Language Models
π Language Models
Section titled βπ Language Modelsβπ― Learning objectives
- Understand how language models work
- Know the difference between traditional programming and language models (see the previous section for the foundation)
- Understand what a context window is and why it matters
- Know the most important limitations
In the previous section we talked about generative AI β now we zoom in on the most discussed type: language models (e.g. ChatGPT, Claude, Gemini). How do they work β and what do you need to know to use them effectively?
What is a language model?
Simply put: a language model is trained on an enormous amount of text. It predicts which word is most likely to come next, based on all the words that came before β patterns it learned from training data.
When you write βWhat is the capital ofβ the model leans toward the next word being a country name, followed by a response format with a city.
Modern models can answer complex questions, write code, summarize, translate, and reason β but theyβre still generating text based on probability and patterns, not human βunderstandingβ in the full sense.
How is a language model trained?
-
Data collection β Enormous amounts of text from books, articles, the web, Wikipedia, forums, and more.
-
Training β The model learns patterns through neural networks: which words follow each other, how sentences are structured, how different text types differ. Requires massive computation; millions of parameters are adjusted.
-
Fine-tuning β Humans rate responses so the model becomes more helpful, relevant, accurate, and safe.
Result: a model that can generate fluent text on almost any topic β within the scope of its training and knowledge cutoff.
What is a context window?
The context window is the total amount of text the model can work with at once β like short-term memory.
Everything must fit there: your question, previous messages, attached text, and the modelβs own responses. Itβs often measured in tokens (small text pieces β words, syllables, or characters).
Why does it matter?
- Long documents β An entire book rarely fits; you must split it or use tools like RAG.
- Long conversations β When the window is full the model drops the oldest parts (in Intric you may get an error if the context is full).
- Context quality β More relevant context in the right order usually gives better answers.
RAG β giving the model current and comprehensive knowledge
RAG (Retrieval-Augmented Generation) lets the model avoid cramming an entire giant knowledge base into a prompt every time.
Instead:
- Documents are split into smaller chunks with metadata (document, page, section).
- When you ask a question, the system picks the chunks that seem most relevant.
- Only those pieces are sent into the context window together with your question.
Itβs like a librarian who fetches the right chapter for you instead of dumping the whole shelf on the table.
Limitations of language models
-
Knowledge cutoff β Training ends at a certain date; events after that donβt exist in the βbase modelβ (connections to the web are a separate layer).
-
Hallucinations β The model can sound convincing but be wrong β especially when filling gaps without adequate sources.
-
No real understanding β Strong on text patterns; weak on anything requiring real experience, sensation, and shared worldview.
-
Context limit β Youβre constrained by the context window size and how well the right information actually gets included.
-
Inconsistency β The same question can give slightly different answers (sampling, temperature, small wording differences).
Summary
Section titled βSummaryβ- Language models predict the next word based on patterns in training data.
- The context window limits how much can be included at once.
- RAG fetches relevant excerpts so you donβt have to fill entire documents into the prompt.
- Key risks: knowledge cutoff, hallucinations, lack of βrealβ understanding, and variation in responses.
Test your knowledge
3 questions Β· 100% correct to pass Β· Review your answers when done