Glossary

Large Language Model

A large language model (LLM) is a neural network trained on massive text datasets that generates text by predicting the most likely next tokens given preceding context. GPT-4, Claude, and Gemini are LLMs.

Explanation

LLMs are transformer-based neural networks trained on hundreds of billions to trillions of tokens. Training involves predicting the next token in a sequence billions of times, adjusting billions of parameters to minimize prediction error. The result: a model with internalized statistical patterns of language — including code. LLMs generate text autoregressively: given a prompt, generate one token at a time, each conditioned on all preceding tokens. The model doesn't 'know' things — it predicts statistically likely continuations. This distinction matters: an LLM can confidently generate plausible-sounding but incorrect information (hallucination) because confident-sounding text is common in training data. Context window: the amount of text an LLM can see at once (its working memory). GPT-4o's 128K token context window equals roughly 96,000 words. Larger windows allow the model to consider more of your codebase. Models don't process the entire window equally — information from the middle of very long contexts tends to be 'lost.' Temperature controls randomness: temperature=0 makes the model deterministic (always pick the most likely token — good for code). Higher temperature increases variability (good for creative tasks). Most coding tools default to low temperature.

Code Example

javascript

// Using the OpenAI API to access an LLM programmatically
const OpenAI = require('openai');
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  temperature: 0,        // deterministic output for code
  max_tokens: 500,       // limit response length (= cost)
  messages: [
    {
      role: 'system',
      content: 'You are a senior TypeScript engineer. Write production-ready code with error handling.',
    },
    {
      role: 'user',
      content: 'Write a function that fetches a user by ID from PostgreSQL.',
    },
  ],
});

const code = response.choices[0].message.content;

// Key concepts:
// model: which LLM to use (affects quality + cost)
// temperature: 0=deterministic, 1=creative
// max_tokens: caps response length
// messages: conversation history (the context window)
// roles: system (behavior instructions), user (prompt), assistant (prior responses)
// tokens: input + output tokens determine cost

Why It Matters for Engineers

Understanding LLMs makes you a better consumer of AI coding tools. Knowing they're next-token predictors — not reasoners or knowers — explains why they hallucinate APIs, make confident factual errors, and struggle with novel problems. This understanding tells you when to trust output and when to verify independently. LLM fundamentals also become practical engineering knowledge as models are embedded in production systems: evaluating output quality, understanding context window limits, managing token costs, and building RAG (retrieval-augmented generation) pipelines are emerging skills for the AI era.

Learn This In Practice

Go deeper with the full module on Beyond Vibe Code.

AI-Assisted Dev Foundations → →