Understanding Tokens
Every time you use ChatGPT, Claude, or any LLM, your text gets broken into tokens. Understanding tokens helps you use AI more effectively—and manage costs.
What Is a Token?
A token is a chunk of text that the AI processes as a single unit.
Tokens aren't exactly words. They're pieces that the model has learned to recognize:
- Common words = 1 token: "hello", "the", "and"
- Longer words = multiple tokens: "extraordinary" = 3 tokens
- Rare words = more tokens: "pneumonoultramicroscopicsilicovolcanoconiosis" = many tokens
The Rough Math
For English text:
- 1 token ≈ 4 characters
- 1 token ≈ 0.75 words
- 100 tokens ≈ 75 words
Quick estimate: Divide character count by 4.
Why Tokens Matter
1. Context Window Limits
Every model has a maximum token limit for input + output combined:
| Model | Context Window |
|---|---|
| GPT-3.5 | 4K - 16K tokens |
| GPT-4 | 8K - 128K tokens |
| Claude 3 | 200K tokens |
| Gemini 1.5 | 1M tokens |
If your conversation exceeds the limit, older messages get dropped.
2. API Costs
You pay per token:
- Input tokens (your prompt)
- Output tokens (AI's response)
GPT-4 pricing example:
- Input: $30 per 1M tokens
- Output: $60 per 1M tokens
A 2,000-word article = ~2,700 tokens = ~$0.08 to read + ~$0.16 to write.
3. Speed
More tokens = slower responses. A 4,000 token output takes longer than 400 tokens.
Tokenization Examples
"Hello world" = 2 tokens
"Hello, world!" = 4 tokens (punctuation matters)
" Hello" = 2 tokens (spaces count)
Code is expensive:
def calculate_sum(a, b):
return a + b
This simple function = ~15 tokens
Non-English text uses more tokens:
- English: "Hello" = 1 token
- Japanese: "こんにちは" = 3+ tokens
Practical Tips
Optimize Prompts for Cost
Expensive:
Please kindly analyze the following text and provide a comprehensive summary that captures all the main points and key insights.
Cheaper (same result):
Summarize this text:
Watch Context Windows
For long documents:
- Break into chunks
- Summarize in stages
- Use models with larger contexts
Request Concise Outputs
Add to prompts:
- "Be concise"
- "Under 200 words"
- "Bullet points only"
Checking Token Counts
OpenAI Tokenizer: tiktoken library or playground Claude: No official tool, but similar to GPT-4 Online tools: Many free token counters available
The Bottom Line
Tokens are the currency of LLMs:
- More tokens = more cost (for APIs)
- More tokens = more context (but with limits)
- More tokens = slower (generation time)
Understanding tokens helps you:
- Write more efficient prompts
- Estimate costs accurately
- Work within context limits
Next up: AI Hallucinations — When AI confidently says things that aren't true