What is TOON?

Token-Oriented Object Notation is a compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage.

TOON Format Overview - Token savings, workflow, and performance comparison

💸

Token-efficient

Typically 30-60% fewer tokens on large uniform arrays vs formatted JSON

🤿

LLM-friendly guardrails

Explicit lengths and fields enable validation

🍱

Minimal syntax

Removes redundant punctuation (braces, brackets, most quotes)

📐

Indentation-based structure

Like YAML, uses whitespace instead of braces

🧺

Tabular arrays

Declare keys once, stream data as rows

Why TOON?

AI is becoming cheaper and more accessible, but larger context windows allow for larger data inputs as well. LLM tokens still cost money – and standard JSON is verbose and token-expensive.

JSON Example

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ]
}

TOON Equivalent

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

TOON conveys the same information with fewer tokens, making it ideal for LLM input where token costs can add up quickly with large datasets.

Best For

Uniform arrays of objects (same fields, primitive values)
Large datasets with consistent structure
Repeated structure • tables
LLM input where token costs matter

Not Ideal For

Deeply nested or non-uniform structures
Semi-uniform arrays (~40–60% tabular eligibility)
Pure tabular data (CSV is more compact)
API responses or storage (use JSON)

Format Overview

TOON borrows YAML's indentation-based structure for nested objects and CSV's tabular format for uniform data rows, then optimizes both for token efficiency.

Performance & Benchmarks

TOON achieves significant token savings while maintaining high retrieval accuracy in LLM applications. Benchmarks show:

Token Efficiency

30-60%

Fewer tokens than JSON

Retrieval Accuracy

73.9%

vs 69.7% for JSON

Note: Token counts vary by tokenizer and model. Benchmarks use a GPT-style tokenizer (cl100k/o200k); actual savings will differ with other models.

Key Points

Learn More

For the complete specification, examples, and more details, visit the official TOON format repository:

TOON Format on GitHub →

What is TOON?

JSON Example

TOON Equivalent

Object

Nested Object

Primitive Array

Tabular Array (Uniform Objects)

Mixed / Non-uniform (List)

Token Efficiency

Retrieval Accuracy

Workflow

When NOT to use TOON

Using TOON in LLM Prompts