What is TOON?

TOON (Token-Oriented Object Notation) is a compact, human-readable data serialization format designed specifically for Large Language Models (LLMs). Unlike JSON, which repeats field names for every object in an array, TOON declares field names once and then transmits data in rows—similar to CSV but with explicit structure. This design typically reduces token usage by 30-60% compared to formatted JSON, especially for uniform arrays of objects. TOON combines YAML's indentation-based structure for nested objects with CSV's tabular efficiency, optimized for LLM contexts where token costs matter.

Tool description

This validator checks TOON format syntax for correctness and provides detailed statistics about the data structure. It parses TOON input using the official @toon-format/toon library, validates the syntax, and outputs comprehensive metrics including character count, line count, number of arrays, objects, primitive values, and total field count. Use this tool to verify TOON data integrity before sending to LLMs or to analyze TOON structure complexity.

Features

  • Syntax validation - Verifies TOON format correctness using official parser
  • Character count - Total number of characters in the input
  • Line count - Number of lines in the TOON data
  • Array detection - Counts all array structures in the data
  • Object detection - Counts all object structures including nested ones
  • Primitive analysis - Counts strings, numbers, booleans, and null values
  • Field counting - Totals all object fields across the entire structure
  • Real-time validation - Instant feedback as you type
  • Syntax highlighting - TOON-specific code highlighting for better readability
  • Error messages - Clear error descriptions for invalid syntax

Use cases

  1. Pre-submission validation - Verify TOON syntax before sending data to LLM APIs to avoid errors and wasted tokens
  2. Structure analysis - Understand the complexity of TOON data by examining array, object, and field counts
  3. Format learning - Test TOON syntax examples to learn the format through trial and error with immediate feedback
  4. Data quality check - Ensure TOON data is properly formatted after generation or conversion from other formats
  5. Token optimization - Analyze TOON structure to identify opportunities for further token reduction

Statistics explained

Characters: Total character count including whitespace and newlines. Useful for comparing TOON compactness against JSON.

Lines: Number of lines in the input. TOON's tabular format typically uses fewer lines than formatted JSON.

Arrays: Count of array structures. TOON's tabular arrays ([N]{fields}:) are more token-efficient than JSON arrays for uniform data.

Objects: Count of object structures. Includes both root objects and nested objects within the data hierarchy.

Primitive values: Total count of all non-composite values (strings, numbers, booleans, null). Indicates data density.

Total fields: Sum of all object properties across the entire structure. High field counts benefit most from TOON's format.

Validation process

  1. Parse TOON input - Uses @toon-format/toon decode function to parse the input string
  2. Validate syntax - If parsing succeeds, TOON syntax is valid; if it throws an error, syntax is invalid
  3. Analyze structure - Recursively traverses the parsed data to count arrays, objects, and primitives
  4. Calculate statistics - Computes character count, line count, and field totals
  5. Display results - Shows validation status and detailed statistics in the output area

TOON format benefits

  • 30-60% fewer tokens than JSON for uniform tabular data
  • Explicit structure with array lengths and field declarations
  • LLM-friendly with guardrails that enable validation
  • Human-readable with minimal syntax and clear structure
  • Lossless representation of JSON data without information loss

When to use TOON

TOON excels with:

  • Large datasets with uniform array structures
  • Repeated objects with the same fields
  • API responses with consistent schemas
  • Database query results with fixed columns
  • Any JSON data where token costs matter

For deeply nested or non-uniform data, JSON may remain more efficient.