ORT (Object Record Table) Format Specification

Version: 1.1.0 Last Updated: 2025

1. Introduction

1.1 Overview

Object Record Table (ORT) is a CSV-like structured data format designed specifically for token optimization in Large Language Model (LLM) contexts. Unlike traditional human-readable formats like JSON and YAML, ORT prioritizes computational efficiency while maintaining readability.

1.2 Goals

Token Efficiency: Minimize the number of tokens required to represent structured data
Structural Clarity: Maintain clear relationships between data elements
Native Support: Support objects and arrays as first-class data structures
Simplicity: Keep the syntax simple and predictable

1.3 Use Cases

ORT is ideal for:

Data interchange with LLMs
Uniform data structures with multiple records
Configuration files for AI applications
Structured data logging

ORT is NOT ideal for:

Heterogeneous data structures
Direct application I/O operations
Human-editable configuration files requiring comments throughout

2. Design Philosophy

2.1 Token Optimization

ORT achieves token efficiency through:

Header-based field definitions: Field names declared once in header
Positional value mapping: Data lines contain only values
Minimal delimiters: Using only essential punctuation
No redundant whitespace: Compact representation

2.2 Comparison with Other Formats

# ORT Format (110 characters, 35 tokens)
users:id,profile(name,age,address(city,country)):
1,(John Doe,30,(New York,USA))
2,(Jane Smith,25,(London,UK))

// JSON Format (398 characters, 118 tokens)
{
  "users": [
    {
      "id": 1,
      "profile": {
        "name": "John Doe",
        "age": 30,
        "address": {
          "city": "New York",
          "country": "USA"
        }
      }
    },
    {
      "id": 2,
      "profile": {
        "name": "Jane Smith",
        "age": 25,
        "address": {
          "city": "London",
          "country": "UK"
        }
      }
    }
  ]
}

3. Lexical Structure

3.1 Character Set

ORT uses UTF-8 encoding and supports the full Unicode character set.

3.2 Line Terminators

Unix style: \n (LF)
Windows style: \r\n (CRLF)
Both are supported and normalized during parsing

3.3 Whitespace

Spaces and tabs are trimmed from line beginnings and endings
Whitespace within values is preserved
Empty lines are ignored

3.4 Reserved Characters

The following characters have special meaning in ORT:

Character	Purpose	Escape Required
`:`	Header delimiter	No (only in headers)
`,`	Value separator	Yes (in values)
`(`	Object/nested field start	Yes (in string values)
`)`	Object/nested field end	Yes (in string values)
`[`	Array start	Yes (in string values)
`]`	Array end	Yes (in string values)
`\`	Escape character	Yes (always `\\`)
`#`	Comment marker	No (only at line start)

4. Data Types

ORT supports six primitive and composite data types:

4.1 Null

Represents absence of a value.

Syntax: Empty string or no value between delimiters

Examples:

users:id,name,email:
1,John,
2,Jane,jane@example.com

JSON Equivalent:

{
  "users": [
    {"id": 1, "name": "John", "email": null},
    {"id": 2, "name": "Jane", "email": "jane@example.com"}
  ]
}

4.2 Boolean

Boolean values representing true or false.

Syntax:

true - Boolean true
false - Boolean false

Case Sensitivity: Lowercase only

Examples:

settings:enabled,verified:
true,false
false,true

JSON Equivalent:

{
  "settings": [
    {"enabled": true, "verified": false},
    {"enabled": false, "verified": true}
  ]
}

4.3 Number

Numeric values including integers and floating-point numbers.

Syntax:

Integer: 42, -17, 0
Float: 3.14, -0.5, 999.99
Scientific notation: NOT supported in current version

Range:

Integers: 64-bit signed (-2^63 to 2^63-1)
Floats: 64-bit IEEE 754 double precision

Examples:

products:id,price:
101,999.99
102,29.99
103,79.99

Special Cases:

Leading zeros are preserved for strings: 007 → "007"
Pure numbers are parsed as numbers: 007 → 7
To force string interpretation, use escape: \007 → "007" (if not parseable as number)

4.4 String

UTF-8 encoded text values.

Syntax: Raw text without quotes

Characteristics:

No surrounding quotes required
Whitespace is trimmed from beginning and end
Internal whitespace is preserved
Special characters must be escaped

Examples:

users:id,name:
1,John Doe
2,Jane Smith

Trimming Behavior:

data:value:
  hello world

Parsed as: "hello world" (leading/trailing spaces removed)

4.5 Array

Ordered collection of values.

Syntax: [value1,value2,value3]

Characteristics:

Square brackets [] delimit arrays
Values separated by commas
Can contain any data type
Can be nested
Empty arrays: []

Examples:

Simple Array:

colors:
[red,green,blue,yellow]

Nested Array:

matrix:
[[1,2,3],[4,5,6],[7,8,9]]

Mixed Type Array:

data:
[42,hello world,true,(id:100,active:false),[1,2,3]]

Array Field:

users:id,tags:
1,[admin,user]
2,[]
3,[guest]

4.6 Object

Unordered collection of key-value pairs.

Syntax:

Inline: (key1:value1,key2:value2)
Header-based: Defined in header, values in data lines

Characteristics:

Parentheses () delimit inline objects
Key-value pairs separated by colons :
Pairs separated by commas
Can be nested
Empty objects: ()

Examples:

Inline Object:

data:
[(id:1,name:Alice),(id:2,name:Bob)]

Header-based Object (Preferred):

users:id,name:
1,Alice
2,Bob

Nested Object:

users:id,profile(name,age):
1,(Alice,30)
2,(Bob,25)

5. Syntax Rules

5.1 Document Structure

An ORT document consists of one or more sections. Each section has:

Header Line: Defines structure and field names
Data Lines: Contains actual values

5.2 Two Main Formats

5.2.1 Named Section Format

Syntax: keyName:field1,field2,...:

Usage: Creating named arrays in the root object

Example:

users:id,name:
1,Alice
2,Bob

Result:

{
  "users": [
    {"id": 1, "name": "Alice"},
    {"id": 2, "name": "Bob"}
  ]
}

5.2.2 Top-Level Format

Syntax: :field1,field2,...:

Usage: Creating root-level objects or arrays

Single Object:

:id,name,email:
1001,Alice Williams,alice@example.com

Result:

{
  "id": 1001,
  "name": "Alice Williams",
  "email": "alice@example.com"
}

Multiple Objects (Array):

:id,name:
1,Alice
2,Bob

Result:

[
  {"id": 1, "name": "Alice"},
  {"id": 2, "name": "Bob"}
]

6. Header Syntax

6.1 Header Format

General Form: [keyName]:field1,field2,...:

Components:

Optional Key Name: Identifier for the data section
Colon: Separates key name from fields
Field List: Comma-separated field names
Trailing Colon: Marks end of header

6.2 Field Names

Rules:

Must be valid identifiers
Case-sensitive
Can contain letters, numbers, underscores
Cannot start with a number (by convention)
No spaces allowed

Valid Examples:

id
firstName
user_name
item2
_private

Invalid Examples:

first name (contains space)
2ndItem (starts with number - technically allowed but not recommended)

6.3 Nested Fields

Nested fields represent object structures.

Syntax: fieldName(nestedField1,nestedField2,...)

Example:

users:id,profile(name,age,email):
1,(John Doe,30,john@example.com)
2,(Jane Smith,25,jane@example.com)

6.4 Deeply Nested Fields

Nesting can be arbitrarily deep.

Example:

users:id,profile(name,age,address(city,country)):
1,(John Doe,30,(New York,USA))
2,(Jane Smith,25,(London,UK))

JSON Equivalent:

{
  "users": [
    {
      "id": 1,
      "profile": {
        "name": "John Doe",
        "age": 30,
        "address": {
          "city": "New York",
          "country": "USA"
        }
      }
    },
    {
      "id": 2,
      "profile": {
        "name": "Jane Smith",
        "age": 25,
        "address": {
          "city": "London",
          "country": "UK"
        }
      }
    }
  ]
}

7. Data Line Syntax

7.1 Basic Data Lines

Data lines contain comma-separated values corresponding to fields in the header.

Rules:

Values must match header field order
Number of values must equal number of fields
Values are separated by commas
Leading/trailing whitespace is trimmed

Example:

users:id,name,age:
1,Alice,30
2,Bob,25

7.2 Value Parsing

Values are parsed in the following order:

Empty string → null
[] → Empty array
() → Empty object
[...] → Array
(...) → Object (if contains :) or Nested object (if in nested field)
Numeric string → Number (if parseable)
true/false → Boolean
Everything else → String

7.3 Nested Object Values

For fields defined with nested structure in header, values should be wrapped in parentheses.

Example:

users:id,profile(name,age):
1,(Alice,30)
2,(Bob,25)

Invalid:

users:id,profile(name,age):
1,Alice,30  # WRONG: Not wrapped in parentheses

Note (v1.1.0+): The parser now supports dynamic field recognition. If a value doesn't match the expected nested format (e.g., an array [...] instead of an object (...)), the parser will treat it as a regular value instead of throwing an error. This allows for more flexible data structures while maintaining backward compatibility.

7.4 Multiple Data Lines

Multiple data lines create an array of objects.

Example:

users:id,name:
1,Alice
2,Bob
3,Charlie

Result:

{
  "users": [
    {"id": 1, "name": "Alice"},
    {"id": 2, "name": "Bob"},
    {"id": 3, "name": "Charlie"}
  ]
}

8. Nested Structures

8.1 Objects within Objects

Header Definition:

users:id,profile(name,contact(email,phone)):

Data:

1,(John,(john@example.com,555-1234))

Result:

{
  "users": [{
    "id": 1,
    "profile": {
      "name": "John",
      "contact": {
        "email": "john@example.com",
        "phone": "555-1234"
      }
    }
  }]
}

8.2 Arrays within Objects

Example:

users:id,name,tags:
1,Alice,[admin,user]
2,Bob,[guest]
3,Charlie,[]

8.3 Objects within Arrays

Inline Objects:

data:
[(id:1,name:Alice),(id:2,name:Bob)]

Result:

{
  "data": [
    {"id": 1, "name": "Alice"},
    {"id": 2, "name": "Bob"}
  ]
}

8.4 Complex Nesting

Combining all nesting types:

records:id,data(values,metadata(tags,settings(options))):
1,([1,2,3],([dev,test],((verbose:true,debug:false))))

Result:

{
  "records": [{
    "id": 1,
    "data": {
      "values": [1, 2, 3],
      "metadata": {
        "tags": ["dev", "test"],
        "settings": {
          "options": {
            "verbose": true,
            "debug": false
          }
        }
      }
    }
  }]
}

9. Escape Sequences

9.1 Purpose

Escape sequences allow special characters to be included in string values.

9.2 Supported Escape Sequences

Escape Sequence	Character	Description
`\\`	`\`	Backslash
`\,`	`,`	Comma
`\(`	`(`	Left parenthesis
`\)`	`)`	Right parenthesis
`\[`	`[`	Left square bracket
`\]`	`]`	Right square bracket
`\n`	Line Feed	Newline
`\t`	Tab	Horizontal tab
`\r`	Carriage Return	Carriage return

9.3 Examples

Escaping Delimiters

messages:id,text:
1,\(Hello\, World!\)
2,Price: $99\,99
3,Use backslash: \\
4,Array syntax: \[1\,2\,3\]

Result:

{
  "messages": [
    {"id": 1, "text": "(Hello, World!)"},
    {"id": 2, "text": "Price: $99,99"},
    {"id": 3, "text": "Use backslash: \\"},
    {"id": 4, "text": "Array syntax: [1,2,3]"}
  ]
}

Newlines and Tabs

texts:id,content:
1,First line\nSecond line\nThird line
2,Name:\tJohn\nAge:\t30

Result:

{
  "texts": [
    {"id": 1, "content": "First line\nSecond line\nThird line"},
    {"id": 2, "content": "Name:\tJohn\nAge:\t30"}
  ]
}

9.4 Escape Processing

Processing Rules:

Backslash followed by recognized character → Replace with escaped character
Backslash followed by unrecognized character → Keep the character, remove backslash
Backslash at end of string → Keep backslash

10. Comments

10.1 Syntax

Comments start with # at the beginning of a line.

Characteristics:

Line comments only (no inline comments)
# must be the first non-whitespace character
Everything after # to end of line is ignored
Comments can appear anywhere in the document

10.2 Examples

# This is a comment
users:id,name:  # This is NOT a comment (not at line start)
1,Alice
# Another comment
2,Bob

10.3 Documentation Style

# User Database
# Format: ID, Name, Email, Active Status
# Last Updated: 2025-01-15

users:id,name,email,active:
1001,Alice Williams,alice@example.com,true
1002,Bob Johnson,bob@example.com,false

11. Complete Examples

11.1 Basic Object Array

users:age,id,name:
30,1,John Doe
25,2,Jane Smith
35,3,Bob Johnson

11.2 Simple Array

colors:
[red,green,blue,yellow]

11.3 Top-Level Single Object

:id,name,email,active:
1001,Alice Williams,alice@example.com,true

11.4 Nested Objects

users:id,profile(name,age,address(city,country)):
1,(John Doe,30,(New York,USA))
2,(Jane Smith,25,(London,UK))

11.5 Nested Array

matrix:
[[1,2,3],[4,5,6],[7,8,9]]

11.6 Mixed Array

data:
[42,hello world,true,(id:100,active:false),[1,2,3]]

11.7 Multiple Sections

products:id,name,price:
101,Laptop,999.99
102,Mouse,29.99
103,Keyboard,79.99

categories:id,name:
1,Electronics
2,Accessories

11.8 Null and Empty Values

records:id,name,email,tags:
1,John Doe,,[]
2,Jane Smith,jane@example.com,()
3,Bob,bob@example.com,[admin,user]

11.9 Escape Characters

messages:id,text:
1,\(Hello\, World!\)
2,Price: $99\,99
3,Use backslash: \\
4,Array syntax: \[1\,2\,3\]

11.10 Newline and Tab

texts:id,content:
1,First line\nSecond line\nThird line
2,Name:\tJohn\nAge:\t30
3,Multi\nLine\nText

11.11 Boolean Values

settings:id,feature,enabled,verified:
1,notifications,true,false
2,dark_mode,false,true
3,auto_save,true,true

12. Parsing Rules

12.1 Value Type Detection

Values are parsed using the following algorithm:

function parse_value(string):
  trimmed = trim(string)

  if trimmed is empty:
    return null

  if trimmed == "[]":
    return empty_array

  if trimmed == "()":
    return empty_object

  if trimmed starts with '[' and ends with ']':
    return parse_array(trimmed)

  if trimmed starts with '(' and ends with ')':
    if contains ':' at depth 0:
      return parse_inline_object(trimmed)
    else:
      return parse_nested_object(trimmed)

  unescaped = unescape(trimmed)

  if unescaped is valid number:
    return number

  if unescaped == "true":
    return true

  if unescaped == "false":
    return false

  return string

12.2 Delimiter Depth Tracking

When parsing values, track nesting depth to correctly identify delimiters:

depth = 0
bracket_depth = 0

for each character:
  if character == '(':
    depth++
  else if character == ')':
    depth--
  else if character == '[':
    bracket_depth++
  else if character == ']':
    bracket_depth--
  else if character == ',' and depth == 0 and bracket_depth == 0:
    # This comma is a value separator

12.3 Field Count Validation

Rule: Number of values in data line must exactly match number of fields in header.

Example Error:

users:id,name,age:
1,Alice  # ERROR: Expected 3 values, got 2

12.4 Nested Object Validation

Rule: For nested fields, values should be wrapped in parentheses and contain correct number of nested values.

Example Error:

users:id,profile(name,age):
1,(Alice)  # ERROR: Expected 2 nested values, got 1

Dynamic Field Recognition (v1.1.0+):

When a nested field receives a value that doesn't match the expected format, the parser will attempt to parse it as a regular value:

users:id,profile(name,age):
1,[x,y,z]  # Parsed as array instead of nested object

This behavior allows the parser to handle non-uniform data structures where the same field may contain different types across records. The generator will detect such non-uniform arrays and output them using inline object format instead of tabular format.

13. Best Practices

13.1 When to Use ORT

Good Use Cases:

Uniform data structures (same fields across records)
Large datasets for LLM consumption
Token-optimized data transfer
Structured logging

Poor Use Cases:

Heterogeneous data (different fields per record)
Direct application configuration
Human-primary editing scenarios
Data with frequent schema changes

13.2 Naming Conventions

Field Names:

Use camelCase or snake_case consistently
Keep names concise but descriptive
Avoid abbreviations unless widely understood

Examples:

# Good
users:id,firstName,lastName,emailAddress:

# Acceptable
users:id,first_name,last_name,email_address:

# Avoid
users:i,fn,ln,ea:  # Too cryptic

13.3 Structure Design

Prefer Flat Over Nested (when reasonable):

# Better for token efficiency
users:id,name,city,country:
1,John,New York,USA

# More nested, but more tokens
users:id,name,address(city,country):
1,John,(New York,USA)

Use Nesting for Logical Grouping:

# Good use of nesting
users:id,profile(name,age,email),settings(theme,language):

13.4 Data Organization

Group Related Sections:

# Good: Related data together
users:id,name:
1,Alice
2,Bob

user_roles:user_id,role:
1,admin
2,user

Use Comments for Clarity:

# User Master Data
users:id,name,email:
1,Alice,alice@example.com

# User Permissions
permissions:user_id,resource,access:
1,/admin,read-write

13.5 Error Handling

Always Validate:

Field count matches value count
Nested structures are properly formed
Escape sequences are valid
Data types are appropriate

Provide Clear Error Messages:

Line 5: Expected 3 values but got 2
  1,Alice

13.6 Performance Considerations

For Large Datasets:

Stream parsing when possible
Validate headers before processing data
Use appropriate buffer sizes
Consider memory constraints for nested structures

Token Optimization:

Minimize field name lengths (while maintaining clarity)
Use flat structures when appropriate
Avoid redundant nesting

Appendix A: Type Conversion Table

ORT Value	Parsed Type	Notes
(empty)	null	Empty string
`true`	boolean	Lowercase only
`false`	boolean	Lowercase only
`42`	number	Integer
`3.14`	number	Float
`-17`	number	Negative number
`hello`	string	Raw text
`[]`	array	Empty array
`[1,2,3]`	array	Array of numbers
`()`	object	Empty object
`(a:1,b:2)`	object	Inline object
`(val1,val2)`	object	Nested object (context-dependent)

Appendix B: Character Encoding

ORT documents must be encoded in UTF-8. Parsers should:

Accept UTF-8 with or without BOM
Reject invalid UTF-8 sequences
Preserve Unicode characters in string values
Handle surrogate pairs correctly

Appendix C: Implementation Notes

C.1 Parser Requirements

A compliant ORT parser must:

Support all data types defined in Section 4
Implement escape sequence processing (Section 9.2)
Validate field/value count matching (Section 13.3)
Handle arbitrary nesting depth (Section 8)
Ignore comments and empty lines (Section 10)

C.2 Generator Requirements

A compliant ORT generator must:

Escape special characters in string values
Generate valid headers for object arrays
Maintain field order consistency
Output minimal whitespace
Use UTF-8 encoding

C.3 Error Handling

Implementations should provide clear error messages including:

Line number
Problematic content
Description of the error
Suggested fix (when applicable)

Appendix D: References

14. Changelog

Version 1.1.0

Release Date: 2025

New Features

Dynamic Field Recognition in Parser
- Parser now handles cases where a field is defined as nested in the header but receives a different value type
- Arrays can now be parsed even when the header defines a nested object structure
- Fallback parsing for values that don't match the expected nested format
Improved Uniform Array Detection in Generator
- Generator now checks both key names AND value types when determining if an array is uniform
- Arrays with same keys but different value types (e.g., object vs array) are now correctly identified as non-uniform
- Non-uniform arrays are generated using inline object format instead of tabular format

Bug Fixes

Fixed parsing error "Expected nested object in parentheses" when array values appear in nested field positions
Fixed parsing error "Expected X values but got Y" for non-uniform object arrays

Example

JSON with non-uniform array:

{
  "test": [
    { "input": { "pairs": [["a","b"],["c,d","e:f",true]] } },
    { "input": ["x", "y", "true", true, 10] }
  ]
}

Previous behavior (v1.0.1): Generated invalid ORT that couldn't be parsed back

New behavior (v1.1.0): Generates valid inline object format

test:
[(input:(pairs:[[a,b],[c\,d,e:f,true]])),(input:[x,y,true,true,10])]

Version 1.0.1

Release Date: 2025

Initial stable release
Full support for all data types (null, boolean, number, string, array, object)
Nested field syntax in headers
Escape sequence support
Multi-language implementations (Rust, TypeScript, Python)

End of Specification

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

License

ORT-format/Specification

Folders and files

Latest commit

History

Repository files navigation

ORT (Object Record Table) Format Specification

Table of Contents

1. Introduction

1.1 Overview

1.2 Goals

1.3 Use Cases

2. Design Philosophy

2.1 Token Optimization

2.2 Comparison with Other Formats

3. Lexical Structure

3.1 Character Set

3.2 Line Terminators

3.3 Whitespace

3.4 Reserved Characters

4. Data Types

4.1 Null

4.2 Boolean

4.3 Number

4.4 String

4.5 Array

4.6 Object

5. Syntax Rules

5.1 Document Structure

5.2 Two Main Formats

5.2.1 Named Section Format

5.2.2 Top-Level Format

6. Header Syntax

6.1 Header Format

6.2 Field Names

6.3 Nested Fields

6.4 Deeply Nested Fields

7. Data Line Syntax

7.1 Basic Data Lines

7.2 Value Parsing

7.3 Nested Object Values

7.4 Multiple Data Lines

8. Nested Structures

8.1 Objects within Objects

8.2 Arrays within Objects

8.3 Objects within Arrays

8.4 Complex Nesting

9. Escape Sequences

9.1 Purpose

9.2 Supported Escape Sequences

9.3 Examples

Escaping Delimiters

Newlines and Tabs

9.4 Escape Processing

10. Comments

10.1 Syntax

10.2 Examples

10.3 Documentation Style

11. Complete Examples

11.1 Basic Object Array

11.2 Simple Array

11.3 Top-Level Single Object

11.4 Nested Objects

11.5 Nested Array

11.6 Mixed Array

11.7 Multiple Sections

11.8 Null and Empty Values

11.9 Escape Characters

11.10 Newline and Tab

11.11 Boolean Values

12. Parsing Rules

12.1 Value Type Detection

12.2 Delimiter Depth Tracking

12.3 Field Count Validation

12.4 Nested Object Validation

13. Best Practices

13.1 When to Use ORT

13.2 Naming Conventions

13.3 Structure Design

13.4 Data Organization

Packages