Skip to content

Commit 4e65a90

Browse files
committed
feat: add Table.from_records and Table.from_file factory methods
Add two factory classmethods to Table for convenient initialization: - Table.from_records(path, records, key): Create a table with initial records. Validates all records before writing, writes atomically with a header, and provides indexed error messages for debugging. - Table.from_file(path, key=None): Load an existing file with automatic key detection from the header. Raises FileError if file doesn't exist. Update README quick start and examples to use the new factory methods.
1 parent 46d2228 commit 4e65a90

File tree

3 files changed

+560
-37
lines changed

3 files changed

+560
-37
lines changed

README.md

Lines changed: 47 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -37,20 +37,29 @@ Requires Python 3.10 or later.
3737
```python
3838
from jsonlt import Table
3939

40-
# Open or create a table
41-
table = Table("users.jsonlt", key="id")
42-
43-
# Insert or update records
44-
table.put({"id": "alice", "role": "admin", "email": "alice@example.com"})
45-
table.put({"id": "bob", "role": "user", "email": "bob@example.com"})
40+
# Create a table with initial records
41+
table = Table.from_records(
42+
"users.jsonlt",
43+
[
44+
{"id": "alice", "role": "admin", "email": "alice@example.com"},
45+
{"id": "bob", "role": "user", "email": "bob@example.com"},
46+
],
47+
key="id",
48+
)
4649

4750
# Read records
4851
user = table.get("alice") # Returns the record or None
4952
exists = table.has("bob") # Returns True
5053

54+
# Update a record
55+
table.put({"id": "alice", "role": "admin", "email": "alice@newdomain.com"})
56+
5157
# Delete records (appends a tombstone)
5258
table.delete("bob")
5359

60+
# Later, load the existing table
61+
table = Table.from_file("users.jsonlt")
62+
5463
# Iterate over all records
5564
for record in table.all():
5665
print(record)
@@ -59,9 +68,11 @@ for record in table.all():
5968
The underlying file after these operations:
6069

6170
```jsonl
62-
{"id": "alice", "role": "admin", "email": "alice@example.com"}
63-
{"id": "bob", "role": "user", "email": "bob@example.com"}
64-
{"id": "bob", "$deleted": true}
71+
{"$jsonlt":{"key":"id","version":1}}
72+
{"id":"alice","email":"alice@example.com","role":"admin"}
73+
{"id":"bob","email":"bob@example.com","role":"user"}
74+
{"id":"alice","email":"alice@newdomain.com","role":"admin"}
75+
{"id":"bob","$deleted":true}
6576
```
6677

6778
## When to use JSONLT
@@ -75,10 +86,14 @@ JSONLT is not a database. For large datasets, high write throughput, or complex
7586
JSONLT supports multi-field compound keys for composite identifiers:
7687

7788
```python
78-
orders = Table("orders.jsonlt", key=("customer_id", "order_id"))
79-
80-
orders.put({"customer_id": "alice", "order_id": 1, "total": 99.99})
81-
orders.put({"customer_id": "alice", "order_id": 2, "total": 149.99})
89+
orders = Table.from_records(
90+
"orders.jsonlt",
91+
[
92+
{"customer_id": "alice", "order_id": 1, "total": 99.99},
93+
{"customer_id": "alice", "order_id": 2, "total": 149.99},
94+
],
95+
key=("customer_id", "order_id"),
96+
)
8297

8398
order = orders.get(("alice", 1))
8499
```
@@ -136,23 +151,25 @@ table.reload()
136151

137152
### Table
138153

139-
| Method | Description |
140-
|-------------------------------|--------------------------------|
141-
| `Table(path, key)` | Open or create a table |
142-
| `get(key)` | Get a record by key, or `None` |
143-
| `has(key)` | Check if a key exists |
144-
| `put(record)` | Insert or update a record |
145-
| `delete(key)` | Delete a record |
146-
| `all()` | Iterate all records |
147-
| `keys()` | Iterate all keys |
148-
| `items()` | Iterate (key, record) pairs |
149-
| `count()` | Number of records |
150-
| `find(predicate, limit=None)` | Find matching records |
151-
| `find_one(predicate)` | Find first match |
152-
| `transaction()` | Start a transaction |
153-
| `compact()` | Remove historical entries |
154-
| `clear()` | Remove all records |
155-
| `reload()` | Reload from disk |
154+
| Method | Description |
155+
|------------------------------------------|--------------------------------|
156+
| `Table(path, key)` | Open or create a table |
157+
| `Table.from_records(path, records, key)` | Create table with records |
158+
| `Table.from_file(path)` | Load existing table |
159+
| `get(key)` | Get a record by key, or `None` |
160+
| `has(key)` | Check if a key exists |
161+
| `put(record)` | Insert or update a record |
162+
| `delete(key)` | Delete a record |
163+
| `all()` | Iterate all records |
164+
| `keys()` | Iterate all keys |
165+
| `items()` | Iterate (key, record) pairs |
166+
| `count()` | Number of records |
167+
| `find(predicate, limit=None)` | Find matching records |
168+
| `find_one(predicate)` | Find first match |
169+
| `transaction()` | Start a transaction |
170+
| `compact()` | Remove historical entries |
171+
| `clear()` | Remove all records |
172+
| `reload()` | Reload from disk |
156173

157174
The `Table` class also supports `len(table)`, `key in table`, and `for record in table`.
158175

@@ -195,8 +212,6 @@ The JSONLT format draws from related work including [BEADS](https://github.com/s
195212

196213
The development of this library involved AI language models, specifically Claude (Anthropic). AI tools contributed to drafting code, tests, and documentation. Human authors made all design decisions and final implementations, and they reviewed, edited, and validated AI-generated content. The authors take full responsibility for the correctness of this software.
197214

198-
This disclosure promotes transparency about modern software development practices.
199-
200215
## License
201216

202217
MIT License. See [LICENSE](LICENSE) for details.

src/jsonlt/_table.py

Lines changed: 176 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,10 @@
77
# pyright: reportImportCycles=false
88

99
from pathlib import Path
10-
from typing import TYPE_CHECKING, ClassVar
10+
from typing import TYPE_CHECKING, ClassVar, cast
1111
from typing_extensions import override
1212

13-
from ._constants import MAX_RECORD_SIZE
13+
from ._constants import JSONLT_VERSION, MAX_RECORD_SIZE
1414
from ._encoding import validate_no_surrogates
1515
from ._exceptions import (
1616
ConflictError,
@@ -20,7 +20,7 @@
2020
TransactionError,
2121
)
2222
from ._filesystem import FileSystem, RealFileSystem
23-
from ._header import serialize_header
23+
from ._header import Header, serialize_header
2424
from ._json import serialize_json, utf8_byte_length
2525
from ._keys import (
2626
Key,
@@ -36,8 +36,9 @@
3636
from ._state import compute_logical_state
3737

3838
if TYPE_CHECKING:
39-
from ._header import Header
40-
from ._json import JSONObject
39+
from collections.abc import Iterable, Mapping
40+
41+
from ._json import JSONObject, JSONValue
4142
from ._transaction import Transaction
4243

4344
__all__ = ["Table"]
@@ -144,6 +145,176 @@ def __init__(
144145
# Initial load
145146
self._load(key)
146147

148+
@classmethod
149+
def from_records( # noqa: PLR0913
150+
cls,
151+
path: "Path | str",
152+
records: "Mapping[str, object] | Iterable[Mapping[str, object]]",
153+
key: KeySpecifier,
154+
*,
155+
auto_reload: bool = True,
156+
lock_timeout: float | None = None,
157+
max_file_size: int | None = None,
158+
_fs: "FileSystem | None" = None,
159+
) -> "Table":
160+
"""Create a table from a list of records.
161+
162+
Creates a new file at the specified path with the given records.
163+
All records are validated before writing, and the file is written
164+
atomically. If any record is invalid, no file is written.
165+
166+
A header with the key specifier is always written, making the
167+
file self-describing.
168+
169+
Args:
170+
path: Path to create the JSONLT file at.
171+
records: A single record dict or iterable of record dicts.
172+
key: Key specifier for the table.
173+
auto_reload: If True (default), check for file changes before
174+
each read operation and reload if necessary.
175+
lock_timeout: Maximum seconds to wait for file lock on write
176+
operations. None means wait indefinitely.
177+
max_file_size: Maximum allowed file size in bytes when loading.
178+
If the file exceeds this limit, LimitError is raised.
179+
_fs: Internal filesystem abstraction for testing. Do not use.
180+
181+
Returns:
182+
A new Table instance backed by the created file.
183+
184+
Raises:
185+
InvalidKeyError: If any record is missing required key fields,
186+
has invalid key values, or contains $-prefixed fields.
187+
LimitError: If any key exceeds 1024 bytes or any record exceeds 1 MiB.
188+
FileError: If the file cannot be created.
189+
190+
Example:
191+
>>> table = Table.from_records(
192+
... "users.jsonlt",
193+
... [
194+
... {"id": "alice", "role": "admin"},
195+
... {"id": "bob", "role": "user"},
196+
... ],
197+
... key="id",
198+
... )
199+
>>> table.count()
200+
2
201+
"""
202+
file_path = Path(path) if isinstance(path, str) else path
203+
fs = RealFileSystem() if _fs is None else _fs
204+
normalized_key = normalize_key_specifier(key)
205+
206+
# Normalize records: single dict -> list
207+
if isinstance(records, dict):
208+
record_list = cast("list[Mapping[str, object]]", [records])
209+
else:
210+
record_list = cast("list[Mapping[str, object]]", list(records))
211+
212+
# Build lines: header + validated records
213+
lines: list[str] = [
214+
serialize_header(Header(version=JSONLT_VERSION, key=normalized_key))
215+
]
216+
217+
for index, record in enumerate(record_list):
218+
try:
219+
record_value = cast("JSONValue", record)
220+
record_obj = cast("JSONObject", record)
221+
222+
validate_no_surrogates(record_value)
223+
validate_record(record_obj, normalized_key)
224+
extracted_key = extract_key(record_obj, normalized_key)
225+
validate_key_length(extracted_key)
226+
227+
serialized = serialize_json(record)
228+
if utf8_byte_length(serialized) > MAX_RECORD_SIZE:
229+
msg = f"record size exceeds maximum {MAX_RECORD_SIZE}"
230+
raise LimitError(msg) # noqa: TRY301
231+
232+
lines.append(serialized)
233+
except (InvalidKeyError, LimitError) as e: # noqa: PERF203
234+
msg = f"record at index {index}: {e}"
235+
raise type(e)(msg) from e
236+
237+
fs.ensure_parent_dir(file_path)
238+
fs.atomic_replace(file_path, lines)
239+
240+
return cls(
241+
file_path,
242+
key=normalized_key,
243+
auto_reload=auto_reload,
244+
lock_timeout=lock_timeout,
245+
max_file_size=max_file_size,
246+
_fs=_fs,
247+
)
248+
249+
@classmethod
250+
def from_file(
251+
cls,
252+
path: "Path | str",
253+
key: "KeySpecifier | None" = None,
254+
*,
255+
auto_reload: bool = True,
256+
lock_timeout: float | None = None,
257+
max_file_size: int | None = None,
258+
_fs: "FileSystem | None" = None,
259+
) -> "Table":
260+
"""Load a table from an existing file.
261+
262+
Opens an existing JSONLT file. If the file has a header with a
263+
key specifier, uses that key. An explicit key parameter can be
264+
provided to override or when the file has no header.
265+
266+
This method is semantically equivalent to the Table constructor
267+
but explicitly indicates the intent to load an existing file
268+
(as opposed to potentially creating a new one).
269+
270+
Args:
271+
path: Path to the existing JSONLT file.
272+
key: Optional key specifier. If None, auto-detected from the
273+
file header. If provided, must match the header key (if any).
274+
auto_reload: If True (default), check for file changes before
275+
each read operation and reload if necessary.
276+
lock_timeout: Maximum seconds to wait for file lock on write
277+
operations. None means wait indefinitely.
278+
max_file_size: Maximum allowed file size in bytes when loading.
279+
If the file exceeds this limit, LimitError is raised.
280+
_fs: Internal filesystem abstraction for testing. Do not use.
281+
282+
Returns:
283+
A Table instance backed by the file.
284+
285+
Raises:
286+
FileError: If the file does not exist or cannot be read.
287+
InvalidKeyError: If no key specifier can be determined (file
288+
has no header and key not provided), or if the provided
289+
key doesn't match the header key.
290+
ParseError: If the file contains invalid content.
291+
292+
Example:
293+
>>> # File has header with key
294+
>>> table = Table.from_file("users.jsonlt")
295+
>>> table.key_specifier
296+
'id'
297+
298+
>>> # File without header, provide key explicitly
299+
>>> table = Table.from_file("data.jsonlt", key="name")
300+
"""
301+
file_path = Path(path) if isinstance(path, str) else path
302+
fs = RealFileSystem() if _fs is None else _fs
303+
304+
stats = fs.stat(file_path)
305+
if not stats.exists:
306+
msg = f"file not found: {file_path}"
307+
raise FileError(msg)
308+
309+
return cls(
310+
file_path,
311+
key=key,
312+
auto_reload=auto_reload,
313+
lock_timeout=lock_timeout,
314+
max_file_size=max_file_size,
315+
_fs=_fs,
316+
)
317+
147318
def _load(self, caller_key: "KeySpecifier | None" = None) -> None:
148319
"""Load or reload the table from disk.
149320

0 commit comments

Comments
 (0)