Herodote

This Rust project is a command-line tool designed to process GPT-generated conversation data (in JSON format) and convert it into structured Markdown files. It is optimized for performance, modular, and easy to use, making it a reliable tool for archiving, publishing, or analyzing GPT conversations.

Features

JSON Parsing: Reads GPT conversation data stored in JSON format.
Markdown Export: Converts conversations into clean, human-readable Markdown files.
Parallel Processing: Uses multi-threading (via rayon) for efficient file writing, even with large datasets.
Customizable Output: Normalizes filenames and ensures compatibility with Markdown editors.
Error Handling: Handles file system and parsing errors gracefully.

Purpose

GPT tools often generate structured JSON data containing user interactions and assistant responses. This project provides a simple way to transform that raw data into Markdown files, which can then be:

Archived for later reference.
Published in blogs or documentation.
Analyzed for research or development purposes.

This project is well-suited for developers, researchers, or writers working with GPT-generated data who want a streamlined solution for managing and exporting conversations.

Build & Run

Build the project: cargo build Run the tool: cargo run -- -i path/to/input.json -o path/to/output/ Testing : cargo test

Usage

Prerequisites

Rust installed on your machine (use Rustup to install).
A valid JSON file containing GPT interaction data.

Command-line Interface This tool is designed to be used via the command line. Below are the supported options:

USAGE:
    herodote [OPTIONS]

OPTIONS:
    -i, --input <FILE>          Path to the input JSON file containing GPT conversations
    -o, --output-folder <DIR>   Path to the folder where Markdown files will be saved
    -h, --help                  Show this help message
    -V, --version               Show version information

Example Usage

Convert a JSON file of GPT conversations into Markdown files:

./target/release/gpt-converter -i conversations.json -o output/

This will:

Parse conversations.json. Create Markdown files in the output/ directory, one file per conversation.

Input File Format

The tool expects a JSON file with the following structure:

{
  "title": "Conversation Title",
  "update_time": 1672531200,
  "mapping": {
    "1": {
      "message": {
        "author": {
          "role": "user"
        },
        "content": {
          "parts": ["Hello!"]
        },
        "create_time": 1672531200
      }
    },
    "2": {
      "message": {
        "author": {
          "role": "assistant"
        },
        "content": {
          "parts": ["Hi! How can I assist you?"]
        },
        "create_time": 1672531220
      }
    }
  }
}

Each conversation is represented as a mapping of interaction nodes, where:

title is the conversation title.
mapping contains the individual user and assistant messages.
update_time specifies the last update timestamp of the conversation.

Output Example

For the above JSON, the tool generates a Markdown file like this:

File: 2023-01-01-Conversation_Title.md

# Conversation Title

## Question
Hello!

## Answer
Hi! How can I assist you?

Key Points

Efficient Multi-threading: Uses rayon for concurrent file writing, ensuring scalability for large datasets.
Data Validation: Ensures only valid interactions (e.g., non-empty text) are processed.
Filename Normalization: Converts titles into safe, human-readable filenames.
Extensibility: The modular design allows easy integration with other tools or formats (e.g., HTML export).

Development

Directory Structure

src/
├── conversation_writer.rs  # Handles Markdown file writing
├── main.rs                 # CLI entry point
├── model.rs                # Data structures mapping GPT export and target Model
├── utils.rs                # Helper functions for filenames and dates

Contributing

Contributions are welcome! If you find a bug, want to suggest a feature, or improve documentation, feel free to open an issue or pull request.

To-Do List

Add support for HTML export.
Implement logging instead of eprintln!.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Rayon for parallel processing.
Serde for JSON parsing.
Rust community for its excellent libraries and tools.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Herodote

Features

Purpose

Build & Run

Usage

Example Usage

Input File Format

Output Example

Key Points

Development

Contributing

To-Do List

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

License

obazin/herodote

Folders and files

Latest commit

History

Repository files navigation

Herodote

Features

Purpose

Build & Run

Usage

Example Usage

Input File Format

Output Example

Key Points

Development

Contributing

To-Do List

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages