A lightweight Python script that uses the Google Gemini API to generate structured datasets in JSONL format.
It reads the base prompt from a .txt file defined in your .env file and writes each generated object line-by-line into a .jsonl file.
git clone https://github.com/erkerem2/gemini-jsonl-generator.git
cd gemini-jsonl-generator
pip install -r requirements.txtGEMINI_KEY="your_api_key"
PROMPT_FILE="prompt.txt path"
python main.py
--total 100
--per_call 20
--outfile output.jsonl
--temperature 0.7
| Argument | Description | Default |
|---|---|---|
--api_key |
Gemini API key (can be read from .env) |
.env:GEMINI_KEY |
--model |
Model name | gemini-2.5-flash-lite |
--total |
Total number of JSON objects to generate | 6 |
--per_call |
Number of objects per API call | 2 |
--outfile |
Output file name | output.jsonl |
--temperature |
Sampling temperature | 0.8 |