GroqAPI

About

FastAPI wrapper over GroqAPI.

Deploy

To run the API from container, clone this repository to remoute host:

git clone https://github.com/A1eksMa/groq/

Put your groq token into YOUR_SECRET_GROQ_TOKEN variable in the groq/groq/config.py (or create groq/groq/.env file).
To configure port in the PORT variable in the groq/config.py.
Build the container from Dockerfile and run it:

docker build -t groq .
docker run -d -p 5555:5555 --name groq groq

Usage

To use API you need the Groq API token from console.groq.com.

The main endpoint of the API transfer to original Groq API 8 params.
Build the dictionary with a Groq API parametres and send a POST request to API.

Architecture

The project is designed for easy extension with new models. The logic for this is contained within the groq/routers/ directory.

Model-Specific Files: For each supported model (e.g., gemma2_9b_it), there is a corresponding Python file (e.g., gemma2_9b_it.py). This file defines the specific API endpoint for that model.
Common Logic Files: To avoid code duplication, there are files named common_*_logic.py (e.g., common_text_logic.py, common_tts_logic.py). These files contain the shared processing logic for different categories of models (text generation, text-to-speech, etc.).

Currently, most model-specific files simply call the appropriate function from a common logic file. However, this structure provides flexibility for the future. If a specific model requires unique parameters or special handling, its dedicated file can be easily modified to implement that custom logic without affecting other models.

Endpoints

The API has a main dispatcher endpoint / that routes requests to model-specific endpoints based on the MODEL parameter in the request body.

`/` (Dispatcher)

This is the main entry point. It expects a POST request with a JSON body similar to the groq_api_params structure described below. It inspects the MODEL field and forwards the request to the appropriate model-specific endpoint. If the model is not found, it returns a 404 error.

Model-Specific Endpoints

Text Mode Models

This endpoints is fully functional and provides the core logic for interacting with the Groq API. It supports normal, JSON, and streaming modes as shown in the examples below.

/qwen-qwen3-32b
/deepseek-r1-distill-llama-70b
/gemma2-9b-it
and others.

Multimodal Models (text-to-speech and `whisper` family)

The following model endpoints are currently implemented as placeholders. They will return a simple JSON response indicating that they are under development.

/distil-whisper-large-v3-en
/whisper-large-v3
/whisper-large-v3-turbo
/playai-tts
/playai-tts-arabic

Examples

The following examples show how to use the main dispatcher endpoint /. The dispatcher will automatically route the request to the correct model endpoint based on the "MODEL" field.

groq_api_params = {
    "YOUR_SECRET_GROQ_TOKEN": {
        "MODEL": model: str,
        "MESSAGES": messages: Iterable[ChatCompletionMessageParam],
        "TEMPERATURE": temperature: Optional[float] | NotGiven = NOT_GIVEN,
        "MAX_COMPLETION_TOKENS": max_completion_tokens: Optional[int] | NotGiven = NOT_GIVEN,
        "TOP_P": top_p: Optional[float] | NotGiven = NOT_GIVEN,
        "STREAM": stream: Optional[Literal[False]] | NotGiven = NOT_GIVEN,
        "RESPONSE_FORMAT": response_format: Optional[completion_create_params.ResponseFormat] | NotGiven = NOT_GIVEN,
        "STOP": stop: Union[Optional[str], List[str], None] | NotGiven = NOT_GIVEN,
    }
}

(see Groq API for details)

A messages contains the context, and looks like a list of dict:

[
    {
        'role' : 'user',
        'content' : prompt,
    },
    {
        'role' : 'assistant',
        'content' : response,
    },
]

Select a model from the list of available models:

"qwen-qwen3-32b",
"deepseek-r1-distill-llama-70b",
"gemma2-9b-it",
"compound-beta",
"compound-beta-mini",
"compound-beta-oss",
"llama-3.1-8b-instant",
"llama-3.3-70b-versatile",
"meta-llama/llama-4-maverick-17b-128e-instruct",
"meta-llama/llama-4-scout-17b-16e-instruct",
"meta-llama/llama-guard-4-12b",
"meta-llama/llama-prompt-guard-2-22m",
"meta-llama/llama-prompt-guard-2-86m",
"moonshotai/kimi-k2-instruct",
"openai/gpt-oss-20b",
"openai/gpt-oss-120b",
"whisper-large-v3",
"whisper-large-v3-turbo",
"playai-tts",
"playai-tts-arabic", (see actual models at the Groq API or console.groq.com)

The API response:

Normal mode: multistring literal (stream=False),
JSON mode: JSON file (stream=False and response_format={'type': 'json_object'}),
Stream mode: stream output (stream=True).

Normal mode

import requests

HOST = "your_IP_adress"
PORT = 5555
URL = f"http://{HOST}:{PORT}/"

prompt = "Hello World!"

# Set the dictionary for Groq API parametres
groq_api_params = {
    "YOUR_SECRET_GROQ_TOKEN": {
        "MODEL": "llama-3.3-70b-versatile",
        "MESSAGES": [{"role": "user", "content": prompt}],
        "TEMPERATURE": 1,
        "MAX_COMPLETION_TOKENS": 1024,
        "TOP_P": 1,
        "STREAM": False,
        "STOP": None,
    }
}

# Send a POST request to the API
response = requests.post(URL, json=groq_api_params)

# Check if the request was successful
if response.status_code == 200:
    response = response.text[1:-1]
    response = response.replace('\\"', '"')
    response = response.replace('\\n', '\n')
    print(response)
else:
    print("Error:", response.status_code)

Output:

> Hello. It's nice to meet you. Is there something I can help you with, or would you like to chat?

JSON mode

import requests

HOST = "your_IP_adress"
PORT = 5555
URL = f"http://{HOST}:{PORT}/"

# Set the dictionary for Groq API parametres
groq_api_params = {
    "YOUR_SECRET_GROQ_TOKEN": {
        "MODEL": "llama-3.3-70b-versatile",
        "MESSAGES": list(),
        "TEMPERATURE": 1,
        "MAX_COMPLETION_TOKENS": 1024,
        "TOP_P": 1,
        "STREAM": False,
        "RESPONSE_FORMAT": {'type': 'json_object'},
        "STOP": None,
    }
}

# Add system prompt
groq_api_params["YOUR_SECRET_GROQ_TOKEN"]["MESSAGES"].append({'role' : 'system',
                                    'content' : 'Always return me json, that looks like a:\
                                               {"ANSWER": text string with your message}'})

# Add prompt
groq_api_params["YOUR_SECRET_GROQ_TOKEN"]["MESSAGES"].append({'role' : 'user',
                                    'content' : 'Hello World!'})

# Send a POST request to the API
response = requests.post(URL, json=groq_api_params)

# Check if the request was successful
if response.status_code == 200:
    response = response.text[1:-1]
    response = response.replace('\\"', '"')
    response = response.replace('\\n', '\n')
    print(response)
else:
    print("Error:", response.status_code)

Output (JSON):

{
   "ANSWER": "Hello, welcome to our conversation, how can I assist you today?"
}

Stream mode

import requests

HOST = "your_IP_adress"
PORT = 5555
URL = f"http://{HOST}:{PORT}/"

prompt = "Hello World!"

# Set the dictionary for Groq API parametres
groq_api_params = {
    "YOUR_SECRET_GROQ_TOKEN": {
        "MODEL": "llama-3.3-70b-versatile",
        "MESSAGES": [{"role": "user", "content": prompt}],
        "TEMPERATURE": 1,
        "MAX_COMPLETION_TOKENS": 1024,
        "TOP_P": 1,
        "STREAM": True,
        "STOP": None,
    }
}

# Send a POST request to the API
response = requests.post(URL, json=groq_api_params, stream=True)

# Check if the request was successful
if response.status_code == 200:
    for chunk in response.iter_content(chunk_size=1024):
        if chunk:
            print(chunk.decode("utf-8"))
else:
    print("Error:", response.status_code)

Output (stream):

>Hello
>.
> It
>'s
> nice
> to
> meet
> you
>.
> Is
> there
> something
> I
> can
> help
> you
> with
>,
> or
> would
> you
> like
> to
> chat
>?

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
.github/workflows		.github/workflows
groq		groq
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GroqAPI

About

Deploy

Usage

Architecture

Endpoints

`/` (Dispatcher)

Model-Specific Endpoints

Text Mode Models

Multimodal Models (text-to-speech and `whisper` family)

Examples

Normal mode

JSON mode

Stream mode

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

A1eksMa/groq

Folders and files

Latest commit

History

Repository files navigation

GroqAPI

About

Deploy

Usage

Architecture

Endpoints

/ (Dispatcher)

Model-Specific Endpoints

Text Mode Models

Multimodal Models (text-to-speech and whisper family)

Examples

Normal mode

JSON mode

Stream mode

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

`/` (Dispatcher)

Multimodal Models (text-to-speech and `whisper` family)

Packages