Skip to content

Conversation

@AlexMikhalev
Copy link

Summary

This PR adds complete Cerebras Inference API integration to rust-genai with custom streaming support.

What's Included

  • Custom Cerebras Streamer - Handles Cerebras's streaming format that ends with StreamEnded instead of [DONE]
  • Full Chat Support - OpenAI-compatible chat completions with all standard parameters
  • Streaming Chat - Real-time streaming responses with proper error handling
  • JSON Mode - Structured output support
  • Model Namespacing - cerebras::model_name format support
  • Comprehensive Tests - Live provider tests for all supported features
  • Example Code - Working example demonstrating usage
  • CI Integration - Dedicated workflow for Cerebras tests

Key Features

  • Models: cerebras::llama-3.1-8b, cerebras::llama-3.3-70b, and others
  • Authentication: CEREBRAS_API_KEY environment variable
  • Endpoint: https://api.cerebras.ai/v1/
  • Streaming: Custom implementation handling Cerebras's stream termination
  • Rate Limiting: Aware of ~3 requests/second limit

Files Added/Modified

  • src/adapter/adapters/cerebras/streamer.rs - New custom streamer
  • src/adapter/adapters/cerebras/adapter_impl.rs - Updated to use custom streamer
  • src/adapter/adapters/cerebras/mod.rs - Added streamer module
  • .github/workflows/cerebras-tests.yml - Dedicated CI workflow
  • .github/workflows/ci.yml - Updated to compile Cerebras tests
  • examples/c11-cerebras.rs - Usage example (already added)
  • tests/tests_p_cerebras.rs - Live tests (already added)

Testing

All tests pass when run with API key:

  • Chat completions ✅
  • Streaming chat ✅
  • JSON mode ✅
  • Temperature control ✅
  • Stop sequences ✅
  • Model listing ✅
  • Multi-system messages ✅

CI/CD

  • Main CI compiles Cerebras code without API key
  • Dedicated workflow runs live tests when CEREBRAS_API_KEY is configured
  • Workflow triggers on changes to Cerebras-related files

Usage

let client = Client::default();
let model = "cerebras::llama-3.1-8b";
let response = client.exec_chat(model, chat_req, None).await?;

The integration is production-ready and follows all existing patterns in the codebase.

…mmit; ensure dispatcher wiring and default auth/endpoint; skip provider runs in CI
- Create CerebrasStreamer to handle Cerebras's streaming format
- Fix stream termination that ends with StreamEnded error instead of [DONE]
- Implement proper usage capture for Cerebras responses
- Ensure graceful stream ending with proper StreamEnd events
- All streaming tests now pass for Cerebras adapter
- Create dedicated Cerebras tests workflow that runs when API key is available
- Add conditional execution based on CEREBRAS_API_KEY repository variable
- Include Cerebras example execution in CI tests
- Update main CI to compile Cerebras tests even without API key
- Workflow triggers on changes to Cerebras-related files
- Create workflow_dispatch workflow for manual testing
- Support both compile-only and live testing modes
- Provide clear messaging when API key is not configured
- Useful for debugging Cerebras integration issues
- Remove redundant else block in CerebrasStreamer
- Ensure all clippy checks pass with project standards
- Maintain code quality and consistency
- Add wiremock-based mock server tests for Anthropic and OpenRouter
- Create detailed test specification document
- Add test dependencies (wiremock, uuid) to Cargo.toml
- Implement 8 mock test scenarios covering basic chat, tools, streaming, auth errors, and JSON mode
- All tests pass and satisfy CI requirements (fmt, clippy, build, package)
Signed-off-by: Alex Mikhalev <alex@metacortex.engineer>
@AlexMikhalev AlexMikhalev merged commit b04c8dd into main Oct 15, 2025
3 checks passed
@AlexMikhalev AlexMikhalev deleted the feature/cerebras-integration branch October 16, 2025 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants