Python SDK for the Speechall API - A powerful speech-to-text transcription service supporting multiple AI models and providers.
- Multiple AI Models: Access various speech-to-text models from different providers (OpenAI Whisper, and more)
- Flexible Input: Transcribe local audio files or remote URLs
- Rich Output Formats: Get results in text, JSON, SRT, or VTT formats
- Speaker Diarization: Identify and separate different speakers in audio
- Custom Vocabulary: Improve accuracy with domain-specific terms
- Replacement Rules: Apply custom text transformations to transcriptions
- Language Support: Auto-detect languages or specify from a wide range of supported languages
- Async Support: Built with async/await support using httpx
pip install speechallimport os
from speechall import SpeechallApi
# Initialize the client
client = SpeechallApi(token=os.getenv("SPEECHALL_API_TOKEN"))
# Transcribe a local audio file
with open("audio.mp3", "rb") as audio_file:
audio_data = audio_file.read()
response = client.speech_to_text.transcribe(
model="openai.whisper-1",
request=audio_data,
language="en",
output_format="json",
punctuation=True
)
print(response.text)from speechall import SpeechallApi
client = SpeechallApi(token=os.getenv("SPEECHALL_API_TOKEN"))
response = client.speech_to_text.transcribe_remote(
file_url="https://example.com/audio.mp3",
model="openai.whisper-1",
language="auto", # Auto-detect language
output_format="json"
)
print(response.text)Identify different speakers in your audio:
response = client.speech_to_text.transcribe(
model="openai.whisper-1",
request=audio_data,
language="en",
output_format="json",
diarization=True,
speakers_expected=2
)
for segment in response.segments:
print(f"[Speaker {segment.speaker}] {segment.text}")Improve accuracy for specific terms:
response = client.speech_to_text.transcribe(
model="openai.whisper-1",
request=audio_data,
language="en",
output_format="json",
custom_vocabulary=["Kubernetes", "API", "Docker", "microservices"]
)Apply custom text transformations:
from speechall import ReplacementRule, ExactRule
replacement_rules = [
ReplacementRule(
rule=ExactRule(find="API", replace="Application Programming Interface")
)
]
response = client.speech_to_text.transcribe_remote(
file_url="https://example.com/audio.mp3",
model="openai.whisper-1",
language="en",
output_format="json",
replacement_ruleset=replacement_rules
)models = client.speech_to_text.list_speech_to_text_models()
for model in models:
print(f"{model.model_identifier}: {model.display_name}")
print(f" Provider: {model.provider}")Get your API token from speechall.com and set it as an environment variable:
export SPEECHALL_API_TOKEN="your-token-here"Or pass it directly when initializing the client:
from speechall import SpeechallApi
client = SpeechallApi(token="your-token-here")text: Plain text transcriptionjson: JSON with detailed information (segments, timestamps, metadata)json_text: JSON with simplified text outputsrt: SubRip subtitle formatvtt: WebVTT subtitle format
Use ISO 639-1 language codes (e.g., en, es, fr, de) or auto for automatic detection.
SpeechallApi: Main client for the Speechall APIAsyncSpeechallApi: Async client for the Speechall API
Transcribe a local audio file.
Parameters:
model(str): Model identifier (e.g., "openai.whisper-1")request(bytes): Audio file contentlanguage(str): Language code or "auto"output_format(str): Output format (text, json, srt, vtt)punctuation(bool): Enable automatic punctuationdiarization(bool): Enable speaker identificationspeakers_expected(int, optional): Expected number of speakerscustom_vocabulary(list, optional): List of custom termsinitial_prompt(str, optional): Context prompt for the modeltemperature(float, optional): Model temperature (0.0-1.0)
Transcribe audio from a URL.
Parameters: Same as transcribe() but with file_url instead of request
List all available models.
Check out the examples directory for more detailed usage examples:
- transcribe_local_file.py - Transcribe local audio files
- transcribe_remote_file.py - Transcribe remote audio URLs
- Python 3.8+
- httpx >= 0.27.0
- pydantic >= 2.0.0
- typing-extensions >= 4.0.0
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Type checking
mypy .- Documentation: docs.speechall.com
- GitHub: github.com/speechall/speechall-python-sdk
- Issues: github.com/speechall/speechall-python-sdk/issues
MIT License - see LICENSE file for details