The health check endpoint (/healthz) has been added to the Python SDK to facilitate production monitoring and service health checks. This endpoint provides a quick way to verify that the ScrapeGraphAI API service is operational and ready to handle requests.
Related: GitHub Issue #62
- Production Monitoring: Regular health checks for alerting and monitoring systems
- Container Health Checks: Kubernetes liveness/readiness probes, Docker HEALTHCHECK
- Load Balancer Health Checks: Ensure service availability before routing traffic
- Integration with Monitoring Tools: Prometheus, Datadog, New Relic, etc.
- Pre-request Validation: Verify service is available before making API calls
- Service Discovery: Health status for service mesh and discovery systems
The health check endpoint is available in the latest version of the Python SDK:
pip install scrapegraph-pyfrom scrapegraph_py import Client
# Initialize client
client = Client.from_env()
# Check health status
health = client.healthz()
print(health)
# {'status': 'healthy', 'message': 'Service is operational'}
# Clean up
client.close()import asyncio
from scrapegraph_py import AsyncClient
async def check_health():
async with AsyncClient.from_env() as client:
health = await client.healthz()
print(health)
# {'status': 'healthy', 'message': 'Service is operational'}
asyncio.run(check_health())Check the health status of the ScrapeGraphAI API service.
Returns:
dict: Health status information with at least the following fields:status(str): Health status (e.g., 'healthy', 'unhealthy', 'degraded')message(str): Human-readable status message
Raises:
APIError: If the API returns an error responseConnectionError: If unable to connect to the API
Asynchronous version of the health check method.
Returns:
dict: Health status information (same structure as sync version)
Raises:
- Same exceptions as synchronous version
from scrapegraph_py import Client
client = Client.from_env()
try:
health = client.healthz()
if health.get('status') == 'healthy':
print("✓ Service is operational")
else:
print(f"⚠ Service status: {health.get('status')}")
except Exception as e:
print(f"✗ Health check failed: {e}")
finally:
client.close()from fastapi import FastAPI, HTTPException
from scrapegraph_py import AsyncClient
app = FastAPI()
@app.get("/health")
async def health_check():
"""Health check endpoint for load balancers"""
try:
async with AsyncClient.from_env() as client:
health = await client.healthz()
if health.get('status') == 'healthy':
return {
"status": "healthy",
"scrape_graph_api": "operational"
}
else:
raise HTTPException(
status_code=503,
detail="ScrapeGraphAI API is unhealthy"
)
except Exception as e:
raise HTTPException(
status_code=503,
detail=f"Health check failed: {str(e)}"
)#!/usr/bin/env python3
"""
Kubernetes liveness probe script for ScrapeGraphAI
Returns exit code 0 if healthy, 1 if unhealthy
"""
import sys
from scrapegraph_py import Client
def main():
try:
client = Client.from_env()
health = client.healthz()
client.close()
if health.get('status') == 'healthy':
sys.exit(0)
else:
sys.exit(1)
except Exception:
sys.exit(1)
if __name__ == "__main__":
main()The health check endpoint supports mock mode for testing:
from scrapegraph_py import Client
# Enable mock mode
client = Client(
api_key="sgai-00000000-0000-0000-0000-000000000000",
mock=True
)
health = client.healthz()
print(health)
# {'status': 'healthy', 'message': 'Service is operational'}Custom Mock Responses:
from scrapegraph_py import Client
custom_response = {
"status": "degraded",
"message": "Custom mock response",
"uptime": 12345
}
client = Client(
api_key="sgai-00000000-0000-0000-0000-000000000000",
mock=True,
mock_responses={"/v1/healthz": custom_response}
)
health = client.healthz()
print(health)
# {'status': 'degraded', 'message': 'Custom mock response', 'uptime': 12345}{
"status": "healthy",
"message": "Service is operational"
}healthy: Service is fully operationaldegraded: Service is operational but experiencing issuesunhealthy: Service is not operational
Note: The actual status values and additional fields may vary based on the API implementation.
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
# Health check using the SDK
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD python -c "from scrapegraph_py import Client; import sys; c = Client.from_env(); h = c.healthz(); c.close(); sys.exit(0 if h.get('status') == 'healthy' else 1)"
CMD ["python", "app.py"]version: '3.8'
services:
app:
build: .
environment:
- SGAI_API_KEY=${SGAI_API_KEY}
healthcheck:
test: ["CMD", "python", "-c", "from scrapegraph_py import Client; import sys; c = Client.from_env(); h = c.healthz(); c.close(); sys.exit(0 if h.get('status') == 'healthy' else 1)"]
interval: 30s
timeout: 3s
retries: 3
start_period: 5sapiVersion: apps/v1
kind: Deployment
metadata:
name: scrapegraph-app
spec:
replicas: 3
template:
spec:
containers:
- name: app
image: your-app:latest
env:
- name: SGAI_API_KEY
valueFrom:
secretKeyRef:
name: scrapegraph-secret
key: api-key
# Liveness probe - restarts container if unhealthy
livenessProbe:
exec:
command:
- python
- -c
- |
from scrapegraph_py import Client
import sys
c = Client.from_env()
h = c.healthz()
c.close()
sys.exit(0 if h.get('status') == 'healthy' else 1)
initialDelaySeconds: 10
periodSeconds: 30
timeoutSeconds: 5
failureThreshold: 3
# Readiness probe - removes from service if not ready
readinessProbe:
exec:
command:
- python
- -c
- |
from scrapegraph_py import Client
import sys
c = Client.from_env()
h = c.healthz()
c.close()
sys.exit(0 if h.get('status') == 'healthy' else 1)
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 2- Basic:
scrapegraph-py/examples/utilities/healthz_example.py - Async:
scrapegraph-py/examples/utilities/healthz_async_example.py
scrapegraph-py/tests/test_client.py- Synchronous testsscrapegraph-py/tests/test_async_client.py- Asynchronous testsscrapegraph-py/tests/test_healthz_mock.py- Mock mode tests
# Run all tests
cd scrapegraph-py
pytest tests/test_healthz_mock.py -v
# Run specific test
pytest tests/test_healthz_mock.py::test_healthz_mock_sync -v- Implement Timeout: Always set a reasonable timeout for health checks (3-5 seconds recommended)
- Use Appropriate Intervals: Don't check too frequently; 30 seconds is a good default
- Handle Failures Gracefully: Implement retry logic with exponential backoff
- Monitor and Alert: Integrate with monitoring systems for automated alerting
- Test in Mock Mode: Use mock mode in CI/CD pipelines to avoid API calls
- Log Health Check Results: Keep records of health check outcomes for debugging
For issues, questions, or contributions, please visit: