Devstral Model with OpenHands Web Deployment

A comprehensive repository for deploying the Devstral model with OpenHands for web access on a server, eliminating the need for LM Studio on the client side.

🚀 Quick Start

Docker Deployment (Recommended)

# Clone the repository
git clone <repository-url>
cd devstral-openhands-deployment

# Build and run with default settings (Ollama)
./build-and-run.sh

# Or with Text Generation WebUI
./build-and-run.sh -t textgen

# Or with llama.cpp and GPU acceleration
./build-and-run.sh -t llamacpp -g

Automated Setup (Traditional)

# Run the interactive setup script
./scripts/setup.sh

# Test your deployment
./scripts/test-deployment.sh

One-Line Deployment

# Quick start with Ollama (recommended for beginners)
cp examples/quick-start-ollama.yml docker-compose.yml && docker-compose up -d

# Production setup with Text Generation WebUI
cp examples/production-textgen.yml docker-compose.yml && docker-compose up -d

# High-performance setup with llama.cpp
cp examples/high-performance-llamacpp.yml docker-compose.yml && docker-compose up -d

📋 Table of Contents

🚀 Quick Start
🎯 Core Concept
📦 Repository Structure
⚙️ Deployment Options
🔧 Configuration
📊 Monitoring & Testing
🐛 Troubleshooting
🏆 Advantages
📚 Documentation
🤝 Contributing

🎯 Core Concept

This deployment replaces LM Studio with a server-side solution that:

Serves the Devstral GGUF model via HTTP API using your choice of:
- Ollama (user-friendly, web UI included)
- Text Generation WebUI (feature-rich, production-ready)
- llama.cpp server (high-performance, minimal overhead)
Runs OpenHands in a Docker container configured to connect to your model API
Provides web access to the OpenHands interface from any browser
Enables centralized control with optional monitoring and scaling

📦 Repository Structure

devstral-openhands-deployment/
├── README.md                    # This file
├── LICENSE                      # MIT License
├── .gitignore                  # Git ignore rules
├── Dockerfile                   # Complete deployment container
├── docker-compose.standalone.yml # Standalone Docker deployment
├── build-and-run.sh            # Build and run script
├── DOCKER_DEPLOYMENT.md        # Docker deployment guide
├── ollama-setup/               # Ollama deployment files
│   ├── README.md
│   ├── docker-compose.yml
│   ├── Modelfile
│   └── test-api.sh
├── text-generation-webui/      # Text Generation WebUI setup
│   ├── README.md
│   ├── docker-compose.yml
│   ├── settings.yaml
│   └── test-api.sh
├── llamacpp-server/            # llama.cpp server setup
│   ├── README.md
│   ├── docker-compose.yml
│   ├── docker-compose.gpu.yml
│   ├── test-api.sh
│   └── start-server.sh
├── examples/                   # Complete deployment examples
│   ├── README.md
│   ├── quick-start-ollama.yml
│   ├── production-textgen.yml
│   └── high-performance-llamacpp.yml
├── docs/                       # Documentation
│   ├── deployment-guide.md
│   └── troubleshooting.md
└── scripts/                    # Utility scripts
    ├── setup.sh               # Interactive setup script
    └── test-deployment.sh     # Deployment testing script

⚙️ Deployment Options

🐳 Option 0: Docker Deployment (Recommended)

Best for: All users, production, development, easy setup

Features:

Complete containerized solution
Automatic service orchestration
Multiple deployment types in one container
Easy configuration and management

Quick Start:

./build-and-run.sh

🟢 Option 1: Ollama (Recommended for Beginners)

Best for: First-time users, development, quick prototyping

Features:

User-friendly web interface
Easy model management
Built-in model library
Simple configuration

Quick Start:

cd ollama-setup/
docker-compose up -d

🟡 Option 2: Text Generation WebUI (Production Ready)

Best for: Production environments, advanced features, monitoring

Features:

Comprehensive web interface
Advanced model parameters
Chat templates and personas
API compatibility
Monitoring and logging

Quick Start:

cd text-generation-webui/
docker-compose up -d

🔴 Option 3: llama.cpp Server (High Performance)

Best for: Maximum performance, minimal overhead, GPU acceleration

Features:

Optimized inference engine
GPU acceleration support
Minimal resource usage
OpenAI-compatible API

Quick Start:

cd llamacpp-server/
docker-compose up -d

🔧 Configuration

Prerequisites

System Requirements:

Minimum: 8GB RAM, 4 CPU cores, 20GB disk space
Recommended: 16GB+ RAM, 8+ CPU cores, 50GB+ disk space
GPU: NVIDIA GPU with 8GB+ VRAM (optional, for acceleration)

Software:

Docker 20.10+
Docker Compose 2.0+
Git (for cloning)

Environment Variables

Create a .env file to customize your deployment:

# Model Configuration
MODEL_FILE=devstral-model.gguf
MODEL_NAME=devstral
CONTEXT_SIZE=4096

# Performance Settings
THREADS=8
BATCH_SIZE=512
GPU_LAYERS=0  # Set to 35+ for GPU acceleration

# Port Configuration
OPENHANDS_PORT=3000
MODEL_SERVER_PORT=8080

# Paths
WORKSPACE_PATH=./workspace
MODELS_PATH=./models

Custom Deployment

# Use environment variables
MODEL_FILE=my-model.gguf GPU_LAYERS=35 docker-compose up -d

# Use specific example
cp examples/high-performance-llamacpp.yml docker-compose.yml
docker-compose up -d

📊 Monitoring & Testing

Health Checks

All deployments include built-in health checks:

# Check service status
docker-compose ps

# View service logs
docker-compose logs -f

# Test API endpoints
./scripts/test-deployment.sh

Performance Monitoring

For production deployments with monitoring:

# Start with monitoring stack
docker-compose --profile monitoring up -d

# Access monitoring interfaces
# Grafana: http://localhost:3001
# Prometheus: http://localhost:9090

Testing Your Deployment

# Comprehensive deployment test
./scripts/test-deployment.sh

# Quick API test (varies by deployment)
curl http://localhost:8080/v1/models  # llama.cpp
curl http://localhost:11434/api/tags  # Ollama
curl http://localhost:5000/api/v1/model  # Text Generation WebUI

🐛 Troubleshooting

Common Issues

Docker Image Pull Failures:

# Use correct OpenHands image
docker pull docker.all-hands.dev/all-hands-ai/openhands:0.40

Model Loading Issues:

# Check model file exists
ls -la models/

# Verify model in container
docker exec -it ollama ollama list

API Connection Issues:

# Test network connectivity
docker exec -it openhands curl http://ollama:11434/api/tags

Port Conflicts:

# Change ports in docker-compose.yml
ports:
  - "3001:3000"  # Change from 3000:3000

For detailed troubleshooting, see docs/troubleshooting.md.

🏆 Advantages

Over Local LM Studio Setup

🌐 Web Access: Access from any browser, any device
🔄 Centralized Control: Manage from a single server
📈 Scalability: Easy to scale resources or add instances
🔒 Security: Centralized security management
👥 Multi-User: Support multiple concurrent users
📊 Monitoring: Built-in monitoring and logging
🚀 Performance: Dedicated server resources

Deployment Benefits

🐳 Containerized: Consistent deployment across environments
⚡ Quick Setup: Automated setup scripts and examples
🔧 Configurable: Multiple deployment options and configurations
🧪 Testable: Comprehensive testing scripts
📖 Documented: Extensive documentation and examples

📚 Documentation

Docker Deployment Guide: Complete Docker deployment instructions
Deployment Guide: Step-by-step deployment instructions
Troubleshooting Guide: Common issues and solutions
Examples: Ready-to-use deployment examples
Ollama Setup: Ollama-specific instructions
Text Generation WebUI: WebUI setup guide
llama.cpp Server: High-performance setup

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Setup

# Clone the repository
git clone <repository-url>
cd devstral-openhands-deployment

# Test your changes
./scripts/test-deployment.sh

# Submit a pull request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Related Projects

OpenHands - AI agent framework
Ollama - Local LLM server
Text Generation WebUI - Web interface for LLMs
llama.cpp - High-performance LLM inference

🆘 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: docs/

Ready to deploy? Start with ./scripts/setup.sh for an interactive setup experience!

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
docs		docs
examples		examples
llamacpp-server		llamacpp-server
ollama-setup		ollama-setup
scripts		scripts
src		src
text-generation-webui		text-generation-webui
.env.docker		.env.docker
.gitignore		.gitignore
DOCKER_DEPLOYMENT.md		DOCKER_DEPLOYMENT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
build-and-run.sh		build-and-run.sh
docker-compose.simple.yml		docker-compose.simple.yml
docker-compose.standalone.yml		docker-compose.standalone.yml
entrypoint.sh		entrypoint.sh
get-docker.sh		get-docker.sh
package-lock.json		package-lock.json
test-docker-deployment.sh		test-docker-deployment.sh

Folders and files

Latest commit

History

Repository files navigation

Devstral Model with OpenHands Web Deployment

🚀 Quick Start

Docker Deployment (Recommended)

Automated Setup (Traditional)

One-Line Deployment

📋 Table of Contents

🎯 Core Concept

📦 Repository Structure

⚙️ Deployment Options

🐳 Option 0: Docker Deployment (Recommended)

🟢 Option 1: Ollama (Recommended for Beginners)

🟡 Option 2: Text Generation WebUI (Production Ready)

🔴 Option 3: llama.cpp Server (High Performance)

🔧 Configuration

Prerequisites

Environment Variables

Custom Deployment

📊 Monitoring & Testing

Health Checks

Performance Monitoring

Testing Your Deployment

🐛 Troubleshooting

Common Issues

🏆 Advantages

Over Local LM Studio Setup

Deployment Benefits

📚 Documentation

🤝 Contributing

Development Setup

📄 License

🔗 Related Projects

🆘 Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages