Ei Tach!
My name is John Dekka, and I'm an enthusiastic nerd originally from the idyllic Saarland (Germany) region. I spend a lot of time programming, mastering Linux, and of course exploring the real world as well.
I'm proud to be a true Linux enthusiast! Höat sich komisch an, is awa mó so. 🤔
My passion isn't limited to computers but I'm also deeply fascinated by the universe, its boundaries, and what might possibly lie beyond them. When it comes to technology, I'd describe myself as highly enthusiastic. Not just to deepen my understanding, but also to solve complex problems faster and more effectively.
I've been tinkering with this stuff for about 25 years now. Operating systems, software, hardware, programming in all sorts of languages, gaming, modding... That's my thing.
It somehow grounds me, and I can really lose myself in it. Every now and then, I also work on a few smaller projects. 🤏
From local experimentation to production-grade systems—bridging theory and practical deployment.
- Model Optimization & Quantization:
- Fine-tuning models for edge devices (e.g.,
GGUF,INT4/INT8quantization) using tools likellama.cpp,TensorRT-LLM, orvLLM. - Benchmarking trade-offs between latency, memory, and accuracy for local inference.
- Implementing speculative decoding (e.g., with Medusa or Lookahead Decoding) to accelerate generation.
- Fine-tuning models for edge devices (e.g.,
- Distributed Training:
- Setting up multi-GPU/TPU clusters (e.g., with
FSDP,Deepspeed, orRay Train) for efficient fine-tuning. - Managing mixed-precision training (
FP16,BF16) and gradient checkpointing to optimize resource usage.
- Setting up multi-GPU/TPU clusters (e.g., with
- Custom Architectures:
- Adapting transformer variants (e.g., Mixture-of-Experts, Retentive Networks, State Space Models) for niche tasks.
- Experimenting with hybrid architectures (e.g., combining LLMs with symbolic reasoning or graph neural networks).
- Dataset Curation:
- Cleaning, deduplicating, and augmenting datasets (e.g., with
datasetslibrary,Unsloth, or custom pipelines). - Synthetic data generation using LLMs (e.g., backtranslation, self-instruct, or evolutionary prompts).
- Cleaning, deduplicating, and augmenting datasets (e.g., with
- Fine-Tuning Workflows:
- LoRA/QLoRA: Efficient parameter-efficient fine-tuning for domain-specific tasks.
- Direct Preference Optimization (DPO) or Reinforcement Learning (RLHF) for alignment.
- Continual Learning: Adapting models to new tasks without catastrophic forgetting (e.g., with adapters or elastic weight consolidation).
- Evaluation:
- Designing automated benchmarks (e.g., with
lm-evaluation-harness,EleutherAI, or custom metrics). - Human-in-the-loop validation for subjective tasks (e.g., creativity, bias, or safety).
- Designing automated benchmarks (e.g., with
- Advanced Prompting Techniques:
- Chain-of-Thought (CoT), Tree-of-Thought (ToT), and Graph-of-Thought (GoT) for complex reasoning.
- Self-Consistency and Ensemble Prompting to improve reliability.
- Dynamic Prompts: Generating or refining prompts at runtime based on user input or context.
- Agentic Workflows:
- Building multi-agent systems (e.g. custom orchestration) for collaborative tasks.
- Tool Use & Function Calling: Integrating LLMs with APIs, databases, or proprietary tools (e.g.,
LangChain,LlamaIndex). - Memory & State Management: Implementing long-term context (e.g., vector stores, graph-based memory, or
LongLora).
- RAG Pipelines:
- Hybrid Search: Combining semantic (e.g.,
FAISS,Weaviate) and keyword-based (e.g.,BM25) retrieval. - Query Rewriting: Using LLMs to expand or refine queries before retrieval.
- Post-Retrieval Processing: Reranking (e.g., with
Cross-Encoders), fusion, or hallucination detection.
- Hybrid Search: Combining semantic (e.g.,
- Local & Edge AI:
- Packaging models for offline use (e.g., with
ONNX,Core ML, orTFLite). - Optimizing for Raspberry Pi, Jetson, or browser-based inference (e.g.,
WebLLM).
- Packaging models for offline use (e.g., with
- Observability:
- Logging and monitoring model performance (e.g., with
Prometheus,Weights & Biases, or custom dashboards). - A/B Testing: Comparing model versions in production.
- Logging and monitoring model performance (e.g., with
- Documentation: Writing clear, reproducible guides for model training, deployment, or API usage.
- Open-Source Contributions: Publishing tools, datasets, or benchmarks (e.g., on
Hugging Face,GitHub, orPapers With Code). - Mentorship: Guiding teams on adopting LLM workflows or debugging training issues.
Building tools and solving problems.
- Python (Daily Driver 🐍)
- JavaScript
- Go (Golang)
- Lua
- Bash
- Web Design: Creating modern, responsive interfaces.
Keeping the lights on and the doors locked.
- Linux Server Administration: Deep knowledge of Linux internals and server management.
- OpSec: Strong focus on operational security and privacy.
- German (Native)
- English (Fluent)
- Norwegian (Godt nok)
Feel free to reach out!
