Custom Llama modeling and integration with llm-foundry #51

dapopov-st · 2025-04-23T14:00:36Z

Llama Training with Adapter Integration

Overview

This PR introduces a custom Llama implementation with an adapter pattern for integration with LLM Foundry and MosaicML Composer. The changes provide an end-to-end workflow for training Llama models (tested locally, need to be ported over to Modal).

Key Changes

Custom Llama Implementation (llmfoundry/models/llama/*)

Updated model architecture
Model definition and training utilities
Support for both LoRA fine-tuning and full model training

Adapter Pattern Integration (llmfoundry/models/llama/model.py)

Custom adapter implementation working with Composer's training framework
Weight transfer between HuggingFace and custom implementation
Parameter management for memory usage considerations

Local Training Workflow (local_llama_training_instruct.py)

End-to-end script for local model training
Dataset preparation, training, conversion, evaluation, and inference
Configured for hardware with memory constraints

Evaluation Improvements (llmfoundry/command_utils/eval.py)

PEFT adapter format handling to address device metadata issues
Conversion between safetensors and bin formats
Error handling for model evaluation

Configuration Templates (scripts/train/yamls/llama/*)

YAML templates for various training scenarios
Settings for training performance
Examples for both LoRA and full model fine-tuning

Modal Integration

For Modal deployment, the integration follows the same pattern with these key considerations:

Model accessibility through HF_TOKEN and secrets management
Using get_hf_token() and download_model_if_needed() functions where appropriate
Container setup based on Dockerfile-dpv-branch
Training configuration using the YAML template structure

The implementation is designed to work in both local and Modal environments.

Additional Documentation

Please refer to the updated README.md for implementation details, including:

Custom training configuration options
Weight loading mechanisms
Model architecture approaches
Adapter pattern (from programming) for bridging custom model files with LLM Foundry's training architecture
Model registration and framework integration

Testing

The implementation has been tested on hardware with 2x RTX 3090 for both LoRA fine-tuning and full model training scenarios. A single 24 GB GPU should suffice for training: in case of OOM errors, just adjust the yamls.

galopyz · 2025-04-23T19:47:59Z

Awesome. Here is a simple sequence packing code that can be added in _generate_batches(self) from GreedyBestFitSequencePacker for decoder models:

if self.suppress_masking:
    labels = np.full_like(batch, self.pad_token_id)
    labels[:, :-1] = batch[:, 1:]
    yieldval = {
        "input_ids": torch.from_numpy(batch),
        "labels": torch.from_numpy(labels),
        "cu_seqlens": cu_seq_lens,
        "max_seqlen": max_seq_lens,
    }

Basically creating labels when there is no masking.

If llama model takes input batches that look like yieldval, then it would be very easy to add sequence packing.

dapopov-st added 2 commits April 22, 2025 20:07

Add PEFT adapter format conversion support to eval.py

075c960

Add custom Llama training and evaluation support

acfee48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom Llama modeling and integration with llm-foundry #51

Custom Llama modeling and integration with llm-foundry #51

Uh oh!

dapopov-st commented Apr 23, 2025 •

edited

Loading

Uh oh!

galopyz commented Apr 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Custom Llama modeling and integration with llm-foundry #51

Are you sure you want to change the base?

Custom Llama modeling and integration with llm-foundry #51

Uh oh!

Conversation

dapopov-st commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Llama Training with Adapter Integration

Overview

Key Changes

Custom Llama Implementation (llmfoundry/models/llama/*)

Adapter Pattern Integration (llmfoundry/models/llama/model.py)

Local Training Workflow (local_llama_training_instruct.py)

Evaluation Improvements (llmfoundry/command_utils/eval.py)

Configuration Templates (scripts/train/yamls/llama/*)

Modal Integration

Additional Documentation

Testing

Uh oh!

galopyz commented Apr 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dapopov-st commented Apr 23, 2025 •

edited

Loading