Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 43 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,47 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.4.0] - 2025-07-06

### Added
- **Advanced Learning Rate Scheduling**: Comprehensive expansion of learning rate scheduling capabilities
- **PolynomialLR**: Polynomial decay with configurable power for smooth learning rate transitions
- **CyclicalLR**: Cyclical learning rates with triangular, triangular2, and exponential range modes
- **WarmupScheduler**: Generic warmup wrapper that can be applied to any base scheduler
- **LRScheduleVisualizer**: ASCII visualization tool for learning rate schedules

- **Enhanced Scheduler Integration**:
- Convenience factory methods for new schedulers in `ScheduledOptimizer`
- Helper functions: `polynomial`, `cyclical`, `cyclical_triangular2`, `cyclical_exp_range`
- Complete integration with existing training infrastructure
- Comprehensive test coverage for all new schedulers

- **Learning Rate Visualization**:
- ASCII-based schedule visualization with customizable dimensions
- Schedule generation utilities for analysis and debugging
- Visual comparison tools for different scheduler behaviors
- Integration examples showing visualization usage

- **Advanced Training Examples**:
- `advanced_lr_scheduling.rs`: Comprehensive demonstration of new schedulers
- Warmup + cyclical learning rate combinations
- Best practices example with dropout + gradient clipping + advanced scheduling
- Performance comparison between different scheduling strategies

### Technical Improvements
- Extended scheduler trait system to support generic warmup wrapper
- Robust cyclical learning rate computation with proper cycle handling
- Polynomial decay implementation with numerical stability
- Comprehensive error handling and edge case management
- Enhanced documentation with visual examples and mathematical formulations

### Benefits
- More sophisticated learning rate control for better training quality
- Modern scheduling techniques used in state-of-the-art deep learning
- Visualization capabilities for schedule analysis and debugging
- Flexible warmup support for any existing scheduler
- Production-ready implementations with comprehensive testing

## [0.3.0] - 2025-07-03

### Added
Expand Down Expand Up @@ -189,4 +230,5 @@ When contributing to this project, please:

- **v0.1.0**: Initial LSTM implementation with forward pass
- **v0.2.0**: Complete training system with BPTT and optimizers
- **v0.3.0**: Learning rate scheduling, GRU implementation, BiLSTM, enhanced dropout, and model persistence
- **v0.3.0**: Learning rate scheduling, GRU implementation, BiLSTM, enhanced dropout, and model persistence
- **v0.4.0**: Advanced learning rate scheduling with 12 different schedulers, warmup support, cyclical rates, and visualization
6 changes: 5 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "rust-lstm"
version = "0.3.0"
version = "0.4.0"
authors = ["Alex Kholodniak <alexandrkholodniak@gmail.com>"]
edition = "2021"
rust-version = "1.70"
Expand Down Expand Up @@ -65,6 +65,10 @@ path = "examples/text_classification_bilstm.rs"
name = "learning_rate_scheduling"
path = "examples/learning_rate_scheduling.rs"

[[example]]
name = "advanced_lr_scheduling"
path = "examples/advanced_lr_scheduling.rs"

[[example]]
name = "gru_example"
path = "examples/gru_example.rs"
Expand Down
52 changes: 46 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,11 @@ graph TD

- **LSTM, BiLSTM & GRU Networks** with multi-layer support
- **Complete Training System** with backpropagation through time (BPTT)
- **Multiple Optimizers**: SGD, Adam, RMSprop with learning rate scheduling
- **Multiple Optimizers**: SGD, Adam, RMSprop with comprehensive learning rate scheduling
- **Advanced Learning Rate Scheduling**: 12 different schedulers including OneCycle, Warmup, Cyclical, and Polynomial
- **Loss Functions**: MSE, MAE, Cross-entropy with softmax
- **Advanced Dropout**: Input, recurrent, output dropout, variational dropout, and zoneout
- **Schedule Visualization**: ASCII visualization of learning rate schedules
- **Model Persistence**: Save/load models in JSON or binary format
- **Peephole LSTM variant** for enhanced performance

Expand All @@ -47,7 +49,7 @@ Add to your `Cargo.toml`:

```toml
[dependencies]
rust-lstm = "0.3.0"
rust-lstm = "0.4.0"
```

### Basic Usage
Expand Down Expand Up @@ -185,18 +187,50 @@ graph LR
style D2 fill:#fff3e0
```

### Learning Rate Scheduling
### Advanced Learning Rate Scheduling

The library includes 12 different learning rate schedulers with visualization capabilities:

```rust
use rust_lstm::{create_step_lr_trainer, create_one_cycle_trainer};
use rust_lstm::{
create_step_lr_trainer, create_one_cycle_trainer, create_cosine_annealing_trainer,
ScheduledOptimizer, PolynomialLR, CyclicalLR, WarmupScheduler,
LRScheduleVisualizer, Adam
};

// Step decay: reduce LR by 50% every 10 epochs
let mut trainer = create_step_lr_trainer(network, 0.01, 10, 0.5);

// OneCycle policy for modern deep learning
let mut trainer = create_one_cycle_trainer(network, 0.1, 100);

// Cosine annealing with warm restarts
let mut trainer = create_cosine_annealing_trainer(network, 0.01, 20, 1e-6);

// Advanced combinations - Warmup + Cyclical scheduling
let base_scheduler = CyclicalLR::new(0.001, 0.01, 10);
let warmup_scheduler = WarmupScheduler::new(5, base_scheduler, 0.0001);
let optimizer = ScheduledOptimizer::new(Adam::new(0.01), warmup_scheduler, 0.01);

// Polynomial decay with visualization
let poly_scheduler = PolynomialLR::new(100, 2.0, 0.001);
LRScheduleVisualizer::print_schedule(poly_scheduler, 0.01, 100, 60, 10);
```

#### Available Schedulers:
- **ConstantLR**: No scheduling (baseline)
- **StepLR**: Step decay at regular intervals
- **MultiStepLR**: Multi-step decay at specific milestones
- **ExponentialLR**: Exponential decay each epoch
- **CosineAnnealingLR**: Smooth cosine oscillation
- **CosineAnnealingWarmRestarts**: Cosine with periodic restarts
- **OneCycleLR**: One cycle policy for super-convergence
- **ReduceLROnPlateau**: Adaptive reduction on validation plateaus
- **LinearLR**: Linear interpolation between rates
- **PolynomialLR** ✨: Polynomial decay with configurable power
- **CyclicalLR** ✨: Triangular, triangular2, and exponential range modes
- **WarmupScheduler** ✨: Gradual warmup wrapper for any base scheduler

## Architecture

- **`layers`**: LSTM and GRU cells (standard, peephole, bidirectional) with dropout
Expand All @@ -223,7 +257,8 @@ cargo run --example bilstm_example # Bidirectional LSTM
cargo run --example dropout_example # Comprehensive dropout demo

# Learning and scheduling
cargo run --example learning_rate_scheduling
cargo run --example learning_rate_scheduling # Basic schedulers
cargo run --example advanced_lr_scheduling # Advanced schedulers with visualization

# Real-world applications
cargo run --example stock_prediction
Expand Down Expand Up @@ -257,8 +292,12 @@ cargo run --example model_inspection
### Learning Rate Schedulers
- **StepLR**: Decay by factor every N epochs
- **OneCycleLR**: One cycle policy (warmup + annealing)
- **CosineAnnealingLR**: Smooth cosine oscillation
- **CosineAnnealingLR**: Smooth cosine oscillation with warm restarts
- **ReduceLROnPlateau**: Reduce when validation loss plateaus
- **PolynomialLR**: Polynomial decay with configurable power
- **CyclicalLR**: Triangular oscillation with multiple modes
- **WarmupScheduler**: Gradual increase wrapper for any scheduler
- **LinearLR**: Linear interpolation between learning rates

## Testing

Expand Down Expand Up @@ -295,6 +334,7 @@ cargo run --example text_classification_bilstm # Classification accuracy

## Version History

- **v0.4.0**: Advanced learning rate scheduling with 12 different schedulers, warmup support, cyclical learning rates, polynomial decay, and ASCII visualization
- **v0.3.0**: Bidirectional LSTM networks with flexible combine modes
- **v0.2.0**: Complete training system with BPTT and comprehensive dropout
- **v0.1.0**: Initial LSTM implementation with forward pass
Expand Down
Loading