When training with Megatron Trainer’s backend in LLaMA-Factory, enabling both neat_packing and context_parallel_size (CP) results in an AssertionError: neat_packing + context_parallel alignment mismatch (sub-sequence length not aligned to 2 * cp_size)
config:
........
cut_off_len: 32768
sequence_packin: true
context_parallel_size: 4
.........