-
Notifications
You must be signed in to change notification settings - Fork 121
Open
Description
Description:
I'm using the Wanda pruning method on Llama 3.2-1b, but the process gets stuck at "loading calibration data" and doesn't proceed further. However, pruning with Llama 2 works fine and runs without issues. Below are the command and relevant information.
Command:
python main_new.py --model /Data/llama3.2-1b --prune_method wanda --sparsity_ratio 0.5 --sparsity_type unstructured --save out/llama3.2-1b/unstructured/wanda/ --save_model out/llama3.2-1b/unstructured/wanda/pruned_modelEnvironment:
- torch 1.10.1
- transformers 4.28.0
- accelerate 0.18.0
- Number of GPUs: 4
Debug Output:
Arguments:
sparsity_type: unstructured
sparsity_ratio: 0.5
prune_method: wanda
------------------------------
loading llm model /Data/llama3.2-1b
Some weights of LlamaForCausalLM were not initialized from the model checkpoint at /Data/llama3.2-1b and are newly initialized:
['model.layers.1.self_attn.rotary_emb.inv_freq', 'model.layers.10.self_attn.rotary_emb.inv_freq', 'model.layers.7.self_attn.rotary_emb.inv_freq', ...]
You should probably TRAIN this model on a downstream task to be able to use it for predictions and inference.
use device cuda:0
pruning starts
loading calibration data
Problem:
- The pruning process stops at "loading calibration data" and does not proceed further.
- The pruning works fine on Llama 2, but not on Llama 3.2-1b.
- This issue only occurs with Llama 3.2-1b, not with previous versions.
Steps Taken:
I have tried running the command with both Llama 3.2-1b and Llama 2.
The issue is specific to Llama 3.2-1b, as pruning on Llama 2 runs without issues.
Expected Behavior:
The pruning process should proceed without getting stuck at the "loading calibration data" step.
Any advice or insights would be greatly appreciated.
Metadata
Metadata
Assignees
Labels
No labels