Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash
#rm -rf build
cmake -S . -B build -DLLAMA_CUBLAS=ON -DLLAMA_GGML_PERF=ON #-DLLAMA_RUN_WARMUP=OFF
cmake --build build --config Release

# if DLLAMA_GGML_PERF=ON
# -> avg exec time of operator in prefill & decode stage.
# -> ratio of active neurons located in the CPU
3 changes: 3 additions & 0 deletions llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2863,6 +2863,9 @@ struct llama_gpu_split_loader {
}
}

ggml_tensor * up = model->layers[0].ffn_up;
printf("\n>>> %ld, %ld\n", up->ne[0], up->ne[1]);

const int64_t t_mlp_us = ggml_time_us() - t_start_mlp_us;
LLAMA_LOG_INFO(" done (%.2f ms)\n", t_mlp_us / 1000.0);

Expand Down
5 changes: 4 additions & 1 deletion powerinfer-py/powerinfer/export_split.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,10 @@ def load_activation_weights(models_base: Path):
# But for now, let's assume it is a plain directory of activation_{0, ... , n_layers - 1}.pt
*_, files = next(os.walk(models_base))
activation_files = [f for f in files if re.match(r"activation_\d+.pt", f)]
activation_files.sort()

layer_num = np.array([int(re.sub(f'[^0-9]', '', f)) for f in activation_files])
idx = np.argsort(layer_num)
activation_files = [activation_files[i] for i in idx]
return [torch.load(models_base / f) for f in activation_files]

def append_gpu_idx(gguf: GGUFWriter, i_layer: int, activation, select_count) -> None:
Expand Down
1 change: 1 addition & 0 deletions scripts/pg19_firstbook_128.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Half-way down the Rue Saint-Denis, almost at the corner of the Rue du Petit-Lion, there stood formerly one of those delightful houses which enable historians to reconstruct old Paris by analogy. The threatening walls of this tumbledown abode seemed to have been decorated with hieroglyphics. For what other name could the passer-by give to the Xs and Vs which the horizontal or diagonal timbers traced on the front, outlined by little parallel cracks in the plaster? It was evident that every beam quivered in its mortices at the passing of the lightest vehicle. This venerable structure was crowned by a