Skip to content

hvzzzz/cva6_experiments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Experiment Log

Each bitstream will produce 4 results: UnixBench, NPB and also performance counters will be measured in each case:

Baseline bitstream

This bitstream is the bitstream at the end of the COMPAS 2025 tutorial with performance counters enabled at: ~/cva6/core/include/cv64a6_imafdch_sv39_config_pkg.sv

#from
localparam CVA6ConfigPerfCounterEn = 0;

#to
localparam CVA6ConfigPerfCounterEn = 1;

Project Summary

Implementation Utilization

✅ Reading report from: ./bitstreams/reports/utilization/SoC_wrapper_utilization_placed_base.rpt

./.ob-jupyter/80f69f1fc27fac1f504bac7de189b99ab81f522b.png

Implementation Timing

--- Timing Summary ---
MetricValue (ns)
WNS (Setup)1.218
WHS (Hold)0.051

File locations

Bitstream

./bitstreams/baseline.bit

Implementation utilization report

./bitstreams/reports/utilization/SoC_wrapper_utilization_placed_base.rpt

Implementation timing report

./bitstreams/reports/timing/SoC_wrapper_timing_summary_routed_base.rpt

Implementation power report

./bitstreams/reports/power/SoC_wrapper_power_routed_base.rpt

UnixBench

Experiment metadata ./runs/UnixBench/unixbench_19700101_064547_base.txt

✅ Loading data from: ./runs/UnixBench/unixbench_19700101_064547_base.txt

./.ob-jupyter/5dc1bd1f49c23865a5455de0268ca0612b7b5dc7.png

=== UnixBench Results (Detailed) ===
 LabelCategoryIPCInstructionsCyclesTime_sScoreUnit
0Arith OverheadInteger / ALU0.388300961603602476198044.95239622607716.000000lps
1Arith RegisterInteger / ALU0.5685001407051462474895194.949790658341.000000lps
2Arith ShortInteger / ALU0.5690001408591702475652404.951305659170.000000lps
3Arith IntInteger / ALU0.5687001407690912475270314.950541658720.000000lps
4DhrystoneInteger / ALU0.447600813896711818280823.636562259837.000000lps
5Arith DoubleInteger / ALU0.5688001406753722473330994.946662658256.000000lps
6WhetstoneFloat / FPU0.4108001034993922519323225.03864619.449000MWIPS
7Sys MixSystem / OS0.174900432545932473661314.94732322651.000000lps
8Sys GetPIDSystem / OS0.225900555092672457750794.915502205011.000000lps
9Sys ExecSystem / OS0.1191002620492219954370.43990920.000000lps
10Pipe_ThroughputSystem / OS0.140400347420902475213604.9504278840.000000lps
11Context_SwitchingSystem / OS0.116900141975021214731232.429462856.000000lps
✅ Data exported to unixbench_results_final.csv

NAS Parallel Benchmark

Experiment metadata ./runs/NPB/NPB_19700101_000506_base.txt


./.ob-jupyter/0754cef1cbe6a2e5a6fad24ef614879a75e57328.png

=== NAS Parallel Benchmark Results ===
 LabelCategoryIPCInstructionsCyclesTime_sScoreScore Unit
0EPCompute Bound0.29700020913501397042643975284.5400000.120000Mop/s
1ISMemory Bound0.2131007566437355138051.4400000.450000Mop/s
2CGMemory Bound0.167400355825955212504115685.9000000.780000Mop/s
3MGMemory Bound0.192400339598791764993657.1300001.070000Mop/s
4FTMixed / Streaming0.1595005866657703678664319148.6800001.190000Mop/s
5BTMixed / Streaming0.2094005495035002624049654106.0100002.150000Mop/s
6SPMixed / Streaming0.181200357293866197184982779.6800001.210000Mop/s
7LUMixed / Streaming0.22980020039299287202738535.3600002.890000Mop/s

1st Iteration

This bitstream increases all the parameters to the maximum to test the limits of the PYNQ-Z2 FPGA where the experiments take place.

#from
localparam CVA6ConfigBTBEntries = 16;
localparam CVA6ConfigDcacheByteSize = 4096;
localparam CVA6ConfigDcacheSetAssoc = 4;

localparam CVA6ConfigRASDepth = 2;
localparam CVA6ConfigBTBEntries = 16;
localparam CVA6ConfigBHTEntries = 16;
#to
localparam CVA6ConfigBTBEntries = 32;
localparam CVA6ConfigDcacheByteSize = 8192;
localparam CVA6ConfigDcacheSetAssoc = 2;

localparam CVA6ConfigRASDepth = 4;
localparam CVA6ConfigBTBEntries = 32;
localparam CVA6ConfigBHTEntries = 32;

Project Summary

Implementation Utilization

✅ Reading report from: ./bitstreams/reports/utilization/SoC_wrapper_utilization_placed_1st.rpt

./.ob-jupyter/2ec5524f50090227c3df06c7f1acc8125c316a8f.png

Implementation Timing

--- Timing Summary ---
MetricValue (ns)
WNS (Setup)1.351
WHS (Hold)0.056

File locations

Bitstream

./bitstreams/1st.bit

Implementation utilization report

./bitstreams/reports/utilization/SoC_wrapper_utilization_placed_1st.rpt

Implementation timing report

./bitstreams/reports/timing/SoC_wrapper_timing_summary_routed_1st.rpt

Implementation power report

./bitstreams/reports/power/SoC_wrapper_power_routed_1st.rpt

UnixBench

Experiment metadata ./runs/UnixBench/unixbench_19700101_003833_1st.txt

✅ Loading data from: ./runs/UnixBench/unixbench_19700101_003833_1st.txt

./.ob-jupyter/7506470d3801753bb051a873b483ad8e2a435fc0.png

=== UnixBench Results (Detailed) ===
 LabelCategoryIPCInstructionsCyclesTime_sScoreUnit
0Arith OverheadInteger / ALU0.391500957531912445781854.89156422495175.000000lps
1Arith RegisterInteger / ALU0.5731001413647182466477774.932956661449.000000lps
2Arith ShortInteger / ALU0.5731001406962142455104284.910209658204.000000lps
3Arith IntInteger / ALU0.5736001413354082463880974.927762661411.000000lps
4DhrystoneInteger / ALU0.452300831756061838960693.677921265552.000000lps
5Arith DoubleInteger / ALU0.5734001413525292465352064.930704661348.000000lps
6WhetstoneFloat / FPU0.4179001034345622475246334.95049319.731000MWIPS
7Sys MixSystem / OS0.180400443365662457780924.91556223285.000000lps
8Sys GetPIDSystem / OS0.215500533205402474618574.949237195993.000000lps
9Sys ExecSystem / OS0.1246002604873209139950.41828021.000000lps
10Pipe_ThroughputSystem / OS0.152200376840572475516444.9510339751.000000lps
11Context_SwitchingSystem / OS0.124700144071181155297272.3105951009.000000lps

NAS Parallel Benchmark

Experiment metadata ./runs/NPB/NPB_19700101_002355_1st.txt


./.ob-jupyter/08cb551cfa343f02c93eaf993cabac721263db7c.png

=== NAS Parallel Benchmark Results ===
 LabelCategoryIPCInstructionsCyclesTime_sScoreScore Unit
0EPCompute Bound0.31240020825318336665771571269.3400000.120000Mop/s
1ISMemory Bound0.2493007445551298701491.2200000.540000Mop/s
2CGMemory Bound0.179700350957451195355310679.0300000.840000Mop/s
3MGMemory Bound0.202500339019241674511646.9700001.090000Mop/s
4FTMixed / Streaming0.1659005826553303513083635141.8600001.250000Mop/s
5BTMixed / Streaming0.2123005487001982584271595104.4900002.190000Mop/s
6SPMixed / Streaming0.188500355398830188553607276.3600001.270000Mop/s
7LUMixed / Streaming0.24160019923400782474342133.4200003.060000Mop/s

2nd Iteration

#from
  localparam CVA6ConfigIcacheByteSize = 4096;
  localparam CVA6ConfigDcacheSetAssoc = 2;
#to
  localparam CVA6ConfigIcacheByteSize = 8192;
  localparam CVA6ConfigDcacheSetAssoc = 4;

Project Summary

Implementation Utilization

✅ Reading report from: ./bitstreams/reports/utilization/SoC_wrapper_utilization_placed_2nd.rpt

./.ob-jupyter/072d10f6f4faef62027b7fc81c8d056e0228c87c.png

Implementation Timing

--- Timing Summary ---
MetricValue (ns)
WNS (Setup)1.496
WHS (Hold)0.051

File locations

Bitstream

./bitstreams/2nd.bit

Implementation utilization report

./bitstreams/reports/utilization/SoC_wrapper_utilization_placed_2nd.rpt

Implementation timing report

./bitstreams/reports/timing/SoC_wrapper_timing_summary_routed_2nd.rpt

Implementation power report

./bitstreams/reports/power/SoC_wrapper_power_routed_2nd.rpt

UnixBench

Experiment metadata ./runs/UnixBench/unixbench_19700101_003833_1st.txt

✅ Loading data from: ./runs/UnixBench/unixbench_19700101_001958_2nd.txt

./.ob-jupyter/fe1f54608913db97faeb9ea5b73ab5299a28a7f3.png

=== UnixBench Results (Detailed) ===
 LabelCategoryIPCInstructionsCyclesTime_sScoreUnit
0Arith OverheadInteger / ALU0.4087001013178582478860384.95772123890762.000000lps
1Arith RegisterInteger / ALU0.5978001476245862469618754.939237691936.000000lps
2Arith ShortInteger / ALU0.5989001485045692479792714.959585696400.000000lps
3Arith IntInteger / ALU0.5987001483996222478878144.957756695912.000000lps
4DhrystoneInteger / ALU0.471300879916781866831123.733662281643.000000lps
5Arith DoubleInteger / ALU0.5988001484354542478989964.957980696012.000000lps
6WhetstoneFloat / FPU0.4362001085675392488780434.97756120.754000MWIPS
7Sys MixSystem / OS0.207900515213862477960494.95592127655.000000lps
8Sys GetPIDSystem / OS0.249300614287702463640314.927281229367.000000lps
9Sys ExecSystem / OS0.1321002800643212086830.42417423.000000lps
10Pipe_ThroughputSystem / OS0.195200483580312477920734.95584113081.000000lps
11Context_SwitchingSystem / OS0.135800157663591160692692.3213851216.000000lps

NAS Parallel Benchmark

Experiment metadata ./runs/NPB/NPB_19700101_002355_1st.txt


./.ob-jupyter/89aacd822027c9a83e8eba09afb5a7335701c951.png

=== NAS Parallel Benchmark Results ===
 LabelCategoryIPCInstructionsCyclesTime_sScoreScore Unit
0EPCompute Bound0.32550020756959736377044336257.3300000.130000Mop/s
1ISMemory Bound0.2548007414347290972471.1700000.560000Mop/s
2CGMemory Bound0.186900349903827187223086675.6000000.880000Mop/s
3MGMemory Bound0.214800334822051558610906.2900001.210000Mop/s
4FTMixed / Streaming0.1717005793611033373637514136.1500001.300000Mop/s
5BTMixed / Streaming0.221700545606453246061366599.2800002.300000Mop/s
6SPMixed / Streaming0.199100352486732177030732271.4400001.350000Mop/s
7LUMixed / Streaming0.25100019825064278974891231.8600003.210000Mop/s

About

This repository contains performance measurement results for a RISC-V CVA6 core running on a PYNQ-Z2 FPGA.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors