update `master` branch from `embench-2.0-branch` by lesteral · Pull Request #211 · embench/embench-iot

lesteral · 2026-02-04T21:24:10Z

@jeremybennett - Can you please update the master branch, from the embench-2.0-branch, as per this PR?

Thanks & regards, Lester

Remove minver; Remove st; Update SPDX Identifiers

This reverts commit 03d6fea.

This reverts commit 61c8a10.

This reverts commit 88684ea.

Setup toolchain (STM32CubeF4 + gcc based). Update python scripts. Update example README.

This change removes all floating-point operations from the benchmark, and reduces the size of the x86 executable to 57k. It also enables the use of deeper trees (max_depth increased from 4 to 5), which slightly increases the complexity of the benchmark. Overall accuracy on the 8x8 downscaled MNIST dataset is 95.82%.

Update xgboost benchmark to use uint8-quantized weights

If we call exit, we end up pulling in the C standard library. * support/beebsc.c: Use assert_beebs rather than assert with init_heap_beebs. * support/beebsc.h: rewrite assert_beebs to not use exit. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

We separate out the CPU_MHZ into its two roles. The first uses GLOBAL_SCALE_FACTOR to scale the benchmarks when building so each runs in around 4 seconds. The second is to work out the Embench score per MHz. We now scale the benchmarks, with two nested loops, one for the LOCAL_SCALE_FACTOR and one for the GLOBAL_SCALE_FACTOR. This allows us to not overflow the loop count with 8/16-bit architectures, while being able to scale up to modern big fast machines. We adjust LOCAL_SCALE_FACTOR values for the benchmarks kept from Embench IoT 1.0 to take account of improvements in compiler performance. * baseline-data/speed.json: Updated for Embench 2.0. * benchmark_speed.py: Script updated for new GLOBAL_SCALE_FACTOR; remove parallel execution; new options to generate MD and CSV output.f; generate total and per MHz scores for relative results. * doc/README.md: Updated to document GLOBAL_SCALE_FACTOR. * examples/arm/stm32f4-discovery/README.md: Updated to use GLOBAL_SCALE_FACTOR. * pylib/embench_core.py: Add MD and CSV to class output_format; move stats output functions to benchmark_speed.py. * pylib/run_stm32f4-discovery.py: Move --cpu_mhz to benchmark_speed.py, pass args to functions. * sconstruct.py: Add --gsf option and help test, remove trailing whitespace. * src/aha-mont64/mont64.c: Use LOCAL_SCALE_FACTOR and GLOBAL_SCALE_FACTOR in nested loop to scale performance. * src/crc32/crc_32.c: Likewise. * src/depthconv/depthconv.c: Likewise. * src/edn/libedn.c: Likewise. * src/huffbench/libhuffbench.c: Likewise. * src/matmult-int/matmult-int.c: Likewise. * src/md5sum/md5.c: Likewise. * src/nettle-aes/nettle-aes.c: Likewise. * src/nettle-sha256/nettle-sha256.c: Likewise. * src/nsichneu/libnsichneu.c: Likewise. * src/picojpeg/picojpeg_test.c: Likewise. * src/qrduino/qrtest.c: Likewise. * src/sglib-combined/combined.c: Likewise. * src/slre/libslre.c: Likewise. * src/statemate/libstatemate.c: Likewise. * src/tarfind/tarfind.c: Likewise. * src/ud/libud.c: Likewise. * src/wikisort/libwikisort.c: Likewise. * src/xgboost/testbench.c: Likewise. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

* sconstruct.py: Set up the environment from the parent process. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

The previous data, fell foul of the scons config not importing the environment, so in fact was with system GCC 13.2. This correctly has data for GCC 14.1, and adjusts local scale factors accordingly. * baseline-data/speed.json: Updated data for GCC 14.1. * src/aha-mont64/mont64.c: Adjust LOCAL_SCALE_FACTOR. * src/edn/libedn.c: Likewise. * src/huffbench/libhuffbench.c: Likewise. * src/matmult-int/matmult-int.c: Likewise. * src/md5sum/md5.c: Likewise. * src/nettle-aes/nettle-aes.c: Likewise. * src/nettle-sha256/nettle-sha256.c: Likewise. * src/sglib-combined/combined.c: Likewise. * src/sglib-combined/sglib.h: Likewise, also replace assert by assert_beebs throughout. * src/slre/libslre.c: Adjust LOCAL_SCALE_FACTOR. * src/statemate/libstatemate.c: Likewise. * src/tarfind/tarfind.c: Likewise. * src/ud/libud.c: Likewise. * src/wikisort/libwikisort.c: Likewise. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

* baseline-data/size.json: Updated values for Embench 2.0 * benchmark_size.py: Extend to measure BSS separately, add CSV and MarkDown output formats, generate statistics for relative runs. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

* benchmark_speed.py (benchmark_speed): Ensure res is set before use. * pylib/run_stm32f4-discovery.py: Add dictionary of exported functions. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

We have updated the defaults, to be based on using garbage collection of unused sections. The baseline data for speed is from a run configured with: scons --config-dir=examples/arm/stm32f4-discovery/ \ cc=arm-none-eabi-gcc \ cflags='-O2 -mcpu=cortex-m4 -mthumb -mfloat-abi=soft -ffunction-sections -fdata-sections' \ ldflags='-O2 -Wl,--gc-sections -mcpu=cortex-m4 -mthumb -mfloat-abi=soft -T${CONFIG_DIR}/STM32F407IGHX_FLASH.ld -L${CONFIG_DIR} -static -nostartfiles' \ user_libs='m startup' gsf=16 with results collected using: ./benchmark_speed.py --target-module run_stm32f4-discovery \ --gdb-command gdb-multiarch --cpu-mhz 16 --gsf 16 --absolute \ --baseline-output The baseline for size is from a run configured with: scons --config-dir=examples/arm/stm32f4-discovery/ cc=arm-none-eabi-gcc \ cflags='-Os -ffunction-sections -fdata-sections -mcpu=cortex-m4 -mfloat-abi=soft -mthumb ' \ ldflags='-Os -Wl,--gc-sections -mcpu=cortex-m4 -mfloat-abi=soft -mthumb -T${CONFIG_DIR}/STM32F407IGHX_FLASH.ld -L${CONFIG_DIR} -static -nostartfiles' \ user_libs='m startup' gsf=1 with results collected using: ./benchmark_size.py --absolute --baseline-output * baseline-data/size.json: Update data. * baseline-data/speed.json: Likewise. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

This is a read through to clarify wording, and ensure consistency for Embench 2.0 and its Arm reference board. * README.md: Updated for Embench 2.0. * doc/Makefile: Correct spelling of hunspell dictionary * doc/README.md: Updated for Embench 2.0. * doc/custom.wordlist: Add new words needed for updated documentation. * examples/arm/stm32f4-discovery/README.md: Updated for Embench 2.0. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

* examples/riscv32/cv32e40pv2fpga/README.md: Created. * examples/riscv32/cv32e40pv2fpga/boardsupport.c: Created. * examples/riscv32/cv32e40pv2fpga/boardsupport.h: Created. * examples/riscv32/cv32e40pv2fpga/link.ld: Created. * examples/riscv32/cv32e40pv2fpga/openocd-nexys-hs2.cfg: Created. * examples/riscv32/cv32e40pv2fpga/unilink.ld: Created. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

I-mikan-I added 30 commits March 4, 2024 12:36

Add dummy/empty benchmark for computing stdlib size overhead

ace099e

Update apple/darwin size configurations for dummy benchmark

d0101be

embench#186 calculate architecture-dependent heap size

8bec9db

WIP (embench#192): Migrate to scons build tooling

e623a06

WIP (embench#192): Update scons build toolchain

771d2f8

Update examples for new config layout

041082e

Remove dummy libs

acecfd6

Remove legacy build script

0174a81

Remove benchmarks

4514138

Update README, baseline

15d0dcd

Update size calculation based on ELF section flags

a3991bd

Update speed interface

dc20d75

Dummy benchmark noinline

ad0dbf3

README update for embench-iot 2.0

8e94f7d

Update licenses

b1f18f0

Update ChangeLog

6903bbe

Update TOC

c89e7c9

Remove unrequired integer to float promotions

3368d58

WIP: Add DepthConv Benchmark

07282ee

Add Depthconv;

5e4ff58

Remove minver; Remove st; Update SPDX Identifiers

Update SPDX for rest of source files

a59e648

nsichneu: remove volatile and fix undefined inputs

88684ea

Make runnin time architecture independent

61c8a10

Remove rest of volatile variables

03d6fea

Revert "Remove rest of volatile variables"

4e8d060

This reverts commit 03d6fea.

Revert "Make runnin time architecture independent"

bd57be1

This reverts commit 61c8a10.

Revert "nsichneu: remove volatile and fix undefined inputs"

b325f0d

This reverts commit 88684ea.

Record new baseline results.

7360c5a

Setup toolchain (STM32CubeF4 + gcc based). Update python scripts. Update example README.

Fix undefined overflow in beebsc (issue on arm)

72efe84

Add (floating-point) xgboost classification benchmark

d1ee0dd

ZSusskind and others added 19 commits May 1, 2024 11:17

Merge pull request embench#194 from ZSusskind/embench-2.0-rc.1

5a6f66c

Update xgboost benchmark to use uint8-quantized weights

Update xgboost measurements

4478808

Update AUTHORS

3c86bdd

Non-Unix: Add file extension support

92f35fa

Remove lto from example build configuration

9726f80

Set up scons with the existing environment.

020f7d3

* sconstruct.py: Set up the environment from the parent process. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

Remove lief

89203fe

Add dictionary of exported modules and fix bug in speed benchmark.

46c79fe

* benchmark_speed.py (benchmark_speed): Ensure res is set before use. * pylib/run_stm32f4-discovery.py: Add dictionary of exported functions. Signed-off-by: Jeremy Bennett <jeremy.bennett@embecosm.com>

Update argument parsing method (extend)

0be80d4

Remove workspace settings folder

b84c264

Remove CPU_MHZ references

6f2ddf3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update `master` branch from `embench-2.0-branch`#211

update `master` branch from `embench-2.0-branch`#211
lesteral wants to merge 49 commits intoembench:masterfrom
lesteral:embench-2.0-branch

lesteral commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

lesteral commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants