Deduplicate walltime perf symbols, unwind data and debug info across pids#240
Deduplicate walltime perf symbols, unwind data and debug info across pids#240GuillaumeLagrange wants to merge 3 commits intomainfrom
Conversation
99e0ea4 to
1a32faa
Compare
15316f4 to
9305a03
Compare
a3617af to
7f65690
Compare
not-matthias
left a comment
There was a problem hiding this comment.
I think the struct naming can be clarified a bit, I was struggling quite a lot to understand how they map together. Maybe also using custom types for the key that's used in the hashmaps.
The overall logic looks good! Next round should be quick
...apshots/codspeed_runner__executor__wall_time__perf__unwind_data__tests__cpp_unwind_data.snap
Outdated
Show resolved
Hide resolved
8c864c5 to
692a99b
Compare
This fixes a regression introduced in 8b37208. We now filter pids using `bench_pids`, except for `exec-harness` integrations, where we take all pids.
192d2ce to
14cb24f
Compare
- Store pid-agnostic data in a file or json map under a mapped `path_key` for each elf - For each pid, store pid specific data, mostly the computed load_bias from where each module was loaded into memory at runtime, alongside a key to retrieve the pid-agnostic data This way, we only write to disk relevant parts of the information.
Also add a rebuild trigger to make it easier to run GITHUB_ACTIONS=1 cargo test` locally. We could have a better trigger, but this will do for now.
14cb24f to
9a7f9bc
Compare
not-matthias
left a comment
There was a problem hiding this comment.
LGTM overall. Just a few minor stylistic comments.
| pub mod unwind_data; | ||
|
|
||
| const PERF_METADATA_CURRENT_VERSION: u64 = 1; | ||
| const PERF_METADATA_CURRENT_VERSION: u64 = 3; |
| use std::path::PathBuf; | ||
|
|
||
| #[derive(Default)] | ||
| pub struct MountedModule { |
There was a problem hiding this comment.
Not sure if "mounted" is the right name for this. Wouldn't it be more like MappedModule?
| /// Unwind data extracted from the mapped ELF file | ||
| pub unwind_data: Option<UnwindData>, | ||
| /// Per-process mounting information | ||
| pub process_mounted_module: HashMap<pid_t, ProcessMountedModule>, |
| let load_bias = | ||
| assert_and_get_load_bias(start_addr, end_addr, file_offset, MODULE_PATH, 0x0); |
There was a problem hiding this comment.
Don't have to change it, but we could instead also define a variable load_bias that is passed as the last parameter to assert_load_bias, which we then can also use below in the unwind_data_from_elf
Feels a bit cleaner than having a assert_and_get function
Since the real deduplicator source of truth is actually the path of the elf file on disk, I settled on
it allows for the filename to be determined by the actual path, without having to either nest all the files, or have to sanitize paths to file name and end up in awkward file-name length limitations
Here's what it looks like on a 1000 iterations report of one simple program