Skip to content

Conversation

@JimB123
Copy link
Owner

@JimB123 JimB123 commented Nov 19, 2024

Here's how I would recommend to change zmalloc_get_allocator_info.

Notes:

  • For simple cached values like this, I prefer using statics scoped to the function. IMO this is easier to understand and maintain than using globals.
  • I modified jeQueryKeyInit and jeQueryCtlInterface to simply assert on failure rather than passing around return codes. These shouldn't fail.
  • I modified jeRefreshStats to locally cache its query key and simplify the interface.
  • I modified jeQueryCtlInterface to allow calls with a null value pointer. Its a common pattern to allow optional values when querying multiple things. This will transparently ignore unsupplied optional values.

JimB123 pushed a commit that referenced this pull request Feb 17, 2025
Fix new unittest networking use-after-free error


```
==96611==ERROR: AddressSanitizer: heap-use-after-free on address 0x503000075e00 at pc 0x55e52cbe1495 bp 0x7ffd9e1fc690 sp 0x7ffd9e1fc688
READ of size 8 at 0x503000075e00 thread T0
    #0 0x55e52cbe[149](https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457#step:10:150)4 in freeReplicaReferencedReplBuffer /home/runner/work/valkey/valkey/src/replication.c:401:27
    #1 0x55e52cbe7abf in freeClientReplicationData /home/runner/work/valkey/valkey/src/replication.c:1261:5
    #2 0x55e52cb17a44 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:188:5
    #3 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    #4 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    #5 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #6 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #7 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)

0x503000075e00 is located 16 bytes inside of 24-byte region [0x503000075df0,0x503000075e08)
freed by thread T0 here:
    #0 0x55e52ca50a7a in free (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x212a7a) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
    #1 0x55e52cb905ba in listEmpty /home/runner/work/valkey/valkey/src/adlist.c:64:9
    #2 0x55e52cb179e5 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:179:9
    #3 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    #4 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    #5 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #6 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #7 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)

previously allocated by thread T0 here:
    #0 0x55e52ca50d13 in malloc (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x212d13) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
    #1 0x55e52cbb844f in ztrymalloc_usable_internal /home/runner/work/valkey/valkey/src/zmalloc.c:[155](https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457#step:10:156):17
    #2 0x55e52cbb844f in valkey_malloc /home/runner/work/valkey/valkey/src/zmalloc.c:184:17
    #3 0x55e52cb90be6 in listAddNodeTail /home/runner/work/valkey/valkey/src/adlist.c:126:17
    #4 0x55e52cb17873 in test_writeToReplica /home/runner/work/valkey/valkey/src/unit/test_networking.c:167:9
    #5 0x55e52cac976b in runTestSuite /home/runner/work/valkey/valkey/src/unit/test_main.c:26:28
    #6 0x55e52cac9bae in main /home/runner/work/valkey/valkey/src/unit/test_main.c:61:14
    #7 0x7fded4c2a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #8 0x7fded4c2a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
    #9 0x55e52c9b5ec4 in _start (/home/runner/work/valkey/valkey/src/valkey-unit-tests+0x177ec4) (BuildId: 587aaf0e86abaf104cbb714f290b1436f8ddf614)
```


https://github.com/valkey-io/valkey/actions/runs/13230922385/job/36927929457

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
JimB123 pushed a commit that referenced this pull request Jul 16, 2025
When calling the command `EVAL error{} 0`, Valkey crashes with the
following stack trace. This patch ensures we never leave the
`err_info.msg` field null when we fail to extract a proper error
message.

```
=== VALKEY BUG REPORT START: Cut & paste starting from here ===
2595901:M 18 Jun 2025 01:20:12.917 # valkey 8.1.2 crashed by signal: 11, si_code: 1
2595901:M 18 Jun 2025 01:20:12.917 # Accessing address: (nil)
2595901:M 18 Jun 2025 01:20:12.917 # Crashed running the instruction at: 0x726f8e57ed1d

------ STACK TRACE ------
EIP:
/usr/lib/libc.so.6(+0x16ed1d) [0x726f8e57ed1d]

2595905 bio_aof
/usr/lib/libc.so.6(+0x9de22) [0x726f8e4ade22]
/usr/lib/libc.so.6(+0x91fda) [0x726f8e4a1fda]
/usr/lib/libc.so.6(+0x9264c) [0x726f8e4a264c]
/usr/lib/libc.so.6(pthread_cond_wait+0x14e) [0x726f8e4a4d1e]
valkey-server *:6379(bioProcessBackgroundJobs+0x1b4) [0x6530abb46db4]
/usr/lib/libc.so.6(+0x957eb) [0x726f8e4a57eb]
/usr/lib/libc.so.6(+0x11918c) [0x726f8e52918c]

2595904 bio_close_file
/usr/lib/libc.so.6(+0x9de22) [0x726f8e4ade22]
/usr/lib/libc.so.6(+0x91fda) [0x726f8e4a1fda]
/usr/lib/libc.so.6(+0x9264c) [0x726f8e4a264c]
/usr/lib/libc.so.6(pthread_cond_wait+0x14e) [0x726f8e4a4d1e]
valkey-server *:6379(bioProcessBackgroundJobs+0x1b4) [0x6530abb46db4]
/usr/lib/libc.so.6(+0x957eb) [0x726f8e4a57eb]
/usr/lib/libc.so.6(+0x11918c) [0x726f8e52918c]

2595901 valkey-server *
/usr/lib/libc.so.6(+0x3def0) [0x726f8e44def0]
/usr/lib/libc.so.6(+0x16ed1d) [0x726f8e57ed1d]
valkey-server *:6379(sdscatfmt+0x894) [0x6530abaa24a4]
valkey-server *:6379(luaCallFunction+0x39a) [0x6530abbc66ea]
valkey-server *:6379(+0x1a0992) [0x6530abbc6992]
valkey-server *:6379(scriptingEngineCallFunction+0x98) [0x6530abbc1298]
valkey-server *:6379(+0x11ff55) [0x6530abb45f55]
valkey-server *:6379(call+0x174) [0x6530aba94454]
valkey-server *:6379(processCommand+0x93d) [0x6530aba958dd]
valkey-server *:6379(processCommandAndResetClient+0x21) [0x6530abaa9d11]
valkey-server *:6379(processInputBuffer+0xe3) [0x6530abaaee83]
valkey-server *:6379(readQueryFromClient+0x65) [0x6530abaaef55]
valkey-server *:6379(+0x18e31a) [0x6530abbb431a]
valkey-server *:6379(aeProcessEvents+0x24a) [0x6530aba790ca]
valkey-server *:6379(aeMain+0x2d) [0x6530aba7938d]
valkey-server *:6379(main+0x3f6) [0x6530aba6e7b6]
/usr/lib/libc.so.6(+0x276b5) [0x726f8e4376b5]
/usr/lib/libc.so.6(__libc_start_main+0x89) [0x726f8e437769]
valkey-server *:6379(_start+0x25) [0x6530aba70235]

2595906 bio_lazy_free
/usr/lib/libc.so.6(+0x9de22) [0x726f8e4ade22]
/usr/lib/libc.so.6(+0x91fda) [0x726f8e4a1fda]
/usr/lib/libc.so.6(+0x9264c) [0x726f8e4a264c]
/usr/lib/libc.so.6(pthread_cond_wait+0x14e) [0x726f8e4a4d1e]
valkey-server *:6379(bioProcessBackgroundJobs+0x1b4) [0x6530abb46db4]
/usr/lib/libc.so.6(+0x957eb) [0x726f8e4a57eb]
/usr/lib/libc.so.6(+0x11918c) [0x726f8e52918c]

4/4 expected stacktraces.

------ STACK TRACE DONE ------

------ REGISTERS ------
2595901:M 18 Jun 2025 01:20:12.920 # 
RAX:0000000000000000 RBX:0000726f8dd35663
RCX:0000000000000000 RDX:0000000000000000
RDI:0000000000000000 RSI:0000000000000010
RBP:00007ffc2b821a80 RSP:00007ffc2b821938
R8 :000000000000000c R9 :00006530abc111b8
R10:0000000000000001 R11:0000000000000003
R12:00006530abc49adc R13:00006530abc111b7
R14:0000000000000001 R15:0000000000000001
RIP:0000726f8e57ed1d EFL:0000000000010283
CSGSFS:002b000000000033
2595901:M 18 Jun 2025 01:20:12.921 * hide-user-data-from-log is on, skip logging stack content to avoid spilling user data.

------ INFO OUTPUT ------
# Server
redis_version:7.2.4
server_name:valkey
valkey_version:8.1.2
valkey_release_stage:ga
redis_git_sha1:00000000
redis_git_dirty:0
redis_build_id:38d65aa7b4148d2c
server_mode:standalone
os:Linux 6.14.6-arch1-1 x86_64
arch_bits:64
monotonic_clock:POSIX clock_gettime
multiplexing_api:epoll
gcc_version:15.1.1
process_id:2595901
process_supervised:no
run_id:a0b75f67a217a81142f17553028c010e86c1ee80
tcp_port:6379
server_time_usec:1750209612917634
uptime_in_seconds:16
uptime_in_days:0
hz:10
configured_hz:10
clients_hz:10
lru_clock:5379148
executable:/home/fusl/valkey-server
config_file:
io_threads_active:0
availability_zone:
listener0:name=tcp,bind=*,bind=-::*,port=6379

# Clients
connected_clients:1
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:0
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
pubsub_clients:0
watching_clients:0
clients_in_timeout_table:0
total_watched_keys:0
total_blocking_keys:0
total_blocking_keys_on_nokey:0
paused_reason:none
paused_actions:none
paused_timeout_milliseconds:0

# Memory
used_memory:911824
used_memory_human:890.45K
used_memory_rss:15323136
used_memory_rss_human:14.61M
used_memory_peak:911824
used_memory_peak_human:890.45K
used_memory_peak_perc:100.29%
used_memory_overhead:892232
used_memory_startup:891824
used_memory_dataset:19592
used_memory_dataset_perc:97.96%
allocator_allocated:1845952
allocator_active:1986560
allocator_resident:6672384
allocator_muzzy:0
total_system_memory:67323842560
total_system_memory_human:62.70G
used_memory_lua:34816
used_memory_vm_eval:34816
used_memory_lua_human:34.00K
used_memory_scripts_eval:184
number_of_cached_scripts:1
number_of_functions:0
number_of_libraries:0
used_memory_vm_functions:33792
used_memory_vm_total:68608
used_memory_vm_total_human:67.00K
used_memory_functions:224
used_memory_scripts:408
used_memory_scripts_human:408B
maxmemory:0
maxmemory_human:0B
maxmemory_policy:noeviction
allocator_frag_ratio:1.00
allocator_frag_bytes:0
allocator_rss_ratio:3.36
allocator_rss_bytes:4685824
rss_overhead_ratio:2.30
rss_overhead_bytes:8650752
mem_fragmentation_ratio:17.18
mem_fragmentation_bytes:14431168
mem_not_counted_for_evict:0
mem_replication_backlog:0
mem_total_replication_buffers:0
mem_clients_slaves:0
mem_clients_normal:0
mem_cluster_links:0
mem_aof_buffer:0
mem_allocator:jemalloc-5.3.0
mem_overhead_db_hashtable_rehashing:0
active_defrag_running:0
lazyfree_pending_objects:0
lazyfreed_objects:0

# Persistence
loading:0
async_loading:0
current_cow_peak:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1750209596
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:-1
rdb_current_bgsave_time_sec:-1
rdb_saves:0
rdb_last_cow_size:0
rdb_last_load_keys_expired:0
rdb_last_load_keys_loaded:0
aof_enabled:0
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_rewrites:0
aof_rewrites_consecutive_failures:0
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0

# Stats
total_connections_received:1
total_commands_processed:0
instantaneous_ops_per_sec:0
total_net_input_bytes:34
total_net_output_bytes:0
total_net_repl_input_bytes:0
total_net_repl_output_bytes:0
instantaneous_input_kbps:0.00
instantaneous_output_kbps:0.00
instantaneous_input_repl_kbps:0.00
instantaneous_output_repl_kbps:0.00
rejected_connections:0
sync_full:0
sync_partial_ok:0
sync_partial_err:0
expired_keys:0
expired_stale_perc:0.00
expired_time_cap_reached_count:0
expire_cycle_cpu_milliseconds:0
evicted_keys:0
evicted_clients:0
evicted_scripts:0
total_eviction_exceeded_time:0
current_eviction_exceeded_time:0
keyspace_hits:0
keyspace_misses:0
pubsub_channels:0
pubsub_patterns:0
pubsubshard_channels:0
latest_fork_usec:0
total_forks:0
migrate_cached_sockets:0
slave_expires_tracked_keys:0
active_defrag_hits:0
active_defrag_misses:0
active_defrag_key_hits:0
active_defrag_key_misses:0
total_active_defrag_time:0
current_active_defrag_time:0
tracking_total_keys:0
tracking_total_items:0
tracking_total_prefixes:0
unexpected_error_replies:0
total_error_replies:0
dump_payload_sanitizations:0
total_reads_processed:1
total_writes_processed:0
io_threaded_reads_processed:0
io_threaded_writes_processed:0
io_threaded_freed_objects:0
io_threaded_accept_processed:0
io_threaded_poll_processed:0
io_threaded_total_prefetch_batches:0
io_threaded_total_prefetch_entries:0
client_query_buffer_limit_disconnections:0
client_output_buffer_limit_disconnections:0
reply_buffer_shrinks:0
reply_buffer_expands:0
eventloop_cycles:170
eventloop_duration_sum:17739
eventloop_duration_cmd_sum:0
instantaneous_eventloop_cycles_per_sec:9
instantaneous_eventloop_duration_usec:99
acl_access_denied_auth:0
acl_access_denied_cmd:0
acl_access_denied_key:0
acl_access_denied_channel:0

# Replication
role:master
connected_slaves:0
replicas_waiting_psync:0
master_failover_state:no-failover
master_replid:d35a0bb7979f490a60174bb363524431d7eb2428
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:10485760
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

# CPU
used_cpu_sys:0.012543
used_cpu_user:0.016853
used_cpu_sys_children:0.000000
used_cpu_user_children:0.000000
used_cpu_sys_main_thread:0.012440
used_cpu_user_main_thread:0.016714

# Modules

# Commandstats

# Errorstats

# Latencystats

# Cluster
cluster_enabled:0

# Keyspace

------ CLIENT LIST OUTPUT ------
id=2 addr=127.0.0.1:41372 laddr=127.0.0.1:6379 fd=10 name=*redacted* age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=12 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=17060 events=r cmd=eval user=*redacted* redir=-1 resp=2 lib-name= lib-ver= tot-net-in=34 tot-net-out=0 tot-cmds=0

------ CURRENT CLIENT INFO ------
id=2 addr=127.0.0.1:41372 laddr=127.0.0.1:6379 fd=10 name=*redacted* age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=12 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=17060 events=r cmd=eval user=*redacted* redir=-1 resp=2 lib-name= lib-ver= tot-net-in=34 tot-net-out=0 tot-cmds=0
argc: 3
argv[0]: "eval"
argv[1]: 7 bytes
argv[2]: 1 bytes

------ EXECUTING CLIENT INFO ------
id=2 addr=127.0.0.1:41372 laddr=127.0.0.1:6379 fd=10 name=*redacted* age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=12 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=17060 events=r cmd=eval user=*redacted* redir=-1 resp=2 lib-name= lib-ver= tot-net-in=34 tot-net-out=0 tot-cmds=0
argc: 3
argv[0]: "eval"
argv[1]: 7 bytes
argv[2]: 1 bytes

------ MODULES INFO OUTPUT ------

------ CONFIG DEBUG OUTPUT ------
repl-diskless-load disabled
debug-context ""
sanitize-dump-payload no
lazyfree-lazy-user-del yes
lazyfree-lazy-server-del yes
import-mode no
lazyfree-lazy-user-flush yes
list-compress-depth 0
dual-channel-replication-enabled no
repl-diskless-sync yes
activedefrag no
lazyfree-lazy-expire yes
io-threads 1
replica-read-only yes
client-query-buffer-limit 1gb
slave-read-only yes
lazyfree-lazy-eviction yes
proto-max-bulk-len 512mb

------ FAST MEMORY TEST ------
2595901:M 18 Jun 2025 01:20:12.921 # Bio worker thread #0 terminated
2595901:M 18 Jun 2025 01:20:12.921 # Bio worker thread #1 terminated
2595901:M 18 Jun 2025 01:20:12.921 # Bio worker thread #2 terminated
*** Preparing to test memory region 6530abce2000 (212992 bytes)
*** Preparing to test memory region 726f8af7f000 (2621440 bytes)
*** Preparing to test memory region 726f8b200000 (8388608 bytes)
*** Preparing to test memory region 726f8ba00000 (4194304 bytes)
*** Preparing to test memory region 726f8bffe000 (8388608 bytes)
*** Preparing to test memory region 726f8c7ff000 (8388608 bytes)
*** Preparing to test memory region 726f8d000000 (8388608 bytes)
*** Preparing to test memory region 726f8dc00000 (4194304 bytes)
*** Preparing to test memory region 726f8e290000 (16384 bytes)
*** Preparing to test memory region 726f8e3d2000 (20480 bytes)
*** Preparing to test memory region 726f8e5f8000 (32768 bytes)
*** Preparing to test memory region 726f8eb58000 (12288 bytes)
*** Preparing to test memory region 726f8eb5c000 (16384 bytes)
*** Preparing to test memory region 726f8ed63000 (4096 bytes)
*** Preparing to test memory region 726f8eef2000 (397312 bytes)
*** Preparing to test memory region 726f8efc7000 (4096 bytes)
.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.O
Fast memory test PASSED, however your memory can still be broken. Please run a memory test for several hours if possible.

------ DUMPING CODE AROUND EIP ------
Symbol: (null) (base: (nil))
Module: /usr/lib/libc.so.6 (base 0x726f8e410000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=(nil) -D -b binary -m i386:x86-64 /tmp/dump.bin
------

=== VALKEY BUG REPORT END. Make sure to include from START to END. ===
```

---------

Signed-off-by: Fusl <fusl@meo.ws>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
JimB123 pushed a commit that referenced this pull request Jul 16, 2025
…y-io#2257)

**Current state**
During `hashtableScanDefrag`, rehashing is paused to prevent entries
from moving, but the scan callback can still delete entries which
triggers `hashtableShrinkIfNeeded`. For example, the
`expireScanCallback` can delete expired entries.

**Issue**
This can cause the table to be resized and the old memory to be freed
while the scan is still accessing it, resulting in the following memory
access violation:

```
[err]: Sanitizer error: =================================================================
==46774==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000003100 at pc 0x0000004704d3 bp 0x7fffcb062000 sp 0x7fffcb061ff0
READ of size 1 at 0x611000003100 thread T0
    #0 0x4704d2 in isPositionFilled /home/gusakovy/Projects/valkey/src/hashtable.c:422
    #1 0x478b45 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1768
    #2 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    #3 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    #4 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    #5 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    #6 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    #7 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    #8 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    #9 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    #10 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    #11 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)
    #12 0x452e39 in _start (/local/home/gusakovy/Projects/valkey/src/valkey-server+0x452e39)

0x611000003100 is located 0 bytes inside of 256-byte region [0x611000003100,0x611000003200)
freed by thread T0 here:
    #0 0x7f471a34a1e5 in __interceptor_free (/lib64/libasan.so.4+0xd81e5)
    #1 0x4aefbc in zfree_internal /home/gusakovy/Projects/valkey/src/zmalloc.c:400
    #2 0x4aeff5 in valkey_free /home/gusakovy/Projects/valkey/src/zmalloc.c:415
    #3 0x4707d2 in rehashingCompleted /home/gusakovy/Projects/valkey/src/hashtable.c:456
    #4 0x471b5b in resize /home/gusakovy/Projects/valkey/src/hashtable.c:656
    #5 0x475bff in hashtableShrinkIfNeeded /home/gusakovy/Projects/valkey/src/hashtable.c:1272
    #6 0x47704b in hashtablePop /home/gusakovy/Projects/valkey/src/hashtable.c:1448
    #7 0x47716f in hashtableDelete /home/gusakovy/Projects/valkey/src/hashtable.c:1459
    #8 0x480038 in kvstoreHashtableDelete /home/gusakovy/Projects/valkey/src/kvstore.c:847
    #9 0x50c12c in dbGenericDeleteWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:490
    #10 0x515f28 in deleteExpiredKeyAndPropagateWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:1831
    #11 0x516103 in deleteExpiredKeyAndPropagate /home/gusakovy/Projects/valkey/src/db.c:1844
    #12 0x6d8642 in activeExpireCycleTryExpire /home/gusakovy/Projects/valkey/src/expire.c:70
    valkey-io#13 0x6d8706 in expireScanCallback /home/gusakovy/Projects/valkey/src/expire.c:139
    valkey-io#14 0x478bd8 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1770
    valkey-io#15 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    valkey-io#16 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    valkey-io#17 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    valkey-io#18 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    valkey-io#19 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    valkey-io#20 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    valkey-io#21 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    valkey-io#22 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    valkey-io#23 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    valkey-io#24 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)

previously allocated by thread T0 here:
    #0 0x7f471a34a753 in __interceptor_calloc (/lib64/libasan.so.4+0xd8753)
    #1 0x4ae48c in ztrycalloc_usable_internal /home/gusakovy/Projects/valkey/src/zmalloc.c:214
    #2 0x4ae757 in valkey_calloc /home/gusakovy/Projects/valkey/src/zmalloc.c:257
    #3 0x4718fc in resize /home/gusakovy/Projects/valkey/src/hashtable.c:645
    #4 0x475bff in hashtableShrinkIfNeeded /home/gusakovy/Projects/valkey/src/hashtable.c:1272
    #5 0x47704b in hashtablePop /home/gusakovy/Projects/valkey/src/hashtable.c:1448
    #6 0x47716f in hashtableDelete /home/gusakovy/Projects/valkey/src/hashtable.c:1459
    #7 0x480038 in kvstoreHashtableDelete /home/gusakovy/Projects/valkey/src/kvstore.c:847
    #8 0x50c12c in dbGenericDeleteWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:490
    #9 0x515f28 in deleteExpiredKeyAndPropagateWithDictIndex /home/gusakovy/Projects/valkey/src/db.c:1831
    #10 0x516103 in deleteExpiredKeyAndPropagate /home/gusakovy/Projects/valkey/src/db.c:1844
    #11 0x6d8642 in activeExpireCycleTryExpire /home/gusakovy/Projects/valkey/src/expire.c:70
    #12 0x6d8706 in expireScanCallback /home/gusakovy/Projects/valkey/src/expire.c:139
    valkey-io#13 0x478bd8 in hashtableScanDefrag /home/gusakovy/Projects/valkey/src/hashtable.c:1770
    valkey-io#14 0x4789c2 in hashtableScan /home/gusakovy/Projects/valkey/src/hashtable.c:1729
    valkey-io#15 0x47e3ca in kvstoreScan /home/gusakovy/Projects/valkey/src/kvstore.c:402
    valkey-io#16 0x6d9040 in activeExpireCycle /home/gusakovy/Projects/valkey/src/expire.c:297
    valkey-io#17 0x4859d2 in databasesCron /home/gusakovy/Projects/valkey/src/server.c:1269
    valkey-io#18 0x486e92 in serverCron /home/gusakovy/Projects/valkey/src/server.c:1577
    valkey-io#19 0x4637dd in processTimeEvents /home/gusakovy/Projects/valkey/src/ae.c:370
    valkey-io#20 0x4643e3 in aeProcessEvents /home/gusakovy/Projects/valkey/src/ae.c:513
    valkey-io#21 0x4647ea in aeMain /home/gusakovy/Projects/valkey/src/ae.c:543
    valkey-io#22 0x4a61fc in main /home/gusakovy/Projects/valkey/src/server.c:7291
    valkey-io#23 0x7f471957c139 in __libc_start_main (/lib64/libc.so.6+0x21139)

SUMMARY: AddressSanitizer: heap-use-after-free /home/gusakovy/Projects/valkey/src/hashtable.c:422 in isPositionFilled
Shadow bytes around the buggy address:
  0x0c227fff85d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff85e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff85f0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c227fff8600: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8610: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
=>0x0c227fff8620:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8630: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c227fff8640: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8650: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8660: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c227fff8670: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==46774==ABORTING
```


**Solution**
Suggested solution is to also pause auto shrinking during
`hashtableScanDefrag`. I noticed that there was already a
`hashtablePauseAutoShrink` method and `pause_auto_shrink` counter, but
it wasn't actually used in `hashtableShrinkIfNeeded` so I fixed that.

**Testing**
I created a simple tcl test that (most of the times) triggers this
error, but it's a little clunky so I didn't add it as part of the PR:

```
start_server {tags {"expire hashtable defrag"}} {
    test {hashtable scan defrag on expiry} {

        r config set hz 100

        set num_keys 20
        for {set i 0} {$i < $num_keys} {incr i} {
            r set "key_$i" "value_$i"
        }

        for {set j 0} {$j < 50} {incr j} {
            set expire_keys 100
            for {set i 0} {$i < $expire_keys} {incr i} {
                # Short expiry time to ensure they expire quickly
                r psetex "expire_key_${i}_${j}" 100 "expire_value_${i}_${j}"
            }

            # Verify keys are set
            set initial_size [r dbsize]
            assert_equal $initial_size [expr $num_keys + $expire_keys]
            
            after 150
            for {set i 0} {$i < 10} {incr i} {
                r get "expire_key_${i}_${j}"
                after 10
            }
        }

        set remaining_keys [r dbsize]
        assert_equal $remaining_keys $num_keys

        # Verify server is still responsive
        assert_equal [r ping] {PONG}
    } {}
}
```
Compiling with ASAN using `make noopt SANITIZER=address valkey-server`
and running the test causes error above. Applying the fix resolves the
issue.

Signed-off-by: Yakov Gusakov <yaakov0015@gmail.com>
JimB123 pushed a commit that referenced this pull request Oct 14, 2025
With valkey-io#1401, we introduced additional filters to CLIENT LIST/KILL
subcommand. The intended behavior was to pick the last value of the
filter. However, we introduced memory leak for all the preceding
filters.

Before this change:
```
> CLIENT LIST IP 127.0.0.1 IP 127.0.0.1
id=4 addr=127.0.0.1:37866 laddr=127.0.0.1:6379 fd=10 name= age=0 idle=0 flags=N capa= db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=21 multi-mem=0 rbs=16384 rbp=16384 obl=0 oll=0 omem=0 tot-mem=16989 events=r cmd=client|list user=default redir=-1 resp=2 lib-name= lib-ver= tot-net-in=49 tot-net-out=0 tot-cmds=0
```
Leak:
```
Direct leak of 11 byte(s) in 1 object(s) allocated from:
    #0 0x7f2901aa557d in malloc (/lib64/libasan.so.4+0xd857d)
    #1 0x76db76 in ztrymalloc_usable_internal /workplace/harkrisp/valkey/src/zmalloc.c:156
    #2 0x76db76 in zmalloc_usable /workplace/harkrisp/valkey/src/zmalloc.c:200
    #3 0x4c4121 in _sdsnewlen.constprop.230 /workplace/harkrisp/valkey/src/sds.c:113
    #4 0x4dc456 in parseClientFiltersOrReply.constprop.63 /workplace/harkrisp/valkey/src/networking.c:4264
    #5 0x4bb9f7 in clientListCommand /workplace/harkrisp/valkey/src/networking.c:4600
    #6 0x641159 in call /workplace/harkrisp/valkey/src/server.c:3772
    #7 0x6431a6 in processCommand /workplace/harkrisp/valkey/src/server.c:4434
    #8 0x4bfa9b in processCommandAndResetClient /workplace/harkrisp/valkey/src/networking.c:3571
    #9 0x4bfa9b in processInputBuffer /workplace/harkrisp/valkey/src/networking.c:3702
    #10 0x4bffa3 in readQueryFromClient /workplace/harkrisp/valkey/src/networking.c:3812
    #11 0x481015 in callHandler /workplace/harkrisp/valkey/src/connhelpers.h:79
    #12 0x481015 in connSocketEventHandler.lto_priv.394 /workplace/harkrisp/valkey/src/socket.c:301
    valkey-io#13 0x7d3fb3 in aeProcessEvents /workplace/harkrisp/valkey/src/ae.c:486
    valkey-io#14 0x7d4d44 in aeMain /workplace/harkrisp/valkey/src/ae.c:543
    valkey-io#15 0x453925 in main /workplace/harkrisp/valkey/src/server.c:7319
    valkey-io#16 0x7f2900cd7139 in __libc_start_main (/lib64/libc.so.6+0x21139)
```

Note: For filter ID / NOT-ID we group all the option and perform
filtering whereas for remaining filters we only pick the last filter
option.

---------

Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant