[pull] master from ray-project:master#832
Merged
pull[bot] merged 6 commits intogarymm:masterfrom Mar 16, 2026
Merged
Conversation
updating lock file for ci py3.10 deps Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
…61708) The compile_pip_requirements rule used the autodetecting Python toolchain, which resolved to system python. Fix by inlining the compile_pip_requirements logic as a py_binary + py_test pair with exec_compatible_with = ["//bazel:py310"], which forces Bazel to select the hermetic Python 3.10 toolchain already registered in WORKSPACE. Topic: fix-requirements-update Signed-off-by: andrew <andrew@anyscale.com> Signed-off-by: andrew <andrew@anyscale.com>
## Summary
- Add client IP:port to Ray Serve HTTP access logs
- Thread client address from the proxy through the request context and
metadata to the replica
- Handle both proxy-routed and direct ingress HTTP paths
For services behind a load balancer, uvicorn's `ProxyHeadersMiddleware`
(enabled by default) resolves `X-Forwarded-For` into `scope["client"]`
automatically, so the logged IP reflects the original client when
`FORWARDED_ALLOW_IPS` is configured.
## How It Works
The client IP is available at the entry point (proxy or direct ingress
replica) but needs to reach the replica's access log, which runs in a
separate process. The data flows through existing infrastructure:
```
External Client (10.0.91.46:54321)
|
[ Proxy ]
| 1. Reads scope["client"] via proxy_request.client
| 2. format_client_address() formats the raw tuple into "host:port"
| 3. Logs it in the proxy access log
| 4. Passes it into _RequestContext._client
|
[ DeploymentHandle ]
| 5. default_impl.py copies _RequestContext._client → RequestMetadata._client
|
[ Replica ]
| 6. Reads request_metadata._client and logs it in the replica access log
```
For **direct ingress HTTP** (replica serves HTTP directly, no proxy),
the replica reads `scope["client"]` itself and formats it with the same
`format_client_address()`.
---
## Update: Feature flag gating
Per review feedback, the client IP logging is now gated behind a feature
flag that is **off by default**:
```
RAY_SERVE_LOG_CLIENT_ADDRESS=1
```
The gate is centralized in `access_log_msg()` in `logging_utils.py` —
when the flag is off, the `client` parameter is ignored and the log
format is unchanged from before this PR. The client address data still
flows through the request context, but is simply not rendered in logs
unless the flag is enabled.
**Tests:** Added a parametrized integration test
(`test_http_access_log_client_address`) that verifies both flag states —
client IP present when on, absent when off.
---------
Signed-off-by: harshit <harshit@anyscale.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Bazel 7 removed the exec_tools attribute from genrule; patch protobuf's BUILD files to use tools instead. Signed-off-by: andrew <andrew@anyscale.com>
…gs for Bazel 7 (#61695) gRPC's grpc_deps() pulls in rules_apple 1.1.3 which uses apple_common.multi_arch_split, removed in Bazel 7. Override to 3.2.1 (compatible with Bazel 6/7/8) before grpc_deps() runs so the maybe() call is a no-op. rules_apple 3.2.1 requires apple_support >= 1.11.1. Patch is_xcode_at_least_version to return False instead of fail() on CLT-only CI machines where xcode_config.xcode_version() is None. Set BAZEL_NO_APPLE_CPP_TOOLCHAIN=1 to skip apple_cc_toolchain on CLT-only machines where it would fail with "Xcode version must be specified". Override -mmacosx-version-min to 12.0 to satisfy std::filesystem and std::variant requirements (generic toolchain defaults to 10.11). Signed-off-by: andrew <andrew@anyscale.com>
…Killer (#60330) ## Description In recent investigations on memory usage issues, we found that: * After a worker becomes IDLE after done with a task, the worker can still take comparatively large amount of space of memory space (~1GB) * The current Ray OOM killer will only kill worker processes with a task scheduled on it To solve the above findings, ideally we should investigate the root cause of why the IDLE workers still takes up large memory space, then fix the memory usage issue and/or update the OOM killer's logic based on the findings. While the above are happening, a short term mitigation that can help with the situation is for the OOM killer to prioritizing killing those IDLE workers that occupies large memory space. This PR implements the short term mitigation by: 1. Add a ray config `idle_worker_killing_memory_threshold_bytes` to indicate the threshold of whether the OOM killer should consider killing the IDLE worker. The default is set to 1GB. We need a threshold because we want to avoid the case where the freshly created IDLE workers from worker pre-start are being killed. This is because killing those IDLE workers won't help much with the memory usage. 2. Update the current OOM killer logic to check and pick a IDLE worker to kill if it possible, before applying the current memory killing logic. 3. Update the `ray_memory_manager_worker_eviction_total` metric to include `MemoryManager.IdleWorkerEviction.Total` type to track the number of idle worker termination 4. Add the corresponding test cases 5. Did some code cleanup along the way ## Related issues N/A ## Additional information Log line changes. For killing the idle worker, we will output the following log line: ``` [2026-02-27 23:33:10,289 I 3779325 3779325] (raylet) node_manager.cc:3078: Killing 1 worker(s), kill details: Memory on the node (IP: 172.31.14.189, ID: 60e4ea8d3b8fc0f4e99ed19c87bf5f9282797707af6e5babca343c7d) was 31.01GB / 62.01GB (0.500018), which exceeds the memory usage threshold of 0.500000; Object store memory usage: [- objects spillable: 0; - bytes spillable: 0; - objects unsealed: 0; - bytes unsealed: 0; - objects in use: 0; - bytes in use: 0; - objects evictable: 0; - bytes evictable: 0; ; - objects created by worker: 0; - bytes created by worker: 0; - objects restored: 0; - bytes restored: 0; - objects received: 0; - bytes received: 0; - objects errored: 0; - bytes errored: 0; ; Eviction Stats:; (global lru) capacity: 104857600; (global lru) used: 0%; (global lru) num objects: 0; (global lru) num evictions: 0; (global lru) bytes evicted: 0]; Ray killed 1 worker(s) based on the killing policy:[Worker with no lease granted: job ID=01000000, pid=3779512, required resources={CPU: 1}, actual memory used=1.18GB, worker ID=23139757f947661f6be1db7c25ee7b7ce449c21e927bdc2134d9b08e)]; To see more information about memory usage on this node, use `ray logs raylet.out -ip 172.31.14.189`; Top 10 memory users: PID MEM(GB) COMMAND, 3779511 18.92 ray::allocate_memory, 3779512 1.18 ray::IDLE, 3779514 1.18 ray::IDLE, 3779513 1.17 ray::IDLE, 3779519 1.17 ray::IDLE, 3752505 0.95 bazel(core-1792) --add-opens=java.base/java.lang=ALL-UNNAMED -Xverify:none -Djava.util.logging.confi..., 3753337 0.86 /home/ubuntu/.cursor-server/cli/servers/Stable-7b98dcb824ea96c9c62362a5e80dbf0d1aae4770/server/node ..., 3754326 0.82 /home/ubuntu/.cursor-server/cli/servers/Stable-7b98dcb824ea96c9c62362a5e80dbf0d1aae4770/server/node ..., 3760346 0.76 /home/ubuntu/.cursor-server/extensions/ms-vscode.cpptools-1.23.6-linux-x64/bin/cpptools-srv 3753753 ..., 3753753 0.65 /home/ubuntu/.cursor-server/extensions/ms-vscode.cpptools-1.23.6-linux-x64/bin/cpptools, suggestions: Refer to the documentation on how to address the out of memory issue: https://docs.ray.io/en/latest/ray-core/scheduling/ray-oom-prevention.html. Consider provisioning more memory on this node or reducing task parallelism by requesting more CPUs per task. To adjust the kill threshold, set the environment variable `RAY_memory_usage_threshold` when starting Ray. To disable worker killing, set the environment variable `RAY_memory_monitor_refresh_ms` to zero. ``` Followup action items: * Investigate and fix the reason why the workers still take up large amount of memory after being IDLE * Based on the above investigation, improve the memory killer with better heuristic in killing the worker processes --------- Signed-off-by: myan <myan@anyscale.com> Signed-off-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: Ibrahim Rabbani <israbbani@gmail.com> Co-authored-by: Kunchen (David) Dai <54918178+Kunchd@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )