-
Notifications
You must be signed in to change notification settings - Fork 105
[Deepin-Kernel-SIG] [linux 6.18-y] [Upstream] Update kernel base to 6.18.5 #1442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: linux-6.18.y
Are you sure you want to change the base?
[Deepin-Kernel-SIG] [linux 6.18-y] [Upstream] Update kernel base to 6.18.5 #1442
Conversation
[ Upstream commit 86730ac255b0497a272704de9a1df559f5d6602e ] After the blamed commit below, if the MPC subflow is already in TCP_CLOSE status or has fallback to TCP at mptcp_disconnect() time, mptcp_do_fastclose() skips setting the `send_fastclose flag` and the later __mptcp_close_ssk() does not reset anymore the related subflow context. Any later connection will be created with both the `request_mptcp` flag and the msk-level fallback status off (it is unconditionally cleared at MPTCP disconnect time), leading to a warning in subflow_data_ready(): WARNING: CPU: 26 PID: 8996 at net/mptcp/subflow.c:1519 subflow_data_ready (net/mptcp/subflow.c:1519 (discriminator 13)) Modules linked in: CPU: 26 UID: 0 PID: 8996 Comm: syz.22.39 Not tainted 6.18.0-rc7-05427-g11fc074f6c36 #1 PREEMPT(voluntary) Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 RIP: 0010:subflow_data_ready (net/mptcp/subflow.c:1519 (discriminator 13)) Code: 90 0f 0b 90 90 e9 04 fe ff ff e8 b7 1e f5 fe 89 ee bf 07 00 00 00 e8 db 19 f5 fe 83 fd 07 0f 84 35 ff ff ff e8 9d 1e f5 fe 90 <0f> 0b 90 e9 27 ff ff ff e8 8f 1e f5 fe 4c 89 e7 48 89 de e8 14 09 RSP: 0018:ffffc9002646fb30 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff88813b218000 RCX: ffffffff825c8435 RDX: ffff8881300b3580 RSI: ffffffff825c8443 RDI: 0000000000000005 RBP: 000000000000000b R08: ffffffff825c8435 R09: 000000000000000b R10: 0000000000000005 R11: 0000000000000007 R12: ffff888131ac0000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f88330af6c0(0000) GS:ffff888a93dd2000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f88330aefe8 CR3: 000000010ff59000 CR4: 0000000000350ef0 Call Trace: <TASK> tcp_data_ready (net/ipv4/tcp_input.c:5356) tcp_data_queue (net/ipv4/tcp_input.c:5445) tcp_rcv_state_process (net/ipv4/tcp_input.c:7165) tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1955) __release_sock (include/net/sock.h:1158 (discriminator 6) net/core/sock.c:3180 (discriminator 6)) release_sock (net/core/sock.c:3737) mptcp_sendmsg (net/mptcp/protocol.c:1763 net/mptcp/protocol.c:1857) inet_sendmsg (net/ipv4/af_inet.c:853 (discriminator 7)) __sys_sendto (net/socket.c:727 (discriminator 15) net/socket.c:742 (discriminator 15) net/socket.c:2244 (discriminator 15)) __x64_sys_sendto (net/socket.c:2247) do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1)) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) RIP: 0033:0x7f883326702d Address the issue setting an explicit `fastclosing` flag at fastclose time, and checking such flag after mptcp_do_fastclose(). Fixes: ae15506 ("mptcp: fix duplicate reset on fastclose") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251212-net-mptcp-subflow_data_ready-warn-v1-2-d1f9fd1c36c8@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> [ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit f1a77dfc3b045c3dd5f6e64189b9f52b90399f07) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
commit e78e70dbf603c1425f15f32b455ca148c932f6c1 upstream. Pull out the !sd check to simplify code. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Chris Mason <clm@meta.com> Link: https://patch.msgid.link/20251107161739.525916173@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit c7ca7e0ff6f0f55ef57c1596286076492f199f9a) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
commit 08d473dd8718e4a4d698b1113a14a40ad64a909b upstream. Simplify code by adding a few variables. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Chris Mason <clm@meta.com> Link: https://patch.msgid.link/20251107161739.655208666@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit d4ffb9ce8e6501bebbf833cd7ae3f34eab1f76ba) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
commit 33cf66d88306663d16e4759e9d24766b0aaa2e17 upstream. Add a randomized algorithm that runs newidle balancing proportional to its success rate. This improves schbench significantly: 6.18-rc4: 2.22 Mrps/s 6.18-rc4+revert: 2.04 Mrps/s 6.18-rc4+revert+random: 2.18 Mrps/S Conversely, per Adam Li this affects SpecJBB slightly, reducing it by 1%: 6.17: -6% 6.17+revert: 0% 6.17+revert+random: -1% Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Chris Mason <clm@meta.com> Link: https://lkml.kernel.org/r/6825c50d-7fa7-45d8-9b81-c6e7e25738e2@meta.com Link: https://patch.msgid.link/20251107161739.770122091@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 98a26893fad4180d8ea210d8749392790dfddc81) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
commit 3af870aedbff10bfed220e280b57a405e972229f upstream. Commit f2060bd ("nfs/localio: add refcounting for each iocb IO associated with NFS pgio header") inadvertantly reintroduced the same potential for __put_cred() triggering BUG_ON(cred == current->cred) that commit 992203a ("nfs/localio: restore creds before releasing pageio data") fixed. Fix this by saving and restoring the cred around each {read,write}_iter call within the respective for loop of nfs_local_call_{read,write} using scoped_with_creds(). NOTE: this fix started by first reverting the following commits: 94afb627dfc2 ("nfs: use credential guards in nfs_local_call_read()") bff3c841f7bd ("nfs: use credential guards in nfs_local_call_write()") 1d18101a644e ("Merge tag 'kernel-6.19-rc1.cred' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs") followed by narrowly fixing the cred lifetime issue by using scoped_with_creds(). In doing so, this commit's changes appear more extensive than they really are (as evidenced by comparing to v6.18's fs/nfs/localio.c). Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Acked-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/linux-next/20251205111942.4150b06f@canb.auug.org.au/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 7a28d65e4beb7627738b75c6a23f36ae54470f93) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Link: https://lore.kernel.org/r/20260109111950.344681501@linuxfoundation.org Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: Slade Watkins <sr@sladewatkins.com> Tested-by: Achill Gilgenast <achill@achill.org>= Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Brett A C Sheffield <bacs@librecast.net> Tested-by: Brett Mastbergen <bmastbergen@ciq.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Jeffrin Jose T <jeffrin@rajagiritech.edu.in> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit dc554c8fb361f13580da3f5a98ad8b494a788666) Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Reviewer's GuideUpdates the kernel base from 6.18.4 to 6.18.5, pulling in upstream scheduler randomness-based newidle balancing, fixing NFS local I/O credential scoping, and refining MPTCP fastclose handling and state tracking. Sequence diagram for scheduler newidle balancing with NI_RANDOMsequenceDiagram
participant CPU
participant this_rq as rq_this
participant sd as sched_domain
participant rng as sched_rng
CPU->>this_rq: sched_balance_newidle(rq_this, rf)
this_rq->>sd: rcu_dereference_check_sched_domain(this_rq->sd)
alt no_sched_domain
this_rq-->>CPU: return (no balancing)
else has_sched_domain
this_rq->>sd: check get_rd_overloaded && avg_idle < max_newidle_lb_cost
alt not_overloaded_or_too_short_idle
this_rq->>sd: update_next_balance(sd, next_balance)
this_rq-->>CPU: return
else overloaded_and_idle_long_enough
loop over_domains
this_rq->>sd: check SD_BALANCE_NEWIDLE
alt SD_BALANCE_NEWIDLE_set
sd->>sd: weight = 1
alt NI_RANDOM_enabled
this_rq->>rng: sched_rng()
rng-->>this_rq: random_u32
this_rq->>sd: d1k = random_u32 % 1024
this_rq->>sd: weight = 1 + sd.newidle_ratio
alt d1k > weight
this_rq->>sd: update_newidle_stats(sd, success=0)
this_rq-->>this_rq: continue (skip balance)
else d1k <= weight
this_rq->>sd: weight = (1024 + weight/2) / weight
end
end
this_rq->>this_rq: t0 = sched_clock_cpu(this_cpu)
this_rq->>this_rq: pulled_task = sched_balance_rq(...)
this_rq->>this_rq: t1 = sched_clock_cpu(this_cpu)
this_rq->>sd: domain_cost = t1 - t0
this_rq->>sd: update_newidle_cost(sd, domain_cost, weight * !!pulled_task)
end
end
this_rq-->>CPU: return
end
end
Sequence diagram for NFS local I/O per-iteration credential overridesequenceDiagram
participant Worker as nfs_local_call_read
participant Filp as struct_file
participant Ops as file_operations
loop for_each_iter_i
Worker->>Worker: configure iocb->kiocb.ki_flags (DIRECT or not)
Worker->>Worker: save_cred = override_creds(Filp.f_cred)
Worker->>Ops: read_iter(&iocb->kiocb, &iocb->iters[i])
Ops-->>Worker: status
Worker->>Worker: revert_creds(save_cred)
alt status != -EIOCBQUEUED
Worker->>Worker: handle partial read or errors
end
end
Sequence diagram for MPTCP fastclose and subflow disconnectsequenceDiagram
participant App as Application
participant Msk as mptcp_sock
participant MPTCP as mptcp_core
participant Ssk as subflow_sock
App->>MPTCP: mptcp_do_fastclose(sk)
MPTCP->>Msk: mptcp_set_state(sk, TCP_CLOSE)
MPTCP->>Msk: Msk.fastclosing = 1
MPTCP-->>App: return
App->>MPTCP: __mptcp_close_ssk(sk, ssk, flags)
MPTCP->>MPTCP: need_push = compute_push(flags)
alt !dispose_it
MPTCP->>MPTCP: __mptcp_retransmit_pending_data(sk)
MPTCP->>MPTCP: __mptcp_subflow_disconnect(ssk, subflow, Msk.fastclosing)
MPTCP-->>App: return
else dispose_it
MPTCP->>MPTCP: close subflow and cleanup
MPTCP-->>App: return
end
Class diagram for updated scheduler domain and RNG stateclassDiagram
class sched_domain {
+unsigned int newidle_call
+unsigned int newidle_success
+unsigned int newidle_ratio
+u64 max_newidle_lb_cost
+unsigned long last_decay_max_lb_cost
+unsigned long last_balance
+unsigned int balance_interval
+unsigned int nr_balance_failed
}
class rnd_state {
<<per_cpu>>
}
class rq {
<<per_cpu_shared_aligned>>
}
class sched_features {
+bool WA_BIAS
+bool UTIL_EST
+bool LATENCY_WARN
+bool NI_RANDOM
}
class sched_core_helpers {
+bool update_newidle_cost(sched_domain sd, u64 cost, unsigned int success)
+void update_newidle_stats(sched_domain sd, unsigned int success)
+u32 sched_rng()
+void sched_init_smp()
}
sched_domain --> sched_core_helpers : used_by
rnd_state --> sched_core_helpers : provides_state_for
rq --> sched_core_helpers : used_by
sched_features --> sched_core_helpers : controls_behavior
Class diagram for updated MPTCP socket stateclassDiagram
class mptcp_sock {
+unsigned int fastopening : 1
+unsigned int in_accept_queue : 1
+unsigned int free_first : 1
+unsigned int rcvspace_init : 1
+unsigned int fastclosing : 1
+u32 notsent_lowat
+int keepalive_cnt
+int keepalive_idle
}
class mptcp_protocol_helpers {
+void mptcp_do_fastclose(sock sk)
+int mptcp_disconnect(sock sk, int flags)
+void __mptcp_close_ssk(sock sk, sock ssk, int flags)
+void __mptcp_subflow_disconnect(sock ssk, mptcp_subflow_context subflow, bool fastclosing)
}
mptcp_sock <-- mptcp_protocol_helpers : operates_on
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 2 issues, and left some high level feedback:
- The new per-CPU random state initialization in sched_init_smp() uses prandom_init_once(&sched_rnd_state), but prandom_init_once is designed for a single struct rnd_state, not a __percpu variable; consider explicitly seeding each CPU’s sched_rnd_state instance instead of passing the percpu symbol address.
- In mptcp, __mptcp_subflow_disconnect() no longer uses the 'flags' argument and only consumes the 'fastclosing' boolean, so you might simplify the signature and call sites further by dropping the unused 'subflow' argument or clarifying why it must be retained.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new per-CPU random state initialization in sched_init_smp() uses prandom_init_once(&sched_rnd_state), but prandom_init_once is designed for a single struct rnd_state, not a __percpu variable; consider explicitly seeding each CPU’s sched_rnd_state instance instead of passing the percpu symbol address.
- In mptcp, __mptcp_subflow_disconnect() no longer uses the 'flags' argument and only consumes the 'fastclosing' boolean, so you might simplify the signature and call sites further by dropping the unused 'subflow' argument or clarifying why it must be retained.
## Individual Comments
### Comment 1
<location> `kernel/sched/core.c:124-127` </location>
<code_context>
EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+DEFINE_PER_CPU(struct rnd_state, sched_rnd_state);
#ifdef CONFIG_SCHED_PROXY_EXEC
</code_context>
<issue_to_address>
**issue (bug_risk):** Per-CPU rnd_state is only initialized once and likely only for a single CPU instance.
`DEFINE_PER_CPU(struct rnd_state, sched_rnd_state);` combined with a single `prandom_init_once(&sched_rnd_state);` in `sched_init_smp()` initializes only one per-CPU instance (effectively CPU0). Other CPUs will call `prandom_u32_state(this_cpu_ptr(&sched_rnd_state))` with an uninitialized `rnd_state`, leading to undefined behaviour and poor randomness on non-boot CPUs.
Please ensure each CPU’s `sched_rnd_state` is explicitly initialized (e.g. iterate with `for_each_possible_cpu()` and use `prandom_seed()` / `prandom_u32_state_init()`, or give each CPU its own `prandom_init_once` guard) rather than initializing the per-CPU symbol only once globally.
</issue_to_address>
### Comment 2
<location> `kernel/sched/fair.c:12125` </location>
<code_context>
}
-static inline bool update_newidle_cost(struct sched_domain *sd, u64 cost)
+static inline void update_newidle_stats(struct sched_domain *sd, unsigned int success)
{
+ sd->newidle_call++;
</code_context>
<issue_to_address>
**issue (complexity):** Consider separating newidle cost tracking from NI_RANDOM statistics and encapsulating the probability logic in a helper so the code’s responsibilities and control flow are more explicit and easier to follow.
You can reduce the added complexity without changing behavior by:
1. **Decoupling cost/decay from NI_RANDOM stats.**
2. **Encapsulating the NI_RANDOM probability logic into a helper.**
3. **Giving `weight` a single clear meaning at the call site.**
### 1. Split stats from cost tracking
Right now `update_newidle_cost()` both updates stats and cost/decay. You can keep its original responsibility and move NI_RANDOM stats to a separate helper:
```c
static inline void update_newidle_stats(struct sched_domain *sd,
unsigned int success)
{
sd->newidle_call++;
sd->newidle_success += success;
if (sd->newidle_call >= 1024) {
sd->newidle_ratio = sd->newidle_success;
sd->newidle_call /= 2;
sd->newidle_success /= 2;
}
}
static inline bool update_newidle_cost(struct sched_domain *sd, u64 cost)
{
unsigned long next_decay = sd->last_decay_max_lb_cost + HZ;
unsigned long now = jiffies;
if (cost > sd->max_newidle_lb_cost) {
sd->max_newidle_lb_cost = cost;
sd->last_decay_max_lb_cost = now;
} else if (time_after(now, next_decay)) {
sd->max_newidle_lb_cost =
(sd->max_newidle_lb_cost * 253) / 256;
sd->last_decay_max_lb_cost = now;
return true;
}
return false;
}
```
Call sites then become explicit about *why* stats are updated:
```c
/* decay-only path */
need_decay = update_newidle_cost(sd, 0);
/* newidle balance path */
domain_cost = t1 - t0;
curr_cost += domain_cost;
t0 = t1;
update_newidle_stats(sd, success_weight * !!pulled_task);
update_newidle_cost(sd, domain_cost);
```
This removes the hidden “if (cost) update_newidle_stats(..)” coupling and makes both responsibilities straightforward.
### 2. Encapsulate NI_RANDOM dice + weight logic
The inlined NI_RANDOM block both decides whether to run and computes the success multiplier, while mutating `weight` in two roles. You can hide that behind a helper with a clear contract:
```c
static inline bool newidle_should_run(struct sched_domain *sd,
unsigned int *success_weight)
{
*success_weight = 1;
if (!sched_feat(NI_RANDOM))
return true;
/*
* Throw a 1k sided dice; and only run newidle_balance according
* to the observed success rate.
*/
u32 d1k = sched_rng() % 1024;
unsigned int w = 1 + sd->newidle_ratio;
if (d1k > w) {
update_newidle_stats(sd, 0);
return false; /* skip balance */
}
*success_weight = (1024 + w / 2) / w;
return true; /* run balance */
}
```
Then the main loop becomes linear and `weight` has a single meaning (“success multiplier”):
```c
for_each_domain(this_cpu, sd) {
u64 domain_cost;
unsigned int success_weight = 1;
update_next_balance(sd, &next_balance);
if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost)
break;
if (sd->flags & SD_BALANCE_NEWIDLE) {
if (!newidle_should_run(sd, &success_weight))
continue;
pulled_task = sched_balance_rq(this_cpu, this_rq, sd,
CPU_NEWLY_IDLE,
&continue_balancing);
t1 = sched_clock_cpu(this_cpu);
domain_cost = t1 - t0;
curr_cost += domain_cost;
t0 = t1;
update_newidle_stats(sd, success_weight * !!pulled_task);
update_newidle_cost(sd, domain_cost);
}
if (pulled_task || !continue_balancing)
break;
}
```
This preserves all existing behavior (same dice, same `newidle_ratio` usage, same success scaling) while:
- Removing mixed responsibilities from `update_newidle_cost`.
- Making the NI_RANDOM behavior self-contained and easier to reason about.
- Avoiding `weight` being reused in two unrelated roles.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR updates the Linux kernel from version 6.18.4 to 6.18.5, incorporating 6 upstream commits that address scheduler optimizations, NFS credential handling, and MPTCP protocol state tracking.
Changes:
- Introduces randomized newidle balancing in the scheduler based on historical success rates to optimize load balancing decisions
- Fixes NFS local I/O to properly scope credential overrides per iteration rather than across the entire loop
- Adds explicit fastclose state tracking in MPTCP to fix subflow disconnect behavior
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| Makefile | Bumps kernel SUBLEVEL from 4 to 5 |
| include/linux/sched/topology.h | Adds newidle balancing statistics fields to sched_domain |
| kernel/sched/sched.h | Introduces per-CPU RNG state and helper for scheduler randomization |
| kernel/sched/core.c | Initializes per-CPU RNG state during SMP initialization |
| kernel/sched/features.h | Adds NI_RANDOM feature flag for randomized newidle balancing |
| kernel/sched/topology.c | Initializes newidle statistics to 50% success rate baseline |
| kernel/sched/fair.c | Implements randomized newidle balancing logic and statistics tracking |
| net/mptcp/protocol.h | Adds fastclosing bit field to mptcp_sock structure |
| net/mptcp/protocol.c | Updates fastclose tracking and subflow disconnect logic |
| fs/nfs/localio.c | Moves credential override/revert inside iteration loops |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * rate. | ||
| */ | ||
| u32 d1k = sched_rng() % 1024; | ||
| weight = 1 + sd->newidle_ratio; |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The weight calculation adds 1 to newidle_ratio which can result in values up to 1025 (1 + 1024 max). However, the comparison on line 12848 checks if (d1k > weight), where d1k is sched_rng() % 1024 (0-1023). This means when newidle_ratio is at its maximum (1024), weight becomes 1025, making the condition always true since d1k can never exceed 1024. This appears inconsistent with the intended probabilistic behavior. The logic should likely be if (d1k >= weight) to properly handle the boundary case, or the weight calculation needs adjustment.
| weight = 1 + sd->newidle_ratio; | |
| weight = 1 + sd->newidle_ratio; | |
| if (weight > 1023) | |
| weight = 1023; |
| .last_balance = jiffies, | ||
| .balance_interval = sd_weight, | ||
|
|
||
| /* 50% success rate */ |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initialization values represent a 50% success rate as noted in the comment, but the relationship between these values isn't immediately clear. Consider adding a comment explaining that newidle_ratio = (newidle_success * 1024) / newidle_call, which equals 512 for the 50% baseline, to help future maintainers understand the invariant.
| /* 50% success rate */ | |
| /* 50% success rate: | |
| * newidle_ratio = (newidle_success * 1024) / newidle_call | |
| * (256 * 1024) / 512 = 512 | |
| */ |
| if (d1k > weight) { | ||
| update_newidle_stats(sd, 0); | ||
| continue; | ||
| } |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This weight calculation implements rounding division but the purpose of the weight transformation is unclear. Consider adding a comment explaining that this converts the success probability into a scaling factor for the statistics update, helping maintainers understand the randomization algorithm.
| } | |
| } | |
| /* | |
| * Convert the (1..1024) success probability into a | |
| * scaling factor for the statistics update. We ran | |
| * newidle_balance with probability weight/1024, so | |
| * use ~1024/weight (rounded) to keep the expected | |
| * newidle cost comparable to the non-randomized case. | |
| */ |
Update kernel base to 6.18.5.
git log --oneline v6.18.4..v6.18.5 | wc
6 35 345
Summary by Sourcery
Update to Linux kernel 6.18.5 while refining scheduler newidle balancing heuristics, fixing NFS local I/O credential handling, and improving MPTCP fast close state tracking.
New Features:
Bug Fixes:
Enhancements:
Build: