fix(scheduler): schedule tasks on new pods when BatchSandbox scales out#399
fix(scheduler): schedule tasks on new pods when BatchSandbox scales out#399zerone0x wants to merge 1 commit intoalibaba:mainfrom
Conversation
When a BatchSandbox is scaled out (replicas increased), the existing in-memory TaskScheduler was not informed of the new task specs for the additional replicas. As a result, newly created pods received no task assignment, leaving them idle. Root causes fixed: 1. Add TaskScheduler.AddTasks() – a new interface method that appends task nodes for specs not yet tracked by the scheduler. Already- tracked task names are silently skipped, making it safe to call with the full task list on every reconciliation cycle. 2. Call AddTasks() in getTaskScheduler() after UpdatePods() so that any replicas added since the scheduler was first created get their task nodes registered immediately. 3. Remove the spurious in scaleBatchSandbox() that doubled every entry in the local pods slice on each iteration without any effect (range evaluated the initial slice length), making the code easier to read and reason about. Fixes alibaba#102 Co-Authored-By: Claude <noreply@anthropic.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Pull request overview
Fixes the scheduler cache behavior so newly created pods receive task assignments after a BatchSandbox scales out (replicas increased).
Changes:
- Added
AddTasks()to theTaskSchedulerinterface and implemented it in the default scheduler. - Updated controller reconciliation to call
AddTasks()afterUpdatePods()when reusing a cached scheduler. - Updated mocks and added unit tests for
AddTasks; removed a confusing no-opappendin scale code.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| kubernetes/internal/scheduler/interface.go | Extends scheduler interface with AddTasks() to support scale-out. |
| kubernetes/internal/scheduler/default_scheduler.go | Implements AddTasks() to append previously-untracked task nodes. |
| kubernetes/internal/scheduler/mock/interface.go | Updates generated gomock to include AddTasks(). |
| kubernetes/internal/controller/batchsandbox_controller.go | Calls AddTasks() on cached scheduler and removes spurious append. |
| kubernetes/internal/scheduler/default_scheduler_test.go | Adds unit tests validating AddTasks() behavior across scenarios. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| UpdatePods(pod []*corev1.Pod) | ||
| ListTask() []Task | ||
| StopTask() []Task | ||
| // AddTasks registers task specs that are not yet tracked by the scheduler. | ||
| // Tasks whose names are already tracked are silently skipped, making this | ||
| // safe to call with the full task list during a scale-out reconciliation. | ||
| AddTasks(tasks []*apis.Task) error |
There was a problem hiding this comment.
The new AddTasks method uses *apis.Task here, but the implementation and mock in this PR use *api.Task. If these are different packages/types, defaultTaskScheduler and MockTaskScheduler will not satisfy TaskScheduler and the build will fail. Unify the task type across the interface, default implementation, and mock (same import path and identifier), and regenerate mocks if needed.
| func (sch *defaultTaskScheduler) AddTasks(tasks []*api.Task) error { | ||
| newNodes, err := initTaskNodes(tasks) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| for _, node := range newNodes { | ||
| if _, exists := sch.taskNodeByNameIndex[node.Name]; !exists { | ||
| sch.taskNodes = append(sch.taskNodes, node) | ||
| sch.taskNodeByNameIndex[node.Name] = node | ||
| } | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
AddTasks currently calls initTaskNodes(tasks) for the full task list every time, even though most tasks may already be tracked. In a reconciliation loop, this can become unnecessarily expensive as replica counts grow (allocating []*taskNode proportional to total replicas every call). Consider filtering to only the missing task specs first (by checking taskNodeByNameIndex on the incoming tasks), then initializing/appending nodes only for that smaller subset.
Summary
Closes #102
When a BatchSandbox is scaled out (replicas increased), the existing in-memory
TaskSchedulerwas not informed of the new task specs for the additional replicas. As a result, newly created pods received no task assignment, leaving them idle indefinitely.Root Causes & Fixes
1.
TaskScheduler.AddTasks()— new interface methodAppends task nodes for specs not yet tracked by the scheduler. Already-tracked task names are silently skipped, making it safe to call with the full task list on every reconciliation cycle (no duplicates).
2. Call
AddTasks()ingetTaskScheduler()afterUpdatePods()The existing code only called
UpdatePods(pods)when a scheduler was retrieved from the cache. Now it also callsAddTasks(taskSpecs)with the current replica set so that any replicas added since the scheduler was first created get their task nodes registered immediately and can receive pod assignments on the nextSchedule()call.3. Remove spurious
pods = append(pods, pod)inscaleBatchSandbox()This line doubled every entry in the local
podsslice on each iteration without any practical effect (Go'srangeevaluates the initial slice header once), making the code harder to reason about. Removing it has no behaviour change but improves clarity.Changes
kubernetes/internal/scheduler/interface.goAddTasks([]*api.Task) errortoTaskSchedulerinterfacekubernetes/internal/scheduler/default_scheduler.goAddTasksondefaultTaskSchedulerkubernetes/internal/scheduler/mock/interface.goAddTaskskubernetes/internal/controller/batchsandbox_controller.goAddTaskson scale-out; remove spurious appendkubernetes/internal/scheduler/default_scheduler_test.goTest_addTaskscovering scale-out, no-op, and empty-scheduler casesTesting
TestBatchSandboxReconciler_scheduleTasks,Test_parseIndex,Test_assignTaskNodes,Test_scheduleTaskNodes,Test_initTaskNodes)Test_addTaskscovering three scenarios: scale-out (new tasks appended, existing skipped), no-op (same tasks, no duplicates), empty scheduler (initial population)Breaking Changes
AddTasksis a new method added to theTaskSchedulerinterface. The mock has been updated accordingly; any other custom implementations of this internal interface will need to add the method.