Multiple sequential MCD replica reduction repeatedly picks the same machine for deletion

**How to categorize this issue?**

/area auto-scaling
/kind bug
/priority 3

**What happened**:
We recently identified a bug where when there are multiple sequential MCD replica scale downs by an external actor (for example `cluster-autoscaler`), we noticed that the MCS controller repeatedly picks the same machine for deletion.
This causes problems, specially with how the external actor processed the scale down. For example, when using `cluster-autoscaler` with the `mcm` provider, `cluster-autoscaler` will cordon nodes that it has marked for termination. This leads to cluster states with multiple cordoned nodes that that are not deleted. 

**What you expected to happen**:
When selecting a node for deletion due to reducing replicas, MCM should not select nodes already in a terminating state.

**How to reproduce it (as minimally and precisely as possible)**:

**Anything else we need to know?**:

**Environment**:

- Kubernetes version (use `kubectl version`):
- Cloud provider or hardware configuration:
- Others:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple sequential MCD replica reduction repeatedly picks the same machine for deletion #1084

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multiple sequential MCD replica reduction repeatedly picks the same machine for deletion #1084

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions