-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-13846. Handle sequenceid synch during close container in SCM. #9228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // TODO: Remove the protobuf conversion after fixing ContainerStateMap. | ||
| final ContainerID id = ContainerID.getFromProtobuf(containerID); | ||
|
|
||
| try (AutoCloseableLock ignored = writeLock(id)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A read lock is acquired in the previous method. Can you check if there is possibility of a deadlock or some other problem with acquiring the write lock here while the read lock is held? AFAIK java doesn't allow a normal reentrant read write lock to be upgraded from read to write. Not sure if the striped lock being used here has some handling for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not used this method anymore now.
| final ContainerInfo containerInfo = containers.getContainerInfo(id); | ||
| Long currentSequenceId = containerInfo.getSequenceId(); | ||
| // Delegate to @Replicate method with current sequenceId | ||
| updateContainerStateWithSequenceId(containerID, event, currentSequenceId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should call the interface method updateContainerStateWithSequenceId instead of the implementation being called here, right? Only then will the proxy be invoked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Siddhant for pointing, updated the fix.
| // Get current info instead of stale oldInfo | ||
| ContainerInfo currentInfo = containers.getContainerInfo(id); | ||
| currentInfo.setState(oldState); // Revert only the state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed when line 385 is reverting the change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverting here is for transactionBuffer
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashishkumar50 I left some comments. Please also add some testing for this change.
...server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java
Show resolved
Hide resolved
...server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java
Outdated
Show resolved
Hide resolved
@siddhantsangwan Thanks for the review, fixed comments and added UT. |
...server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java
Outdated
Show resolved
Hide resolved
...server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManagerImpl.java
Outdated
Show resolved
Hide resolved
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one comment above. Otherwise LGTM.
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @ashishkumar50 for fixing this. Pending green CI.
What changes were proposed in this pull request?
Currently in SCM only state is synched from leader to followers using raft. In some cases sequence id in follower may lag even though state gets changed to CLOSED. More details described in Jira.
In this PR, we are also synching sequence id from leader to follower when there is state change.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13846
How was this patch tested?
Verified in docker cluster and added new test