Extend the db-maintenance check logic to handle duplicate ongoing builds #41
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The weekly build starts on a Friday and typically takes about 5 days, with DB maintenance happening in the last 10-20 hours when the (quick) SwapTables event and the (not quick) CodedEvent_SNOMED rebuild event happen.
Occasionally something delays or slows down a build so that it takes longer than a week, and the next Fridays build starts before the previous one had finished. The previous logic for the maintenance mode check only looked at the most recently started overall build; this meant that if a new build started before the previous one had finished, it would have no associated SwapTables/CodedEvent_SNOMED events yet, and we're report that we were out of maintenance mode when we weren't.
We now look for the most recent TWO builds, so we can check that they're not both ongoing. If they are both ongoing, we use the earliest of the two to check for associated SwapTables/CodedEvent_SNOMED events. If we determine that we're not in maintenance mode, we now also do a final check to ensure that the CodedEvent_SNOMED table really is available.
We return both the maintenance mode status and the build count so that the RAP Agent can include the build count in telemetry, and we can set up alerts in honeycomb if we ever see a build count of 2.
Depends on the associated job-runner PR to handle the new output format
Closes #11
Closes #12