-
Notifications
You must be signed in to change notification settings - Fork 65
Add workflow to run SDGym multi-table benchmark monthly and publish results #518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: issue-515-_benchmark_multi_table_compute_gcp
Are you sure you want to change the base?
Add workflow to run SDGym multi-table benchmark monthly and publish results #518
Conversation
| workflow_dispatch: | ||
| schedule: | ||
| - cron: '0 5 1 * *' | ||
| push: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for testing the workflows, I will remove it before merging
sdgym/run_benchmark/utils.py
Outdated
| OUTPUT_DESTINATION_AWS = 's3://sdgym-benchmark/Benchmarks/' | ||
| UPLOAD_DESTINATION_AWS = 's3://sdgym-benchmark/Benchmarks/' | ||
| OUTPUT_DESTINATION_AWS = ( | ||
| 's3://sdgym-benchmark/Debug/GCP_Github/' # 's3://sdgym-benchmark/Benchmarks/' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing purposes, TODO: Update it before merging
| def post_benchmark_launch_message(date_str, compute_service='AWS', modality='single_table'): | ||
| """Post a message to the SDV Alerts Slack channel when the benchmark is launched.""" | ||
| channel = SLACK_CHANNEL | ||
| channel = DEBUG_SLACK_CHANNEL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing purposes, TODO: Update it before merging
| def post_benchmark_uploaded_message(folder_name, commit_url=None, modality='single_table'): | ||
| """Post benchmark uploaded message to sdv-alerts slack channel.""" | ||
| channel = SLACK_CHANNEL | ||
| channel = DEBUG_SLACK_CHANNEL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing purposes, TODO: Update it before merging
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## issue-515-_benchmark_multi_table_compute_gcp #518 +/- ##
================================================================================
+ Coverage 78.76% 78.93% +0.16%
================================================================================
Files 33 33
Lines 2793 2825 +32
================================================================================
+ Hits 2200 2230 +30
- Misses 593 595 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
223d2b2 to
9f10efc
Compare
| compute_quality_score=True, | ||
| compute_diagnostic_score=True, | ||
| compute_privacy_score=True, | ||
| compute_privacy_score=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for testing purposes only or it will change to this now ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the benchmark we don't compute the privacy_score for now:
| compute_privacy_score=False, |
| MODALITY_TO_GDRIVE_LINK = { | ||
| 'single_table': 'https://docs.google.com/spreadsheets/d/1W3tsGOOtbtTw3g0EVE0irLgY_TN_cy2W4ONiZQ57OPo/edit?usp=sharing', | ||
| 'multi_table': 'https://docs.google.com/spreadsheets/d/1R13RktVvKnxRecYIge07OBpbX1vbEkE2D1_2idNAKSY/edit?usp=sharing', | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this won't change probably, but could we use the id from sdgym/run_benchmark/upload_benchmark_results.py if it doesn't cause some circular dependencies ? Or maybe define them in a constants.py file. That way we should just change the id in one file and not have to worry about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we do this, let's move to constants file
amontanez24
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Resolve #516
CU-86b7w17ze
@amontanez24 From what I've experienced, we can currently run 4-gpu machines at the same location:
The initial plan is to have 6 instances running on January 1st (4 for single-table, 2 for multi-table), so I wanted to discuss the options with you:
Maybe there is a better solution that I'm happy to discuss.