-
Notifications
You must be signed in to change notification settings - Fork 40
Add ADAM and RMSProp optimizers #2041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Memory benchmark result| Test Name | %Δ | Master (MB) | PR (MB) | Δ (MB) | Time PR (s) | Time Master (s) |
| -------------------------------------- | ------------ | ------------------ | ------------------ | ------------ | ------------------ | ------------------ |
test_objective_jac_w7x | 2.08 % | 3.952e+03 | 4.034e+03 | 82.14 | 47.33 | 43.93 |
test_proximal_jac_w7x_with_eq_update | 1.70 % | 6.613e+03 | 6.726e+03 | 112.66 | 198.04 | 198.83 |
test_proximal_freeb_jac | -0.08 % | 1.323e+04 | 1.322e+04 | -10.12 | 106.29 | 106.91 |
test_proximal_freeb_jac_blocked | -0.06 % | 7.550e+03 | 7.545e+03 | -4.70 | 93.96 | 94.68 |
test_proximal_freeb_jac_batched | 0.17 % | 7.561e+03 | 7.575e+03 | 13.02 | 93.96 | 94.27 |
test_proximal_jac_ripple | -0.27 % | 3.579e+03 | 3.570e+03 | -9.60 | 71.90 | 73.01 |
test_proximal_jac_ripple_bounce1d | -2.68 % | 3.765e+03 | 3.664e+03 | -101.03 | 84.17 | 85.57 |
test_eq_solve | -0.09 % | 2.028e+03 | 2.026e+03 | -1.75 | 101.17 | 101.58 |For the memory plots, go to the summary of |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2041 +/- ##
=======================================
Coverage 95.75% 95.76%
=======================================
Files 102 102
Lines 28344 28374 +30
=======================================
+ Hits 27142 27171 +29
- Misses 1202 1203 +1
🚀 New features to boost your workflow:
|
desc/optimize/stochastic.py
Outdated
| Update rule for 'rmsprop': | ||
| v_{k} = beta*v_{k-1} + (1-beta)*grad(x_{k})^2 | ||
| x_{k+1} = x_{k} - alpha * grad(x_{k}) / (sqrt(v_{k}) + epsilon) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some literature uses different Greek letters for alpha and beta, but I used these to make api slightly more compact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the unrelated change, noticed this while fixing the doc errors
| ValueError, | ||
| "x_scale should be one of 'auto' or array-like, got {}".format(x_scale), | ||
| ) | ||
| if isinstance(x_scale, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should say in docstring that 'x_scale="auto"' is essentially just means no scaling as we just set xscale to 1 in that case.
dpanici
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just the small docstring fix, should be explitict that x_scale='"auto"` does no scaling here
x_scaleis now used with SGD methods too