-
Notifications
You must be signed in to change notification settings - Fork 27
Fixes for structured LBFGS approximation. #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
9613f3d
b534954
91b1d30
11d9611
321f418
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -349,21 +349,27 @@ def qn_matvec(self, v): | |
| for i in range(npairs): | ||
| k = (self.insert + i) % npairs | ||
| if ys[k] is not None: | ||
| # Form all a[] and ad[] vectors for the current step | ||
| a[:, k] = y[:, k] - s[:, k] / self.gamma | ||
| ad[:, k] = yd[:, k] - s[:, k] / self.gamma | ||
| for j in range(i): | ||
| l = (self.insert + j) % npairs | ||
| if ys[l] is not None: | ||
| alTs = np.dot(a[:, l], s[:, k]) | ||
| adlTs = np.dot(ad[:, l], s[:, k]) | ||
| update = -alTs / aTs[l] * ad[:, l] - adlTs / aTs[l] * \ | ||
| a[:, l] + adTs[l] / aTs[l] * alTs * a[:, l] | ||
| a[:, k] += update.copy() | ||
| ad[:, k] += update.copy() | ||
| aTsk = np.dot(a[:, l], s[:, k]) | ||
| adTsk = np.dot(ad[:, l], s[:, k]) | ||
| aTsl = np.dot(a[:, l], s[:, l]) | ||
| adTsl = np.dot(ad[:, l], s[:, l]) | ||
| update = (aTsk / aTsl) * ad[:, l] + (adTsk / aTsl) * a[:, l] - \ | ||
| (aTsk * adTsl / aTsl**2) * a[:, l] | ||
| a[:, k] -= update.copy() | ||
| ad[:, k] -= update.copy() | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why these changes? It's hard to tell whether they actually modify the behavior or are just a rewrite.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I remember correctly, it's just a rewrite to make the similarity between structured LBFGS and structured LSR1 obvious.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok. In that case, it's wasteful to recompute the dot products that are already stored in
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I'll simplify it. |
||
|
|
||
| # Form inner products with current s[] and input vector | ||
| aTs[k] = np.dot(a[:, k], s[:, k]) | ||
| adTs[k] = np.dot(ad[:, k], s[:, k]) | ||
| aTv = np.dot(a[:, k], v[:]) | ||
| adTv = np.dot(ad[:, k], v[:]) | ||
| q += aTv / aTs[k] * ad[:, k] + adTv / aTs[k] * \ | ||
| a[:, k] - aTv * adTs[k] / aTs[k]**2 * a[:, k] | ||
|
|
||
| q += (aTv / aTs[k]) * ad[:, k] + (adTv / aTs[k]) * a[:, k] - \ | ||
| (aTv * adTs[k] / aTs[k]**2) * a[:, k] | ||
| return q | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't this already computed in
aTs[l]?