![]() |
Stan Math Library
5.2.0
Automatic Differentiation
|
|
inline |
Strong Wolfe line search for maximization.
Finds step $\alpha$ along $a(\alpha) = a_0 + \alpha\,p$ satisfying the strong Wolfe conditions for maximization:
where $\phi(\alpha)$ is the objective and $\phi'(\alpha) = \nabla\phi^\top p$ is the directional derivative. Falls back to Armijo-only or convergence statuses when strong Wolfe cannot be achieved.
The search maintains a left endpoint low, a right endpoint high, and a fallback best (best Armijo-satisfying point seen so far). Non-finite evaluations are contracted by factor opt.tau automatically.
The user given initial step size is expanded by opt.scale_up until either Wolfe conditions are violated or opt.max_alpha is reached. If the first evaluation satisfies both conditions, the search expands further ("zoom-up") to find the largest such step before accepting.
Starting from $\alpha_0 = \text{clamp}(\text{curr.alpha} \cdot \text{scale_up},\; \text{min_alpha},\; \text{max_alpha})$, evaluates increasing step sizes to find a bracket [low, high] guaranteed to contain a Wolfe point. If the first evaluation already satisfies both conditions, the search expands further ("zoom-up") to find the largest such step before accepting.
The goal of this phase is to find a valid bracket [low, high] Ideally such that the low endpoint satisfies Armijo and has a positive derivative, while the high endpoint has a negative derivative. This gives us a shape like /\ to search through in the zoom phase. If such a bracket cannot be found within the step size limits, that is fine but the zoom phase will start with bisections instead of the cubic interpolation.
Narrows [low, high] to find a strong Wolfe point. Uses cubic interpolation when the bracket has a derivative sign change and high satisfies Armijo; otherwise bisects.
If no strong Wolfe point is found, falls back to the best Armijo-satisfying point (status Armijo) or Fail if none exists.
Must compute $\phi(\alpha)$ and $\phi'(\alpha)$ at e.alpha(), store results in e.obj() and e.dir(), and populate out with the full candidate state (theta, theta_grad, a). Must not mutate curr or prev.
| Info | Wolfe working set (prev_, curr_, scratch_, p_, init_dir_) |
| UpdateFun | Step evaluator satisfying the contract above |
| Options | Provides c1, c2, tau, scale_up, min/max_alpha, max_iterations, and convergence thresholds |
| Stream | Output stream type (may be nullptr) |
| wolfe_info | Working set with baseline (prev_), current (curr_), scratch space, direction p_, and initial derivative |
| update_fun | Step evaluator callback |
| opt | Line search options |
| msgs | Optional diagnostic stream (nullptr to disable) |
WolfeStatus with:Wolfe – both conditions satisfied, curr_ updatedArmijo – only Armijo satisfied, curr_ updatedConvergedGradient / ConvergedObjective / ConvergedObjectiveAndGradient – early convergence, curr_ updated iff Armijo holdsIntervalTooSmall – bracket collapsed, curr_ updated iff Armijo holdsStepTooSmall – $\alpha$ contracted below min_alphaReachedMaxStep – evaluation budget exceededFail – no Armijo point found, curr_ unchanged For each case | armijo | wolfe | sign(g) | Action ----—+----—+------—+-----------------------------— | [1] T | T | | Accept alpha | [2] T | F | > 0 | set low=high, expand high | [3] T | F | < 0 | stop | [4] F | T | | stop | [5] F | F | | stop
Zoom phase
| # | Armijo | f(m) > f(l) | Wolfe | f'(mid) | Action |
|---|---|---|---|---|---|
| 1 | T | T | T | - | Accept |
| 2 | T | T | F | > 0 | low = mid |
| 3 | T | T | F | <= 0 | high = mid |
| 4 | T | F | - | - | high = mid |
| 5 | F | - | - | - | high = mid |
Definition at line 699 of file wolfe_line_search.hpp.