Now, Suppose that I add in the additional information that the first k derivatives of f are all non-negative, for some moderate k (maybe 2, 3, 4; something like that). Are there any worthwhile tricks which could speed things up in this case?
Comments
Log in with your Bluesky account to leave a comment
I'd like to do it all very hands-off, so bisection seemed sensible if unambitious. If there's reason to believe that Newton will be automatically stable (which some other comments hint at), then it's indeed a natural choice.
Don’t dismiss Secant method. It’s a bit less aggressive than Newton, but it doesn’t need a derivative at all, which can make it faster in practice (less computations of f).
Assume that f is basically a polynomial, so it's not appealing to do something like "call a polynomial solver" as a sub-routine; I am really only interested in something very low-tech and fast. The problem should morally be very easy, which is why I think that such a solution might be available.
In the end, I'm probably interested in solving this system for many values of t > 0, so in principle one could get a low amortised cost in various ways. I'd be most keen to find a solution which is sensible even for a single solve, though.
My limited intuition is that one could use these constraints to build an aggressive line search. Monoticity tells you which direction to search and the bounds on the derivatives can be used to construct an appropriate order polynomial whose zeros tell you how far to extrapolate.
For higher derivatives, it feels like the trick should extend to higher order curves than tangent lines, but I've not worked out even this much rigorously enough to be super-confident in it
I imagine it does, though in the case that I'm considering, checking whether the higher-order Taylor polynomial exceeds t is going to be a similar cost to checking with f. It's a slightly weird request in that way.
And I was right not to be super-confident... I now think what I suggested gives you the new search area [x_0, new point] rather than [new_point, 1]... Hopefully you get what I was going for 😅
i'd guess newton's method for root-finding might be a good choice here, as the convergence theory relies on bounds on the derivatives, though of course the convergence conditions are pretty non-prescriptive so you'll just have to try it. could also combine with an initial bisection warm start.
Check out Halley’s method that has a cubic convergence rate. Newton’s method has quadratic convergence and bisect method has linear convergence in comparison
I recently had to implement the Brent method (for finding the inverse cumulative function of a distribution) and it can sometimes be quite much faster than the bissection method: https://en.m.wikipedia.org/wiki/Brent%27s_method
Comments
f(x) = poly(x) + x^k / 1 - x
where the coefficients of poly are positive.
If its second derivative is non-negative, its first derivative is monotonic and I think its therefore convex.
So f is lower bounded by the tangent at any points sampled
If you sample x_0 and f(x_0) < t, rather than bisecting [x_0, 1], bisect [where the tangent at f(x_0) reaches t, 1]
And I was right not to be super-confident... I now think what I suggested gives you the new search area [x_0, new point] rather than [new_point, 1]... Hopefully you get what I was going for 😅
https://en.m.wikipedia.org/wiki/Brent%27s_method