1e-x describes the step length used in the finite difference formula used to check whether the computation of the gradient is consistent with the function itself. Specifically, we use a four point finite difference scheme to approximate the gradient using the formula:
<grad f(x),dx> = (f(x-2 eps dx) - 8 f(x-eps dx) + 8 f(x+eps dx) - f(x+2 eps dx)) / (12 eps)
= really means approximately equal to and
<.,.> denotes the inner product, which is probably
<x,y>=x' y with
' meaning transpose. Anyway, this approximation itself has error and we don’t know apriori how to find the best
eps for the approximation. Therefore, we generate a series of tests where
eps varies from
1e+2 = 100 to
1e-5 = 0.00001. Generally, we want to see a hump in these results, which means the error starts out somewhat high at
1e+2, becomes low, and then becomes high again by the time we hit
1e-5, but it really depends on the scaling of the problem. Sometimes, it just low for each
eps. If it’s always high, then that indicates that there’s a problem with the definition of the gradient. Meaning, there’s likely a mistake in our code that generates the gradient, which will probably break the optimization process. All that said, this is really a tool to help find and diagnose problems, but not a perfectly precise technique.
Anyway, hopefully that helps. Let me know if you’ve additional questions!