Parallization of Optizelle


Can Optimzelle work with cuSolver to accelerate the interior point optimization with GPU? I did not find much info about parallization in the tech document. Thanks.


Hi He Sun! I’ll give a qualified yes, but it somewhat depends on the structure of your problem. Do you have equality constraints as well or do you have purely inequality constraints? Are your inequality constraints bounds or a more general cone constraint like a semidefinite program?

In terms of the interior point machinery, there’s only a few places where parallelization pays off and it only occurs with semidefinite programs. Namely, the operator linv for semidefinite constraints involves the inverse of a matrix, which can be computed with cuSOLVER. There’s also the feasibility line search srch, which involves a generalized eigenvalue problem. The typical transformation to solve this problem involves a Choleski factorization followed by an eigenvalue solve and this Choleski factorization can be parallelized with cuSOLVER. Both of these operations can be parallelized without modifying the code, but by providing a custom vector space, which is descried in chapter 6.3 of the manual. Now, for second order cone constraints and the nonnegative orthant (normal bounds), there’s really no need for an inverse or factorization, so this kind of parallelization doesn’t help these sorts of inequalities.

Beyond the pure interior point pieces, there are two different linear systems that can be parallelized. First, there’s the linear system associated with the Hessian of the Lagrangian. For a pure inequality constrained problem, this operator is:

hess f(x) dx + h'(x)* inv(h(x)) (h'(x)dx z)

where the * denotes the operator adjoint. If there are equality constraints, this operator becomes:

hess f(x) dx + (g''(x)dx)*y + h'(x)* inv(h(x)) (h'(x)dx z)

We solve this system with a preconditioned truncated-CG algorithm, so we can precondition it with an inverse or factorization from cuSOLVE. This preconditioner goes into fns.PH. An example of how to define a preconditioner for this linear system solve, but without the interior point pieces, can be found in the Rosenbrock example in chapter 1.3. Additional information on how to define a preconditioner can be found in chapter 3.5.

If we have equality constraints, g(x)=0, we should also precondition the system g'(x)g'(x)*, since we require this preconditioner in a number of places in the composite step algorithm. Here, again, we can use a package like cuSOLVER to find the inverse of the matrix g'(x)g'(x)*. This preconditioner is described in chapter 3.5 and goes into the structure fns.PSchur_left.

Generally speaking, I think it’s important to define a good, fast preconditioner for the equality constraints. For the objective, it’s a little bit tricky. A perfect preconditioner will force the algorithm to attempt a pure Newton step at each iteration. This may or may not be a descent direction and if it’s not, we’ll end up retreating to the Cauchy point, which essentially ignores any curvature information in the Hessian of the Lagrangian. As such, it’s sometimes helpful to not have a perfect preconditioner, or inverse, there in order to better capture as much as the positive curvature of the Hessian that we can.

Finally, the actual evaluations of the objective and constraints can be parallelized. It’s a little uncommon to see an inverse in the constraints, but moderately common to see an inverse in the objective. For example, this comes up in reduced space formulations for parameter estimation. In any case, this inverse can be parallelized using something like cuSOLVER.

Anyway, hopefully this gets you started. Let me know if I can clarify anything and certainly if you have more information about the structure of the problem, such as those questions that I noted above, I may be able to give a more directed answer. Short of that, the manual isn’t clear enough on how this works, so thanks for the question! Let me figure out a way to rework the manual to make this more clear.


Actually, one minor correction. I just checked the code and it looks like we’ve turned off the preconditioner for the Lagrangian of the Hessian when we have equality constraints. It’s currently this way because we use the preconditioner to inject the nullspace projection for the composite step algorithm. That said, it’s possible to use a preconditioner here, but we’d need to fix some code. Nonetheless, like I mentioned above, I don’t generally like trying to precondition this operator due to curvature concerns.