Example of optimization with thousand of variables

I would like to have an example of an optimization case with thousand of variables. Is there such a thing? I have used an optimization code for a few variables for a while, but I don’t think it can handle thousands of variables, that is why I am looking into Optizelle

Hi, Pablo! Here’s a quick and dirty example using Octave:

% Grab the Optizelle library
global Optizelle;
setupOptizelle();

% Set the problem size
m = 10000;

% Setup the linear system
A = sprandn(m,m,0.01);
b = randn(m,1);

% Find the least squares objective
f = struct();
f.eval = @(x) norm(A*x-b).^2;
f.grad = @(x) 2.*(A'*(A*x-b));
f.hessvec = @(x,dx) 2.*(A'*(A*dx));

% Set the bound constraint
h.eval = @(x) x;
h.p = @(x,dx) dx;
h.ps = @(x,dz) dz;
h.pps = @(x,dx,dz) zeros(m,1);

% Generate an initial guess
x = ones(m,1);
z = zeros(m,1);

% Create an optimization state
state=Optizelle.InequalityConstrained.State.t( ...
    Optizelle.Rm,Optizelle.Rm,x,z);
state.iter_max = 2000;
state.eps_mu = 1e-4;
state.eps_grad = 1e-4;

% Create a bundle of functions
fns=Optizelle.InequalityConstrained.Functions.t;
fns.f=f; 
fns.h=h;

% Solve the optimization problem
state=Optizelle.InequalityConstrained.Algorithms.getMin( ...
    Optizelle.Rm,Optizelle.Rm,Optizelle.Messaging.stdout,fns,state);

% Print out the reason for convergence
fprintf('The algorithm converged due to: %s\n', ...
    Optizelle.OptimizationStop.to_string(state.opt_stop));

This is a nonnegative least squares problem with 10000 variables. Now, that said, there are a couple of factors as to how well Optizelle, or any other second-order algorithm, will work. First, anything that requires a factorization is going to be hard to scale. For example, if your problem has thousands of equality constraints and you don’t have a good preconditioner or the system is too large to factorize, it’s probably not going to solve. Second, unconstrained and bound constrained problems tend to scale better and I’ve solved problems with hundreds of millions of variables. However, the structure of the problem meant that the spectrum of the Hessian was compact. This helps enormously otherwise it’s not practical to solve the second-order system sufficiently enough for fast convergence.

Basically, this is a way to say that the algorithms can scale to thousands, or even millions, of variables, but the structure of the problem will dictate as to whether or not the algorithm will work well.

Anyway, thanks for the note and let me know if I can clarify anything.