Bad_alloc exception on cpp usage

Hello friends

I am developing an interface to solve my Nonlinear Optimization Problems using Optizelle 1.3.0. I am still in my first steps. I have compiled and run the nq_sqp_exercise.cpp successfully, but I have had problems with my own examples. When I try to run my own code, even on toy problems having less than 10 variables and constraints, after some callback evaluations, I have had the following:

terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc

running my software using valgrind (Linux, intel icpc compiler), I have had, using Constrained optimizer:


terminate called after throwing an instance of ‘std::bad_alloc’
what(): std::bad_alloc
==8620==
==8620== Process terminating with default action of signal 6 (SIGABRT)
==8620== at 0x6BDAFB7: raise (raise.c:51)
==8620== by 0x6BDC920: abort (abort.c:79)
==8620== by 0xF80B9C2: __gnu_cxx::__verbose_terminate_handler() [clone .cold] (vterminate.cc:95)
==8620== by 0xF817445: __cxxabiv1::__terminate(void ()()) (eh_terminate.cc:48)
==8620== by 0xF8174B0: std::terminate() (eh_terminate.cc:58)
==8620== by 0xF817703: __cxa_throw (eh_throw.cc:95)
==8620== by 0xF80DEA8: std::__throw_bad_alloc() (functexcept.cc:54)
==8620== by 0xF1AD8EC: void Optizelle::solveInKrylov<double, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY>(unsigned long c
onst&, double const
, double const*, std::__cxx11::list<Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY::Vector, std::al
locator<Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY::Vector> > const&, Optizelle::Operator<double, Optizelle::Equali
tyConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY> const&, Optizelle::
EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY::Vector const&, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm

::XXxYY::Vector&) [clone .constprop.1142] (in /opt/optizelle_versions/Optizelle-1.3.0-Source/build/install/lib/liboptizelle.so)
==8620== by 0xF1ADCF0: std::pair<double, unsigned long> Optizelle::gmres<double, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XX
xYY>(Optizelle::Operator<double, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY, Optizelle::EqualityConstrained<double, Optizel
le::Rm, Optizelle::Rm>::XXxYY> const&, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY::Vector const&, double, unsigned
long, unsigned long, Optizelle::Operator<double, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY, Optizelle::EqualityConstrained
<double, Optizelle::Rm, Optizelle::Rm>::XXxYY> const&, Optizelle::Operator<double, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXx
YY, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY> const&, Optizelle::GMRESManipulator<double, Optizelle::EqualityConstrained<
double, Optizelle::Rm, Optizelle::Rm>::XXxYY> const&, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::XXxYY::Vector&) [clone .
constprop.1139] (in /opt/optizelle_versions/Optizelle-1.3.0-Source/build/install/lib/liboptizelle.so)
==8620== by 0xF1E784F: Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::Algorithms::findEqualityMultiplier(Optizelle::EqualityConstr
ained<double, Optizelle::Rm, Optizelle::Rm>::Functions::t const&, Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::State::t&) (in /opt/
optizelle_versions/Optizelle-1.3.0-Source/build/install/lib/liboptizelle.so)
==8620== by 0x5A185B: Optizelle::EqualityConstrained<double, Optizelle::Rm, Optizelle::Rm>::Algorithms::CompositeStepManipulator<Optizelle::Constrained<d
ouble, Optizelle::Rm, Optizelle::Rm, Optizelle::Rm> >::eval(Optizelle::Constrained<double, Optizelle::Rm, Optizelle::Rm, Optizelle::Rm>::Functions::t const&
, Optizelle::Constrained<double, Optizelle::Rm, Optizelle::Rm, Optizelle::Rm>::State::t&, Optizelle::OptimizationLocation::t const&) const (optizelle.h:9684
)
==8620== by 0x5A5771: Optizelle::ConversionManipulator<Optizelle::Constrained<double, Optizelle::Rm, Optizelle::Rm, Optizelle::Rm>, Optizelle::Unconstrai
ned<double, Optizelle::Rm> >::eval(Optizelle::Unconstrained<double, Optizelle::Rm>::Functions::t const&, Optizelle::Unconstrained<double, Optizelle::Rm>::St
ate::t&, Optizelle::OptimizationLocation::t const&) const (optizelle.h:1688)


Does someone have some idea what can be wrong in my code? Unfortunately, my problem data is not hardcoded and I cannot put my formulation here. My interface to Optizelle belongs a large system and problem parameters cames from a complex path structure.

Thank you in advanced

Regards

Wendel Melo

Hi Wendel,

Thanks for trying things out. Without trying things out, it’s hard to know for sure. That said, if I had to guess, the size parameter m is incorrect due to an underflow. The type Natural in the routine solveInKrylov is typedef’d to size_t. The simplest place to get a std::bad_alloc is on the allocation of y in the line

std::vector <Real> y(m);

For example, the code:

int main() {
    unsigned m = -1;
    std::vector <double> y(m);
}

gives the error

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

Given that the stack trace doesn’t go down deeper, I don’t think it’s in the other routines. Though, the line X_Vector V_y(X::init(x)); also allocates memory, so it’s possible that x was already enormous and the allocation of a new vector failed. I’d have to think if that’d be std::bad_alloc specifically. But, at this point, I’d check the parameter m. Do you have this running in a debugger?

Joe

In retrospect, there’s absolutely no reason why you’d call that routine directly, so there’s something else going on. Give me a bit and I’ll check some other things.

OK, if you can get a debugger into that routine, that would help. That parameter m is the current number of Krylov vectors that we’re working with. It should be something reasonable unless something has gone awry with the number of iterations that GMRES is using has broken, which is controlled by augsys_iter_max and somewhat less by augsys_rst_freq, which are the maximum number of solver iterations and how often we restart GMRES, respectively. Now, since the number of variables you’re working with is small, this really shouldn’t happen unless there’s a bug somewhere, which is possible.

By the way, I see that you’re using icpc. That should be fine, but I don’t test on it. Could you try with gcc?

OK, to summarize, steps to take:

  1. Turn on a debugger and figure out exactly what line it’s crashing on and how big of an allocation it’s trying to do

  2. Recompile with gcc to determine if it’s a compiler difference between icpc and gcc

Sound good?

Ah, this may be related to a pull request that was never tracked down:

You can also try replacing the lines:

iter--;
i--;

on lines 2526 and 2527 of linalg.h by

if ( iter > 0 ) { iter--; }
if ( i    > 0 ) { i--; }

That’s probably it. As to why, I’m still kind of baffled and that’s why it never was merged. If there are no iterations, the operator should give a vector whose norm is not nan.

Anyway, if you have time, please check and verify this. If this truly is the problem, it would be good to know why we’re getting a nan and triggering the code on iteration 0.

Thank you for your attention joe. I have made changes that you recommended in linalg.h. The bad_alloc exception is not being raised more, thank you.

However, I am still having problems. Even to my toy problems, I am getting the opt_stop 2 (StepSmal) and the constrained optimizer is taking only 2 iterations. I have printed out input and output in callback evaluation methods and I could note that in the Inequality Evalution structure, in the method:

// z=h’(x)*dy
void ps( X::Vector const &x, Y::Vector const &dy, X::Vector &z) const

input array dy are coming with infinity and nan values in my runs. I think this should not happen, right? Does someone have some idea what is happening now?

Regards

Wendel

Let’s see if we can sort this out.

The reason that the stopping condition for the small step size is getting tricked is likely that some function is returning a nan somewhere and the optimization is backing off the step in an attempt to find a point where it does not. There’s a reason argument to be made that all nans should be trapped and reported, but we don’t do this to accommodate a moderately common situation that arises in parameter estimation. If a parameter estimation solve is based on an ODE or PDE solve that uses some kind of method of lines, it’s possible, if not likely, for the optimizer to find a parameter that violates stability. That leads the DE solve to go awry and probably return a nan. By backing off the step when this happens, the hope is to not be so aggressive with the step and a feasible parameter is found. Candidly, it’s better to just add an appropriate bound, so this doesn’t happen, but most people do not.

Alright, so why are we getting nans here? Good question and one difficult to answer with the available information here. There are a number of places where the adjoint of the inequality, h'(x)*, could be fed a nan. It could mean that the Lagrange multiplier contains nans. It could mean that the output from the forward operator, h'(x), contains nans. The essential assumption that we make in Optizelle is that the starting point provided by the user ensures that f(x), g(x), h(x), and all of their derivatives can be evaluated and contain any infs or nans. We also assume that h(x) > 0. If not, we need to reformulate the problem to add a slack variable since this is not done automatically. I suppose that would break things. Is your starting point strictly feasible with respect to the inequality? I’d have to think whether or not that could lead to a nan entering the adjoint of h, but it’s still a problem. Would you also run some derivative checks to see if we’re generating reasonable results? Set

state.dscheme = Optizelle::DiagnosticScheme::DiagnosticsOnly;
state.f_diag = Optizelle::FunctionDiagnostics::SecondOrder;
state.g_diag = Optizelle::FunctionDiagnostics::SecondOrder;
state.h_diag = Optizelle::FunctionDiagnostics::SecondOrder;

and rerun.

Joe