Hillclimbing

Colin Champion

Summary: this page contains my implementation of a couple of optimisation routines.

linmax finds the maximum value of a function over a given interval. The prototype is

xy linmax(double (*f)(double),xy A,xy B,xy C,double tol) ;

where

f is the function;
A, B, C are 3 xy structures; and
tol is the tolerance.

The return value is the argument/value pair at the maximum/minimum. A, B, and C correspond to function evaluations at 3 points, and should satisfy either A.x<B.x<C.x or A.x>B.x>C.x; moreover they should bracket the maximim: ie. B.y should be strictly greater than each of A.y and C.y. f is a function on a single argument returning a single value. The tolerance is absolute.

#include "memory.h"
#include <math.h>

xy linmax(double (*f)(double),xy A,xy B,xy C,double tol) 
{ int iter,t ; 
  double d,e,p,q,r,*a=0 ; 
  xy U,V,W ; 

  if(C.x<A.x) { V = C ; C = A ; A = V ; } 
  if(A.y>C.y) { W = A ; V = C ; } else { W = C ; V = A ; } 
  if(!(A.x<B.x&&B.x<C.x&&A.y<B.y&&C.y<B.y&&tol>0)) throw up("bad call") ;

  d = B.x - W.x ;
  e = C.x - A.x ;
  
  for(iter=0;B.x-A.x>2*tol||C.x-B.x>2*tol;iter++)
  { if(iter==1000) throw up("infinite loop") ; 
    U.x = A.x ; 
    if(fabs(e)>tol) // fit parabola
    { r = (B.x-W.x)*(B.y-V.y) ; 
      q = (B.x-V.x)*(B.y-W.y) ; 
      p = (B.x-V.x)*q - (B.x-W.x)*r ; 
      q = 2 * (q-r) ; 
      if(q>0) p = -p ; else q = -q ; 
      r = e ; 
      e = d ;
      if(2*fabs(p)<fabs(q*r)) 
      { U.x = B.x + (d=p/q) ; 
        if(U.x<A.x||U.x>C.x) U.x = A.x ; 
        else if(U.x<A.x+2*tol) U.x = B.x + (d=(A.x+2*tol-B.x)) ; 
        else if(U.x>C.x-2*tol) U.x = B.x + (d=(C.x-2*tol-B.x)) ; 
      }
    }
    if(U.x==A.x) // a golden section step
    { if(B.x<(A.x+C.x)/2) e = C.x - B.x ; else e = A.x - B.x ; 
      d = 0.382 * e ; 
    } 
    if(fabs(d)<tol) { if(d>0) d = tol ; else d = -tol ; } 
    U.x = B.x + d ;  
    U.y = f(U.x) ;
    if(U.y>B.y)
    { if(U.x<B.x) C = B ; else A = B ; 
      V = W ; W = B ; B = U ; 
    }
    else
    { if(U.x<B.x) A = U ; else C = U ; 
      if(U.y>W.y) { V = W ; W = U ; } 
      else if(U.y>V.y) V = U ;  
    }
  }
  return B ; 
}

Store the following as linmaxtest.c. It finds π as the maximum of –cos x between x=2 and x=4.

#include <math.h>
#include "memory.h"

xy linmax(double (*f)(double),xy A,xy B,xy C,double tol) ;
static int ng ; 
static double g(double theta) { ng += 1 ; return -cos(theta) ; } 

int main()
{ double tol,q,u,anc[1] ; 
  xy A,B,C ; 

  for(tol=1e-2;tol>5e-11;tol/=10)
  { A = xy(2,g) ; 
    B = xy(3,g) ; 
    C = xy(4,g) ; 
    ng = 0 ; 
    u = linmax(g,A,B,C,tol).x ; 
    printf("tol = %.1e ; %2d fn evals ; pi ~= %.10f\n",tol,ng,u) ; 
  }
}

Results as follows:

tol = 1.0e-02 ;  3 fn evals ; pi ~= 3.1455637887
tol = 1.0e-03 ;  5 fn evals ; pi ~= 3.1415989278
tol = 1.0e-04 ;  5 fn evals ; pi ~= 3.1415989278
tol = 1.0e-05 ;  5 fn evals ; pi ~= 3.1415889278
tol = 1.0e-06 ;  6 fn evals ; pi ~= 3.1415926404
tol = 1.0e-07 ;  6 fn evals ; pi ~= 3.1415926404
tol = 1.0e-08 ;  6 fn evals ; pi ~= 3.1415926536
tol = 1.0e-09 ;  7 fn evals ; pi ~= 3.1415926536
tol = 1.0e-10 ; 14 fn evals ; pi ~= 3.1415926536

linmax.c linmaxtest.c 
utils: memory.h

Download the code then compile and run using:

g++ -O -g -o linmaxtest linmaxtest.c linmax.c ; linmaxtest

doc : source : test : download

doc : source : mdplib source : references : test : download

The prototypes are

double uoamin(double (*func)(double*),double *x,int n,
              double xinc,double tol,int maxiter) ;
double uoamax(double (*func)(double*),double *x,int n,
              double xinc,double tol,int maxiter) ;
int uoacond() ;

where

f is the function;
x is an n-long vector defining the start point;
n is the dimension of the space;
xinc is an offset (the same for all dimensions) used to generate some further initial points;
tol is the radius of tolerance for the minimum/maximum, and
maxiter is the maximum number of function iterations to perform.

Choose between uoamin and uoamax depending on whether you want to minimise or maximise f.

On return x will hold the approximate location of the minimum/maximum and the return value will be the value of f as computed at that point. The algorithm terminates when one of the three following conditions is met:

The position x is believed to be within a radius tol of the true minimum/maximum (cond=0); or
x is believed to be within a larger radius ρ of the true minimum/maximum, but this cannot be tightened because f seems to be completely flat within this radius (cond=1); or
maxiter function evaluations have been performed (cond=2).

A call to uoacond() returns the value of cond corresponding to the most recent call to newuoa.

If you’re looking for reliable software you’d be better off running Attractive Chaos’s C++ translation of Powell’s original Fortran code (or Steven Johnson’s), but if you want to look inside you may find my code more congenial.

My version used Attractive Chaos’s as a starting point. I tried to convert it into the sort of code I’d have written myself. At each step I checked correctness using the test program below. Unfortunately the rewriting was error-prone and I occasionally introduced bugs which weren’t shown up by the test, but which became evident as the code became clearer and faulty logic started to show through. I can’t be confident of having removed all the bugs I introduced, which is why I can’t recommend my code to anyone who wants something he or she can trust.

At first I kept my output bit-identical to Attractive Chaos’s on my test file: this required me to dispense with a couple of minor code reorganisations. Later, when I made a version of lincoa I found some duplicated code. I replaced this by shared code but this led to small changes in the output for both programs. At this point I allowed myself the minor code reorganisations.

Incidentally I changed every line of code while working through it.

• uoacond • biglag • bigden • sethd • trsapp • setrhodelta • setknewdistsq • uoamin • uoamax

/*
  This is NEWUOA for unconstrained minimization. The codes were written
  by Powell in Fortran and then translated to C with f2c. I further
  modified the code to make it independent of libf2c and f2c.h. Please
  refer to "The NEWUOA software for unconstrained optimization without
  derivatives", which is available at www.damtp.cam.ac.uk, for more
  information.
 */
/*
  The original fortran codes are distributed without restrictions. The
  C++ codes are distributed under MIT license.
 */
/* The MIT License

   Copyright (c) 2004, by M.J.D. Powell <mjdp@cam.ac.uk>
                 2008, by Attractive Chaos <attractivechaos@aol.co.uk>
                 2015, by Colin Champion <colin.champion@masterlyinactivity.com>

   Permission is hereby granted, free of charge, to any person obtaining
   a copy of this software and associated documentation files (the
   "Software"), to deal in the Software without restriction, including
   without limitation the rights to use, copy, modify, merge, publish,
   distribute, sublicense, and/or sell copies of the Software, and to
   permit persons to whom the Software is furnished to do so, subject to
   the following conditions:

   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.

   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
   BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
   ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
   CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
   SOFTWARE.
*/

#include "memory.h"
#include <math.h>
#define pi 3.141592653589793
static int cond = -1 ; 
int uoacond() { return cond ; }
int update(int,int,double **,double **,int,double *,double,int *) ;
void shiftxbase(int n,int npt,double **xp,double *xopt,double *vlag,double **b,
                double **z,double *pq,double *hq,int idz) ;
double calcvlagbeta(int n,int npt,double *vlag,double **z,double **b,double *w,
                    double *xopt,double *d,double beta,double dsq,double xsq,
                    int idz) ;
int chooseknew(int n,int npt,double beta,double **z,double *vlag,double **xp,
               double *xopt,double detrat,double rhosq,int ktemp,int idz) ;
double genpqw(int n,int npt,double *pqw,double **z,int knew,int idz) ;
void updatehq(int n,int npt,double *hq,double pqknew,double **xpt,int knew) ;
void updategopt(int n,double *gopt,double *hq,double *step) ;
void updategopt2(int n,int npt,double *gopt,double *xopt,double *pq,double **) ;

/* -------------------------------------------------------------------------- */

static double biglag(int n,int npt,double *xopt,double **xp,double **b,
                     double **z,int idz,int knew,double delta,double *d)
{ /* N is the number of variables. NPT is the number of interpolation
   * equations. XOPT is the best interpolation point so far. XPT
   * contains the coordinates of the current interpolation
   * points. BMAT provides the last N columns of H.  ZMAT and IDZ give
   * a factorization of the first NPT by NPT submatrix of H. NDIM is
   * the first dimension of BMAT and has the value NPT+N.  KNEW is the
   * index of the interpolation point that is going to be moved. DELTA
   * is the current trust region bound. D will be set to the step from
   * XOPT to the new point. ALPHA will be set to the KNEW-th diagonal
   * element of the H matrix. HCOL, GC, GD, S and W will be used for
   * working space. */

  /* The step D is calculated in a way that attempts to maximize the
   * modulus of LFUNC(XOPT+D), subject to the bound ||D|| <= DELTA,
   * where LFUNC is the KNEW-th Lagrange function. */

  int i,j,k,iterc,isave ;
  double sp,ss,cf1,cf2,cf3,cf4,cf5,dhd,cth,tau,sth,sum,temp,step,angle,scale ;
  double denom,delsq,tempa,tempb,taubeg,tauold,taumax,dd,gg,alpha ;
  double *hcol=vector(npt),*w=vector(n),*s=vector(n),*gd=vector(n) ; 
  double *gc=vector(n) ; 

  delsq = delta * delta ;

  /* Set the first NPT components of HCOL to the leading elements of
   * the KNEW-th column of H. */
  alpha = genpqw(n,npt,hcol,z,knew,idz) ; 

  /* Set the unscaled initial direction D. Form the gradient of LFUNC
   * atXOPT, and multiply D by the second derivative matrix of LFUNC. */
  for(dd=i=0;i<n;i++) 
  { d[i] = xp[knew][i] - xopt[i] ; gc[i] = b[i][knew] ; dd += d[i]*d[i] ; }

  for(k=0;k<npt;k++) 
  { for(temp=sum=j=0;j<n;j++) 
    { temp += xp[k][j] * xopt[j] ; sum += xp[k][j] * d[j] ; }
    temp *= hcol[k] ;
    sum *= hcol[k] ;
    for(i=0;i<n;i++) 
    { gc[i] += temp * xp[k][i] ; gd[i] += sum * xp[k][i] ; }
  }

  /* Scale D and GD, with a sign change if required. Set S to another
   * vector in the initial two dimensional subspace. */
  for(gg=sp=dhd=i=0;i<n;i++) 
  { gg += gc[i] * gc[i] ; sp += d[i] * gc[i] ; dhd += d[i] * gd[i] ; }

  scale = delta / sqrt(dd) ;
  if(sp*dhd<0) scale = -scale ;
  if(sp*sp>dd*.99*gg) temp = 1 ; else temp = 0 ; 
  tau = scale * (fabs(sp)+0.5*scale*fabs(dhd)) ;
  if(gg*delsq<tau*.01*tau) temp = 1 ;

  for(i=0;i<n;i++) 
  { d[i] *= scale ; gd[i] *= scale ; s[i] = gc[i] + temp*gd[i] ; }

  /* Begin the iteration by overwriting S with a vector that has the
   * required length and direction, except that termination occurs if
   * the given D and S are nearly parallel. */
  for(iterc=0;iterc<n;iterc++) 
  { for(dd=sp=ss=i=0;i<n;i++) 
    { dd += d[i] * d[i] ; sp += d[i] * s[i] ; ss += s[i] * s[i] ; }
    temp = dd*ss - sp*sp ;
    if(temp<=dd*1e-8*ss) break ;
    denom = sqrt(temp);
    for(i=0;i<n;i++) { s[i] = (dd*s[i]-sp*d[i]) / denom ; w[i] = 0 ; }

    /* Calculate the coefficients of the objective function on the
     * circle, beginning with the multiplication of S by the second
     * derivative matrix. */
    for(k=0;k<npt;k++) 
    { for(sum=j=0;j<n;j++) sum += xp[k][j] * s[j] ;
      sum *= hcol[k] ; 
      for(j=0;j<n;j++) w[j] += sum * xp[k][j] ;
    }

    for(cf1=cf2=cf3=cf4=cf5=i=0;i<n;i++) 
    { cf1 += s[i] * w[i];
      cf2 += d[i] * gc[i];
      cf3 += s[i] * gc[i];
      cf4 += d[i] * gd[i];
      cf5 += s[i] * gd[i];
    }

    cf1 /= 2 ;
    cf4 = cf4/2 - cf1 ;
    /* Seek the value of the angle that maximizes the modulus of TAU. */
    taubeg = cf1 + cf2 + cf4 ;
    taumax = tauold = taubeg;
    temp = 2 * pi / 50.0 ;

    for(isave=0,i=1;i<50;tauold=tau,i++) 
    { angle = (double) i*temp;
      cth = cos(angle) ;
      sth = sin(angle) ;
      tau = cf1 + (cf2 + cf4*cth) * cth + (cf3 + cf5*cth) * sth ;
      if(fabs(tau)>fabs(taumax)) 
      { taumax = tau ; isave = i ; tempa = tauold ; } 
      else if(i==isave+1) tempb = tau ;
    }

    if(isave==0) tempa = tau ; else if(isave==49) tempb = taubeg ;
    step = 0 ;
    if(tempa!=tempb) 
    { tempa -= taumax ;
      tempb -= taumax ;
      step = 0.5 * (tempa-tempb) / (tempa+tempb) ;
    }
    angle = temp * (isave+step) ;
    /* Calculate the new D and GD. Then test for convergence. */
    cth = cos(angle) ;
    sth = sin(angle) ;
    tau = cf1 + (cf2 + cf4*cth) * cth + (cf3 + cf5*cth) * sth ;

    for(i=0;i<n;i++) 
    { d[i] = cth*d[i] + sth*s[i] ;
      gd[i] = cth*gd[i] + sth*w[i] ;
      s[i] = gc[i] + gd[i] ;
    }
    if(fabs(tau)<=fabs(taubeg)*1.1) break ;
  }
  free(hcol,w,s,gd,gc) ; 
  return alpha ; 
}
/* -------------------------------------------------------------------------- */
/*page*/
/* -------------------------------------------------------------------------- */

static double bigden(int n,int npt,double *xopt,double **xp,double **b, 
                     double **z,int idz,int kopt,int knew, double *d,double *w,
                     double *vlag)
{ /* N is the number of variables.
   * NPT is the number of interpolation equations.
   * XOPT is the best interpolation point so far.
   * XPT contains the coordinates of the current interpolation points.
   * BMAT provides the last N columns of H.
   * ZMAT and IDZ give a factorization of the first NPT by NPT
     submatrix of H.
   * NDIMis the first dimension of BMAT and has the value NPT+N.
   * KOPT is the index of the optimal interpolation point.
   * KNEW is the index of the interpolation point that is going to be
     moved.
   * D will be set to the step from XOPT to the new point, and on entry it
   * should be the D that was calculated by the last call of BIGLAG. 
     The length of the initial D provides a trust region bound on the final D.
   * W will be set to Wcheck for the final choice of D.
   * VLAG will be set to Theta*Wcheck+e_b for the final choice of D.
   * D is calculated in a way that should provide a denominator with a
     large modulus in the updating formula when the KNEW-th
     interpolation point is shifted to the new position XOPT+D. 
   * the RETURN value beta will be found in the updating formula when
     the KNEW-th interpolation point is moved to its new position.
*/
  int i,j,k,isave,iterc,jc,nw,ksav,ndim=npt+n ;
  double dd,ds,ss,den[9],par[9],tau,sum,diff,temp,step,beta,alpha,angle ;
  double denex[9],tempa,tempb,tempc,ssden,dtest,xoptd,xopts,denold,denmax ; 
  double densav,dstemp,sumold,sstemp,xoptsq ;
  double **prod=matrix(ndim,5),**wvec=matrix(ndim,5),*s=vector(n) ;

  /* Store the first NPT elements of the KNEW-th column of H in W(N+1)
   * to W(N+NPT). */
  alpha = genpqw(n,npt,w+n,z,knew,idz) ; 

  /* The initial search direction D is taken from the last call of
   * BIGBDLAG, and the initial S is set below, usually to the direction
   * from X_OPT to X_KNEW, but a different direction to an
   * interpolation point may be chosen, in order to prevent S from
   * being nearly parallel to D. */
  for(dd=ds=ss=i=0;i<n;i++)
  { dd += d[i]*d[i] ; 
    s[i] = xp[knew][i] - xopt[i] ;
    ds += d[i] * s[i] ;
    ss += s[i] * s[i] ;
  }

  if(ds*ds>dd*.99*ss) 
  { ksav = knew ;
    dtest = ds * ds / ss ;
    for(k=0;k<npt;k++) if(k!=kopt)
    { for(dstemp=sstemp=i=0;i<n;i++) 
      { diff = xp[k][i] - xopt[i] ;
        dstemp += d[i] * diff ; 
        sstemp += diff*diff ; 
      }
      if(dstemp*dstemp/sstemp<dtest) 
      { ksav = k ;
        dtest = dstemp * dstemp / sstemp ;
        ds = dstemp ;
        ss = sstemp ;
      }
    }
    for(i=0;i<n;i++) s[i] = xp[ksav][i] - xopt[i] ;
  }

  ssden = dd*ss - ds*ds ;
  densav = 0 ;
  /* Begin the iteration by overwriting S with a vector that has the
   * required length and direction. */

  /* ------------------------------------------------------------------------ */

  for(iterc=1;;iterc++)
  { temp = 1.0 / sqrt(ssden) ;
    for(xoptsq=xoptd=xopts=i=0;i<n;i++) 
    { s[i] = temp * (dd*s[i]-ds*d[i]) ;
      xoptd += xopt[i] * d[i] ;
      xopts += xopt[i] * s[i] ;
      xoptsq += xopt[i] * xopt[i] ; 
    }
    /* Set the coefficients of the first 2 terms of BETA. */
    tempa = 0.5 * xoptd * xoptd ;
    tempb = 0.5 * xopts * xopts ;
    den[0] = dd * (xoptsq + 0.5*dd) + tempa + tempb ;
    den[1] = 2 * xoptd * dd ;
    den[2] = 2 * xopts * dd ;
    den[3] = tempa - tempb ;
    den[4] = xoptd * xopts ;
    for(i=4;i<8;i++) den[i+1] = 0 ; 

    /* Put the coefficients of Wcheck in WVEC. */
    for(k=0;k<npt;k++)
    { for(tempa=tempb=tempc=i=0;i<n;i++)
      { tempa += xp[k][i] * d[i];
        tempb += xp[k][i] * s[i];
        tempc += xp[k][i] * xopt[i];
      }
      wvec[k][0] = 0.25 * (tempa*tempa + tempb*tempb) ;
      wvec[k][1] = tempa * tempc ;
      wvec[k][2] = tempb * tempc ;
      wvec[k][3] = 0.25 * (tempa*tempa - tempb*tempb) ;
      wvec[k][4] = 0.5 * tempa * tempb ;
    }
    for(i=0;i<n;i++)
    { wvec[i+npt][0] = 0 ;
      wvec[i+npt][1] = d[i] ;
      wvec[i+npt][2] = s[i] ;
      wvec[i+npt][3] = 0 ;
      wvec[i+npt][4] = 0 ;
    }

    /* Put the coefficents of THETA*Wcheck in PROD. */
    for(jc=0;jc<5;jc++)
    { if(jc==2||jc==3) nw = ndim ; else nw = npt ; 
      for(k=0;k<npt;k++) prod[k][jc] = 0 ;
      for(j=0;j<npt-n-1;j++) 
      { for(sum=k=0;k<npt;k++) sum += z[j][k] * wvec[k][jc] ;
        if(j<idz) sum = -sum ;
        for(k=0;k<npt;k++) prod[k][jc] += sum * z[j][k] ;
      }
      if(nw==ndim) for(k=0;k<npt;k++) 
      { for(sum=j=0;j<n;j++) sum += b[j][k] * wvec[npt+j][jc] ;
        prod[k][jc] += sum ;
      }
      for(j=0;j<n;j++) 
      { for(sum=i=0;i<nw;i++) sum += b[j][i] * wvec[i][jc] ;
        prod[npt+j][jc] = sum ;
      }
    }

    /* Include in DEN the part of BETA that depends on THETA. */
    for(k=0;k<ndim;k++)
    { for(sum=i=0;i<5;i++) sum += ( par[i] = 0.5 * prod[k][i] * wvec[k][i] ) ;
      den[0] -= par[0] + sum ;

      tempa = prod[k][0] * wvec[k][1] + prod[k][1] * wvec[k][0] ;
      tempb = prod[k][1] * wvec[k][3] + prod[k][3] * wvec[k][1] ;
      tempc = prod[k][2] * wvec[k][4] + prod[k][4] * wvec[k][2] ;
      den[1] -= tempa + 0.5 * (tempb+tempc) ;
      den[5] -= 0.5 * (tempb-tempc) ;

      tempa = prod[k][0] * wvec[k][2] + prod[k][2] * wvec[k][0] ;
      tempb = prod[k][1] * wvec[k][4] + prod[k][4] * wvec[k][1] ;
      tempc = prod[k][2] * wvec[k][3] + prod[k][3] * wvec[k][2] ;
      den[2] -= tempa + 0.5 * (tempb-tempc) ;
      den[6] -= 0.5 * (tempb+tempc) ;

      tempa = prod[k][0] * wvec[k][3] + prod[k][3] * wvec[k][0] ;
      den[3] -= tempa + par[1] - par[2] ;

      tempa = prod[k][0] * wvec[k][4] + prod[k][4] * wvec[k][0] ;
      tempb = prod[k][1] * wvec[k][2] + prod[k][2] * wvec[k][1] ;
      den[4] -= tempa + 0.5 * tempb ;
      den[7] -= par[3] - par[4] ;

      tempa = prod[k][3] * wvec[k][4] + prod[k][4] * wvec[k][3] ;
      den[8] -= 0.5 * tempa ;
    }

    /* Extend DEN so that it holds all the coefficients of DENOM. */
    for(sum=i=0;i<5;i++) 
    { tempa = prod[knew][i] ; sum += ( par[i] = 0.5 * tempa * tempa ) ; }
    denex[0] = alpha*den[0] + par[0] + sum ;

    tempa = 2 * prod[knew][0] * prod[knew][1] ;
    tempb = prod[knew][1] * prod[knew][3] ;
    tempc = prod[knew][2] * prod[knew][4] ;
    denex[1] = alpha*den[1] + tempa + tempb + tempc ;
    denex[5] = alpha*den[5] + tempb - tempc ;

    tempa = 2 * prod[knew][0] * prod[knew][2] ;
    tempb = prod[knew][1] * prod[knew][4] ;
    tempc = prod[knew][2] * prod[knew][3] ;
    denex[2] = alpha*den[2] + tempa + tempb - tempc ;
    denex[6] = alpha*den[6] + tempb + tempc ;

    tempa = 2 * prod[knew][0] * prod[knew][3] ;
    denex[3] = alpha*den[3] + tempa + par[1] - par[2] ;

    tempa = 2 * prod[knew][0] * prod[knew][4] ;
    denex[4] = alpha*den[4] + tempa + prod[knew][1]*prod[knew][2] ;
    denex[7] = alpha*den[7] + par[3] - par[4] ;
    denex[8] = alpha*den[8] + prod[knew][3]*prod[knew][4] ;

    /* Seek the value of the angle that maximizes the modulus of DENOM. */
    sum = denex[0] + denex[1] + denex[3] + denex[5] + denex[7];
    denold = denmax = sum ;
    temp = 2 * pi / 50 ;
    par[0] = 1 ;
    for(isave=0,i=1;i<50;i++)
    { angle = i * temp ;
      par[1] = cos(angle) ;
      par[2] = sin(angle) ;
      for(j=4;j<9;j+=2)
      { par[j-1] = par[1]*par[j-3] - par[2]*par[j-2] ;
        par[j] = par[1]*par[j-2] + par[2]*par[j-3] ;
      }
      sumold = sum ;
      for(sum=j=0;j<9;j++) sum += denex[j]*par[j] ;
      if(fabs(sum)>fabs(denmax)) { denmax = sum ; isave = i ; tempa = sumold ; }
      else if(i==isave+1) tempb = sum ;
    }

    if(isave==0) tempa = sum ; else if(isave==49) tempb = denold ;
    step = 0 ;
    if(tempa!=tempb) 
    { tempa -= denmax ;
      tempb -= denmax ;
      step = 0.5 * (tempa-tempb) / (tempa+tempb) ;
    }

    /* Calculate the new parameters of the denominator, the new VLAG
     * vector and the new D. Then test for convergence. */
    angle = temp * (isave+step) ;
    par[1] = cos(angle) ;
    par[2] = sin(angle) ;
    for(j=4;j<9;j+=2) 
    { par[j-1] = par[1]*par[j-3] - par[2]*par[j-2] ;
      par[j] = par[1]*par[j-2] + par[2]*par[j-3] ;
    }
    for(denmax=beta=j=0;j<9;j++)
    { beta += den[j]*par[j] ; denmax += denex[j]*par[j] ; } 

    for(k=0;k<ndim;k++) for(vlag[k]=j=0;j<5;j++)
      vlag[k] += prod[k][j] * par[j] ;

    tau = vlag[knew] ;
    for(dd=tempa=tempb=i=0;i<n;i++)
    { d[i] = par[1]*d[i] + par[2]*s[i] ;
      w[i] = xopt[i] + d[i] ;
      dd += d[i] * d[i] ;
      tempa += d[i] * w[i] ;
      tempb += w[i] * w[i] ;
    }

    if(iterc>=n) break ;
    if(iterc>1) densav = fmax(densav, denold) ;
    if(fabs(denmax)<=fabs(densav)*1.1) break ;
    densav = denmax ;

    /* Set S to 0.5 the gradient of the denominator with respect to
     * D. Then branch for the next iteration. */
    for(i=0;i<n;i++)
    { temp = tempa*xopt[i] + tempb*d[i] - vlag[npt+i] ;
      s[i] = tau*b[i][knew] + alpha*temp ;
    }

    for(k=0;k<npt;k++)
    { for(sum=j=0;j<n;j++) sum += xp[k][j] * w[j] ;
      temp = (tau*w[n+k]-alpha*vlag[k]) * sum ;
      for(i=0;i<n;i++) s[i] += temp * xp[k][i] ;
    }

    for(ss=ds=i=0;i<n;i++) { ss += s[i]*s[i] ; ds += d[i]*s[i] ; }
    ssden = dd*ss - ds*ds ;
    if(ssden<dd*1e-8*ss) break ;
  }

  /* Set the vector W before the RETURN from the subroutine. */
  for(k=0;k<ndim;k++) for(w[k]=j=0;j<5;j++) w[k] +=  wvec[k][j] * par[j] ;
  vlag[kopt] += 1 ;
  freematrix(prod,wvec) ; free(s) ; 
  return beta ;
}
/* -------------------------------------------------------------------------- */
/*page*/
/* -------------------------------------------------------------------------- */

/* The following instructions set the vector HD to the vector D multiplied 
   by the second derivative matrix of Q.  */

static void 
  sethd(double *hd,int n,double **xp,double *d,double *hq,double *pq,int npt)
{ int i,j,k ; 
  double temp ; 

  for(i=0;i<n;i++) hd[i] = 0 ; 
  updategopt(n,hd,hq,d) ; 
  updategopt2(n,npt,hd,d,pq,xp) ; 
}
/* -------------------------------------------------------------------------- */

static double trsapp(int n, int npt, double *xopt, double **xp, double *gq,  
                     double *hq, double *pq,double delsq, double *step)
{ /* The arguments NPT, XOPT, XPT, GQ, HQ and PQ have their usual meanings, 
   * in order to define the current quadratic model Q.
   * DELTA is the trust region radius, and has to be positive. STEP
   * will be set to the calculated trial step. CRVMIN will be returned as the
   * least curvature of H aint the conjugate directions that occur,
   * except that it is set to 0 ifSTEP goes all the way to the trust
   * region boundary. The calculation of STEP begins with the
   * truncated conjugate gradient method. If the boundary of the trust
   * region is reached, then further changes to STEP may be made, each
   * one being in the 2D space spanned by the current STEP and the
   * corresponding gradient of Q. Thus STEP should provide a
   * substantial reduction to Q within the trust region. */

  int i,j,iterc,isave,itermax ;
  double dd,cf,dg,gg,ds,sg,ss,dhd,dhs,cth,sgk,shs,sth,qadd,qbeg,qred,qmin ;
  double temp,qsav,qnew,ggbeg,alpha,angle,reduc,ggsav,tempa,tempb,bstep ;
  double ratio,angtest,crvmin ;
  double *d=vector(n),*g=vector(n),*hd=vector(n),*hs=vector(n) ; 

  itermax = n ;
  for(i=0;i<n;i++) d[i] = xopt[i] ; 
  sethd(hd,n,xp,d,hq,pq,npt) ;
  /* Prepare for the first line search. */
  for(qred=dd=i=0;i<n;i++)
  { step[i] = hs[i] = 0 ; 
    g[i] = gq[i] + hd[i] ;
    d[i] = -g[i] ; 
    dd += d[i]*d[i] ;
  }
  if(dd==0) { free(d,g,hd,hs) ; return 0 ; }
  ggbeg = gg = dd ;

  /* ------------------------------------------------------------------------ */
  /*page*/
  /* -- Calculate the step to the trust region boundary and the product HD -- */

  for(ds=ss=0,iterc=1;ss<delsq;iterc++)
  { temp = delsq - ss;
    bstep = temp / (ds + sqrt(ds * ds + dd * temp));
    sethd(hd,n,xp,d,hq,pq,npt) ;

    for(dhd=j=0;j<n;j++) dhd += d[j] * hd[j] ;
    /* Update CRVMIN and set the step-length ALPHA. */
    alpha = bstep ;
    if(dhd>0) 
    { if(iterc==1) crvmin = dhd/dd ; else crvmin = fmin(crvmin,dhd/dd) ;
      alpha = fmin(alpha,gg/dhd) ;
    }
    qred += ( qadd = alpha * (gg - 0.5*alpha*dhd) ) ;

    /* Update STEP and HS. */
    for(ggsav=gg,gg=i=0;i<n;i++)
    { step[i] += alpha * d[i] ; 
      hs[i] += alpha * hd[i] ; 
      temp = g[i] + hs[i] ; 
      gg += temp * temp ; 
    }

    /* Begin another conjugate direction iteration if required. */
    if(alpha>=bstep) break ; 
    ds = 0 ; 
    if(qadd>qred*.01&&gg>ggbeg*1e-4&&iterc<itermax) 
    { temp = gg / ggsav ;
      for(dd=ss=i=0;i<n;i++)
      { d[i] = temp*d[i] - g[i] - hs[i] ; 
        dd += d[i]*d[i] ; 
        ds += d[i]*step[i] ; 
        ss += step[i]*step[i] ;
      }
    }
    if(ds<=0) { free(d,g,hd,hs) ; return crvmin ; }
  }
  /* ------------------------------------------------------------------------ */
  /*page*/
  /* ------------ Test whether an alternative iteration is required --------- */

  for(iterc++;gg>ggbeg*1e-4;iterc++)
  { for(sg=shs=i=0;i<n;i++) { sg += step[i]*g[i] ; shs += step[i]*hs[i] ; } 
    sgk = sg + shs ;
    angtest = sgk / sqrt(gg*delsq) ;
    if(angtest<=-.99) break ; 
    /* Begin the alternative iteration by calculating D and HD and some
     * scalar products. */
    temp = sqrt(delsq*gg - sgk*sgk) ;
    tempa = delsq / temp ;
    tempb = sgk / temp ;
    for(i=0;i<n;i++) d[i] = tempa*(g[i]+hs[i]) - tempb*step[i] ; 
    sethd(hd,n,xp,d,hq,pq,npt) ;

    for(dg=dhd=dhs=i=0;i<n;i++)
    { dg += d[i]*g[i] ; dhd += hd[i]*d[i] ; dhs += hd[i]*step[i] ; }

    /* Seek the value of the angle that minimizes Q. */
    cf = 0.5 * (shs-dhd) ;
    qbeg = sg + cf ;
    qsav = qmin = qbeg ;
    temp = 2 * pi / 50 ;

    for(isave=0,i=1;i<50;i++)
    { angle = i * temp ;
      cth = cos(angle) ;
      sth = sin(angle) ;
      qnew = (sg+cf*cth)*cth + (dg+dhs*cth)*sth ;
      if(qnew<qmin) { qmin = qnew ; isave = i ; tempa = qsav ; }
      else if(i==isave+1) tempb = qnew ;
      qsav = qnew ;
    }
    if(isave==0) tempa = qnew ; else if(isave==49) tempb = qbeg ;
    angle = 0 ;
    if(tempa!=tempb) 
    { tempa -= qmin ; 
      tempb -= qmin ;
      angle = 0.5 * (tempa-tempb) / (tempa+tempb) ;
    }
    angle = temp * (isave+angle) ;

    /* Calculate the new STEP and HS. Then test for convergence. */
    cth = cos(angle) ;
    sth = sin(angle) ;
    reduc = qbeg - (sg + cf * cth) * cth - (dg + dhs * cth) * sth ;
    for(gg=i=0;i<n;i++)
    { step[i] = cth*step[i] + sth*d[i] ; 
      hs[i] = cth*hs[i] + sth*hd[i] ; 
      temp = g[i] + hs[i] ;
      gg += temp*temp ;
    }

    qred += reduc ;
    ratio = reduc / qred ;
    if(iterc>=itermax||ratio<=0.01) break ; 
  }
  free(d,g,hd,hs) ; 
  return 0 ; 
}
/* -------------------------------------------------------------------------- */
/*page*/
/* -------------------------------------------------------------------------- */

static double setrhodelta(double *rho,double rhoend,double *delta)
{ double ratio=rho[0]/rhoend ; 

  delta[0] = rho[0] / 2 ; 
  if(ratio<=16) rho[0] = rhoend ;
  else if(ratio<=250) rho[0] = sqrt(ratio) * rhoend ;
  else rho[0] *= 0.1 ; 

  delta[0] = fmax(delta[0],rho[0]) ;
  return ratio ; 
}
static double setknewdistsq
  (int n,int npt,double delta,int *knew,double *xopt,double **xp) 
{ int j,k ;
  double sum,temp,distsq=4*delta*delta ;
  for(k=0;k<npt;k++)
  { for(sum=j=0;j<n;j++) { temp = xp[k][j] - xopt[j] ; sum += temp*temp ; }
    if(sum>distsq) { knew[0] = k ; distsq = sum ; } 
  }
  return distsq ; 
}
/* -------------------------------------------------------------------------- */

double uoamin(double (*func)(double*),double *x,int n,
              double rhobeg,double rhoend,int maxfun)
{
    /* This subroutine seeks the least value of a function of many
     * variables, by a trust region method that forms quadratic models
     * by interpolation. There can be some freedom in the interpolation
     * conditions, which is taken up by minimizing the Frobenius norm of
     * the change to the second derivative of the quadratic model,
     * beginning with a zero matrix. The arguments of the subroutine are
     * as follows. */

    /* N must be set to the number of variables and must be at least
     * two. NPT is the number of interpolation conditions. Its value
     * must be in the interval [N+2,(N+1)(N+2)/2]. Initial values of the
     * variables must be set in X(1),X(2),...,X(N). They will be changed
     * to the values that give the least calculated F. xinc and tol
     * must be set to the initial and final values of a trust region
     * radius, so both must be positive with tol<=xinc. Typically
     * xinc should be about one tenth of the greatest expected change
     * to a variable, and tol should indicate the accuracy that is
     * required in the final values of the variables. MAXFUN must be set
     * to an upper bound on the number of calls of CALFUN.  */

  /* SUBROUTINE func(x) must be provided by the user. */

  /* XBASE will hold a shift of origin that should reduce the contributions
   * from rounding errors to values of the model and Lagrange functions.
   * XOPT will be set to the displacement from XBASE of the vector of
     variables that provides the least calculated F so far.
   * XNEW will be set to the displacement from XBASE of the vector of
     variables for the current calculation of F. */

  /* ------------------------------------------------------------------------ */
  /*page*/
  /* ------------------------------------------------------------------------ */

  /* XPT will contain the interpolation point coordinates relative to XBASE.
   * FVAL will hold the values of F at the interpolation points.
   * GQ will hold the gradient of the quadratic model at XBASE.
   * HQ will hold the explicit second derivatives of the quadratic model.
   * PQ will contain the parameters of the implicit second derivatives
     of the quadratic model.
   * BMAT will hold the last N columns of H.
   * ZMAT will hold the factorization of the leading NPT by NPT submatrix
   * of H, this factorization being ZMAT times Diag(DZ) times ZMAT^T,
   * where the elements of DZ are plus or minus 1.0, asspecified by IDZ.
   * NDIM is the first dimension of BMAT and has the value NPT+N.
   * D is reserved for trial steps from XOPT.
   * VLAG will contain the values of the Lagrange functions at a new point X.
   * They are part of a product that requires VLAG to be of length NDIM. */

  int i,j,k,ih,nf,nh,nfm,idz,ipt,jpt,nfmm,knew,kopt,nptm,flip=(n<0) ;
  n = abs(n) ;  
  int ktemp,ksave,nfsav,itemp,itest,npt=2*n+1 ;
  double f,dsq,rho,sum,fbeg,diff,beta,gisq,temp,suma,sumb,fopt,gqsq ; 
  double xipt,xjpt,diffa,diffb,diffc,alpha,delta,recip,reciq,fsave ;
  double dnorm,ratio,dstep,vquad,detrat,crvmin,distsq,xoptsq ;

  double *xbase=vector(n),*xopt=vector(n),*xnew=vector(n),*fval=vector(npt) ; 
  double *gq=vector(n),*hq=vector((n*(n+1))/2),*pq=vector(npt),*d=vector(n) ;
  double *vlag=vector(npt+n),*w=vector(2*npt),*pqw=vector(npt) ; 
  double **z=matrix(npt-(n+1),npt),**b=matrix(n,npt+n),**xp=matrix(npt,n) ;

  nh = (n*(n+1)) / 2 ;
  nptm = npt - (n+1) ;
  if(maxfun<1) maxfun = 1 ; 
  for(j=0;j<n;j++) xbase[j] = x[j] ; 

  /* Begin the initialization procedure. NF becomes 1 more than the
   * number of function values so far. The coordinates of the
   * displacement of the next initial interpolation point from XBASE
   * are set in XPT(NF,.). */

  recip = 1 / (rhobeg*rhobeg) ;
  reciq = sqrt(0.5) * recip ;

  /* ------------------------------------------------------------------------ */
  /*page*/
  /* ------------------------------------------------------------------------ */

  for(nf=0;nf<npt;nf++)
  { nfm = nf-1 ;
    nfmm = nfm-n ;

    if(nfm<2*n) // always happens when npt = 2*n+1
    { if(nfm>=0&&nfm<n) xp[nf][nfm] = rhobeg ;
      else if(nfm>=n) xp[nf][nfmm] = -rhobeg ;
    } 
    else        // the case which never happens
    { itemp = nfmm / n ;
      jpt = nfm - itemp * n - n ;
      ipt = jpt + itemp ;
      if(ipt>=n) { ipt -= n ; swap(ipt,jpt) ; } 
      xipt = rhobeg ;
      if(fval[ipt+n+1] < fval[ipt+1]) xipt = -xipt ;
      xjpt = rhobeg ;
      if(fval[jpt+n+1] < fval[jpt+1]) xjpt = -xjpt ;
      xp[nf][ipt] = xipt ;
      xp[nf][jpt] = xjpt ;
    }
    /* Calculate the next value of F. The least function value so far and its
     * index are required. */
    for(j=0;j<n;j++) x[j] = xp[nf][j] + xbase[j];
    if(flip) fval[nf] = f = -func(x) ; else fval[nf] = f = func(x) ; 

    if(nf==0) { fbeg = fopt = f ; kopt = 0 ; }
    else if(f<fopt) { fopt = f ; kopt = nf ; }

    /* Set the non0 initial elements of BMAT and the quadratic model
     * in the cases when NF is at most 2*N+1. */
    if(nfm<2*n) 
    { if(nfm>=0&&nfm<n) 
      { gq[nfm] = (f-fbeg) / rhobeg ;
        if(npt<=nf+n) 
        { b[nfm][0] = -1/rhobeg ; 
          b[nfm][nf] = 1/rhobeg ; 
          b[nfm][npt+nfm-1] = -(rhobeg*rhobeg)/2 ; 
        }
      } 
      else if(nfm>=n) 
      { b[nfmm][nf-n] = 1/(2*rhobeg) ; 
        b[nfmm][nf] = -1/(2*rhobeg) ; 
        z[nfmm][0] = -(reciq+reciq) ;
        z[nfmm][nf] = z[nfmm][nf-n] = reciq ;
        ih = (nfmm * (nfmm+3)) / 2 ;
        temp = (fbeg-f) / rhobeg ;
        hq[ih] = (gq[nfmm]-temp) / rhobeg;
        gq[nfmm] = .5 * (gq[nfmm] + temp);
      }
    }
        /* Set the off-diagonal second derivatives of the Lagrange
         * functions and the initial quadratic model. */
    else 
    { ih = (ipt*(ipt+1))/2 + jpt ;
      if(xipt<0) ipt += n ;
      if(xjpt<0) jpt += n ;
      z[nfmm][0] = z[nfmm][nf] = recip ;
      z[nfmm][ipt] = z[nfmm][jpt] = -recip ;
      hq[ih] = (fbeg - fval[ipt+1] - fval[jpt+1] + f) / (xipt*xjpt) ;
    }
  }
  /* ------------------------------------------------------------------------ */
  /*page*/
  /* - Begin the iterative procedure, because the initial model is complete - */

  diffa = diffb = itest = idz = 0 ;
  for(i=0;i<n;i++) xopt[i] = xp[kopt][i] ; 
    /* Generate the next trust region step and test its length. Set KNEW
     * to -1 ifthe purpose of the next F will be to improve the
     * model. */

  /* ------------------------------------------------------------------------ */

  for(knew=-1,delta=rho=rhobeg,nfsav=nf;;)
  { if(knew<0) for(;;)
    { crvmin = trsapp(n,npt,xopt,xp,gq,hq,pq,delta*delta,d) ;
      for(dsq=i=0;i<n;i++) dsq += d[i]*d[i] ; 
      dnorm = fmin(delta,sqrt(dsq)) ;
      if(dnorm>=rho/2) break ; // knew must be -1
      delta /= 10 ;
      ratio = -1 ;
      if(delta<=rho*1.5) delta = rho ;
      temp = crvmin * .125 * rho * rho ;
      if(nf<=nfsav+2||temp<=diffa||temp<=diffb||temp<=diffc) 
      { distsq = setknewdistsq(n,npt,delta,&knew,xopt,xp) ;
        /* If KNEW is positive, then set DSTEP, and branch back for the next
         * iteration, which will generate a "model step". */
        if(knew>=0) 
        { dstep = fmax(rho,fmin(sqrt(distsq)/10,delta/2)) ; 
          dsq = dstep*dstep ; 
          break ;
        }
        else if(delta>rho||dnorm>rho) continue ;
      }
      /* Return from the calculation, after another Newton-Raphson step,
       * if it is too short to have been tried before. */
      if(rho<=rhoend) 
      { for(i=0;i<n;i++) 
        { xnew[i] = xopt[i] + d[i] ; x[i] = xbase[i] + xnew[i] ; }
        if(flip) f = -func(x) ; else f = func(x) ;
        knew = -2 ; 
        break ;
      }
      /* The calculations with the current value of RHO are complete. Pick
       * the next values of RHO and DELTA. */
      ratio = setrhodelta(&rho,rhoend,&delta) ; 
      nfsav = nf ; 
    }
    if(knew==-2) { cond = 0 ; break ; }

    /* ---------------------------------------------------------------------- */
    /*page*/
    /* ---------------------------------------------------------------------- */

    /* Shift XBASE if XOPT may be too far from XBASE. First make the
     * changes to BMAT that do not depend on ZMAT. */
    for(xoptsq=i=0;i<n;i++) xoptsq += xopt[i] * xopt[i] ; 
    if(dsq<=xoptsq/1000) 
    { updategopt(n,gq,hq,xopt) ;
      updategopt2(n,npt,gq,xopt,pq,xp) ;
      shiftxbase(n,npt,xp,xopt,vlag,b,z,pq,hq,idz) ; 
      for(j=0;j<n;j++) { xbase[j] += xopt[j] ; xopt[j] = 0 ; }
      xoptsq = 0 ; 
    }

    /* Pick the model step ifKNEW is positive. A different choice of D
     * may be made later, ifthe choice of D by BIGLAG causes
     * substantial cancellation in DENOM. */
    if(knew>=0) alpha = biglag(n,npt,xopt,xp,b,z,idz,knew,dstep,d) ;

    /* Calculate VLAG and BETA for the current choice of D. The first
     * NPT components of W_check will be held in W. */
    for(k=0;k<npt;k++)
    { for(vlag[k]=suma=sumb=j=0;j<n;j++)
      { suma += xp[k][j] * d[j] ;
        sumb += xp[k][j] * xopt[j] ;
        vlag[k] += b[j][k] * d[j] ;
      }
      w[k] = suma * (suma/2+sumb) ;
    }
    /* ---------------------------------------------------------------------- */
    /*page*/
    /* ---------------------------------------------------------------------- */

    beta = calcvlagbeta(n,npt,vlag,z,b,w,xopt,d,beta,dsq,xoptsq,idz) ;
    vlag[kopt] += 1 ;

    /* If KNEW is positive and if the cancellation in DENOM is unacceptable, 
     * then BIGDEN calculates an alternative model step. */
    if(knew>=0&&fabs(1 + alpha*beta / (vlag[knew]*vlag[knew]))<=0.8) 
      beta = bigden(n,npt,xopt,xp,b,z,idz,kopt,knew,d,w,vlag) ;

    /* Calculate the next value of the objective function. */
    if(nf+1>=maxfun) { cond = 2 ; break ; }
    for(i=0;i<n;i++) { xnew[i] = xopt[i] + d[i] ; x[i] = xbase[i] + xnew[i] ; }
    nf += 1 ; 
    if(flip) f = -func(x) ; else f = func(x) ;

    /* Use the quadratic model to predict the change in F due to the
     * step D, and set DIFF to the error of this prediction. */
    for(vquad=ih=j=0;j<n;j++)
    { vquad += d[j] * gq[j] ; 
      for(i=0;i<=j;ih++,i++)
      { temp = d[i]*xnew[j] + d[j]*xopt[i] ;
        if(i==j) temp /= 2 ;
        vquad += temp * hq[ih] ;
      }
    }

    for(k=0;k<npt;k++) vquad += pq[k] * w[k] ;
    diff = f - fopt - vquad ;
    diffc = diffb ;
    diffb = diffa ;
    diffa = fabs(diff) ;
    if(dnorm>rho) nfsav = nf ;
    /* Update FOPT and XOPT ifthe new F is the least value of the
     * objective function so far. The branch when KNEW is positive
     * occurs ifD is not a trust region step. */
    fsave = fopt ;
    if(f<fopt) { fopt = f ; for(i=0;i<n;i++) xopt[i] = xnew[i] ; }
    ksave = knew ;

    if(knew<0) 
    { /* Pick the next value of DELTA after a trust region step. */
      if(vquad>=0) { cond = 1 ; break ; }
      ratio = (f-fsave) / vquad ;
      if(ratio<=0.1) delta = dnorm/2 ;
      else if(ratio<=.7) delta = fmax(delta/2,dnorm) ;
      else delta = fmax(delta/2,2*dnorm) ;
      if(delta<=rho*1.5) delta = rho ;

      /* Set KNEW to the index of the next interpolation point to be
       * deleted. */
      temp = fmax(delta/10,rho) ;
      if(f>=fsave) { ktemp = kopt ; detrat = 1 ; }
      else { ktemp = -1 ; detrat = 0 ; } 

      knew = chooseknew(n,npt,beta,z,vlag,xp,xopt,detrat,temp*temp,ktemp,idz) ; 
    }
     /* Update BMAT, ZMAT and IDZ, so that the KNEW-th interpolation
      * point can be moved. Begin the updating of the quadratic model,
      * starting with the explicit second derivative term. */
    if(knew>=0) idz = update(n,npt,b,z,idz,vlag,beta,&knew) ;

    if(knew>=0)
    { /* Update the other second derivative parameters, and then the
       * gradient vector of the model. Also include the new interpolation
       * point. */
      updatehq(n,npt,hq,pq[knew],xp,knew) ;
      pq[knew] = 0 ;
      genpqw(n,npt,pqw,z,knew,idz) ; 
      for(k=0;k<npt;k++) pq[k] += diff * pqw[k] ; 

      fval[knew] = f ;
      for(gqsq=i=0;i<n;i++)
      { gq[i] += diff * b[i][knew];
        gqsq += gq[i] * gq[i];
        xp[knew][i] = xnew[i];
      }

      /* If a trust region step makes a small change to the objective
       * function, then calculate the gradient of the least Frobenius norm
       * interpolant at XBASE, and store it in W, using VLAG for a vector
       * of right hand sides. */
      if(ksave==-1&&delta==rho&&fabs(ratio)>.01) itest = 0 ;
      else if(ksave==-1&&delta==rho) 
      { for(k=0;k<npt;k++) vlag[k] = fval[k] - fval[kopt] ; 
        for(gisq=i=0;i<n;i++)
        { for(w[i]=k=0;k<npt;k++) w[i] += b[i][k] * vlag[k] ;
          gisq += w[i]*w[i] ; 
        }
        itest += 1 ; 
        if(gqsq<gisq*100) itest = 0 ;
        if(itest>=3)
        { for(i=0;i<n;i++) gq[i] = w[i] ; 
          for(ih=0;ih<nh;ih++) hq[ih] = 0 ; 
          for(j=0;j<nptm;j++) 
          { for(w[j]=k=0;k<npt;k++) w[j] += vlag[k] * z[j][k] ;
            if(j<idz) w[j] = -w[j] ; 
          }
          for(k=0;k<npt;k++) 
            for(pq[k]=j=0;j<nptm;j++) pq[k] += z[j][k] * w[j];
          itest = 0 ; 
        }
      }

      if(f<fsave) kopt = knew ;
      knew = -1 ; 
      /* If a trust region step has provided a sufficient decrease in F,
       * then branch for another trust region calculation. The case
       * KSAVE>0 occurs when the new function value was calculated by a
       * model step. */
      if(f<=fsave+vquad/10||ksave>=0) continue ;
      /* Alternatively, find out ifthe interpolation points are close
       * enough to the best point so far. */
    }
    distsq = setknewdistsq(n,npt,delta,&knew,xopt,xp) ;
      /* If KNEW is positive, then set DSTEP, and branch back for the next
       * iteration, which will generate a "model step". */
    if(knew>=0) 
    { dstep = fmax(rho,fmin(sqrt(distsq)/10,delta/2)) ; dsq = dstep*dstep ; }
    else if(ratio<=0&&delta<=rho&&dnorm<=rho) 
    /* The calculations with the current value of RHO are complete. Pick
     * the next values of RHO and DELTA. */
    { if(rho<=rhoend) { cond = 0 ; break ; }
      ratio = setrhodelta(&rho,rhoend,&delta) ; 
      nfsav = nf ; 
    }
  }
  /* Return from the calculation, after another Newton-Raphson step,
   * if it is too short to have been tried before. */
  if(fopt<=f) { for(i=0;i<n;i++) x[i] = xbase[i] + xopt[i] ; f = fopt ; }
 
  free(xbase,xopt,xnew,w,fval,gq) ; 
  free(hq,pq,d,vlag,pqw) ; 
  freematrix(z,b,xp) ; 
  if(flip) return -f ; else return f ;
}
/* -------------------------------------------------------------------------- */

double uoamax(double (*func)(double*),double *x,int n,
                 double xinc,double tol,int max_iter)
{ return uoamin(func,x,-n,xinc,tol,max_iter) ; }

• updategopt • updategopt2 • update • shiftxbase • calcvlagbeta • chooseknew • genpqw • updatehq

/*
  The original fortran codes are distributed without restrictions. The
  C++ codes are distributed under MIT license.
 */
/* The MIT License

   Copyright (c) 2004, by M.J.D. Powell <mjdp@cam.ac.uk>
                 2008, by Attractive Chaos <attractivechaos@aol.co.uk>
                 2015, by Colin Champion <colin.champion@masterlyinactivity.com>

   Permission is hereby granted, free of charge, to any person obtaining
   a copy of this software and associated documentation files (the
   "Software"), to deal in the Software without restriction, including
   without limitation the rights to use, copy, modify, merge, publish,
   distribute, sublicense, and/or sell copies of the Software, and to
   permit persons to whom the Software is furnished to do so, subject to
   the following conditions:

   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.

   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
   BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
   ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
   CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
   SOFTWARE.
*/
#include"memory.h"
#include <math.h>
#define max(a,b) ((a)>(b)?(a):(b))
#define min(a,b) ((a)<(b)?(a):(b))

void updategopt(int n,double *gopt,double *hq,double *step)
{ int i,j,ih ; 
  for(ih=j=0;j<n;j++) for(i=0;i<=j;i++,ih++) 
  { if(i<j) gopt[j] += hq[ih] * step[i] ; gopt[i] += hq[ih] * step[j] ; }
}
void updategopt2(int n,int npt,double *gopt,double *xopt,double *pq,double **xp)
{ int i,k ;
  double sum,temp ; 
  for(k=0;k<npt;k++)
  { for(sum=i=0;i<n;i++) sum += xp[k][i] * xopt[i];
    temp = pq[k] * sum ;
    for(i=0;i<n;i++) gopt[i] += temp * xp[k][i] ;
  }
}
/* -------------------------------------------------------------------------- */
/*page*/
/* -------------------------------------------------------------------------- */

int update(int n,int npt,double **b,double **z,int idz,double *vlag,
           double beta,int *kn)
{
    /* The arrays BMAT and ZMAT with IDZ are updated, in order to shift
     * the interpolation point that has index KNEW. On entry, VLAG
     * contains the components of the vector Theta*Wcheck+e_b of the
     * updating formula (6.11), and BETA holds the value of the
     * parameter that has this name. */

  int nptm=npt-n-1,i,j,ja,jb,jl,jp,iflag,knew=kn[0] ;
  double tau,temp,scala,scalb,alpha,denom,tempa,tempb,tausq,sqrtdn ; 
  double *w=vector(npt+n) ;
static int upcall=-1 ; upcall += 1 ; 
  if(knew<0) throw up("knew=%d\n",knew) ; 

    /* Apply the rotations that put zeros in the KNEW-th row of ZMAT. */

  for(jl=0,j=1;j<npt-n-1;j++)
  { if(j==idz)  jl = j ;
    else if(z[j][knew]!=0) 
    { tempa = z[jl][knew];
      tempb = z[j][knew];
      temp = sqrt(tempa*tempa+tempb*tempb) ;
      tempa /= temp ;
      tempb /= temp ;
      for(i=0;i<npt;i++)
      { temp = tempa*z[jl][i] + tempb*z[j][i] ;
        z[j][i] = tempa*z[j][i] - tempb*z[jl][i] ;
        z[jl][i] = temp ;
      }
      z[j][knew] = 0;
    }
  }

  /* Put the first NPT components of the KNEW-th column of HLAG into
   * W, and calculate the parameters of the updating formula. */
  if(idz<1) tempa = z[0][knew] ; else tempa = -z[0][knew] ;
  for(i=0;i<npt;i++) w[i] = tempa * z[0][i] ; 
  if(jl>0) for(tempb=z[jl][knew],i=0;i<npt;i++) w[i] += tempb * z[jl][i] ; 

  alpha = w[knew] ;
  tau = vlag[knew] ;
  tausq = tau * tau ;
  denom = alpha*beta + tausq ;
  vlag[knew] -= 1 ;
  if(denom==0) { kn[0] = -1 ; free(w) ; return idz ; } 

  /* Complete the updating of ZMAT when there is only 1.0 nonzero
   * element in the KNEW-th row of the new matrix ZMAT, but, ifIFLAG
   * is set to 1.0, then the first column of ZMAT will be exchanged
   * with another 1.0 later. */
  iflag = 0 ;
  sqrtdn = sqrt(fabs(denom)) ;

  /* ------------------------------------------------------------------------ */
  /*page*/
  /* ------------------------------------------------------------------------ */

  if(jl==0)
  { tempa = tau / sqrtdn ; 
    tempb = z[0][knew] / sqrtdn ;
    for(i=0;i<npt;i++) z[0][i] = tempa*z[0][i] - tempb*vlag[i] ;
    if(denom<0) { if(idz==0) idz = 1 ; else iflag = 1 ; }
  }
  else 
  { /* Complete the updating of ZMAT in the alternative case. */
    if(beta>=0) ja = jl ; else ja = 0 ; 
    jb = jl - ja ;
    temp = z[jb][knew] / denom ;
    tempa = temp * beta ;
    tempb = temp * tau ;
    temp = z[ja][knew] ;
    scala = 1 / sqrt(fabs(beta)*temp*temp+tausq) ;
    scalb = scala * sqrtdn ;
    for(i=0;i<npt;i++)
    { z[ja][i] = scala * (tau * z[ja][i] - temp * vlag[i]) ;
      z[jb][i] = scalb * (z[jb][i] - tempa * w[i] - tempb * vlag[i]) ;
    }
    if(denom<=0) { if(beta<0) idz += 1 ; else iflag = 1 ; }
  }
  if(denom==0) throw up("denom is 0") ; 

  /* IDZ is reduced in the following case, and usually the first
   * column of ZMAT is exchanged with a later 1.0. */
  if(iflag==1) for(idz-=1,i=0;i<npt;i++) swap(z[0][i],z[idz][i]) ; 

  /* Finally, update the matrix BMAT. */
  for(j=0;j<n;j++)
  { jp = npt + j ; 
    w[jp] = b[j][knew] ;
    tempa = (alpha*vlag[jp] - tau*w[jp]) / denom ;
    tempb = -(beta*w[jp] + tau*vlag[jp]) / denom ;
    for(i=0;i<=jp;i++)
    { b[j][i] += tempa*vlag[i] + tempb*w[i] ; 
      if(i>=npt) b[i-npt][jp] = b[j][i];
    }
  }
  free(w) ; 
  return idz ;
}
/* -------------------------------------------------------------------------- */
/*page*/
/* -------------------------------------------------------------------------- */

void shiftxbase(int n,int npt,double **xp,double *xopt,double *vlag,double **b,
                double **z,double *pq,double *hq,int idz)
{ int i,j,k,ih ;
  double tempq,temp,sum,xoptsq,sumz,*w=vector(2*npt) ;

  for(xoptsq=i=0;i<n;i++) xoptsq += xopt[i] * xopt[i] ; 
  tempq = xoptsq/4 ;

  for(k=0;k<npt;k++)
  { for(sum=i=0;i<n;i++) sum += xp[k][i] * xopt[i];
    sum -= xoptsq/2 ;
    w[npt+k] = sum ;
    for(i=0;i<n;i++)
    { xp[k][i] -= xopt[i]/2 ;
      vlag[i] = b[i][k];
      w[i] = sum*xp[k][i] + tempq*xopt[i] ;
      for(j=0;j<=i;j++) b[j][npt+i] += vlag[i]*w[j] + w[i]*vlag[j] ;
    }
  }

  /* Then the revisions of BMAT that depend on ZMAT are calculated. */
  for(k=0;k<npt-n-1;k++)
  { for(sumz=i=0;i<npt;i++) { sumz += z[k][i] ; w[i] = w[npt+i] * z[k][i] ; }
    for(j=0;j<n;j++)
    { sum = tempq * sumz * xopt[j];
      for(i=0;i<npt;i++) sum += w[i] * xp[i][j];
      vlag[j] = sum;
      if(k<idz) sum = -sum;
      for(i=0;i<npt;i++) b[j][i] += sum * z[k][i];
    }
    for(i=0;i<n;i++)
    { if(k>=idz) temp = vlag[i] ; else temp = -vlag[i] ; 
      for(j=0;j<=i;j++) b[j][npt+i] += temp * vlag[j] ;
    }
  }

  /* The following instructions complete the shift of XBASE,
   * including the changes to the parameters of the quadratic
   * model. */

  for(ih=j=0;j<n;j++)
  { for(w[j]=k=0;k<npt;k++)
    { w[j] += pq[k] * xp[k][j] ; xp[k][j] -= xopt[j]/2 ; }
    for(i=0;i<=j;ih++,i++)
    { hq[ih] += w[i]*xopt[j] + xopt[i]*w[j] ; b[j][npt+i] = b[i][npt+j] ; }
  }
  free(w) ; 
}
/* -------------------------------------------------------------------------- */
/*page*/
/* -------------------------------------------------------------------------- */

//C     Calculate VLAG and BETA for the current choice of STEP. The first NPT
//C       elements of VLAG are set to the values of the Lagrange functions at
//C       XPT(KOPT,.)+STEP(.). The first NPT components of W_check are held
//C       in W, where W_check is defined in a paper on the updating method.

double calcvlagbeta(int n,int npt,double *vlag,double **z,double **b,double *w,
                    double *xopt,double *d,double beta,double dsq,double xsq,
                    int idz) 
{ int i,j,k ;
  double sum,dx,bsum ;
  for(beta=k=0;k<npt-n-1;k++)
  { for(sum=i=0;i<npt;i++) sum += z[k][i] * w[i] ;
    if(k<idz) { beta += sum*sum ; sum = -sum ; } else beta -= sum*sum ; 
    for(i=0;i<npt;i++) vlag[i] += sum * z[k][i];
  }

  for(bsum=dx=j=0;j<n;j++)
  { for(sum=i=0;i<npt;i++) sum += w[i] * b[j][i];
    bsum += sum * d[j] ;
    for(k=0;k<n;k++) sum += b[k][npt+j] * d[k];
    vlag[npt+j] = sum ; 
    bsum += sum * d[j] ; 
    dx += d[j] * xopt[j] ; 
  }

  // more numerical instability here? 
  return dx*dx + dsq*((xsq+dx) + dx + dsq/2) + beta - bsum ;
}
/* -------------------------------------------------------------------------- */

int chooseknew(int n,int npt,double beta,double **z,double *vlag,double **xp,
               double *xopt,double detrat,double rhosq,int ktemp,int idz)
{ int i,j,k,knew ;
  double temp,qtemp,distsq,hdiag ;

  for(knew=-1,k=0;k<npt;k++) if(k!=ktemp)
  { for(hdiag=j=0;j<npt-n-1;j++) 
      if(j<idz) hdiag -= z[j][k]*z[j][k] ; else hdiag += z[j][k]*z[j][k] ; 
    temp = fabs(beta*hdiag+vlag[k]*vlag[k]) ;
    for(distsq=j=0;j<n;j++) 
    { qtemp = xp[k][j] - xopt[j] ; distsq += qtemp*qtemp ; }
    if(distsq>=rhosq) 
    { if(rhosq==0) temp *= distsq * distsq ; 
      else { qtemp = distsq / rhosq ; temp *= qtemp*qtemp*qtemp ; }
    }
    if(temp>detrat) { detrat = temp ; knew = k ; } 
  }
  return knew ; 
}
/* -------------------------------------------------------------------------- */

double genpqw(int n,int npt,double *pqw,double **z,int knew,int idz)
{ int i,j,k ;
  double temp ; 
  for(k=0;k<npt;k++) pqw[k] = 0 ; 
  for(j=0;j<npt-n-1;j++)
  { if(j<idz) temp = -z[j][knew] ; else temp = z[j][knew] ; 
    if(temp!=0) for(k=0;k<npt;k++) pqw[k] += temp * z[j][k] ;
  }
  return pqw[knew] ; 
}
void updatehq(int n,int npt,double *hq,double pqknew,double **xpt,int knew) 
{ int i,j,ih ; 
  double temp ; 
  for(ih=i=0;i<n;i++) 
  { temp = pqknew * xpt[knew][i] ;
    for(j=0;j<=i;j++,ih++) hq[ih] += temp * xpt[knew][j] ; 
  }
}

Prof. Powell wrote a long account of the NEWUOA algorithm, available from the DAMTP website, as The NEWUOA software for unconstrained optimization without derivatives. It isn’t an easy read.
He also wrote an attractive paper A view of algorithms for optimization without derivatives available from the same place.
NEWUOA has a Wikipedia page.
Zaikun Zhang has a page listing programs by Prof. Powell including some emails containing information about them.
Attractive Chaos (secular name unknown) provides a C++ translation of Powell’s original Fortran. This was the starting point for my own version, and I’ve put a copy in my archive.
Steven G. Johnson provides another C conversion on github. This was unknown to me in 2015; presumably it’s bubbled up the Google listings since.

The test program uoatest.c below calls uoamin to minimise the Chebyshev function used by Powell to illustrate his own Fortran code. It does this in 2, 4, 6, 8, 10 and 20 dimensions (Powell stopped at 8). The correct output is this:

 2    31: 3.78846734e-19:  0.211325  0.788675
 4    90: 3.52620610e-14:  0.102673  0.406204  0.593796  0.897327
 6   193: 1.55535235e-12:  0.066876  0.288740  0.366683  0.633317  0.711260 ...
 8   341: 3.51687373e-03:  0.043153  0.193091  0.266329  0.500000  0.500000 ...
10   563: 4.77271370e-03:  0.074748  0.171518  0.286434  0.359647  0.470750 ...
20  2418: 4.57295520e-03:  0.024600  0.070921  0.116574  0.176668  0.206808 ...

[Note that it is slightly changed since the bugfix of 22 Feb 2020.]

Attractive Chaos’s code gives the following:

 2    31: 3.78847194e-19:  0.211325  0.788675
 4    91: 2.19176132e-15:  0.102673  0.406204  0.593796  0.897327
 6   190: 3.12702103e-14:  0.066877  0.288741  0.366682  0.633318  0.711259 ...
 8   303: 3.51687373e-03:  0.043153  0.193091  0.266329  0.500000  0.500000 ...
10   668: 6.50395480e-03:  0.059620  0.166708  0.239170  0.398884  0.398885 ...
20  1934: 4.57295560e-03:  0.024600  0.070921  0.116573  0.176667  0.206807 ...

Notice that I perform noticeably more function evaluations for n=20 but get an appreciably smaller minimum for n=10.

Attractive Chaos’s results are not identical to Powell’s own, which are these:

    At the return from NEWUOA     Number of function values =    31
    Least value of F =  3.788471046857957D-19         The corresponding X is:
     2.113249D-01   7.886751D-01

    At the return from NEWUOA     Number of function values =    90
    Least value of F =  3.526693206487107D-14         The corresponding X is:
     1.026728D-01   4.062038D-01   5.937962D-01   8.973272D-01

    At the return from NEWUOA     Number of function values =   198
    Least value of F =  4.343402133989936D-14         The corresponding X is:
     6.687652D-02   2.887405D-01   3.666823D-01   6.333176D-01   7.112593D-01
     9.331234D-01

    At the return from NEWUOA     Number of function values =   314
    Least value of F =  3.516873725862449D-03         The corresponding X is:
     4.315284D-02   1.930909D-01   2.663288D-01   5.000002D-01   4.999999D-01
     7.336712D-01   8.069093D-01   9.568473D-01

Powell’s results are no better, just different, and the discrepancy isn’t surprising given the numerical delicacy of the algorithm. Powell remarked that “the results are still highly sensitive to computer rounding errors” and was in the habit of giving two results for each problem, differing through having the x-coordinates permuted (which makes no difference mathematically); his pairs of results typically differ more widely between themselves than ours do from his on the Chebyshev problem.

• initchebyfunc • closechebyfunc • chebyfunc • main

#include "memory.h"
static int n=0,ncalls=0 ; 
static double **y=0 ; 

void initchebyfunc(int k) { n = k ; ncalls = 0 ; y = matrix(n+1,n) ; } 
int closechebyfunc() 
{ int ret=ncalls ; freematrix(y) ; ncalls = n = 0 ; y = 0 ; return ret ; } 

double chebyfunc(double *x) 
{ double q,qret ; 
  int i,j ; 

  for(j=0;j<n;j++) { y[0][j] = 1 ; y[1][j] = 2*x[j]-1 ; } 
  for(i=2;i<=n;i++) for(j=0;j<n;j++) y[i][j] = 2*y[1][j]*y[i-1][j] - y[i-2][j] ;
  for(qret=i=0;i<=n;i++)
  { for(q=j=0;j<n;j++) q += y[i][j] ; 
    q /= n ; 
    if((i&1)==0) q += 1/((i-1.0)*(i+1.0)) ; 
    qret += q*q ; 
  }
  ncalls += 1 ; 
  return qret ; 
} 
double uoamin(double (*)(double*),double*,int,double,double,int) ;
double uoamax(double (*)(double*),double*,int,double,double,int) ;

int main(int argc,char **argv)
{ int i,ind,n,ncalls,order[]={2,4,6,8,10,20,0,0} ; 
  double *x,y ; 

  for(ind=0;(n=order[ind]);ind++)
  { x = vector(n) ; 
    for(i=0;i<n;i++) x[i] = (i+1.0) / (n+1.0) ;
    initchebyfunc(n) ; 
    y = uoamin(chebyfunc,x,n,x[0]/5,1e-6,50000) ; 
    ncalls = closechebyfunc() ;
    printf("%2d%6d: %.8e:",n,ncalls,y) ; 
    for(i=0;i<n&&i<5;i++) printf(" %9.6f",x[i]) ; 
    if(n>5) printf(" ...") ; 
    printf("\n") ; 
    free(x) ; 
  }
}

newuoa.c mdplib.c uoatest.c
condorcet: memory.h

Download the code and compile using:

g++ -c -O -g newuoa.c mdplib.c
g++ -o uoatest -g -O uoatest.c mdplib.o newuoa.o

Then run uoatest without any arguments.

doc : source : mdplib source : references : test : download