0
votes

I am using SciPy root function to find a root of a function which involves an ODE (in the function passed to root I am calling SciPy solve_ivp). When using "krylov" method in the root, it solves it with no problem. But when I use any other method it raise a ValueError: ValueError: could not broadcast input array from shape (2,1) into shape (2,).

Any help is much appreciated.

Thanks, Kourosh

2
Do you know anything about numpy broadcasting? no? then you haven't read enough basic numpy. Or how about a traceback with the error? My guess - just a guess - is that your objective function returns a (2,1) shape array, but it is supposed to be a (2,) shape, a flat 1d array. Did you review the method docs, and their requirements? Paying close attention to the shape of the parameter requirements? Clearly it's a broadcasted assignment issue, but the rest is just a guess. - hpaulj
Thank you for your comment. Yes, I know about numpy broadcasting. My objective function return a scalar value. My question is how one of the scipy root methods i.e., krylov solves it with no problem but other methods raise an error. - Kourosh

2 Answers

0
votes

So, I think I figure out what the problem was. I will post it here in case someone else has the same problem.

The problem is how scipy root function is treating x0 differently when krylov versus other methods is used. For every method other than krylov it treats x0 as an array of length 1 (since in my case I am solving 1 equation and not system of equations). This might result in broadcasting error mentioned above. I also report this in scipy github page. This can be resolved by converting x0 from array to scalar (e.g., with .item() method).

0
votes

The code you posted on the bug report (why didn't you post any of there here?):

In [1]: from scipy.optimize import root
   ...: from scipy.integrate import solve_ivp
   ...: 
   ...: 
   ...: def fun_d(x, d0):
   ...:     return(d0 * x)
   ...: 
   ...: 
   ...: def ode_fun(t, z, a, b, c, d0):
   ...:     x, y = z
   ...:     d = fun_d(x, d0)
   ...:     u = a*d*x - b*x*y
   ...:     v = -c*y + d*x*y
   ...:     return([u, v])
   ...: 
   ...: 
   ...: def fun_err(d0, a, b, c):
   ...:     sol = solve_ivp(ode_fun, t_span=[0, 15], y0=[10, 5],
   ...:                     method='RK45', args=(a, b, c, d0), t_eval=[12, ],
   ...:                     rtol=1e-11, atol=1e-13)
   ...:     obj_sol = solve_ivp(ode_fun, t_span=[0, 15], y0=[10, 5],
   ...:                         method='RK45', args=(1.5, 1, 3, 1), t_eval=[12, ],
   ...:                         rtol=1e-11, atol=1e-13)
   ...:     obj_x = obj_sol.y[0][-1]
   ...:     return(sol.y[0][-1] - obj_x)
   ...: 
   ...: 
   ...: a = 1.5
   ...: b = 1
   ...: c = 3
   ...: 

The case that runs:

In [2]: d_sol = root(fun_err, x0=0.5, args=(a, b, c), method='krylov')
/usr/local/lib/python3.8/dist-packages/scipy/optimize/_nonlin.py:366: RuntimeWarning: invalid value encountered in double_scalars
  and dx_norm/self.x_rtol <= x_norm))
In [3]: d_sol
Out[3]: 
     fun: array([1.82191151e-06])
 message: 'A solution was found at the specified tolerance.'
     nit: 2
  status: 1
 success: True
       x: array(0.53107372)

The failure case, with FULL TRACEBACK

In [4]: d_sol = root(fun_err, x0=0.5, args=(a, b, c), method='hybr')
Traceback (most recent call last):
  Input In [4] in <cell line: 1>
    d_sol = root(fun_err, x0=0.5, args=(a, b, c), method='hybr')
  File /usr/local/lib/python3.8/dist-packages/scipy/optimize/_root.py:234 in root
    sol = _root_hybr(fun, x0, args=args, jac=jac, **options)
  File /usr/local/lib/python3.8/dist-packages/scipy/optimize/_minpack_py.py:226 in _root_hybr
    shape, dtype = _check_func('fsolve', 'func', func, x0, args, n, (n,))
  File /usr/local/lib/python3.8/dist-packages/scipy/optimize/_minpack_py.py:24 in _check_func
    res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
  Input In [1] in fun_err
    sol = solve_ivp(ode_fun, t_span=[0, 15], y0=[10, 5],
  File /usr/local/lib/python3.8/dist-packages/scipy/integrate/_ivp/ivp.py:580 in solve_ivp
    message = solver.step()
  File /usr/local/lib/python3.8/dist-packages/scipy/integrate/_ivp/base.py:181 in step
    success, message = self._step_impl()
  File /usr/local/lib/python3.8/dist-packages/scipy/integrate/_ivp/rk.py:144 in _step_impl
    y_new, f_new = rk_step(self.fun, t, y, self.f, h, self.A,
  File /usr/local/lib/python3.8/dist-packages/scipy/integrate/_ivp/rk.py:61 in rk_step
    K[0] = f
ValueError: could not broadcast input array from shape (2,1) into shape (2,)

based on the branching in root source, it also works for any of these:

 `broyden1', 'broyden2', 'anderson', 'linearmixing',
 'diagbroyden', 'excitingmixing', 'krylov'

According to the root docs x0 is supposed to a numpy array:

Parameters
fun : callable
A vector function to find a root of.

x0 : ndarray
Initial guess.

The root of the problem is that

def fun_d(x, d0):
    return(d0 * x)

and hence ode_fun produces a different result depending on whether x0 is np.array(.5) or np.array([.5]). In the problem cases ode_fun returns (2,1) array (or list that becomes that), where as it should be a flat (2,) array.