optiml.opti.unconstrained.stochastic.gradient_descent module
- class optiml.opti.unconstrained.stochastic.gradient_descent.StochasticGradientDescent(f, x=None, batch_size=None, eps=1e-06, tol=1e-08, epochs=1000, step_size=0.01, momentum_type='none', momentum=0.9, callback=None, callback_args=(), shuffle=True, random_state=None, verbose=False)[source]
Bases:
StochasticMomentumOptimizerStochastic Gradient Descent (SGD) for the minimization of the provided function f.
At each iteration the point is moved by a fixed (or scheduled) learning rate along the negative of the gradient estimated on a mini batch of the data, optionally accelerated by a classical heavy-ball (Polyak) or Nesterov momentum term as selected by momentum_type.
- Parameters:
f – the objective function.
x – ([n x 1] real column vector): the point where to start the algorithm from.
batch_size – (integer scalar or None, optional, default value None): the size of the mini batches used to estimate the gradient; if None the full sample is used.
eps – (real scalar, optional, default value 1e-6): the accuracy in the stopping criterion: the algorithm is stopped when the norm of the gradient is less than or equal to eps.
tol – (real scalar, optional, default value 1e-8): the tolerance used in the optimality conditions of the Lagrangian dual (when f is a Lagrangian dual).
epochs – (integer scalar, optional, default value 1000): the maximum number of epochs before the algorithm is stopped.
step_size – (real scalar > 0, callable or iterable, optional, default value 0.01): the learning rate, i.e., the size of the step taken along the negative gradient.
momentum_type – (string in {‘none’, ‘polyak’, ‘nesterov’}, optional, default value ‘none’): the kind of momentum to apply (‘none’, heavy-ball ‘polyak’ or ‘nesterov’).
momentum – (real scalar in [0, 1) or iterable, optional, default value 0.9): the momentum factor, i.e., the fraction of the previous step retained in the current one.
callback – (callable, optional, default value None): a function called at each iteration with the optimizer instance (and callback_args) as arguments; it can raise StopIteration to interrupt the optimization.
callback_args – (tuple, optional, default value ()): additional positional arguments passed to the callback at each call.
shuffle – (boolean, optional, default value True): whether to shuffle the order of the mini batches at the beginning of each epoch.
random_state – (integer scalar or None, optional, default value None): seed for the random number generator, for reproducibility.
verbose – (boolean or integer, optional, default value False): print details about each iteration if True (or every verbose epochs if an integer), nothing otherwise.
- callback(args=())
- check_lagrangian_dual_conditions()
- check_lagrangian_dual_optimality()
- is_augmented_lagrangian_dual()
- is_batch_end()
- is_lagrangian_dual()
- is_verbose()
- iter_mini_batches()
Return an iterator that successively yields tuples of aligned mini batches of size
batch_sizefrom the sliceable arrays returned byf.args(), in random order (whenshuffleis True) without replacement.- Returns:
an infinite iterator of mini batches (one tuple of aligned slices per step).