optiml.ml.svm.smo module

class optiml.ml.svm.smo.SMO(quad, X, y, K, kernel, C, tol=0.001, verbose=False)[source]

Bases: ABC

Base abstract class for the sequential minimal optimization (SMO) algorithm used to train the dual SVM formulation. It holds the data, the kernel matrix and the optimization state shared by the classifier and regression variants.

Subclasses must implement _take_step, _examine_example and minimize.

Parameters:

quad (Quadratic instance) – The quadratic objective of the dual problem, used to monitor the cost during the optimization.
X (ndarray of shape (n_samples, n_features)) – Training data.
y (ndarray of shape (n_samples,)) – Target values associated with X.
K (ndarray of shape (n_samples, n_samples)) – Precomputed kernel (Gram) matrix of the training data.
kernel (Kernel instance) – The kernel function used to build K. If it is a LinearKernel the primal weight vector w is maintained explicitly.
C (float) – Regularization parameter, i.e., the upper bound on the Lagrange multipliers.
tol (float, default=1e-3) – Tolerance for the KKT stopping criterion.
verbose (bool or int, default=False) – Controls the verbosity of progress messages to stdout.

minimize()[source]

class optiml.ml.svm.smo.SMOClassifier(quad, X, y, K, kernel, C, tol=0.001, verbose=False)[source]

Bases: SMO

Implements John Platt’s sequential minimal optimization algorithm for training a support vector classifier.

The SMO algorithm is an algorithm for solving large quadratic programming (QP) optimization problems, widely used for the training of support vector machines. First developed by John C. Platt in 1998, SMO breaks up large QP problems into a series of smallest possible QP problems, which are then solved analytically.

This class follows the original algorithm by Platt with additional modifications by Keerthi et al.

References

John C. Platt. Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines.

S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, K.R.K. Murthy. Improvements to Platt’s SMO Algorithm for SVM Classifier Design. Technical Report CD-99-14.

Parameters:

quad (Quadratic instance) – The quadratic objective of the dual problem, used to monitor the cost during the optimization.
X (ndarray of shape (n_samples, n_features)) – Training data.
y (ndarray of shape (n_samples,)) – Target values associated with X.
K (ndarray of shape (n_samples, n_samples)) – Precomputed kernel (Gram) matrix of the training data.
kernel (Kernel instance) – The kernel function used to build K. If it is a LinearKernel the primal weight vector w is maintained explicitly.
C (float) – Regularization parameter, i.e., the upper bound on the Lagrange multipliers.
tol (float, default=1e-3) – Tolerance for the KKT stopping criterion.
verbose (bool or int, default=False) – Controls the verbosity of progress messages to stdout.

minimize()[source]

class optiml.ml.svm.smo.SMORegression(quad, X, y, K, kernel, C, epsilon, tol=0.001, verbose=False)[source]

Bases: SMO

Implements Smola and Scholkopf sequential minimal optimization algorithm for training a support vector regression.

The SMO algorithm is an algorithm for solving large quadratic programming (QP) optimization problems, widely used for the training of support vector machines. First developed by John C. Platt in 1998, SMO breaks up large QP problems into a series of smallest possible QP problems, which are then solved analytically.

This class incorporates modifications in the original SMO algorithm to solve regression problems as suggested by Alex J. Smola and Bernhard Scholkopf and further modifications for better performance by Shevade et al.

References

G.W. Flake, S. Lawrence. Efficient SVM Regression Training with SMO.

Alex J. Smola, Bernhard Scholkopf. A Tutorial on Support Vector Regression. NeuroCOLT2 Technical Report Series NC2-TR-1998-030.

S.K. Shevade, S.S. Keerthi, C. Bhattacharyya, K.R.K. Murthy. Improvements to SMO Algorithm for SVM Regression. Technical Report CD-99-16.

Parameters:

quad (Quadratic instance) – The quadratic objective of the dual problem, used to monitor the cost during the optimization.
X (ndarray of shape (n_samples, n_features)) – Training data.
y (ndarray of shape (n_samples,)) – Target values associated with X.
K (ndarray of shape (n_samples, n_samples)) – Precomputed kernel (Gram) matrix of the training data.
kernel (Kernel instance) – The kernel function used to build K. If it is a LinearKernel the primal weight vector w is maintained explicitly.
C (float) – Regularization parameter, i.e., the upper bound on the Lagrange multipliers.
epsilon (float) – Width of the epsilon-tube of the epsilon-insensitive loss within which no penalty is associated in the regression problem.
tol (float, default=1e-3) – Tolerance for the KKT stopping criterion.
verbose (bool or int, default=False) – Controls the verbosity of progress messages to stdout.

minimize()[source]