validation accuracy on them.
Option -C conducts cross validation under different parameters and finds
-the best one. This option is supported only by -s 0, -s 2 (for finding
-C) and -s 11 (for finding C, p). If the solver is not specified, -s 2
+the best one. This option is supported only by -s 0, -s 2 (for finding
+C) and -s 11 (for finding C, p). If the solver is not specified, -s 2
is used.
Formulations:
solvers (currently -s 0, -s 2 and -s 11 are supported) and
different number of CV folds. Further, users can use
the -c option to specify the smallest C value of the
-search range. This option is useful when users want to
+search range. This option is useful when users want to
rerun the parameter selection procedure from a specified
C under a different setting, such as a stricter stopping
tolerance -e 0.0001 in the above example. Similarly, for
--s 11, users can use the -p option to specify the
-maximal p value of the search range.
+-s 11, users can use the -p option to specify the
+maximal p value of the search range.
> train -c 10 -w1 2 -w2 5 -w3 2 four_class_data_file
conducts cross validation many times under parameters C = start_C,
2*start_C, 4*start_C, 8*start_C, ..., and finds the best one with
the highest cross validation accuracy. For -s 11, it conducts cross
- validation many times with a two-fold loop. The outer loop considers a
+ validation many times with a two-fold loop. The outer loop considers a
default sequence of p = 19/20*max_p, ..., 1/20*max_p, 0 and
- under each p value the inner loop considers a sequence of parameters
+ under each p value the inner loop considers a sequence of parameters
C = start_C, 2*start_C, 4*start_C, ..., and finds the best one with the
lowest mean squared error.
all folds become stable or C reaches max_C.
If start_p <= 0, then this procedure calculates a maximal p for prob as
- the start_p. Otherwise, the procedure starts with the first
- i/20*max_p <= start_p so the outer sequence is i/20*max_p,
+ the start_p. Otherwise, the procedure starts with the first
+ i/20*max_p <= start_p so the outer sequence is i/20*max_p,
(i-1)/20*max_p, ..., 0.
-
- The best C, the best p, and the corresponding accuracy (or MSE) are
- assigned to *best_C, *best_p and *best_score, respectively. For
+
+ The best C, the best p, and the corresponding accuracy (or MSE) are
+ assigned to *best_C, *best_p and *best_score, respectively. For
classification, *best_p is not used, and the returned value is -1.
- Function: double predict(const model *model_, const feature_node *x);
returned model is just a scalar: cross-validation accuracy for
classification and mean-squared error for regression.
-If the '-C' option is specified, best parameters are found by cross
+If the '-C' option is specified, best parameters are found by cross
validation. The parameter selection utility is supported only by -s 0,
--s 2 (for finding C) and -s 11 (for finding C, p). The returned
-model is a three dimensional vector with the best C, the best p, and
-the corresponding cross-validation accuracy or mean squared error. The
+-s 2 (for finding C) and -s 11 (for finding C, p). The returned
+model is a three dimensional vector with the best C, the best p, and
+the corresponding cross-validation accuracy or mean squared error. The
returned best p for -s 0 and -s 2 is set to -1 because the p parameter
is not used by classification models.
structure. If '-v' is specified, cross validation is
conducted and the returned model is just a scalar: cross-validation
accuracy for classification and mean-squared error for regression.
-
+
If the '-C' option is specified, best parameters are found
- by cross validation. The parameter selection utility is supported
+ by cross validation. The parameter selection utility is supported
only by -s 0, -s 2 (for finding C) and -s 11 (for finding C, p).
- The returned structure is a triple with the best C, the best p,
- and the corresponding cross-validation accuracy or mean squared
- error. The returned best p for -s 0 and -s 2 is set to -1 because
+ The returned structure is a triple with the best C, the best p,
+ and the corresponding cross-validation accuracy or mean squared
+ error. The returned best p for -s 0 and -s 2 is set to -1 because
the p parameter is not used by classification models.