scikeras.wrappers.BaseWrapper¶

class scikeras.wrappers.BaseWrapper(model=None, *, build_fn=None, warm_start=False, random_state=None, optimizer='rmsprop', loss=None, metrics=None, batch_size=None, validation_batch_size=None, verbose=1, callbacks=None, validation_split=0.0, shuffle=True, run_eagerly=False, epochs=1, **kwargs)[source]¶

Implementation of the scikit-learn classifier API for Keras.

Below are a list of SciKeras specific parameters. For details on other parameters, please see the see the tf.keras.Model documentation.

Parameters:

modelUnion[None, Callable[…, tf.keras.Model], tf.keras.Model], default None: Used to build the Keras Model. When called, must return a compiled instance of a Keras Model to be used by fit, predict, etc. If None, you must implement _keras_build_fn.
optimizerUnion[str, tf.keras.optimizers.Optimizer, Type[tf.keras.optimizers.Optimizer]], default “rmsprop”: This can be a string for Keras’ built in optimizers, an instance of tf.keras.optimizers.Optimizer or a class inheriting from tf.keras.optimizers.Optimizer. Only strings and classes support parameter routing.
lossUnion[Union[str, tf.keras.losses.Loss, Type[tf.keras.losses.Loss], Callable], None], default None: The loss function to use for training. This can be a string for Keras’ built in losses, an instance of tf.keras.losses.Loss or a class inheriting from tf.keras.losses.Loss . Only strings and classes support parameter routing.
random_stateUnion[int, np.random.RandomState, None], default None: Set the Tensorflow random number generators to a reproducible deterministic state using this seed. Pass an int for reproducible results across multiple function calls.
warm_startbool, default False: If True, subsequent calls to fit will _not_ reset the model parameters but will reset the epoch to zero. If False, subsequent fit calls will reset the entire model. This has no impact on partial_fit, which always trains for a single epoch starting from the current epoch.
batch_sizeUnion[int, None], default None: Number of samples per gradient update. This will be applied to both fit and predict. To specify different numbers, pass fit__batch_size=32 and predict__batch_size=1000 (for example). To auto-adjust the batch size to use all samples, pass batch_size=-1.

Attributes:

model_tf.keras.Model

The instantiated and compiled Keras Model. For pre-built models, this will just be a reference to the passed Model instance.

history_Dict[str, List[Any]]

Dictionary of the format {metric_str_name: [epoch_0_data, epoch_1_data, ..., epoch_n_data]}.

initialized_bool

Checks if the estimator is intialized.

target_encoder_sklearn-transformer

Transformer used to pre/post process the target y.

feature_encoder_sklearn-transformer

Transformer used to pre/post process the features/input X.

n_outputs_expected_int

The number of outputs the Keras Model is expected to have, as determined by target_transformer_.

target_type_str

One of:

‘continuous’: y is an array-like of floats that are not all integers, and is 1d or a column vector.
‘continuous-multioutput’: y is a 2d array of floats that are not all integers, and both dimensions are of size > 1.
‘binary’: y contains <= 2 discrete values and is 1d or a column vector.
‘multiclass’: y contains more than two discrete values, is not a sequence of sequences, and is 1d or a column vector.
‘multiclass-multioutput’: y is a 2d array that contains more than two discrete values, is not a sequence of sequences, and both dimensions are of size > 1.
‘multilabel-indicator’: y is a label indicator matrix, an array of two dimensions with at least two columns, and at most 2 unique values.
‘unknown’: y is array-like but none of the above, such as a 3d array, sequence of sequences, or an array of non-sequence objects.

y_shape_Tuple[int]

Shape of the target y that the estimator was fitted on.

y_dtype_np.dtype

Dtype of the target y that the estimator was fitted on.

X_shape_Tuple[int]

Shape of the input X that the estimator was fitted on.

X_dtype_np.dtype

Dtype of the input X that the estimator was fitted on.

n_features_in_int

The number of features seen during fit.

Parameters:

model (None | Callable[[...], Model] | Model) –
build_fn (None | Callable[[...], Model] | Model) –
warm_start (bool) –
random_state (int | RandomState | None) –
optimizer (str | Optimizer | Type[Optimizer]) –
loss (str | Loss | Type[Loss] | Callable | None) –
metrics (List[str | Metric | Type[Metric] | Callable] | None) –
batch_size (int | None) –
validation_batch_size (int | None) –
verbose (int) –
callbacks (List[Callback | Type[Callback]] | None) –
validation_split (float) –
shuffle (bool) –
run_eagerly (bool) –
epochs (int) –

property current_epoch: int¶

Returns the current training epoch.

Returns:

int: Current training epoch.

property feature_encoder¶

Retrieve a transformer for features / X.

Metadata will be collected from get_metadata if the transformer implements that method. Override this method to implement a custom data transformer for the features.

Returns:

sklearn transformer: Transformer implementing the sklearn transformer interface.

fit(X, y, sample_weight=None, **kwargs)[source]¶

Constructs a new model with model & fit the model to (X, y).

Parameters:

XUnion[array-like, sparse matrix, dataframe, of shape (n_samples, n_features): Training samples, where n_samples is the number of samples and n_features is the number of features.
yUnion[array-like, dataframe of shape (n_samples,) or (n_samples, n_outputs): True labels for X.
sample_weightarray-like of shape (n_samples,), default=None: Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
**kwargsDict[str, Any]: Extra arguments to route to Model.fit.

Returns:

BaseWrapper: A reference to the instance that can be chain called (est.fit(X,y).transform(X)).

Return type:

BaseWrapper

Warning

Passing estimator parameters as keyword arguments (aka as **kwargs) to fit is not supported by the Scikit-Learn API, and will be removed in a future version of SciKeras. These parameters can also be specified by prefixing fit__ to a parameter at initialization (BaseWrapper(..., fit__batch_size=32, predict__batch_size=1000)) or by using set_params (est.set_params(fit__batch_size=32, predict__batch_size=1000)).

get_params(deep=True)[source]¶

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

initialize(X, y=None)[source]¶

Initialize the model without any fitting.

You only need to call this model if you explicitly do not want to do any fitting (for example with a pretrained model). You should _not_ call this right before calling fit, calling fit will do this automatically.

Parameters:

XUnion[array-like, sparse matrix, dataframe, of shape (n_samples, n_features): Training samples where n_samples is the number of samples and n_features is the number of features.
yUnion[array-like, dataframe,, of shape (n_samples,) or (n_samples, n_outputs), default None: True labels for X.

Returns:

BaseWrapper: A reference to the BaseWrapper instance for chained calling.

Return type:

BaseWrapper

property initialized_: bool¶

Checks if the estimator is intialized.

Returns:

bool: True if the estimator is initialized (i.e., it can be used for inference or is ready to train), otherwise False.

partial_fit(X, y, sample_weight=None, **kwargs)[source]¶

Fit the estimator for a single epoch, preserving the current training history and model parameters.

Parameters:

XUnion[array-like, sparse matrix, dataframe, of shape (n_samples, n_features): Training samples where n_samples is the number of samples and n_features is the number of features.
yUnion[array-like, dataframe,, of shape (n_samples,) or (n_samples, n_outputs): True labels for X.
sample_weightarray-like of shape (n_samples,), default=None: Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.
**kwargsDict[str, Any]: Extra arguments to route to Model.fit.

Returns:

BaseWrapper: A reference to the instance that can be chain called (ex: instance.partial_fit(X, y).transform(X) )

Return type:

BaseWrapper

predict(X, **kwargs)[source]¶

Returns predictions for the given test data.

Parameters:

XUnion[array-like, sparse matrix, dataframe, of shape (n_samples, n_features): Training samples where n_samples is the number of samples and n_features is the number of features.
**kwargsDict[str, Any]: Extra arguments to route to Model.predict.

Returns:

array-like: Predictions, of shape shape (n_samples,) or (n_samples, n_outputs).

Warning

Passing estimator parameters as keyword arguments (aka as **kwargs) to predict is not supported by the Scikit-Learn API, and will be removed in a future version of SciKeras. These parameters can also be specified by prefixing predict__ to a parameter at initialization (BaseWrapper(..., fit__batch_size=32, predict__batch_size=1000)) or by using set_params (est.set_params(fit__batch_size=32, predict__batch_size=1000)).

score(X, y, sample_weight=None)[source]¶

Returns the score on the given test data and labels.

No default scoring function is implemented in BaseWrapper, you must subclass and implement one.

Parameters:

XUnion[array-like, sparse matrix, dataframe, of shape (n_samples, n_features): Test input samples, where n_samples is the number of samples and n_features is the number of features.
yUnion[array-like, dataframe,, of shape (n_samples,) or (n_samples, n_outputs): True labels for X.
sample_weightarray-like of shape (n_samples,), default=None: Array of weights that are assigned to individual samples. If not provided, then each sample is given unit weight.

Returns:

float: Score for the test data set.

Return type:

float

static scorer(y_true, y_pred, **kwargs)[source]¶

Scoring function for model.

This is not implemented in BaseWrapper, it exists as a stub for documentation.

Parameters:

y_truearray-like of shape (n_samples,) or (n_samples, n_outputs): True labels.
y_predarray-like of shape (n_samples,) or (n_samples, n_outputs): Predicted labels.
**kwargs: dict: Extra parameters passed to the scorer.

Returns:

float: Score for the test data set.

Return type:

float

set_params(**params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object. This also supports routed parameters, eg: classifier__optimizer__learning_rate.

Parameters:

**paramsdict: Estimator parameters.

Returns:

BaseWrapper: Estimator instance.

Return type:

BaseWrapper

property target_encoder¶

Retrieve a transformer for targets / y.

Metadata will be collected from get_metadata if the transformer implements that method. Override this method to implement a custom data transformer for the target.

Returns:

target_encoder: Transformer implementing the sklearn transformer interface.