AutoMLPreset

class lightautoml.automl.presets.base.AutoMLPreset(task, timeout=3600, memory_limit=16, cpu_limit=4, gpu_ids='all', debug=False, timing_params=None, config_path=None, **kwargs)[source]

Bases: AutoML

Basic class for automl preset.

It’s almost like AutoML, but with delayed initialization. Initialization starts on fit, some params are inferred from data. Preset should be defined via .create_automl method. Params should be set via yaml config. Most useful case - end-to-end model development.

Commonly _params kwargs (ex. timing_params) set via config file (config_path argument). If you need to change just few params, it’s possible to pass it as dict of dicts, like json. To get available params please look on default config template. Also you can find there param description. To generate config template call SomePreset.get_config('config_path.yml').

Example

>>> automl = SomePreset(Task('binary'), timeout=3600)
>>> automl.fit_predict(data, roles={'target': 'TARGET'})

Parameters:

task (Task) – Task to solve.
timeout (int) – Timeout in seconds.
memory_limit (int) – Memory limit that are passed to each automl.
cpu_limit (int) – CPU limit that that are passed to each automl.
gpu_ids (Optional[str]) – GPU IDs that are passed to each automl.
verbose – Controls the verbosity: the higher, the more messages. <1 : messages are not displayed; >=1 : the computation process for layers is displayed; >=2 : the information about folds processing is also displayed; >=3 : the hyperparameters optimization process is also displayed; >=4 : the training process for every algorithm is displayed;
timing_params (Optional[dict]) – Timing param dict.
config_path (Optional[str]) – Path to config file.
**kwargs (Any) – Not used.

classmethod get_config(path=None)[source]

Create new config template.

Parameters:: path (Optional[str]) – Path to config.
Return type:: Optional[dict]
Returns:: Config.

create_automl(**fit_args)[source]

Abstract method - how to build automl.

Here you should create all automl components, like readers, levels, timers, blenders. Method ._initialize should be called in the end to create automl.

Parameters:: **fit_args – params that are passed to .fit_predict method.

fit_predict(train_data, roles, train_features=None, cv_iter=None, valid_data=None, valid_features=None, verbose=0, path_to_save=None)[source]

Fit on input data and make prediction on validation part.

Parameters:

train_data (Any) – Dataset to train.
roles (dict) – Roles dict.
train_features (Optional[Sequence[str]]) – Features names, if can’t be inferred from train_data.
cv_iter (Optional[Iterable]) – Custom cv-iterator. For example, TimeSeriesIterator.
valid_data (Optional[Any]) – Optional validation dataset.
valid_features (Optional[Sequence[str]]) – Optional validation dataset features if can’t be inferred from valid_data.
verbose (int) – Verbosity level that are passed to each automl.
path_to_save (Optional[str]) – The path that joblib will use to save the model after fit stage is completed. Use *.joblib format.

Return type:

LAMLDataset

Returns:

Dataset with predictions. Call .data to get predictions array.

static set_verbosity_level(verbose)[source]

Verbosity level setter.

Parameters:: verbose (int) – Controls the verbosity: the higher, the more messages. <1 : messages are not displayed; >=1 : the computation process for layers is displayed; >=2 : the information about folds processing is also displayed; >=3 : the hyperparameters optimization process is also displayed; >=4 : the training process for every algorithm is displayed;