WhiteBoxPreset
- class lightautoml.automl.presets.whitebox_presets.WhiteBoxPreset(task, timeout=3600, memory_limit=16, cpu_limit=4, gpu_ids=None, timing_params=None, config_path=None, general_params=None, reader_params=None, read_csv_params=None, whitebox_params=None)[source]
Bases:
AutoMLPreset
Preset for AutoWoE - logistic regression over binned features (scorecard).
Supported data roles - numbers, dates, categories.
Limitations:
Simple time management.
No memory management.
Working only with
pandas.DataFrame
.No batch inference.
No text support.
No parallel execution.
No batch inference.
No GPU usage.
No cross-validation scheme. Supports only holdout validation (cv is created inside AutoWoE, but no oof pred returned).
Common usecase - fit lightweight interpretable model for binary classification task.
Commonly _params kwargs (ex. timing_params) set via config file (config_path argument). If you need to change just few params, it’s possible to pass it as dict of dicts, like json. To get available params please look on default config template. Also you can find there param description To generate config template call
WhiteBoxPreset.get_config('config_path.yml')
.- Parameters:
task (
Task
) – Task to solve.timeout (
int
) – Timeout in seconds.memory_limit (
int
) – Memory limit that are passed to each automl.cpu_limit (
int
) – CPU limit that that are passed to each automl.gpu_ids (
Optional
[str
]) – GPU IDs that are passed to each automl.read_csv_params (
Optional
[dict
]) – Params to passpandas.read_csv
(case of train/predict from file).whitebox_params (
Optional
[dict
]) – Params of WhiteBox algo (look at config file).
- property whitebox
Get wrapped AutoWoE object.
- Returns:
Model.
- create_automl(*args, **kwargs)[source]
Create basic
WhiteBoxPreset
instance from data.- Parameters:
*args – Not used.
**kwargs – everything passed to
.fit_predict
.
- fit_predict(train_data, roles, train_features=None, cv_iter=None, valid_data=None, valid_features=None, verbose=0, **fit_params)[source]
Fit and get prediction on validation dataset.
Almost same as
lightautoml.automl.base.AutoML.fit_predict
.Additional features - working with different data formats. Supported now:
- Parameters:
train_data (
Any
) – Dataset to train.roles (
dict
) – Roles dict.train_features (
Optional
[Sequence
[str
]]) – Optional features names, if can’t be inferred from train_data.cv_iter (
Optional
[Iterable
]) – Custom cv-iterator. For example,TimeSeriesIterator
.valid_features (
Optional
[Sequence
[str
]]) – Optional validation dataset features if cannot be inferred from valid_data.verbose (
int
) – Controls the verbosity: the higher, the more messages. <1 : messages are not displayed; >=1 : the computation process for layers is displayed; >=2 : the information about folds processing is also displayed; >=3 : the hyperparameters optimization process is also displayed; >=4 : the training process for every algorithm is displayed;fit_params – Others.
- Return type:
- Returns:
Dataset with predictions. Call
.data
to get predictions array.