TabularNLPAutoML
- class lightautoml.automl.presets.text_presets.TabularNLPAutoML(task, timeout=3600, memory_limit=16, cpu_limit=4, gpu_ids='all', debug=False, timing_params=None, config_path=None, general_params=None, reader_params=None, read_csv_params=None, nested_cv_params=None, tuning_params=None, selection_params=None, nn_params=None, lgb_params=None, cb_params=None, rf_params=None, linear_l2_params=None, nn_pipeline_params=None, gbm_pipeline_params=None, linear_pipeline_params=None, text_params=None, tfidf_params=None, autonlp_params=None)[source]
Bases:
TabularAutoML
Classic preset - work with tabular and text data.
Supported data roles - numbers, dates, categories, text Limitations - no memory management.
GPU support in catboost/lightgbm (if installed for GPU), NN models training.
Commonly _params kwargs (ex. timing_params) set via config file (config_path argument). If you need to change just few params, it’s possible to pass it as dict of dicts, like json. To get available params please look on default config template. Also you can find there param description. To generate config template call
TabularNLPAutoML.get_config('config_path.yml')
.- Parameters:
task (
Task
) – Task to solve.timeout (
int
) – Timeout in seconds.memory_limit (
int
) – Memory limit that are passed to each automl.cpu_limit (
int
) – CPU limit that that are passed to each automl.gpu_ids (
Optional
[str
]) – GPU IDs that are passed to each automl.debug (
bool
) – To catch running model exceptions or not.read_csv_params (
Optional
[dict
]) – Params to passpandas.read_csv
(case of train/predict from file).nested_cv_params (
Optional
[dict
]) – Param dict for nested cross-validation.selection_params (
Optional
[dict
]) – Params of feature selection.nn_params (
Optional
[dict
]) – Params of neural network model.nn_pipeline_params (
Optional
[dict
]) – Params of feature generation for neural network models.gbm_pipeline_params (
Optional
[dict
]) – Params of feature generation for boosting models.linear_pipeline_params (
Optional
[dict
]) – Params of feature generation for linear models.text_params (
Optional
[dict
]) – General params of text features.autonlp_params (
Optional
[dict
]) – Params of text embeddings features.
- create_automl(**fit_args)[source]
Create basic automl instance.
- Parameters:
**fit_args – Contain all information needed for creating automl.
- predict(data, features_names=None, batch_size=None, n_jobs=1)[source]
Get dataset with predictions.
Almost same as
lightautoml.automl.base.AutoML.predict
on new dataset, with additional features.Additional features - working with different data formats. Supported now:
Parallel inference - you can pass
n_jobs
to speedup prediction (requires more RAM). Batch_inference - you can passbatch_size
to decrease RAM usage (may be longer).- Parameters:
- Return type:
- Returns:
Dataset with predictions.