TabularDataFeatures

class lightautoml.pipelines.features.base.TabularDataFeatures(**kwargs)[source]

Bases: object

Helper class contains basic features transformations for tabular data.

This method can de shared by all tabular feature pipelines, to simplify .create_automl definition.

__init__(**kwargs)[source]

Set default parameters for tabular pipeline constructor.

Parameters

**kwargs – Additional parameters.

static get_cols_for_datetime(train)[source]

Get datetime columns to calculate features.

Parameters

train (Union[PandasDataset, NumpyDataset]) – Dataset with train data.

Return type

Tuple[List[str], List[str]]

Returns

2 list of features names - base dates and common dates.

get_datetime_diffs(train)[source]

Difference for all datetimes with base date.

Parameters

train (Union[PandasDataset, NumpyDataset]) – Dataset with train data.

Return type

Optional[LAMLTransformer]

Returns

Transformer or None if no required features.

get_datetime_seasons(train, outp_role=None)[source]

Get season params from dates.

Parameters
Return type

Optional[LAMLTransformer]

Returns

Transformer or None if no required features.

static get_numeric_data(train, feats_to_select=None, prob=None)[source]

Select numeric features.

Parameters
Return type

Optional[LAMLTransformer]

Returns

Transformer.

static get_freq_encoding(train, feats_to_select=None)[source]

Get frequency encoding part.

Parameters
Return type

Optional[LAMLTransformer]

Returns

Transformer.

get_ordinal_encoding(train, feats_to_select=None)[source]

Get order encoded part.

Parameters
Return type

Optional[LAMLTransformer]

Returns

Transformer.

get_categorical_raw(train, feats_to_select=None)[source]

Get label encoded categories data.

Parameters
Return type

Optional[LAMLTransformer]

Returns

Transformer.

get_target_encoder(train)[source]

Get target encoder func for dataset.

Parameters

train (Union[PandasDataset, NumpyDataset]) – Dataset with train data.

Return type

Optional[type]

Returns

Class

get_binned_data(train, feats_to_select=None)[source]

Get encoded quantiles of numeric features.

Parameters
Return type

Optional[LAMLTransformer]

Returns

Transformer.

get_categorical_intersections(train, feats_to_select=None)[source]

Get transformer that implements categorical intersections.

Parameters
Return type

Optional[LAMLTransformer]

Returns

Transformer.

get_uniques_cnt(train, feats)[source]

Get unique values cnt.

Parameters
Return type

Series

Returns

Series.

get_top_categories(train, top_n=5)[source]

Get top categories by importance.

If feature importance is not defined, or feats has same importance - sort it by unique values counts. In second case init param ascending_by_cardinality defines how - asc or desc.

Parameters
Return type

List[str]

Returns

List.