LabelEncoder

class lightautoml.transformers.categorical.LabelEncoder(subs=None, random_state=42)[source]

Bases: lightautoml.transformers.base.LAMLTransformer

Simple LabelEncoder in order of frequency.

Labels are integers from 1 to n. Unknown category encoded as 0. NaN is handled as a category value.

__init__(subs=None, random_state=42)[source]
Parameters
  • subs (Optional[int]) – Subsample to calculate freqs. If None - full data.

  • random_state (int) – Random state to take subsample.

fit(dataset)[source]

Estimate label frequencies and create encoding dicts.

Parameters

dataset (Union[NumpyDataset, PandasDataset]) – Pandas or Numpy dataset of categorical features.

Returns

self.

transform(dataset)[source]

Transform categorical dataset to int labels.

Parameters

dataset (Union[NumpyDataset, PandasDataset]) – Pandas or Numpy dataset of categorical features.

Return type

NumpyDataset

Returns

Numpy dataset with encoded labels.