TargetEncoder

class lightautoml.transformers.categorical.TargetEncoder(alphas=(0.5, 1.0, 2.0, 5.0, 10.0, 50.0, 250.0, 1000.0))[source]

Bases: LAMLTransformer

Out-of-fold target encoding.

Limitation:

  • Required .folds attribute in dataset - array of int from 0 to n_folds-1.

  • Working only after label encoding.

Parameters:

alphas (Sequence[float]) – Smooth coefficients.

static binary_score_func(candidates, target)[source]

Score candidates alpha with logloss metric.

Parameters:
  • candidates (ndarray) – Candidate oof encoders.

  • target (ndarray) – Target array.

Return type:

int

Returns:

Index of best encoder.

static reg_score_func(candidates, target)[source]

Score candidates alpha with mse metric.

Parameters:
  • candidates (ndarray) – Candidate oof encoders.

  • target (ndarray) – Target array.

Return type:

int

Returns:

Index of best encoder.

fit(dataset)[source]

Fit encoder.

fit_transform(dataset)[source]

Calc oof encoding and save encoding stats for new data.

Parameters:

dataset (Union[NumpyDataset, PandasDataset]) – Pandas or Numpy dataset of categorical label encoded features.

Return type:

NumpyDataset

Returns:

NumpyDataset - target encoded features.

transform(dataset)[source]

Transform categorical dataset to target encoding.

Parameters:

dataset (Union[NumpyDataset, PandasDataset]) – Pandas or Numpy dataset of categorical features.

Return type:

Union[NumpyDataset, CSRSparseDataset]

Returns:

Numpy dataset with encoded labels.