LGBAdvancedPipeline

class lightautoml.pipelines.features.lgb_pipeline.LGBAdvancedPipeline(feats_imp=None, top_intersections=5, max_intersection_depth=3, subsample=None, multiclass_te_co=3, auto_unique_co=10, output_categories=False, fill_na=False, ascending_by_cardinality=False, use_groupby=False, groupby_types=['delta_median', 'delta_mean', 'min', 'max', 'std', 'mode', 'is_mode'], groupby_triplets=[], groupby_top_based_on='cardinality', groupby_top_categorical=3, groupby_top_numerical=3, **kwargs)[source]

Bases: FeaturesPipeline, TabularDataFeatures

Create advanced pipeline for trees based models.

Includes:

  • Different cats and numbers handling according to role params.

  • Dates handling - extracting seasons and create datediffs.

  • Create categorical intersections.

Parameters:
  • feats_imp (Union[ImportanceEstimator, SelectionPipeline, None]) – Features importances mapping.

  • top_intersections (int) – Max number of categories to generate intersections.

  • max_intersection_depth (int) – Max depth of cat intersection.

  • subsample (Union[float, int, None]) – Subsample to calc data statistics.

  • multiclass_te_co (int) – Cutoff if use target encoding in cat handling on multiclass task if number of classes is high.

  • auto_unique_co (int) – Switch to target encoding if high cardinality.

create_pipeline(train)[source]

Create tree pipeline.

Parameters:

train (Union[PandasDataset, NumpyDataset]) – Dataset with train features.

Return type:

LAMLTransformer

Returns:

Transformer.