NpIterativeFeatureSelector

class lightautoml.pipelines.selection.permutation_importance_based.NpIterativeFeatureSelector(feature_pipeline, ml_algo=None, imp_estimator=None, fit_on_holdout=True, feature_group_size=5, max_features_cnt_in_result=None)[source]

Bases: lightautoml.pipelines.selection.base.SelectionPipeline

Select features sequentially using chunks to find the best combination of chunks.

The general idea of this algorithm is to sequentially check groups of features ordered by feature importances and if the quality of the model becomes better, we select such group, if not - ignore group.

__init__(feature_pipeline, ml_algo=None, imp_estimator=None, fit_on_holdout=True, feature_group_size=5, max_features_cnt_in_result=None)[source]
Parameters
  • feature_pipeline (FeaturesPipeline) – Composition of feature transforms.

  • ml_algo (Optional[MLAlgo]) – Tuple (MlAlgo, ParamsTuner).

  • imp_estimator (Optional[ImportanceEstimator]) – Feature importance estimator.

  • fit_on_holdout (bool) – If use the holdout iterator.

  • feature_group_size (Optional[int]) – Chunk size.

  • max_features_cnt_in_result (Optional[int]) – Lower bound of features after selection, if it is reached, it will stop.

perform_selection(train_valid=None)[source]

Select features iteratively by checking model quality for current selected feats and new group.

Parameters

train_valid (Optional[TrainValidIterator]) – Iterator for dataset.