TfidfTextTransformer
- class lightautoml.transformers.text.TfidfTextTransformer(default_params=None, freeze_defaults=True, subs=None, random_state=42)[source]
Bases:
lightautoml.transformers.text.TunableTransformer
Simple Tfidf vectorizer.
- __init__(default_params=None, freeze_defaults=True, subs=None, random_state=42)[source]
- Parameters
Note
The behaviour of freeze_defaults:
True
: params may be rewritten depending on dataset.False
: params may be changed only manually or with tuning.
- init_params_on_input(dataset)[source]
Get transformer parameters depending on dataset parameters.
- Parameters
dataset (
Union
[NumpyDataset
,PandasDataset
]) – Dataset used for model parmaeters initialization.- Return type
- Returns
Parameters of model.
- fit(dataset)[source]
Fit tfidf vectorizer.
- Parameters
dataset (
Union
[NumpyDataset
,PandasDataset
]) – Pandas or Numpy dataset of text features.- Returns
self.
- transform(dataset)[source]
Transform text dataset to sparse tfidf representation.
- Parameters
dataset (
Union
[NumpyDataset
,PandasDataset
]) – Pandas or Numpy dataset of text features.- Return type
- Returns
Sparse dataset with encoded text.