WeightedAverageTransformer
- class lightautoml.text.weighted_average_transformer.WeightedAverageTransformer(embedding_model, embed_size, weight_type='idf', use_svd=True, alpha=0.001, verbose=False, **kwargs)[source]
Bases:
sklearn.base.TransformerMixin
Weighted average of word embeddings.
- __init__(embedding_model, embed_size, weight_type='idf', use_svd=True, alpha=0.001, verbose=False, **kwargs)[source]
Calculate sentence embedding as weighted average of word embeddings.
- Parameters
embedding_model (
Dict
) – word2vec, fasstext, etc. Should have dict interface {<word>: <embedding>}.embed_size (
int
) – Size of embedding.weight_type (
str
) – ‘idf’ for idf weights, ‘sif’ for smoothed inverse frequency weights, ‘1’ for all weights are equal.use_svd (
bool
) – Substract projection onto first singular vector.alpha (
int
) – Param for sif weights.verbose (
bool
) – Add prints.**kwargs – Unused arguments.