LAMLDataset

class lightautoml.dataset.base.LAMLDataset(data, features, roles, task=None, **kwargs)[source]

Bases: object

Basic class to create dataset.

__init__(data, features, roles, task=None, **kwargs)[source]

Create dataset with given data, features, roles and special attributes.

Parameters:
  • data (Any) – 2d array of data of special type for each dataset type.

  • features (Optional[list]) – Feature names or None for empty data.

  • roles (Optional[Dict[str, ColumnRole]]) – Features roles or None for empty data.

  • task (Optional[Task]) – Task for dataset if train/valid.

  • **kwargs (Any) – Special named array of attributes (target, group etc..).

property features

Define how to get features names list.

Returns:

Features names.

property data

Get data attribute.

Returns:

Any, array like or None.

property roles

Get roles dict.

Returns:

Dict of feature roles.

property inverse_roles

Get inverse dict of feature roles.

Returns:

dict, keys - roles, values - features names.

set_data(data, features, roles)[source]

Inplace set data, features, roles for empty dataset.

Parameters:
  • data (Any) – 2d array like or None.

  • features (Any) – List of features names.

  • roles (Any) – Roles dict.

empty()[source]

Get new dataset for same task and targets, groups, without features.

Return type:

LAMLDataset

Returns:

New empty dataset.

property shape

Get size of 2d feature matrix.

Returns:

Tuple of 2 elements.

classmethod concat(datasets)[source]

Concat multiple dataset.

Default behavior - takes empty dataset from datasets[0] and concat all features from others.

Parameters:

datasets (Sequence[LAMLDataset]) – Sequence of datasets.

Return type:

LAMLDataset

Returns:

Concated dataset.

drop_features(droplist)[source]

Inplace drop columns from dataset.

Parameters:

droplist (Sequence[str]) – Feature names.

Returns:

Dataset without columns.

static from_dataset(dataset)[source]

Abstract method - how to create this type of dataset from others.

Parameters:

dataset (LAMLDataset) – Original type dataset.

Return type:

LAMLDataset

Returns: # noqa DAR202

Converted type dataset.

property dataset_type

Get type of dataset.