CSRSparseDataset

class lightautoml.dataset.np_pd_dataset.CSRSparseDataset(data, features=(), roles=None, task=None, **kwargs)[source]

Bases: lightautoml.dataset.np_pd_dataset.NumpyDataset

Dataset that contains sparse features and np.ndarray targets.

to_pandas()[source]

Not implemented.

Return type

Any

to_numpy()[source]

Convert to NumpyDataset.

Return type

NumpyDataset

Returns

NumpyDataset.

property shape

Get size of 2d feature matrix.

Return type

Tuple[Optional[int], Optional[int]]

Returns

tuple of 2 elements.

__init__(data, features=(), roles=None, task=None, **kwargs)[source]

Create dataset from csr_matrix.

Parameters

Note

For different type of parameter feature there is different behavior:

  • list, should be same len as data.shape[1]

  • None - automatic set names like feat_0, feat_1 …

  • Prefix - automatic set names like Prefix_0, Prefix_1 …

For different type of parameter feature there is different behavior:

  • list, should be same len as data.shape[1].

  • None - automatic set NumericRole(np.float32).

  • ColumnRole - single role.

  • dict.

set_data(data, features=(), roles=None)[source]

Inplace set data, features, roles for empty dataset.

Parameters

Note

For different type of parameter feature there is different behavior:

  • list, should be same len as data.shape[1]

  • None - automatic set names like feat_0, feat_1 …

  • Prefix - automatic set names like Prefix_0, Prefix_1 …

For different type of parameter feature there is different behavior:

  • list, should be same len as data.shape[1].

  • None - automatic set NumericRole(np.float32).

  • ColumnRole - single role.

  • dict.

static from_dataset(dataset)[source]

Convert dataset to sparse dataset.

Return type

CSRSparseDataset

Returns

Dataset in sparse form.