NumpyDataset
- class lightautoml.dataset.np_pd_dataset.NumpyDataset(data, features=(), roles=None, task=None, **kwargs)[source]
Bases:
LAMLDataset
Dataset that contains info in np.ndarray format.
Create dataset from numpy arrays.
- Parameters:
data (
Union
[ndarray
,csr_matrix
,None
]) – 2d array of features.features (
Union
[Sequence
[str
],str
,None
]) – Features names.roles (
Union
[Sequence
[ColumnRole
],ColumnRole
,Dict
[str
,ColumnRole
],None
]) – Roles specifier.**kwargs (
ndarray
) – Named attributes like target, group etc ..
Note
For different type of parameter feature there is different behavior:
list, should be same len as data.shape[1]
None - automatic set names like feat_0, feat_1 …
Prefix - automatic set names like Prefix_0, Prefix_1 …
For different type of parameter feature there is different behavior:
list, should be same len as data.shape[1].
None - automatic set NumericRole(np.float32).
ColumnRole - single role.
dict.
- property features
Features list.
- property roles
Roles dict.
- set_data(data, features=(), roles=None)[source]
Inplace set data, features, roles for empty dataset.
- Parameters:
data (
Union
[ndarray
,csr_matrix
]) – 2d np.ndarray of features.features (
Union
[Sequence
[str
],str
,None
]) – features names.roles (
Union
[Sequence
[ColumnRole
],ColumnRole
,Dict
[str
,ColumnRole
],None
]) – Roles specifier.
Note
For different type of parameter feature there is different behavior:
List, should be same len as data.shape[1]
None - automatic set names like feat_0, feat_1 …
Prefix - automatic set names like Prefix_0, Prefix_1 …
For different type of parameter feature there is different behavior:
List, should be same len as data.shape[1].
None - automatic set NumericRole(np.float32).
ColumnRole - single role.
dict.
- to_pandas()[source]
Convert to PandasDataset.
- Return type:
- Returns:
Same dataset in PandasDataset format.
- static from_dataset(dataset)[source]
Convert random dataset to numpy.
- Parameters:
dataset (
TypeVar
(Dataset
, bound=LAMLDataset
)) – Dataset.- Return type:
- Returns:
Numpy dataset.