prepare_factorized_array#

skrough.dataprep.prepare_factorized_array(data_x: ndarray) tuple[numpy.ndarray, numpy.ndarray][source]#

Factorize data table.

Factorize data table and return statistics of feature domain sizes.

Parameters:

data_x – A dataset to be factorized.

Returns:

Result is consisted of the following elements

  • factorized data returned in a form of a 2D array

  • data feature domain sizes returned in a form of 1d array, i.e., a single value (domain size) returned for each column

Examples

>>> ar = np.array([[5, 3],
...                [9, 3],
...                [5, 2]])
>>> prepare_factorized_array(ar)
(array([[0, 0],
        [1, 0],
        [0, 1]]),
array([2, 2]))