get_homogeneity#

skrough.homogeneity.get_homogeneity(distribution: ndarray[Any, dtype[int64]]) ndarray[Any, dtype[int8]][source]#

Compute distribution homogeneity.

Compute homogeneity for a given input distribution. The function is mainly used for computation of homogeneity of decision attributes. The distribution format is defined as a 2D array where:

  • rows correspond to separate contexts, e.g., groups of objects or equivalence classes,

  • values in columns for a particular row represent discrete distribution, i.e., the number of occurrences of each possible decision attribute distinct value.

The result is a sequence of integer values (0 or 1), where each corresponds to a group/context (row) in the distribution input. A value of 1 means that there is at most one non-zero value in a given row (meaning that a row is homogenous), 0 otherwise (non-homogenous).

Parameters:

distribution – A 2D array representing a distribution.

Raises:

ValueError – If distribution is not a two-dimensional array.

Returns:

An array consisting of integer values 0 or 1 indicating that a corresponding row in the distribution input argument is either non-homogenous (for 0) or homogenous (for 1).

Examples

>>> get_homogeneity(
...     np.asarray(
...         [
...             [0, 0],
...             [1, 1],
...             [0, 3],
...             [5, 0],
...         ]
...     )
... )
array([1, 0, 1, 1])