Top-level APIs¶
These methods and objects are available directly in the babeldata
module.
Functions in 'common'¶
glass_wool(x_in, maxstd, side='both')
¶
Iteratively remove outliers from data.
Iteratively removes outliers from normally distributed input data until there are no more outliers more than
maxstd
standard deviations from the mean.
The returned array has the same length - outliers are set to np.nan
.
Drop nans after calling glass_wool
Use x[~np.isnan(x)]
to remove outliers (np.nan
) from the returned array.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x_in |
np.ndarray
|
The input data. |
required |
maxstd |
float | Tuple[float, float]
|
The maximum number of standard deviations allowed for an outlier. If a float is given, the same maximum standard deviation is used for the upper and lower sides of the distribution. If a tuple is given and side is "both", the first value is used for the lower side and the second value is used for the upper side. |
required |
side |
str
|
The side(s) on which to remove outliers. Options are "lower", "upper", or "both" (default). |
'both'
|
Returns:
Type | Description |
---|---|
numpy.ndarray
|
A copy of the input data with outliers set to |
Examples:
Cut values at plus and minus 2 standard deviations from the mean:
>>> import numpy as np
>>> from babeldata.common import glass_wool
>>> x = np.array([1., 442., 443., 444., 445., 446., 447., 448., 449., 900.])
>>> glass_wool(x, 2.0)
array([ nan, 442., 443., 444., 445., 446., 447., 448., 449., nan])
Only cut upper outliers:
>>> glass_wool(x, 2.0, side='upper')
array([ 1., 442., 443., 444., 445., 446., 447., 448., 449., nan])
Only cut lower outliers:
>>> glass_wool(x, 2.0, side='lower')
array([ nan, 442., 443., 444., 445., 446., 447., 448., 449., 900.])
Use asymmetric upper and lower limits:
>>> glass_wool(x, (2.0, 4.0), 'both')
array([ nan, 442., 443., 444., 445., 446., 447., 448., 449., 900.])
Notes¶
- The input data must be a 1D numpy.ndarray.
- The function is optimized with the @jit decorator for improved performance.