Function to extract intervals from a sequence.
An interval is defined as the distance between consecutive occurrences
of the similar elements in the sequence, with boundary intervals counted as positions
from sequence edges to first/last occurrence. The intervals are extracted
based on the specified binding direction (start or end)
and mode (normal, lossy, cycle, or redundant).
Example how intervals are extracted from the sequence with "binding.start" and "mode.normal".
X |
b |
a |
b |
c |
b |
b |
1 |
-> |
2 |
-> |
2 |
a |
-> |
2 |
|
|
|
c |
-> |
-> |
-> |
4 |
|
intervals |
1 |
2 |
2 |
4 |
2 |
Parameters:
Name |
Type |
Description |
Default |
X
|
|
Array to exctact an intervals from. Must be a 1-dimensional array.
|
required
|
binding
|
int
|
start = 1 - Intervals are extracted from left to right.
end = 2 – Intervals are extracted from right to left.
|
required
|
mode
|
int
|
Mode handling the intervals at the sequence boundaries:
lossy = 1 - Both interval from the start of the sequence
to the first element occurrence and interval from the
last element occurrence to the end of the sequence
are not taken into account.
normal = 2 - Interval from the start of the sequence to
the first occurrence of the element
(in case of binding to the beginning)
or interval from the last occurrence of the element to
the end of the sequence
(in case of binding to the end) is taken into account.
cycle = 3 - Interval from the start of the sequence to
the first element occurrence
and interval from the last element occurrence to the
end of the sequence are summed
into one interval (as if sequence was cyclic).
Interval is placed either in the beginning of
intervals array (in case of binding to the beginning)
or in the end (in case of binding to the end).
redundant = 4 - Both interval from start of the sequence
to the first element occurrence and the interval from
the last element occurrence to the end of the
sequence are taken into account. Their placement in results
array is determined by the binding.
|
required
|
Returns:
Name | Type |
Description |
order |
ndarray
|
Intervals extracted from the sequence
|
Raises:
Type |
Description |
Not1DArrayException
|
When X parameter is not a 1-dimensional array
|
ValueError
|
When binding or mode is not valid
|
Examples:
Get intervals from a sequence binding to start and mode normal.
| import foapy
source = ['a', 'b', 'a', 'c', 'a', 'd']
intervals = foapy.intervals(source, foapy.binding.start, foapy.mode.normal)
print(intervals)
# [1 2 2 3 2 5]
|
Get intervals from a emprty sequence.
| import foapy
source = []
intervals = foapy.intervals(source, foapy.binding.start, foapy.mode.normal)
print(intervals)
# []
|
Getting an intervals of an array with more than 1 dimension is not allowed.
| import foapy
source = [[1, 2], [3, 4]]
intervals = foapy.intervals(source, foapy.binding.start, foapy.mode.normal)
# Not1DArrayException:
# {'message': 'Incorrect array form. Expected d1 array, exists 2'}
|
Source code in .tox/docs-deploy/lib/python3.11/site-packages/foapy/core/_intervals.py
| def intervals(X, binding: int, mode: int) -> ndarray:
"""
Function to extract intervals from a sequence.
An interval is defined as the distance between consecutive occurrences
of the similar elements in the sequence, with boundary intervals counted as positions
from sequence edges to first/last occurrence. The intervals are extracted
based on the specified binding direction ([start][foapy.binding.start] or [end][foapy.binding.end])
and mode ([normal][foapy.mode.normal], [lossy][foapy.mode.lossy], [cycle][foapy.mode.cycle], or [redundant][foapy.mode.redundant]).
Example how intervals are extracted from the sequence with **"binding.start"** and **"mode.normal"**.
| **X** | b | a | b | c | b |
|:-------------:|:------:|:------:|:------:|:------:|:-----:|
| b | 1 | -> | 2 | -> | 2 |
| a | -> | 2 | | | |
| c | -> | -> | -> | 4 | |
| **intervals** | **1** | **2** | **2** | **4** | **2** |
Parameters
----------
X: array_like
Array to exctact an intervals from. Must be a 1-dimensional array.
binding: int
[start][foapy.binding.start] = 1 - Intervals are extracted from left to right.
[end][foapy.binding.end] = 2 – Intervals are extracted from right to left.
mode: int
Mode handling the intervals at the sequence boundaries:
[lossy][foapy.mode.lossy] = 1 - Both interval from the start of the sequence
to the first element occurrence and interval from the
last element occurrence to the end of the sequence
are not taken into account.
[normal][foapy.mode.normal] = 2 - Interval from the start of the sequence to
the first occurrence of the element
(in case of binding to the beginning)
or interval from the last occurrence of the element to
the end of the sequence
(in case of binding to the end) is taken into account.
[cycle][foapy.mode.cycle] = 3 - Interval from the start of the sequence to
the first element occurrence
and interval from the last element occurrence to the
end of the sequence are summed
into one interval (as if sequence was cyclic).
Interval is placed either in the beginning of
intervals array (in case of binding to the beginning)
or in the end (in case of binding to the end).
[redundant][foapy.mode.redundant] = 4 - Both interval from start of the sequence
to the first element occurrence and the interval from
the last element occurrence to the end of the
sequence are taken into account. Their placement in results
array is determined by the binding.
Returns
-------
order : ndarray
Intervals extracted from the sequence
Raises
-------
Not1DArrayException
When X parameter is not a 1-dimensional array
ValueError
When binding or mode is not valid
Examples
--------
Get intervals from a sequence binding to [start][foapy.binding.start] and mode [normal][foapy.mode.normal].
``` py linenums="1"
import foapy
source = ['a', 'b', 'a', 'c', 'a', 'd']
intervals = foapy.intervals(source, foapy.binding.start, foapy.mode.normal)
print(intervals)
# [1 2 2 3 2 5]
```
Get intervals from a emprty sequence.
``` py linenums="1"
import foapy
source = []
intervals = foapy.intervals(source, foapy.binding.start, foapy.mode.normal)
print(intervals)
# []
```
Getting an intervals of an array with more than 1 dimension is not allowed.
``` py linenums="1"
import foapy
source = [[1, 2], [3, 4]]
intervals = foapy.intervals(source, foapy.binding.start, foapy.mode.normal)
# Not1DArrayException:
# {'message': 'Incorrect array form. Expected d1 array, exists 2'}
```
""" # noqa: E501
# Validate binding
if binding not in {constants_binding.start, constants_binding.end}:
raise ValueError(
{"message": "Invalid binding value. Use binding.start or binding.end."}
)
# Validate mode
valid_modes = [
constants_mode.lossy,
constants_mode.normal,
constants_mode.cycle,
constants_mode.redundant,
]
if mode not in valid_modes:
raise ValueError(
{"message": "Invalid mode value. Use mode.lossy,normal,cycle or redundant."}
)
ar = np.asanyarray(X)
if ar.shape == (0,):
return []
if binding == constants_binding.end:
ar = ar[::-1]
perm = ar.argsort(kind="mergesort")
mask_shape = ar.shape
mask = np.empty(mask_shape[0] + 1, dtype=bool)
mask[:1] = True
mask[1:-1] = ar[perm[1:]] != ar[perm[:-1]]
mask[-1:] = True # or mask[-1] = True
first_mask = mask[:-1]
last_mask = mask[1:]
intervals = np.empty(ar.shape, dtype=np.intp)
intervals[1:] = perm[1:] - perm[:-1]
delta = len(ar) - perm[last_mask] if mode == constants_mode.cycle else 1
intervals[first_mask] = perm[first_mask] + delta
inverse_perm = np.empty(ar.shape, dtype=np.intp)
inverse_perm[perm] = np.arange(ar.shape[0])
if mode == constants_mode.lossy:
intervals[first_mask] = 0
intervals = intervals[inverse_perm]
result = intervals[intervals != 0]
elif mode == constants_mode.normal:
result = intervals[inverse_perm]
elif mode == constants_mode.cycle:
result = intervals[inverse_perm]
elif mode == constants_mode.redundant:
result = intervals[inverse_perm]
redundant_intervals = len(ar) - perm[last_mask]
if binding == constants_binding.end:
redundant_intervals = redundant_intervals[::-1]
result = np.concatenate((result, redundant_intervals))
if binding == constants_binding.end:
result = result[::-1]
return result
|