Coverage for src/foapy/characteristics/ma/_identifying_information.py: 100%

3 statements  

« prev     ^ index     » next       coverage.py v7.8.0, created at 2025-05-17 20:45 +0000

1import numpy as np 

2 

3 

4def identifying_information(intervals, dtype=None): 

5 """ 

6 Calculates identifying informations (amount of information) of the intervals 

7 grouped by congeneric sequence. 

8 

9 $$ 

10 \\left[ H_j \\right]_{1 \\le j \\le m} = 

11 \\left[ 

12 \\log_2 { \\left(\\frac{1}{n_j} * \\sum_{i=1}^{n_j} \\Delta_{ij} \\right) } 

13 \\right]_{1 \\le j \\le m} 

14 $$ 

15 

16 where \\( \\Delta_{ij} \\) represents $i$-th interval of $j$-th 

17 congeneric intervals array, \\( n_j \\) is the total 

18 number of intervals in $j$-th congeneric intervals array 

19 and $m$ is number of congeneric intervals arrays. 

20 

21 Parameters 

22 ---------- 

23 intervals : array_like 

24 An array of congeneric intervals array 

25 dtype : dtype, optional 

26 The dtype of the output 

27 

28 Returns 

29 ------- 

30 : array 

31 An array of the identifying information of congeneric intervals. 

32 

33 Examples 

34 -------- 

35 

36 Calculate the identifying information of a sequence. 

37 

38 ``` py linenums="1" 

39 import foapy 

40 import numpy as np 

41 

42 source = np.array(['a', 'b', 'a', 'c', 'a', 'd']) 

43 order = foapy.ma.order(source) 

44 intervals = foapy.ma.intervals(order, foapy.binding.start, foapy.mode.normal) 

45 result = foapy.characteristics.ma.identifying_information(intervals) 

46 print(result) 

47 # [0.73696559 1. 2. 2.5849625 ] 

48 ``` 

49 

50 Calculate the identifying information of congeneric intervals of a sequence. 

51 

52 ``` py linenums="1" 

53 import foapy 

54 

55 X = [] 

56 X.append([1, 1, 4, 4]) 

57 X.append([3, 1, 3]) 

58 X.append([5, 3, 1]) 

59 

60 result = foapy.characteristics.ma.identifying_information(X) 

61 print(result) 

62 # [1.32192809 1.22239242 1.5849625 ] 

63 ``` 

64 """ # noqa: W605 

65 

66 return np.asanyarray( 

67 [ 

68 np.log2(np.mean(line, dtype=dtype), dtype=dtype) if len(line) != 0 else 0 

69 for line in intervals 

70 ], 

71 dtype=dtype, 

72 )