Time Series Features

Features:

 Time domain:
均值,方差,標準差,最大值,最小值,過零點個數,最大值與最小值之差,眾數
frequency domain:
直流分量,圖形的均值、方差、標準差、斜度、峭度,幅度的均值、方差、標準差、斜度、峭度

R

1. Spectral entropy of a time series
Computes feature of a time series based on tiled (non-overlapping) windows.
2. lumpiness is the variance of the variances
3. stability is the variance of the means
Computes feature of a time series based on sliding (overlapping) windows
4. max_level_shift finds the largest mean shift between two consecutive windows.
5. max_var_shift finds the largest var shift between two consecutive windows
6. max_kl_shift finds the largest shift in Kulback-Leibler divergence between two consecutive windows


7. Number of crossing points: the number of times a time series crosses the median
8. Number of flat spots: Number of flat spots in a time series(rel)
9. Hurst coefficent: Computes the Hurst coefficient indicating the level of fractional differencing of a time series


10. Autocorrelation-based features: Computes various measures based on autocorrelation coefficients of the original series, first-differenced series and second-differenced series
x_acf1
x_acf10
diff1_acf1
diff1_acf10
diff2_acf1
diff2_acf10
11. Partial autocorrelation-based features: Computes various measures based on partial autocorrelation coefficients of the original series, first-differenced series and second-differenced series
x_pacf5
diff1_pacf5
diff2_pacf5

12. Parameter estimates of Holt's linear trend method: Estimate the smoothing parameter for the level-alpha and the smoothing parameter for the trend-beta
13. Autocorrelation coefficient at lag 1 of the residual: Computes the first order autocorrelation of the residual series of the deterministic trend model

**. Strength of trend and seasonality of a time series:
Computes various measures of trend and seasonality of a time series based on an STL decomposition
- Summary Stat:
14. the length of a time series
15. the variance of a time series
16. the variance of the residules
17. the variance of the detrend series
18. the variance of the deseasonal series
19. the number of the seasonal
20. Measure of trend strength
21. Measure of seasonal strength
22. Find time of peak and trough for each component
23. Compute measure of spikiness
24. Compute measures of linearity and curvature
25. ACF of remainder
**. Heterogeneity coefficients
Computes various measures of heterogeneity of a time series. First the series is pre-whitened using an AR model to give a new series y.
We fit a GARCH(1,1) model to y and obtain the residuals, e. Then the four measures of heterogeneity are:
26. the sum of squares of the first 12 autocorrelations of y^2 27. the sum of squares of the first 12 autocorrelations of e^2 28. the R^2 value of an AR model applied to y^2 29. the R^2} value of an AR model applied to e^2 The statistics obtained from y^2 are the ARCH effects, while those from e^2 are the GARCH effects.

python
1. the absolute energy of the time series which is the sum over the squared values
E = \sum_{i=1,\ldots, n} x_i^2
2. the sum over the absolute value of consecutive changes in the series x
\sum_{i=1, \ldots, n-1} \mid x_{i+1}- x_i \mid
3. Calculates the value of an aggregation function (e.g. var or mean) of the autocorrelation taken over different all possible lags (1 to length of x)
\frac{1}{n-1} \sum_{l=1,\ldots, n} \frac{1}{(n-l)\sigma^{2}} \sum_{t=1}^{n-l}(X_{t}-\mu )(X_{t+l}-\mu)
where n is the length of the time series X_i\sigma^2 its variance and \mu its mean.
4. Calculates a linear least-squares regression for values of the time series that were aggregated over chunks versus the sequence from 0 up to the number of chunks minus one. The chunk specifies how many time series values are in each chunk. Further, the aggregation function could be “max”, “min” or , “mean”, “median”
5. Implements a vectorized Approximate entropy algorithm.
For short time-series this method is highly dependent on the parameters, but should be stable for N > 2000, see:
Yentes et al. (2012) - The Appropriate Use of Approximate Entropy and Sample Entropy with Short Data Sets
Other shortcomings and alternatives discussed in:
Richman & Moorman (2000) - Physiological time-series analysis using approximate entropy and sample entropy
6. This feature fits the unconditional maximum likelihood of an autoregressive AR(k) process. The k parameter is the maximum lag of the process
X_{t}=\varphi_0 +\sum _{{i=1}}^{k}\varphi_{i}X_{{t-i}}+\varepsilon_{t}
For the configurations from param which should contain the maxlag “k” and such an AR process is calculated. Then the coefficients \varphi_{i} whose index i contained from “coeff” are returned.
7. The Augmented Dickey-Fuller test is a hypothesis test which checks whether a unit root is present in a time series sample. 
8.  Calculates the autocorrelation of the specified lag, according to the formula [1]
\frac{1}{(n-l)\sigma^{2}} \sum_{t=1}^{n-l}(X_{t}-\mu )(X_{t+l}-\mu)
where n is the length of the time series X_i\sigma^2 its variance and \mu its mean. l denotes the lag.
References
compare with 3.

9. First bins the values of x into max_bins equidistant bins. Then calculates the value of
- \sum_{k=0}^{min(max\_bins, len(x))} p_k log(p_k) \cdot \mathbf{1}_{(p_k > 0)}
where p_k is the percentage of samples in bin k.

10. This function calculates the value of
\frac{1}{n-2lag} \sum_{i=0}^{n-2lag} x_{i + 2 \cdot lag}^2 \cdot x_{i + lag} \cdot x_{i}
which is
\mathbb{E}[L^2(X)^2 \cdot L(X) \cdot X]
where \mathbb{E} is the mean and L is the lag operator. It was proposed in [1] as a measure of non linearity in the time series.
References
[1] Schreiber, T. and Schmitz, A. (1997).
Discrimination power of measures for nonlinearity in a time series
PHYSICAL REVIEW E, VOLUME 55, NUMBER 5
11. First fixes a corridor given by the quantiles ql and qh of the distribution of x. Then calculates the average, absolute value of consecutive changes of the series x inside this corridor.
Think about selecting a corridor on the y-Axis and only calculating the mean of the absolute change of the time series inside this corridor.
12.This is an estimate for a time series complexity [1] (A more complex time series has more peaks, valleys etc.). It calculates the value of
\sqrt{ \sum_{i=0}^{n-2lag} ( x_{i} - x_{i+1})^2 }
References
[1] Batista, Gustavo EAPA, et al (2014).
CID: an efficient complexity-invariant distance for time series.
Data Mining and Knowledge Difscovery 28.3 (2014): 634-669.
13. the number of values in x that are higher than the mean of x

14. the number of values in x that are lower than the mean of x

15. Calculates a Continuous wavelet transform for the Ricker wavelet, also known as the “Mexican hat wavelet” which is defined by
\frac{2}{\sqrt{3a} \pi^{\frac{1}{4}}} (1 - \frac{x^2}{a^2}) exp(-\frac{x^2}{2a^2})
where a is the width parameter of the wavelet function.
This feature calculator takes three different parameter: widths, coeff and w. The feature calculater takes all the different widths arrays and then calculates the cwt one time for each different width array. Then the values for the different coefficient for coeff and width w are returned. (For each dic in param one feature is returned)
16. Calculates the sum of squares of chunk i out of N chunks expressed as a ratio with the sum of squares over the whole series

17. the spectral centroid (mean), variance, skew, and kurtosis of the absolute fourier transform spectrum.

18. Calculates the fourier coefficients of the one-dimensional discrete Fourier Transform for real input by fast fourier transformation algorithm
A_k =  \sum_{m=0}^{n-1} a_m \exp \left \{ -2 \pi i \frac{m k}{n} \right \}, \qquad k = 0,
\ldots , n-1.
The resulting coefficients will be complex, this feature calculator can return the real part (attr==”real”), the imaginary part (attr==”imag), the absolute value (attr=”“abs) and the angle in degrees (attr==”angle).

19. Returns the first location of the maximum value of x. The position is calculated relatively to the length of x.

20. the first location of the minimal value of x. The position is calculated relatively to the length of x.

21.Coefficients of polynomial h(x), which has been fitted to the deterministic dynamics of Langevin model
\dot{x}(t) = h(x(t)) + \mathcal{N}(0,R)
as described by [1].
For short time-series this method is highly dependent on the parameters.
References
[1] Friedrich et al. (2000): Physics Letters A 271, p. 217-222
Extracting model equations from experimental data
22. Checks if any value in x occurs more than once

23. Checks if the maximum value of x is observed more than once

24.Checks if the minimal value of x is observed more than once
25.Those apply features calculate the relative index i where q% of the mass of the time series x lie left of i. For example for q = 50% this feature calculator will return the mass center of the time series

26.  the kurtosis of x (calculated with the adjusted Fisher-Pearson standardized moment coefficient G2).

27.Boolean variable denoting if the standard dev of x is higher than ‘r’ times the range = difference between max and min of x. Hence it checks if
std(x) > r * (max(X)-min(X))
According to a rule of the thumb, the standard deviation should be a forth of the range of the values.

28. the relative last location of the maximum value of x. The position is calculated relatively to the length of x.

29. the last location of the minimal value of x. The position is calculated relatively to the length of x.
30. the length of x
31.Calculate a linear least-squares regression for the values of the time series versus the sequence from 0 to length of the time series minus one. This feature assumes the signal to be uniformly sampled. It will not use the time stamps to fit the model. The parameters control which of the characteristics are returned.
Possible extracted attributes are “pvalue”, “rvalue”, “intercept”, “slope”, “stderr”, see the documentation of linregress for more information.
32. the length of the longest consecutive subsequence in x that is bigger than the mean of x

33. the length of the longest consecutive subsequence in x that is smaller than the mean of x

34. Largest fixed point of dynamics :math:argmax_x {h(x)=0}` estimated from polynomial h(x), which has been fitted to the deterministic dynamics of Langevin model
\dot(x)(t) = h(x(t)) + R \mathcal(N)(0,1)
as described by
Friedrich et al. (2000): Physics Letters A 271, p. 217-222 Extracting model equations from experimental data
For short time-series this method is highly dependent on the parameters.

35. Calculates the highest value of the time series x.
36. the mean of x
37. the mean over the absolute differences between subsequent time series values which is
\frac{1}{n} \sum_{i=1,\ldots, n-1} | x_{i+1} - x_{i}|
38.  the mean over the absolute differences between subsequent time series values which is
\frac{1}{n} \sum_{i=1,\ldots, n-1}  x_{i+1} - x_{i}

39.  the mean value of a central approximation of the second derivative
\frac{1}{n} \sum_{i=1,\ldots, n-1}  \frac{1}{2} (x_{i+2} - 2 \cdot x_{i+1} + x_i)
40. the median of x

41. Calculates the lowest value of the time series x.
42. Calculates the number of crossings of x on m. A crossing is defined as two sequential values where the first value is lower than m and the next is greater, or vice-versa. If you set m to zero, you will get the number of zero crossings.
43. This feature calculator searches for different peaks in x. To do so, x is smoothed by a ricker wavelet and for widths ranging from 1 to n. This feature calculator returns the number of peaks that occur at enough width scales and with sufficiently high Signal-to-Noise-Ratio (SNR)
44. Calculates the number of peaks of at least support n in the time series x. A peak of support n is defined as a subsequence of x where a value occurs, which is bigger than its n neighbours to the left and to the right.
Hence in the sequence
>>> x = [3, 0, 0, 4, 0, 0, 13]
4 is a peak of support 1 and 2 because in the subsequences
>>> [0, 4, 0]
>>> [0, 0, 4, 0, 0]
4 is still the highest value. Here, 4 is not a peak of support 3 because 13 is the 3th neighbour to the right of 4 and its bigger than 4.
45. Calculates the value of the partial autocorrelation function at the given lag. The lag k partial autocorrelation of a time series \lbrace x_t, t = 1 \ldots T \rbrace equals the partial correlation of x_t and x_{t-k}, adjusted for the intermediate variables \lbrace x_{t-1}, \ldots, x_{t-k+1} \rbrace ([1]). Following [2], it can be defined as
\alpha_k = \frac{ Cov(x_t, x_{t-k} | x_{t-1}, \ldots, x_{t-k+1})}
{\sqrt{ Var(x_t | x_{t-1}, \ldots, x_{t-k+1}) Var(x_{t-k} | x_{t-1}, \ldots, x_{t-k+1} )}}
with (a) x_t = f(x_{t-1}, \ldots, x_{t-k+1}) and (b) x_{t-k} = f(x_{t-1}, \ldots, x_{t-k+1}) being AR(k-1) models that can be fitted by OLS. Be aware that in (a), the regression is done on past values to predict x_twhereas in (b), future values are used to calculate the past value x_{t-k}. It is said in [1] that “for an AR(p), the partial autocorrelations [ \alpha_k ] will be nonzero for k<=p and zero for k>p.” With this property, it is used to determine the lag of an AR-Process.
References
[1] Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015).
Time series analysis: forecasting and control. John Wiley & Sons.
 
46. the percentage of unique values, that are present in the time series more than once.
len(different values occurring more than once) / len(different values)
This means the percentage is normalized to the number of unique values, in contrast to the percentage_of_reoccurring_values_to_all_values.

47. the ratio of unique values, that are present in the time series more than once.
# of data points occurring more than once / # of all data points
This means the ratio is normalized to the number of data points in the time series, in contrast to the percentage_of_reoccurring_datapoints_to_all_datapoints.
48. Calculates the q quantile of x. This is the value of x greater than q% of the ordered values from x.

49. Count observed values within the interval [min, max)
50. Ratio of values that are more than r*std(x) (so r sigma) away from the mean of x. 
51. Returns a factor which is 1 if all values in the time series occur only once, and below one if this is not the case. In principle, it just returns
# unique values / # values
52.Calculate and return sample entropy of x.
References
53. the sample skewness of x (calculated with the adjusted Fisher-Pearson standardized moment coefficient G1).
54. This feature calculator estimates the cross power spectral density of the time series x at different frequencies. To do so, the time series is first shifted from the time domain to the frequency domain.
The feature calculators returns the power spectrum of the different frequencies.
55. the standard deviation of x
56. the sum of all data points, that are present in the time series more than once.
57. Returns the sum of all values, that are present in the time series more than once.
58. Calculates the sum over the time series values
59. Boolean variable denoting if the distribution of x looks symmetric. This is the case if
| mean(X)-median(X)| < r * (max(X)-min(X))

60. This function calculates the value of
\frac{1}{n-2lag} \sum_{i=0}^{n-2lag} x_{i + 2 \cdot lag}^2 \cdot x_{i + lag} - x_{i + lag} \cdot  x_{i}^2
which is
\mathbb{E}[L^2(X)^2 \cdot L(X) - L(X) \cdot X^2]
where \mathbb{E} is the mean and L is the lag operator. It was proposed in [1] as a promising feature to extract from time series.
References
[1] Fulcher, B.D., Jones, N.S. (2014).
Highly comparative feature-based time-series classification.
Knowledge and Data Engineering, IEEE Transactions on 26, 3026–3037.

61.Count occurrences of value in time series x.
62. the variance of x 
63. Boolean variable denoting if the variance of x is greater than its standard deviation. Is equal to variance of x being larger than 1

留言

這個網誌中的熱門文章

標準差與 Wald 統計量

可能性比檢定(Likelihood ratio test)

Wold Decomposition Theorem