Feature Extraction for Time Series Analysis: A Practical Guide
Written on
Time series data presents unique challenges and opportunities in data science.
When I embarked on my journey in machine learning, my passion for physics initially drew me in—a rather unconventional motivation. My academic pursuits revealed a deep appreciation for coding and data science. At that time, the nature of the data didn’t concern me; my primary goal was to immerse myself in programming and produce extensive lines of code daily.
However, irrespective of initial preferences, career paths tend to guide you toward certain data types. For instance, working at SpaceX likely involves substantial signal processing, whereas a position at Netflix would probably engage you with natural language processing (NLP) and recommendation systems. If you found yourself at Tesla, your focus would likely shift towards computer vision and image-related tasks.
During my early career as a physicist, and later through my PhD in engineering, I was quickly introduced to the realm of signal processing. This is the essence of engineering: whenever you collect data and derive insights, you are essentially dealing with signals. It’s important to note that signals are not exclusive to engineering; finance also heavily relies on time series data, such as stock price fluctuations.
The key takeaway here is: time series data is distinct. Many transformations and processing techniques applicable to tabular data or images may not hold the same relevance for time series. Take feature extraction, for example.
The concept of feature extraction involves refining the data to identify and retain essential characteristics that can enhance subsequent machine learning processes. In essence, it serves to optimize the input for machine learning by prioritizing significant features while discarding irrelevant ones.
The complete feature extraction process is illustrated below:
When we differentiate between feature extraction methods for tabular data versus signal data, we recognize that they operate in entirely different contexts.
For instance, concepts such as peaks and valleys, the Fourier Transform, and Wavelet Transform only gain significance in the context of signals. My intention in outlining these points is to emphasize that a specific suite of feature extraction techniques is uniquely suited for signal data.
Broadly, feature extraction methods can be categorized into two groups:
- Data-driven methods: These techniques focus on deriving features by merely analyzing the signals themselves, without consideration for the machine learning objectives, such as classification or forecasting.
- Model-based methods: These approaches take a holistic view, seeking to identify features tailored to the specific problem at hand.
Data-driven methods typically offer computational simplicity and do not rely on target outputs. However, their drawback is that the features they produce may lack specificity for your particular task. For example, applying a Fourier Transform to a signal might yield general features that are not as effective as those specifically learned through an end-to-end model.
In this article, we will concentrate on data-driven methods. Specifically, we will explore domain-specific, frequency-based, time-based, and statistical-based methods. Let's dive in!
1. Domain-Specific Feature Extraction
The first method I’ll discuss is intentionally somewhat vague. The optimal way to extract features often hinges on the particular problem you are tackling. For example, if you are analyzing a signal from an engineering experiment and need to focus on the amplitude after t = 6s, that information is crucial for your analysis, even if it may not seem significant in a broader context.
2. Frequency-Based Feature Extraction
2.1 Explanation
This technique pertains to the spectral analysis of your time series or signal. The most straightforward way to analyze a signal is within the time domain, where we consider it as a value at a given moment.
To illustrate, let’s examine the following signal in its natural domain:
When we visualize it, we get:
This represents the simplest form of our dataset. We can transform this into the frequency domain, where we decompose the signal into its periodic components—frequencies, amplitudes, and phases.
The Fourier Transform Y(k) of the signal y(t) is expressed as follows:
This illustrates the amplitude and phase for each frequency component k. For feature extraction, we can derive the amplitudes, phases, and frequency values of the top 10 components (those with the highest amplitudes), yielding 30 features (10 each for amplitude, frequency, and phase).
Moreover, we can extend this method by utilizing wavelets instead of sines or cosines, leading to what is known as Wavelet Decomposition.
Understanding this material can be complex, so let’s proceed with some coding to demonstrate its application.
2.2 Code
Let's implement the basic Fourier Transform in practice.
First, we need to import the necessary libraries:
import numpy as np
import matplotlib.pyplot as plt
Next, let’s consider this signal as our example:
t = np.linspace(0, 1, 1000)
y = np.sin(2 * np.pi * 1 * t) + 0.4 * np.sin(2 * np.pi * 2 * t) + 2 * np.sin(2 * np.pi * 3.2 * t)
This signal comprises three primary components: one with amplitude = 1 and frequency = 1, one with amplitude = 0.4 and frequency = 2, and one with amplitude = 2 and frequency = 3.2. We can recover these through the Fourier Transform:
from scipy.fft import fft
y_f = fft(y)
frequencies = np.fft.fftfreq(len(y), d=t[1] - t[0])
We observe three distinct peaks that correspond to the respective amplitudes and frequencies.
While elaborate plotting isn’t necessary here, we could certainly create a simple function for this task:
def extract_features(signal, time_array=None, num_features=10, max_frequency=None):
# Implementation for feature extraction
pass
This function allows you to input the signal y and, optionally, the time array, the number of peaks to consider, and the maximum frequency to explore.
If we aim to extract features using wavelets, we would need to install the following library:
pip install PyWavelets
Then we would execute the wavelet transform.
> Note: I delve into wavelets in detail in another article; feel free to check it out for more insights.
3. Statistical-Based Feature Extraction
3.1 Explanation
Another viable approach for feature extraction is leveraging traditional statistical methods. There are numerous statistical metrics that can be computed from a signal, ranging from simple to complex:
- The mean, which is simply the sum of the signal divided by its number of time steps.
- The variance, indicating how much the signal deviates from the mean.
- Skewness and Kurtosis, which assess the non-Gaussian characteristics of the signal distribution. Skewness measures asymmetry, while kurtosis evaluates "tailedness."
- Quantiles: These values segment the time series into intervals defined by probability ranges.
- Autocorrelation: This metric quantifies how patterned the time series is, indicating how the current values relate to past values.
- Entropy: Reflects the complexity or unpredictability of the time series.
In 2024, each of these properties can be easily computed with just a single line of code.
4. Time-Based Feature Extraction
4.1 Explanation
In this section, we focus on extracting time features, particularly peaks and valleys within the signal. We will utilize the find_peaks function from SciPy for this purpose.
Numerous characteristics can enhance peak extraction, such as expected width, threshold, or plateau size. If such parameters are known (e.g., you may only want to consider peaks with an amplitude greater than 2), they can be adjusted accordingly. Otherwise, default settings may suffice.
We also have the flexibility to determine how many peaks/features we want to extract. For instance, if we choose N = 10, we will identify the 10 largest peaks and valleys, resulting in 20 features (10 locations and 10 amplitudes).
4.2 Code
The implementation for this is straightforward:
from scipy.signal import find_peaks
peaks, _ = find_peaks(y, height=2)
Be mindful that if you specify N = 10 peaks, but only have M = 4 peaks in your signal, the remaining 6 locations and amplitudes will default to 0.
5. Which Method Should You Use?
Having explored four distinct classes of methods, you may wonder which one is most suitable for your analysis.
While I could take the diplomatic route and say, "It depends on the problem," it is indeed true that it often does.
In reality, if you have the opportunity for domain-based feature extraction, that should be your primary focus. When the underlying physics of an experiment or prior knowledge of the issue is clear, these features should be prioritized, and sometimes even considered as the only relevant ones. However, it’s also common not to have domain-based features at your disposal, and that’s perfectly acceptable.
Regarding frequency, statistical, and time-based features, I recommend integrating them all. Add these features to your dataset and assess their effectiveness—whether they aid, hinder, or confuse your machine learning model.
6. Conclusions
I appreciate your time and attention as we recap the key points covered in this article:
- We introduced the concept of feature extraction, emphasizing its significance and the need for specific techniques in time series analysis.
- We clarified the distinction between model-based and data-driven feature extraction methods, focusing on the latter in this discussion.
- We explored domain-based feature extraction techniques, which are tailored to the specific problem at hand.
- We examined spectral techniques that utilize the Fourier/frequency spectrum of signals.
- We covered statistical techniques that derive metrics like mean, standard deviation, entropy, and autocorrelation from signals.
- We looked into time-based techniques, which focus on extracting peak information from signals.
- We provided guidance on selecting the appropriate technique for your particular situation.
7. About Me
Thank you once again for your engagement; it truly means a lot!
My name is Piero Paialunga, and I am currently pursuing a Ph.D. in Aerospace Engineering at the University of Cincinnati while also working as a Machine Learning Engineer for Gen Nine. I write about AI and Machine Learning on my blog and LinkedIn. If you enjoyed this article and wish to learn more about machine learning, feel free to:
- Connect with me on LinkedIn for updates and insights.
- Subscribe to my newsletter for new stories and the opportunity to reach out with questions.
- Become a referred member to access unlimited articles from myself and many other leading writers in Machine Learning and Data Science.
- Interested in collaborating? Check out my rates and projects on Upwork!
For inquiries or potential collaborations, you can reach me at: