Springer papers


Josif Grabocka , Erind Bedalli and Lars Schmidt-Thieme


Time-series classification has gained wide attention within the Machine Learning community, due to its large range of applicability varying from medical diagnosis, financial markets, up to shape and trajectory classification. The current state-of-art methods applied in time-series classification rely on detecting similar instances through neighboring algorithms. Dynamic Time Warping (DTW) is a similarity measure that can identify the similarity of two time-series, through the computation of the optimal warping alignment of time point pairs, therefore DTW is immune towards patterns shifted in time or distorted in size/shape. Unfortunately the classification time complexity of computing the DTW distance of two series is quadratic, subsequently DTW based nearest neighbor classification deteriorates to quartic order of time complexity per test set. The high time complexity order causes the classification of long time series to be practically infeasible. In this study we propose a fast linear classification complexity method. Our method projects the original data to a reduced latent dimensionality using matrix factorization, while the factorization is learned efficiently via stochastic gradient descent with fast convergence rates and early stopping. The latent data dimensionality is set to be as low as the cardinality of the label variable. Finally, Support Vector Machines with polynomial kernels are applied to classify the reduced dimensionality data. Experimentations over long time series datasets from the UCR collection demonstrate the superiority of our method, which is orders of magnitude faster than baselines while being superior even in terms of classification accuracy.


Machine Learning Data Mining Time Series Classification Dimensionality Reduction