Self-Learning of Multivariate Time Series Using Perceptually Important Points

Timo Lintonen; Tomi Räty

doi:10.1109/JAS.2019.1911777

Volume 6 Issue 6

Nov. 2019

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2019 > 6(6): 1318-1331

Timo Lintonen and Tomi Räty, "Self-Learning of Multivariate Time Series Using Perceptually Important Points," IEEE/CAA J. Autom. Sinica, vol. 6, no. 6, pp. 1318-1331, Nov. 2019. doi: 10.1109/JAS.2019.1911777

Citation:

Timo Lintonen and Tomi Räty, "Self-Learning of Multivariate Time Series Using Perceptually Important Points," IEEE/CAA J. Autom. Sinica, vol. 6, no. 6, pp. 1318-1331, Nov. 2019. doi: 10.1109/JAS.2019.1911777

Citation:

PDF( 1100 KB)

Self-Learning of Multivariate Time Series Using Perceptually Important Points

doi: 10.1109/JAS.2019.1911777

Timo Lintonen^,,
Tomi Räty

More Information

Abstract

Abstract

In machine learning, positive-unlabelled (PU) learning is a special case within semi-supervised learning. In positive-unlabelled learning, the training set contains some positive examples and a set of unlabelled examples from both the positive and negative classes. Positive-unlabelled learning has gained attention in many domains, especially in time-series data, in which the obtainment of labelled data is challenging. Examples which originate from the negative class are especially difficult to acquire. Self-learning is a semi-supervised method capable of PU learning in time-series data. In the self-learning approach, observations are individually added from the unlabelled data into the positive class until a stopping criterion is reached. The model is retrained after each addition with the existent labels. The main problem in self-learning is to know when to stop the learning. There are multiple, different stopping criteria in the literature, but they tend to be inaccurate or challenging to apply. This publication proposes a novel stopping criterion, which is called Peak evaluation using perceptually important points, to address this problem for time-series data. Peak evaluation using perceptually important points is exceptional, as it does not have tunable hyperparameters, which makes it easily applicable to an unsupervised setting. Simultaneously, it is flexible as it does not make any assumptions on the balance of the dataset between the positive and the negative class.
- Positive-unlabelled (PU) learning,
- self-learning,
- stopping criterion,
- time series

FullText(HTML)

References(45)

References

[1]	S. S. Khan and M. G. Madden, " A survey of recent trends in one class classification, ” in Artificial Intelligence and Cognitive Science, L. Coyle and J. Freyne, Eds. Berlin, Heidelberg, Germany: Springer, 2010, pp. 188–197.
[2]	O. Chapelle, B. Schlkopf, and A. Zien, Semi-Supervised Learning. Cambridge, USA: The MIT Press, 2010.
[3]	S. D. Villalba and P. Cunningham, " An evaluation of dimension reduction techniques for one-class classification,” Artif. Intell. Rev., vol. 27, no. 4, pp. 273–294, Apr. 2007. doi: 10.1007/s10462-008-9082-5
[4]	E. Ferretti, M. L. Errecalde, M. Anderka, and B. Stein, " On the use of reliable-negatives selection strategies in the PU learning approach for quality flaws prediction in wikipedia, ” in Proc. 25th Int. Workshop on Database and Expert Systems Applications, Munich, Germany, 2014, pp. 211–215.
[5]	J. K. Rout, A. Dalmia, K. K. R. Choo, S. Bakshi, and S. K. Jena, " Revisiting semi-supervised learning for online deceptive review detection,” IEEE Access, vol. 5, pp. 1319–1327, Jan. 2017. doi: 10.1109/ACCESS.2017.2655032
[6]	P. Nusrath Hameed, K. Verspoor, S. Kusljic, and S. Halgamuge, " Positive-unlabeled learning for inferring drug interactions based on heterogeneous attributes,” BMC Bioinformatics, vol. 18, pp. 140, Mar. 2017. doi: 10.1186/s12859-017-1546-7
[7]	L. Wei and E. Keogh, " Semi-supervised time series classification, ” in Proc. 12th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Philadelphia, PA, USA, 2006, pp. 748–753.
[8]	I. Triguero, S. Garcia, and F. Herrera, " Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study,” Knowl. Inf. Syst., vol. 42, no. 2, pp. 245–284, Feb. 2015. doi: 10.1007/s10115-013-0706-y
[9]	C. A. Ratanamahatana and D. Wanichsan, " Stopping criterion selection for efficient semi-supervised time series classification,” in Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, R. Lee, Ed. Berlin, Heidelberg, Germany: Springer, 2008, pp. 1–14.
[10]	M. Gonzalez, C. Bergmeir, I. Triguero, Y. Rodriguez, and J. M. Benitez, " On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems,” Inf. Sci., vol. 328, pp. 42–59, Jan. 2016. doi: 10.1016/j.ins.2015.07.061
[11]	Y. P. Chen, B. Hu, E. J. Keogh, and G. E. A. P. A. Batista, " DTW-D: time series semi-supervised learning from a single example, ” in Proc. 19th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Chicago, Illinois, USA, 2013, pp. 383–391.
[12]	X. P. Xi, E. Keogh, C. Shelton, L. Wei, and C. A. Ratanamahatana, " Fast time series classification using numerosity reduction,” in Proc. 23rd Int. Conf. Machine Learning, Pittsburgh, Pennsylvania, USA, 2006, pp. 1033–1040.
[13]	H. S. Lei and B. Y. Sun, " A study on the dynamic time warping in kernel machines, ” in Proc. 3rd Int. IEEE Conf. Signal-Image Technologies and Internet-Based System, Shanghai, China, 2007, pp. 839–845.
[14]	F. Petitjean, G. Forestier, G. I. Webb, A. E. Nicholson, Y. P. Chen, and E. Keogh, " Dynamic time warping averaging of time series allows faster and more accurate classification, ” in Proc. IEEE Int. Conf. Data Mining, Shenzhen, China, 2014, pp. 470–479.
[15]	S. A. More and P. J. Deore, " Gait recognition by cross wavelet transform and graph model,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 3, pp. 718–726, May 2018. doi: 10.1109/JAS.2018.7511081
[16]	A. Bagnall, J. Lines, J. Hills, and A. Bostrom, " Time-series classification with cote: the collective of transformation-based ensembles,” IEEE Trans. Knowl. Data Eng., vol. 27, no. 9, pp. 2522–2535, Mar. 2015. doi: 10.1109/TKDE.2015.2416723
[17]	H. Sakoe and S. Chiba, " Dynamic programming algorithm optimization for spoken word recognition,” IEEE Trans. Acoust.,Speech,Signal Process., vol. 26, no. 1, pp. 43–49, Feb. 1978. doi: 10.1109/TASSP.1978.1163055
[18]	E. Keogh and C. A. Ratanamahatana, " Exact indexing of dynamic time warping,” Knowl. Inf. Syst., vol. 7, no. 3, pp. 358–386, Mar. 2005. doi: 10.1007/s10115-004-0154-9
[19]	C. A. Ratanamahatana and E. Keogh, " Three myths about dynamic time warping data mining,” pp. 506–510. [Online]. Available:https://epubs.siam.org/doi/abs/10.1137/1.9781611972757.50
[20]	M. Congedo, C. Özen, and L. Sherlin, " Notes on EEG resampling by natural cubic spline interpolation,” J. Neurotherapy, vol. 6, no. 4, pp. 73–80, Sep. 2002. doi: 10.1300/J184v06n04_08
[21]	M. N. Nguyen, X. L. Li, and S. K. Ng, " Positive unlabeled learning for time series classification,” in Proc. 22nd Int. Joint Conf. Artificial Intelligence, Barcelona, Catalonia, Spain, 2011, pp. 1421–1426.
[22]	N. Begum, B. Hu, T. Rakthanmanon, and E. Keogh, " A minimum description length technique for semi-supervised time series classification, ” in Integration of Reusable Systems, T. Bouabana-Tebibel and S. H. Rubin, Eds. Cham, Germany: Springer, 2014, pp. 171–192.
[23]	M. N. Nguyen, X. L. Li, and S. K. Ng, " Ensemble based positive unlabeled learning for time series classification, ” in Database Systems for Advanced Applications, S. G. Lee, Z. Y. Peng, X. F. Zhou, Y. S. Moon, R. Unland, and J. Yoo, Eds. Berlin, Heidelberg, Germany: Springer, 2012, pp. 243–257.
[24]	A. Mojsilović, " Perceptual indexing of multivariate time series, ” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Honolulu, HI, USA, 2007, pp. II-533–II-536.
[25]	D. J. Ketchen Jr. and C. L. Shook, " The application of cluster analysis in strategic management research: an analysis and critique,” Strateg. Manag. J., vol. 17, no. 6, pp. 441–458, Jun. 1996. doi: 10.1002/(SICI)1097-0266(199606)17:6<441::AID-SMJ819>3.0.CO;2-G
[26]	P. Y. Zhang, S. Shu, and M. C. Zhou, " An online fault detection model and strategies based on SVM-grid in clouds,” IEEE/CAA J. Autom. Sinica, vol. 5, no. 2, pp. 445–456, Mar. 2018. doi: 10.1109/JAS.2017.7510817
[27]	H. A. Dau, E. Keogh, K. Kamgar, C. C. M. Yeh, Y. Zhu, S. Gharghabi, C. A. Ratanamahatana, Y. P.Chen, B. Hu, N. Begum, A. Bagnall, A. Mueen, and G. Batista, " The UCR time series classification archive,” 2019 [Online]. Available: https://www.cs.ucr.edu/~eamonn/time_series_data_2018/.
[28]	A. Bagnall, J. Lines, A. Bostrom, J. Large, and E. Keogh, " The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances,” Data Mining Knowl. Discovery, vol. 31, no. 3, pp. 606–660, May 2017. doi: 10.1007/s10618-016-0483-9
[29]	H. Ding, G. Trajcevski, P. Scheuermann, X. Y. Wang, and E. Keogh, " Querying and mining of time series data: experimental comparison of representations and distance measures,” Proc. VLDB Endow., vol. 1, no. 2, pp. 1542–1552, Aug. 2008. doi: 10.14778/1454159.1454226
[30]	G. E. A. P. A. Batista, E. J. Keogh, O. M. Tataw, and V. M. A. De Souza, " CID: an efficient complexity-invariant distance for time series,” Data Mining Knowl. Discovery, vol. 28, no. 3, pp. 634–669, May 2014. doi: 10.1007/s10618-013-0312-3
[31]	R. T. Olszewski, " Generalized feature extraction for structural pattern recognition in time-series data,” Ph.D. dissertation, Carnegie Mellon University, Pittsburgh, PA, USA, 2001.
[32]	J. Demšar, " Statistical comparisons of classifiers over multiple data sets,” J. Mach. Learn. Res., vol. 7, pp. 1–30, Dec. 2006.
[33]	Hoang Anh Dau, Anthony Bagnall, Kaveh Kamgar, Chin-Chia Michael Yeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana, and Eamonn Keogh, " The UCR Time Series Archive,” 2019 [Online]. Available: http://www.ieee-jas.org/en/article/doi/10.1109/JAS.2019.1911747
[34]	M. Friedman, " The use of ranks to avoid the assumption of normality implicit in the analysis of variance,” J. Am. Stat. Assoc., vol. 32, no. 200, pp. 675–701, Dec. 1937. doi: 10.1080/01621459.1937.10503522
[35]	F. Wilcoxon, " Individual comparisons by ranking methods,” Biometrics Bull., vol. 1, no. 6, pp. 80–83, Dec. 1945. doi: 10.2307/3001968
[36]	S. Holm, " A simple sequentially rejective multiple test procedure,” Scand. J. Stat., vol. 6, no. 2, pp. 65–70, 1979.
[37]	W. Haynes, " Holm’s method,” in Encyclopedia of Systems Biology, W. Dubitzky, O. Wolkenhauer, K. H. Cho, and H. Yokota, Eds. New York, NY, USA: Springer, 2013.
[38]	H. Ismail Fawaz, G. Forestier, J. Weber, L. Idoumghar, and P. A. Muller, " Deep learning for time series classification: a review,” Data Mining Knowl. Discovery, vol. 33, no. 4, pp. 917–963, Jul. 2019. doi: 10.1007/s10618-019-00619-1
[39]	M. W. Kadous, " Temporal classification: extending the classification paradigm to multivariate time series,” Ph.D. dissertation, University of New South Wales, Sydney, Australia, 2002.
[40]	D. Y. Yeung, H. Chang, Y. M. Xiong, S. George, R. Kashi, T. Matsumoto, and G. Rigoll, " SVC2004: first international signature verification competition,” in Biometric Authentication, D. Zhang and A. K. Jain, Eds. Berlin, Heidelberg, Germany: Springer, 2004, pp. 16–22.
[41]	J. Vanschoren, J. N. Van Rijn, B. Bischl, and L. Torgo, " OpenML: networked science in machine learning,” ACM SIGKDD Explor. Newsl., vol. 15, no. 2, pp. 49–60, Dec. 2013.
[42]	C. Gong, T. L. Liu, J. Yang, and D. C. Tao, " Large-margin label-calibrated support vector machines for positive and unlabeled learning,” IEEE Trans. Neural Networks and Learning Systems, pp. 1-13, 2019.
[43]	Y. S. Jeong, M. K. Jeong, and O. A. Omitaomu, " Weighted dynamic time warping for time series classification,” Pattern Recognit., vol. 44, no. 9, pp. 2231–2240, Sep. 2011. doi: 10.1016/j.patcog.2010.09.022
[44]	F. Petitjean, A. Ketterlin, and P. Gancarski, " A global averaging method for dynamic time warping, with applications to clustering,” Pattern Recognit., vol. 44, no. 3, pp. 678–693, Mar. 2011. doi: 10.1016/j.patcog.2010.09.013
[45]	M. Cuturi and M. Blondel, " Soft-DTW: a differentiable loss function for time-series,” in Proc. 34th Int. Conf. Machine Learning, Sydney, Australia, 2017, pp. 894–903.

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(10) / Tables(9)

Get Citation

PDF

XML

Article Metrics

Article views (2111) PDF downloads(175)

Highlights

Peak evaluation using perceptually important points to address the problem of a stopping criterion in self-learning for time-series data.
Flexible stopping criterion which is applicable regardless of the balance (between negative, positive or balanced classes) of the time-series data.
Stopping criterion, which does not require any tunable hyperparameters, which functions successfully and automatically.

Self-Learning of Multivariate Time Series Using Perceptually Important Points

doi: 10.1109/JAS.2019.1911777

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content