ANALYSIS OF NOVEL FEATURE SELECTION CRITERION BASED ON INTERACTIONS OF HIGHER ORDER IN CASE OF PRODUCTION PLANT DATA

Main Article Content

Mateusz Pawluk
Dariusz Wierzba

Abstrakt
Feature selection plays vital role in the processing pipeline of today’s data science applications and is a crucial step of the overall modeling process. Due to multitude of possibilities for extracting large and highly structured data in various fields, this is a serious issue in the area of machine learning without any optimal solution proposed so far. In recent years, methods based on concepts derived from information theory attracted particular attention, introducing eventually general framework to follow. The criterion developed by author et al., namely IIFS (Interaction Information Feature Selection), extended state-of-the-art methods by adopting interactions of higher order, both 3-way and 4-way. In this article, careful selection of data from industrial site was made in order to benchmark such approach with others. Results clearly show that including side effects in IIFS can reorder output set of features significantly and improve overall estimate of error for the selected classifier.

Article Details

Jak cytować
Pawluk, M., & Wierzba, D. (2019). ANALYSIS OF NOVEL FEATURE SELECTION CRITERION BASED ON INTERACTIONS OF HIGHER ORDER IN CASE OF PRODUCTION PLANT DATA. Metody Ilościowe W Badaniach Ekonomicznych, 20(3), 209–216. https://doi.org/10.22630/MIBE.2019.20.3.20
Bibliografia

Battiti R. (1994) Using Mutual Information for Selecting Features in Supervised Neural-Net Learning. IEEE Transactions on Neural Networks and Learning Systems., 5(4), 537-550. (Crossref)

Brown G., Pocock A., Zhao M. J., Luján M. (2012) Conditional Likelihood Maximisation: a Unifying Framework for Information Theoretic Feature Selection. Journal of Machine Learning Research, 13(1), 27-66.

Jakulin A., Bratko I. (2004) Quantifying and Visualizing Attribute Interactions: an Approach Based on Entropy. Manuscript.

Lin D., Tang X. (2006) Conditional Infomax Learning: an Integrated Framework for Feature Extraction and Fusion. LNCS Springer, 3951, 68-82. (Crossref)

Pawluk M., Teisseyre P., Mielniczuk J. (2019) Information-Theoretic Feature Selection Using High-Order Interactions. LNCS Springer, 11331. (Crossref)

Peng H., Long F., Ding C. (2005) Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226-1238. (Crossref)

Tharwat A. (2018) Classification Assessment Methods. Applied Computing and Informatics.

Xing E., Jordan M., Karp R. (2001) Feature Selection for High-Dimensional Genomic Microarray Data. ICML Proceedings of the Eighteenth International Conference on Machine Learning, 601-608.

Yang H. H., Moody J. (1999) Data Visualization and Feature Selection: New Algorithms for Nongaussian Data. Advances in Neural Information Processing Systems, 12, 687-693.

Zhang F., Li W., Zhang Y., Feng Z. (2018) Data Driven Feature Selection for Machine Learning Algorithms in Computer Vision. IEEE Internet of Things Journal, 5(6). (Crossref)

Statystyki

Downloads

Rekomendowane teksty