Machine Learning Based Integrative Regression Analysis of High – Dimensional Heterogeneous Data
Ramana S1*, Gomathi S2, Jayashree S2, Pranesh K2 and Vignesh S2
1Assistant Professor, Department of Information Technology, Nandha College of Technology, Perundurai Main Road, India
2UG – Final Year, Department of Information Technology, Nandha College of Technology, Perundurai Main Road, India
*Corresponding Author: Ramana S, Assistant Professor, Department of Information Technology, Nandha College of Technology, Perundurai Main Road, India.
Received:
May 09, 2024; Published: May 26, 2025
Abstract
In the present digital era massive amount of data is being continuously generated at exceptional and increasing scales. In supervised learning, missing values usually appear in the training set. The missing values in a dataset may generate bias, affecting the quality of the supervised learning process or the performance of classification algorithms. These imply that a reliable method for dealing with missing values is necessary. In this paper, we analyse the difference between imputation of missing values and imputation in real world applications. This data has become an important and indispensable part of every economy, industry, organization, business and individual. Further handling of these large datasets due to the heterogeneity in their formats is one of the major challenges. There is a need for efficient data processing techniques to handle the heterogeneous data and also to meet the computational requirements to process this huge volume of data. The objective of this project is to review, describe and recreate on heterogeneous data with its complexity in processing, and also the use of machine learning algorithms which plays a major role in data analytics. We experimentally show that our approach significantly outperforms some standard machine learning methods for handling missing values in classification tasks.
Keywords: Heterogeneous Data; Missing Data; Predictive Analytics; Convolutional Neural Network; Long Short-Term Memory; Data Imputation
References
- Afridi M., et al. “On automated source selection for transfer learning in convolutional neural networks”. Pattern Recognition 73 (2018): 65-75.
- Alotaibi Munif and A Mahmood. “Improved gait recognition based on specialized deep convolutional neural network”. Computer Vision and Image Understanding 164 (2017): 103-110.
- Bengio Yoshua., et al. “Representation learning: A review and new perspectives”. IEEE transactions on pattern analysis and machine intelligence 35.8 (2013): 1798-1828.
- S Satheesh Kumar and M Karthick. “An Secured Data Transmission in MANET Networks with Optimizing Link State Routing Protocol Using ACO-CBRP Protocols”. IEEE Access (2018).
- D’Agostino Danny., et al. “Deep autoen- coder for off-line design-space dimensionality reduction in shape optimization”. In 2018 AIAA/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference, (2018): 1648.
- Azath Mubarakali., et al. “Optimized flexible network architecture creation against 5G communication-based IoT using information-centric wireless computing”. Wireless Networks (2023).
- Ghodsi Ali. “Dimensionality reduction a short tutorial”. Department of Statistics and Actuarial Science, Univ. of Waterloo, Ontario, Canada 37 (2006): 38.
- Karthick M., et al. “Cybersecurity Warning System Using Diluted Convolutional Neural Network Framework for IOT Attack Prevention”. International Journal of Intelligent Engineering and Systems1 (2024).
- Han Jiequn., et al. “Overcoming the curse of dimensionality: Solving high-dimensional partial differential equations using deep learning”. arXiv preprint arXiv:1707.02568 (2017): 1-13.
- Jaseena KU and B C Kovoor. “A survey on deep learning techniques for big data in biometrics”. International Journal of Advanced Research in Computer Science 1 (2018).
- Kuo Y Frances and I H Sloan. “Lifting the curse of dimensionality”. Notices of the AMS 11 (2005): 1320-1328.
- Roweis T Sam and K S Lawrence. “Nonlinear dimensionality reduction by locally linear embedding”. Science 5500 (2000): 2323-2326.
- Raymer L Michael., et al. “Dimensionality reduction using genetic algorithms”. IEEE transactions on evolutionary computation 4.2 (2000): 164-171.
- Wang Xuechuan and K K Paliwal. “Feature extraction and dimensionality reduction algorithms and their applications in vowel recognition”. Pattern Recognition 10, (2003): 2429-2439.
- Zuech Richard., et al. “Intrusion detection and big heterogeneous data: a survey”. Journal of Big Data 1 (2015): 3.
- Collobert Ronan and J Weston. “A unified architecture for natural language processing: Deep neural networks with multitask learning”. In Proceedings of the 25th international conference on Machine learning, ACM (2008): 160-167.
- Farrell Michael D and Russell M Mersereau. “On the impact of PCA dimension reduction for hyperspectral detection of difficult targets”. IEEE Geo-science and Remote Sensing Letters 2.2 (2005): 192-195.
- Hinton E Geoffrey., et al. “A fast learning algorithm for deep belief nets”. Neural Computation 7 (2006): 1527-1554.
Citation
Copyright