Quantitative Analysis of a Weak Correlation between Complicated Data on the Basis of Principal Component Analysis

The mining of weak correlation information between two data matrices with high complexity is a very challenging task. A new method named principal component analysis-based multiconfidence ellipse analysis (PCA/MCEA) was proposed in this study, which first applied a confidence ellipse to describe the...

Full description

Saved in:
Bibliographic Details
Main Authors: Tao Pang, Haitao Zhang, Liliang Wen, Jun Tang, Bing Zhou, Qianxu Yang, Yong Li, Jiajun Wang, Aiming Chen, Zhongda Zeng
Format: Article
Language:English
Published: Wiley 2021-01-01
Series:Journal of Analytical Methods in Chemistry
Online Access:http://dx.doi.org/10.1155/2021/8874827
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The mining of weak correlation information between two data matrices with high complexity is a very challenging task. A new method named principal component analysis-based multiconfidence ellipse analysis (PCA/MCEA) was proposed in this study, which first applied a confidence ellipse to describe the difference and correlation of such information among different categories of objects/samples on the basis of PCA operation of a single targeted data. This helps to find the number of objects contained in the overlapping and nonoverlapping areas of ellipses obtained from PCA runs. Then, a quantitative evaluation index of correlation between data matrices was defined by comparing the PCA results of more than one data matrix. The similarity and difference between data matrices was further quantified through comprehensively analyzing the outcomes. Complicated data of tobacco agriculture were used as an example to illustrate the strategy of the proposed method, which includes rich features of climate, altitude, and chemical compositions of tobacco leaves. The number of objects of these data reached 171,516 with 14, 4, and 5 descriptors of climate, altitude, and chemicals, respectively. On the basis of the new method, the complex but weak relationship between these independent and dependent variables were interestingly studied. Three widely used but conventional methods were applied for comparison in this work. The results showed the power of the new method to discover the weak correlation between complicated data.
ISSN:2090-8865
2090-8873