Correlation and regression in contingency tables -- from Wolfram Library Archive

Products

Consulting & Solutions

Learning & Support

Company

Wolfram|Alpha

Enable JavaScript to interact with content and submit forms on Wolfram websites. Learn how

Title

Correlation and regression in contingency tables

Author

Thomas Colignatus

Organization:

Thomas Cool Consultancy & Econometrics

URL:	http://thomascool.eu/

Revision date

2007-06-05

Description

Nominal data currently lack a correlation coefficient, such as has already been defined for real data. A measure can be designed using the determinant, with the useful interpretation that the determinant gives the ratio between volumes. With M a m � n contingency table with m ≥ n, and A = Normalized[M], then A'A is a square n � n matrix and the suggested measure is r = Sqrt[Det[A'A]]. With M an a � a � ... � a contingency matrix, then pairwise correlations can be collected in a k � k matrix R. A matrix of such pairwise correlations is called an association matrix. If that matrix is also positive semi-definite (PSD) then it is a proper correlation matrix. The overall correlation then is R = f[R] where f can be chosen to impose PSD-ness. An option is to use R = Sqrt[1 - Det[R]]. However, for both nominal and cardinal data the advisable choice is to take the maximal multiple correlation within R. The resulting measure of “nominal correlation” measures the distance between a main diagonal and the off-diagonal elements, and thus is a measure of strong correlation. Cramer’s V measure for pairwise correlation can be generalized in this manner too. It measures the distance between all diagonals (including cross-diagaonals and subdiagonals) and statistical independence, and thus is a measure of weaker correlation. Finally, when also variances are defined then regression coefficients can be determined from the variance-covariance matrix.

Subjects

	Business and Economics
	Mathematics > Probability and Statistics
	Wolfram Technology > Application Packages > Additional Applications > Cool Economics

Keywords

association, correlation, contingency table, volume ratio, determinant, nonparametric methods, nominal data, nominal scale, categorical data, Fisher’, s exact test, odds ratio, tetrachoric correlation coefficient, phi, Cramer’, s V, Pearson, contingency coefficient, uncertainty coefficient, Theil’, s U, eta, meta-analysis, Simpson’, s paradox, causality, statistical independence, regression

Downloads

ColignatusCorrelation.zip (157.5 KB) - ZIP archive

WOLFRAM