








Correlation and regression in contingency tables






Organization:  Thomas Cool Consultancy & Econometrics 






20070605






Nominal data currently lack a correlation coefficient, such as has already been defined for real data. A measure can be designed using the determinant, with the useful interpretation that the determinant gives the ratio between volumes. With M a m × n contingency table with m ≥ n, and A = Normalized[M], then A'A is a square n × n matrix and the suggested measure is r = Sqrt[Det[A'A]]. With M an a × a × ... × a contingency matrix, then pairwise correlations can be collected in a k × k matrix R. A matrix of such pairwise correlations is called an association matrix. If that matrix is also positive semidefinite (PSD) then it is a proper correlation matrix. The overall correlation then is R = f[R] where f can be chosen to impose PSDness. An option is to use R = Sqrt[1  Det[R]]. However, for both nominal and cardinal data the advisable choice is to take the maximal multiple correlation within R. The resulting measure of “nominal correlation” measures the distance between a main diagonal and the offdiagonal elements, and thus is a measure of strong correlation. Cramer’s V measure for pairwise correlation can be generalized in this manner too. It measures the distance between all diagonals (including crossdiagaonals and subdiagonals) and statistical independence, and thus is a measure of weaker correlation. Finally, when also variances are defined then regression coefficients can be determined from the variancecovariance matrix.












association, correlation, contingency table, volume ratio, determinant, nonparametric methods, nominal data, nominal scale, categorical data, Fisher’, s exact test, odds ratio, tetrachoric correlation coefficient, phi, Cramer’, s V, Pearson, contingency coefficient, uncertainty coefficient, Theil’, s U, eta, metaanalysis, Simpson’, s paradox, causality, statistical independence, regression






 ColignatusCorrelation.zip (157.5 KB)  ZIP archive 







     
