|
Abstract Data mining problems in business and industry are mostly concerned with prediction on large datasets. Here we first present data mining methods in Mathematica for predicting continuous outcomes using adaptive linear least squares fits. In these methods we use a RLink, a link we have developed between Mathematica and R, to compute these fits using methods developed by others as an R package. In particular, we illustrate MARS like (multivariate adaptive regression spline) model fits using Mathematica wrapper routines to seamlessly integrate the R "earth" package fitting method into Mathematica. Many data mining problems are concerned with predicting binomial (true/false) outcomes. For example, in advertising, one is often interested in predicting whether an individual with given characteristics will respond to an advertisement. Alternatively, in a race for consumer dollars, one is often interested in predicting which brand or product is most likely to be purchased by a consumer based upon both consumer and brand characteristics. Here we present adaptive methods for making these predictions. Originally implemented by us in R, these methods have been implemented in Mathematica using MLINK.
|
|