KTH Matematik |
Tid: 16 juni, 2017, kl 13.50-14.20. Seminarierummet 3721, Institutionen för matematik, KTH, Lindstedtsvägen 25, plan 7. Karta!Föredragshållare: Goran Dizdarevic Titel: Data fusion for consumer behaviour (Master's thesis) Abstract This thesis analyses different methods of data fusion by fitting a chosen number of statistical models to empirical consumer data and evaluating their performance in terms of a selection of performance measures. The main purpose of the models is to predict business related consumer variables. Conventional methods such as decision trees, linear model and K-nearest neighbour have been suggested as well as single-layered neural networks and the naive Bayesian classifier. Furthermore, ensemble methods for both classification and regression have been investigated by minimizing the cross-entropy and RMSE of predicted outcomes using the iterative non-linear BFGS optimization algorithm. Time consumption of the models and methods for feature selection is also discussed in this thesis. Data regarding consumer drinking habits, transaction and purchase history and social demographic background is provided by Nepa. Evaluation of the performance measures indicate that the naive Bayesian classifier predicts consumer drinking habits most accurately whereas the random forest, although the most time consuming, is preferred when classifying the Consumer Satisfaction Index (CSI). Regression of CSI yield similar performance to all models. Moreover, the ensemble methods increased the prediction accuracy slightly in addition to increasing the time consumption. |
Sidansvarig: Filip Lindskog Uppdaterad: 25/02-2009 |