Tuesday, May 26, 2009

Misusing Factor Analysis: How to make common errors?

This comes out of my personal experience. One of the most (ab)used statistical technique is factor analysis. MBA students use it at their leisure. I do not remember how many times I have used this myself. Following list of errors is obviously not exhaustive. But these were the errors I was about to make. Thankfully I was able to rectify my understanding. (very important from interview perspective)

Coolest Mistake to make: I want to find out which factors are important for customer. Ha!!! Do the Factor Analysis on the data, the Eigen values tell me the relative importance of the factors. Higher the Eigen value, higher is the importance.

The fact: High Eigen values do not represent importance. They represent the variance explained by that factor. Variance refers to variation from the mean or the expected value. So you see, importance does not come into picture. If you really want to know importance I would recommend other techniques like multidimensional scaling among others.

Another not so common mistake: Factor A got 10 variables, whereas Factor B has only 2 variables. Since factor A has more variables, customers have higher preference/significance/whatever! for factor A.

The fact: We must understand how factor analysis works. Let us say that there is a bag, it is called residual, and right now it is 100%. I take one variable out and start correlating it with other variables. Slowly I find a factor with one/multiple variables in it based on correlation. Let us say I got a factor which explained 30% variance. The bag now has 70% residual. The process continues until the bag is empty. So I think it is clear that we cannot compare two factors apart from how much variance they explain. The customer preference is also not shown here.

Gyan: Before I close, I would like to share one more thing I learnt from my current project in a telecom company. Though chances are very less, but clients with different sectors can come up with different factors/ different variables in different factors given same set of variables. So a researcher must always stay careful. If the sample size for each sector is good enough to do analysis, he must cross check taking sectors individually and finding factors. It may help in better understanding of the customer. That’s what factor analysis is all about..:)

No comments:

Post a Comment