Estimation: A Generalization

Its worth noting that Least Squares is a form of statistical estimation; as we recall from lecture, the idea is to define a criteria for matching a model to some data and then minimize the squared error. Thus, we find a Least-Squared estimator for the given problem. There are, however, two key points worth making; first, as we have seen on homework assignments, a closed-form Least-Squares estimator does not necessarily exist. While we may be able to find a not-so-closed-form estimator, we may not be able to calculate it quickly enough for a particular application. Fortunately, there are several other options.

One particularly common option is a use a maximum likelihood estimator. What we want to find is the value of the actual parameter x that maximizes the probability of the observed result o. The way that one does this– namely, computing the probability as a function of the parameters, finding the partial derivatives, and maximizing. Yet another approach is a Minimum Variance Estimator; these estimators attempt to minimize the variance of the result. Not only that, but Minimum Variance Estimators have well-understood error bounded by the Cramer-Rao lower bound. A reasonable discussion can be found in Steven Kay’s text on Statistical Signal Processing (Vol 1, Estimation Theory).

Secondly, as with all statistical techniques, a certain amount of error-checking is necessary. Just because one can run the least squares algorithm on a model and a data set does not necessarily mean that the data matches the model in any way. Standard ways of checking the result include the correlation coefficent and the p-value; both are well-understood statistical metrics. One particularly hilarious illustration of this fact was recently splashed all over slashdot; a Czech Ornithologist ran a study of scientific output, measured in papers / year and citations / paper, vs quantity of beer consumed for a number of his colleagues. Unfortunately, he forgot to double-check his statistics: the correlation coefficient was a scant 0.5. A far more thorough explanation of the faults of this study is linked below. Granted, a correlation coefficient of 0.5 is pretty good for some areas, but one would normally like to see something better; even then, however, the accuracy of the estimator is clearly application specific. That said, it always pays to double-check the results.

http://en.wikipedia.org/wiki/CramÈr-Rao_inequality

http://en.wikipedia.org/wiki/Minimum_variance_unbiased

http://en.wikipedia.org/wiki/Maximum_likelihood

http://life.lithoguru.com/index.php?itemid=119

Posted in Topics: Uncategorized

Jump down to leave a comment.

Leave a Comment

You must be logged in to post a comment.



* You can follow any responses to this entry through the RSS 2.0 feed.