Although the theory of numerical techniques may seem dry and utterly technical, and their effects mostly noticed only when incorrectly applied (just count the number of ’rounding error caused tragic mishap to occur’ posts), I wanted to show an example of how a specific numerical technique and its interpretation can actually dominate an entire disciplines and lead people to advocate deeply flawed philosophies.My first exposure to PCA, and factor analysis in general, came from a rather unexpected corner: The Mismeasure of Man, by Stephen Jay Gould. I’m sure at least a few of you have read the book, but for those who haven’t, it’s a book about the history, development, and abuses spawned by a particular view of intelligence and intelligence testing. A quick digression into the history of PCA and factor analysis: it was psychologists who first developed the general theory and practice of factor analysis in the early 20th century, notably by Charles Spearman. An English psychologist and statistician, he noticed that exam scores of schoolchildren were closely correlated across subjects. This led him to identify a ‘general intelligence’ factor, g, which he identified through PCA. As we are learning, PCA decomposes high-dimensional data into a lower number of factor dimensions, and ranks them according to how well these factors account for the input data’s variance (a related technique, common factor analysis, does not make the assumption that the output data is a linear model of the input, but merely decomposes the common variance between the two sets of data). Spearman’s g, the principal factor, he posited as simply general intelligence. Spearman’s work was influential in psychology, and especially in the new field of intelligence testing. It seemed to give credence to men like Lewis Terman, who popularized various forms of IQ tests, even convincing the Army to administer his intelligence test to recruits during World War I. Armed with the mathematical reasoning behind PCA and the apparent existence of general intelligence as a real, measurable quantity, Terman used his tests to make sweeping claims:
“High-grade or border-line deficiency… is very, very common among Spanish-Indian and Mexican families of the Southwest and also among negroes. Their dullness seems to be racial, or at least inherent in the family stocks from which they come… Children of this group should be segregated into separate classes… They cannot master abstractions but they can often be made into efficient workers… from a eugenic point of view they constitute a grave problem because of their unusually prolific breeding” (The Measurement of Intelligence, 1916, p. 91-92).
Although this may sound wildly offensive to our modern sensibilities, Terman’s views were only mildly controversial in his day. Poor performance on intelligence tests that were administered to Southern and Eastern European immigrants became reasons to restrict their immigration to the US. This particular view of intelligence gave credence to beliefs that white men, especially Northern Europeans, had an indubitable claim to superior intelligence. Even worse, it led this racist and repugnant view to be taken seriously as an academic position.According to Gould, who spends much of his book excoriating the reasoning behind a general intelligence factor, Terman and Spearman’s mistake was to confuse statistical factors with real factors. He calls this error in reasoning reification (coming from the Latin res, ‘thing’), and argues that the existence of some general intelligence quantity in our minds cannot be justified through statistical decomposition alone without proper empirical support. He persuasively argues that
…the abstraction of intelligence as a single entity, its location within the brain, its quantification as one number for each individual, and the use of these numbers to rank people in a single series of worthiness, invariably to find that oppressed and disadvantaged groups—races, classes, or sexes—are innately inferior and deserve their status. (pp. 24–25)
Put into other words, misapplication of numerical technique became a motivating force for social injustice.Personally, I find parts of Gould’s book to be a stretch. However, the point he makes has clear implications for anyone who will be applying these techniques that we learn in real-world settings: many of these techniques (PCA, curve-fitting, etc.) imply a model of reality that may well be incorrect. As an example, if you lack strong justification for assuming that a continuous curve underlies a set of data points, then fitting an interpolating polynomial to it is just begging for spurious results.
http://en.wikipedia.org/wiki/Factor_analysis
http://en.wikipedia.org/wiki/IQ
http://en.wikipedia.org/wiki/Lewis_Terman
http://en.wikipedia.org/wiki/Principal_components_analysis
http://books.google.com/books?id=Xv1OupfjBDIC&pg=PA133&lpg=PA133&dq=nonorthogonal+factors&source=web&ots=g4QvGDZ3VZ&sig=a2U6B_HGJbHihs9fM9jXRx2e9RY&hl=enhttp://www.amazon.com/Mismeasure-http://www.amazon.com/Mismeasure-Man-Stephen-Jay-Gould/dp/0393314251






Leave a Comment
You must be logged in to post a comment.
* You can follow any responses to this entry through the RSS 2.0 feed.