The fifteen factor approach utilized is summarized as follows:
The 15 primary scales can be further organized into 5 global scales: Extraversion (Warmth, Liveliness, Social Boldness, Privateness, and Self-Reliance), Anxiety (Emotional Stability, Vigilance, Apprehension, and Tension), Tough-Mindedness (Warmth, Sensitivity, Abstractedness, and Openness to Change), Independence (Dominance, Social Boldness, Vigilance, and Openness to Change) and Self-Control (Liveliness, Rule-Consciousness, and Perfectionism). The global scales of the 16PF are similar to the 5 FFM domains; in particular, Extraversion overlaps considerably with FFM extraversion, Anxiety with Neuroticism, Self-Control with Conscientiousness, and Tough-Mindedness with (negative) Openness. The Independence scale, however, has no clear-cut analogue in the FFM.
(Agreeableness is the FFM without a clear 16PF analog.)
This is much larger than the the overlap that can be observed for individual personality traits analyzed one at a time, and by using less fine grained personality measurements such as "Big Five" personality traits (aka the Five Factors Model aka FFM). The authors provide some examples of the gender distinctions that are obscured in highly aggregated personality measurements:
For example, FFM extraversion has loadings on two narrower dimensions, warmth/affiliation (consistently higher in females) and dominance/venturesomeness (consistently higher in males). These two effects of opposite sign result in a small overall sex difference in extraversion, with females typically scoring (slightly) higher than males. A similar pattern of crossover sex differences has been found in openness to experience, with males scoring higher on the “ideas” dimension and females on the “aesthetics” dimension of this trait. Sex differences in Conscientiousness are also confined to just some of its components.
At the individual, fine grained personality trait level the study found that:
[T]he largest differences between the sexes were found in Sensitivity, Warmth, and Apprehension (higher in females), and Emotional stability, Dominance, Rule-consciousness, and Vigilance (higher in males). These effects subsume the classic sex differences in instrumentality/expressiveness or dominance/nurturance.
The statistical methods used in the latent variable analysis are basically the same ones used to discern patterns in large quantitites of genetic data is large population samples by breaking it down into dimensions of variable or hypothetical ancestral components. It also account for measurement error issues. From the perspective of someone familiar with linear algebra, the concepts used are kindred to eigenvector analysis. According to the study authors:
When observed scores were used and univariate effect sizes were aggregated by simply averaging them (the weakest methodology), the overall male-female difference was “small” and consistent with Hyde's meta-analytic results. However, when univariate effect sizes were estimated on latent variables and aggregated in a multvariate index (the strongest methodology), sex differences increased about tenfold and became extremely large.
The controversial conclusion of the study's authors in Italy and the U.K. is that the "idea that there are only minor differences between the personality profiles of males and females should be rejected as based on inadequate methodology." The leading meta-analysis of JS Hyde in 2005 found a 75% between men and women on a wide variety of personality traits.
Probably the most important potential issue with this analysis relative to other studies of gender difference in personality, which the authors appropriately identify is that:
Estimating group differences on latent variables is clearly preferable to relying on observed scores, but this methodology depends on the assumption of measurement invariance, i.e., the assumption that the construct being measured is actually the same in both groups. Booth and Irwing found that between-sex invariance was violated for the five global scales of the 16PF (analogous to the Big Five), but satisfied for the 15 primary factors of personality. There is evidence that the same may apply to FFM inventories. Measurement invariance is thus another reason to measure sex differences at the level of narrow traits, instead of focusing on broad traits like the Big Five.
In their conclusion they argue that there are measurement invariance issues, but that tend to underestimate rather than overestimate the gender differences because people who complete self-reported personality inventories tend to interpret questions with reference to others of the same gender, but the suggestive evidence that they offer on that point isn't a particularly rigorous refutation of the point.
The fact that a more fine grained set of personality traits and simultaneous multivariate analysis dramatically increases gender differences in the data analysis is strong support for their hypothesis. While even random noise in data will necessarily heighten gender differences when this kind of analysis is done, relative to a less fine grained set of personality traits and univariate analysis, the magnitude of the differences is dramatic enough to lend weight to their hypothesis that apparent male-female similarities in personality are to a significant degree products of methodologies that blur the differences like aggregation of subtraits with different gender biases and tendencies to ignore corrolations that are both statistically important and influence how the combination of personality traits present and interact as a whole.
The margins of error claimed in the main output of the study as compared to prior metastudies, which is the a measure "D" that can be equated to the percentage of overlap in personality space between men and women revealed by the analysis is about +/-5%. The margin of error in the old 75% overlap estimate isn't entirely clear, and appear to be greater since it was binned in three large categories that each spanned a magnitude difference of factors of two or three on an equivalent measure. But, by any measure, the effect was huge.
In the kind of language physicists like to use, there was a roughly 32 sigma difference between the global level gender differences seen in the new analysis and the differences found in the most disparate of the individual traits in the leading metastudy of personality differences between genders. Not a three point two sigma difference, a thirty-two sigma difference, in a study which does not suffer from common defects in behavioral science research like small sample sizes. Large sample sizes should greatly reduce the amount of statistical noise in the effects.
Physicists tend to insist on five sigma effects before they are considered significant, and social scientists tend to be quite a bit more lenient in their standards of statistical significance.
It is also worth noting that neither result (the old similarity hypothesis or their strong difference hypothesis) is congruent to a feminist or anti-feminist view. While some of the earlier notions of feminism that were pivotal in motivating an end to legal sex discrimination in most forms of education and employment focus on breaking down gender stereotypes and de-emphasizing gender differences, more modern feminist scholarship has had more of a different but equal character to it, recognizing strong differences in masculine and feminine ways of conceptualizing situations and moral issues and arguing for a perspective that gives greater weight than traditional social and legal arrangements to perspectives that are more feminine. Likewise, this research doesn't begin to probe the extent to which more complex than binary desconstruction of masculine-feminine differences impact their analysis or the extent to which atypical gender identities and sexual orientations could be driving some of the remaining overlap between gender and personality that they observe. Ten percent overlaps start to reach scales where bimodal continuums of sexual orientation and gender identity at reported levels could materially impact the result.