Muscle Profiling – Is Muscularity Evidence of a Crime?

Muscularity (mesomorphy) has a long-time alleged association with negative characteristics such as assertiveness or even criminality (e.g., Sheldon, 1942). The modern pursuit of muscle has also been demonized via allegations of association with undesirable behaviors; condemnations that seem destined to produce a more formal muscle profiling.

Modern “muscle profiling” finds support from Harrison Pope, a psychiatrist with widely-known positions on what he considers pathologies related to inappropriate desire for muscle, dedication to working out, and AAS use. His assertion that “…if a man is fairly lean, has an FFMI greater than about 25, and claims that he has achieved this physical condition without the use of steroids, he is almost certainly lying” (Pope, Phillips, & Olivardia, 2000; pp. 35-36) casts the veneer of science over muscle profiling, presaging the advent of muscularity as suggestive of AAS use. In a climate of condemnation of both muscle and AAS use, such assertions demand scrutiny.

The Fat-Free Mass Index (FFMI) – A Quick Primer

Why the FFMI? Expressing fat free mass (FFM) as a percentage of body mass (BM) can be deceptive; height (H) relates linearly to FFM and taller slightly-built individuals often exhibit similar or greater percentages of FFM than shorter more muscular individuals. The FFMI addressed this with a formula which mirrors BMI (BW/H2), but places FFM in the numerator (FFM/H2; VanItallie, Yang, Heymsfield, Funk, & Boileau, 1990).

The “Normalized” FFMI. Kouri, Pope, Katz, and Oliva (1995) found that, nonetheless, taller non-AAS users’ FFMIs were approximately 2 points higher than shorter non-users’, even when shorter men appeared more muscular. Subsequently, they “normalized” the FFMI (hereafter the nFFMI for clarity) by examining the FFMI/height relationship in a group of elite non-users with FFMIs > 22 because they “…felt that the distribution of the elite group would more closely reflect the dictates of physiology and not be confounded by lack of achievement, as in less muscular subjects” (Kouri et al. 1995; p. 224). FFMI increased 6.1 kg/m, so it was weighted by 6.1 multiplied by the difference between individual height and a “standard height” of 1.8 meters (approximately 5’11″; see formula below). Consequently, FFMIs for those below 1.8 meters increased and FFMIs for those above decreased. The formula is (Kouri et al., 1995; Pope et al., 2000):

nFFMI = FFMI + 6.1(1.8-H)

Where nFFMI = normalized Fat-Free Mass Index; FFMI is the Fat-Free Mass Index; H = height in meters; and “6.1 (1.8-H)” is the correction standardized to 1.8 meters.

What does this mean?

“We believe that, if a man is fairly lean, has an FFMI greater than about 25, and claims that he has achieved this physical condition without the use of steroids, he is almost certainly lying” (Pope et al., 2000, pp. 35-36). Many bodybuilders protested that they or others had achieved that level of muscularity without AAS: Pope’s assertion dismisses them as “…almost certainly lying” and in fact, in Kouri’s study, only AAS users had nFFMIs over 25.

So, take 3 men who deny AAS use, each “fairly lean” with the same 80 kg of FFM; at 66 inches tall, nFFMI = 29.22, at 69 inches tall, nFFMI = 26.35, and at 72 inches tall nFFMI = 23.74. Two of them are “…almost certainly lying”. A man who is 1.8 meters (about 5’11) has an nFFMI of 24.69 (if he gains about 5 pounds, he crosses the line). A matter of inches confers “…almost certain…” guilt (71″ okay, 69″ a liar). This is the problem with such pronouncements and disconcerting in a culture where alleged AAS users are guilty even if they prove otherwise.

Can the “Normalized” FFMI detect Lies?

The Normalizing Correction. Concerns about the nFFMI as “lie detector” start with a correction which increases nFFMI in some instances. Whenever observed data are “adjusted”, the result must be evaluated – and replicated in new samples. As noted above, this correction was derived in a small sub-sample (the n is not reported). Such an approach to computing a correction that impacts subsequent allegations in an entire population is not sufficient. The correction may be sample-specific; it must be replicated and its generality examined. If the population’s height/FFMI relationship differs, as well as the “standard height”, then the correction and the resulting nFFMI’s validity in the population are not supported.

Cut-off Scores and Prediction. Suppose you want to develop a measure to discriminate clandestine drug users from non-users. You might recruit a sample of 50% each, admitted users and non-users, compute your measure and compare the groups. A good measure will likely reveal that users have a wider range of scores, a significantly higher mean score, and, hence, many users achieve higher scores than all non-users. You might then assert that anyone (even those not in your study) above a certain score on the measure had to be a user. Seems logical, but this “known groups” approach is only step one.

Kouri et al. took 83 users (53% of the sample), 74 non-users, created their correction in a subset of this same sample, then computed the nFFMI and noted that only users scored (45%; n = 37) above 25. This shows that a member of a known group will obtain a certain score in this study; nothing more. This sample does not represent the base rate in the population – relatively equal group size is an issue of analytical assumption – and these are not population values. So, AAS users and non-users differ on FFMI and, thus, nFFMI. This is not news; AAS’ positive effect on FFM is clear (e.g., Bhasin et al., 1996).

But their assertion that the nFFMI can detect AAS use requires more. It is a question this study does not answer; “How likely is it that a man with a certain nFFMI belongs to a certain group?” – predicting user/non-user from the nFFMI. To assert this broadly requires replication of the correction and cross-validation of the cutting score (~ 25) in independent samples that accurately represent the population. Such prediction (what the nFFMI is alleged to do) requires consideration of base rates (e.g., Meehl & Rosen, 1955) – prevalence of the behavior (e.g., AAS use) in the population. This sample of “…men in a large controlled study of athletes, recruited at gymnasiums…” (Kouri at al., 1995; p. 223) supplemented with “…23 additional men recruited for a placebo-controlled, double-blind study of testosterone cypionate in normal men…” (p. 224) does not reflect prevalence rates for AAS use the population; it is a sample of convenience. In fact, the population of interest is not defined, although 15% of the sample were “normal” men.

Cutting scores (e.g., nFFMI ~ 25) may be sample specific and, thus, their predictive validity changes with base rates of behavior. The sample of users and non-users used to test the correction and cut-off should be independent and represent the real world population. Nonetheless, it is suggested that this profile simply be accepted and applied broadly (e.g., Pope & Brower, 2008; Pope et al., 2000), despite such well-known considerations (e.g., Meehl & Rosen, 1955; Rosenfield, Sands, & Van Gorp, 2000). Other areas of research on detecting deception have devoted years to such studies (e.g., Rosenfield et al., 2000), yet this is considered less important when it comes to AAS use; the oft-repeated assertion (e.g., Pope & Brower, 2008) has seemingly never been subjected to such large-scale cross-validation.

Some validation was attempted by comparing nFFMIs of Mr. America winners from the “pre-steroid era” (1939 – 1959) and recent (as of 1995) competitors using pictures (calculations of FFM by Dr. Pope; Kouri et al., 1995). First, an independent “rater” not connected with the study or even aware of the hypothesis or source of the material should have been used to control for the possibility of bias, even if unintentional. Second, this is not validation of the nFFMI as lie detector; it relies on several potentially unwarranted assumptions. For instance, it assumes that diet and training technology and non-AAS performance aids have not evolved since 1939. In fact, another reason to replicate the derived correction and cross-validate the nFFMI is the fact that such technologies have changed even more rapidly since 1995 when the Kouri et al. article appeared, although Pope (Pope & Brower, 2008) continue to suggest it. In 1995 and 2000 it was premature; now it is also outdated.


The FFMI and nFFMI describe physique. Indeed, they provide objective data on muscularity which could be useful in cases where body image is misperceived. However, assertions that nFFMIs > 25 expose clandestine AAS use, defining a “muscle profile” that can be used to cast suspicion, are not adequately supported and serve to further an unflattering caricature of research and science. While Kouri et al. acknowledged many (but not all) of the shortcomings highlighted here, they also suggested that they “…could ultimately follow an analogous procedure in forensic situations with individuals displaying an abnormally elevated FFMI (p. 228).” Indeed, by 2000, the shortcomings were considered insufficient to forestall a blanket accusation of prevarication among those deemed overly-muscled. Unfortunately, the well-muscled are a constituency without representation; their plight evokes little sympathy.

IFBB pro bodybuilder Toney Freeman arrested in Sweden as victim of muscle profiling

IFBB pro bodybuilder Toney Freeman arrested in Sweden as victim of muscle profiling. Photo credit:


Bhasin, S., Storer, T.W., Berman, N., Callegari, C., Clevenger, B., Phillips, J., Bunnell, T.,

Tricker, R., Shirazi, A., & Casaburi, R. (1996). The effects of supraphysiological doses of testosterone on muscle size and strength in normal men. The New England Journal of Medicine, 335, 1-7.

Kouri, E.M., Pope, H.G., Katz, D.L., & Oliva, P. (1995). Fat-free mass index in users and non-users of anabolic-androgenic steroids. Clinical Journal of Sport Medicine, 5, 223-228.

Meehl, P.E., & Rosen, A. (1955). Antecedent probability and the efficiency of psychometric signs, patterns, or cutting scores. Psychological Bulletin, 52, 194-216.

Pope, H.G., & Brower, K.J. (2008). Treatment of anabolic-androgenic steroid-related disorders. The American Psychiatric Publishing textbook of substance abuse treatment By Marc Galanter, Herbert D. Kleber

Pope, H.G., Phillips, K.A. & Olivardia, R. (2000). The Adonis Complex: The Secret Crisis of Male Body Obsession. New York, NY; Free Press.

Rosnefield, B., Sands, S.S., & Van Gorp, W.G. (2000). Have we forgotten the base rate problem? Methodological issues in the detection of distortion. Archives of Neuropsychology, 15, 348-359.

Sheldon, W.H. (1940). The Varieties of Human Physique. Harper and Brothers, New York.

VanItallie, T.B., Yang, M.U., Heymsfield, S.B., Funk, R.C., & Boileau, .A. (1990). Height-normalized indices of the body’s fat-free mass and fat mass: Potentially useful indicators of nutritional status. American Journal of Clinical Nutrition, 52, 953-939.

Read previous post:
New York Times and the (Illegal) Steroid Witch-Hunt in Baseball

Michael Schmidt of the New York Times reported this week that David Ortiz and Manny Ramirez were two of the...