What Are Z-Scores Used For in Health and Statistics

Z-scores are used to measure how far a value sits from the average of its group, expressed in standard deviations. This simple concept powers a surprisingly wide range of real-world applications, from tracking a child’s growth to flagging fraudulent transactions in a dataset. At its core, a z-score of 0 means a value is perfectly average, a z-score of +1 means it’s one standard deviation above average, and a z-score of -2 means it’s two standard deviations below.

How Z-Scores Work

The formula is straightforward: subtract the mean from your data point, then divide by the standard deviation. If a class has an average test score of 75 with a standard deviation of 10, and you scored 85, your z-score is +1.0. You’re exactly one standard deviation above the class average.

What makes this useful is standardization. Raw numbers from different scales become directly comparable once converted to z-scores. Say you scored 501 on both the math and reading sections of the SAT. Those scores might look equal, but if the math section has a higher average and tighter spread than the reading section, your z-scores would reveal that your reading performance was actually stronger relative to other test-takers. Without z-scores, that comparison is impossible to make accurately.

Z-scores also map neatly onto percentiles. A z-score of 0 corresponds to the 50th percentile, meaning you’re right in the middle. A z-score of +1.645 puts you at the 95th percentile, while -1.645 lands at the 5th. About 68% of all values in a normal distribution fall between z-scores of -1 and +1, roughly 95% fall between -2 and +2, and 99.7% fall between -3 and +3.

Tracking Children’s Growth

One of the most common medical uses of z-scores is pediatric growth monitoring. When a pediatrician plots your child’s height, weight, or BMI on a growth chart, they’re using z-scores behind the scenes. The World Health Organization recommends z-scores as the standard method for interpreting children’s body measurements, partly because they work across sex and age groups without needing separate scales.

A child with a weight-for-age z-score of 0 is exactly average for their age and sex. A z-score between -2 and -3 indicates moderate malnutrition, while a z-score below -3 signals severe malnutrition. The same thresholds apply to height-for-age (where low scores indicate stunting) and weight-for-height (where low scores indicate wasting). These cut-offs give clinicians around the world a consistent, objective way to identify children who need nutritional intervention, regardless of what country they’re in or what units they use.

Measuring Bone Density

If you’ve ever had a bone density scan (DEXA), your results likely came back as either a T-score or a z-score, depending on your age. For premenopausal women, men under 50, and children, the result is reported as a z-score. This compares your bone mineral density to healthy people of your same age, ethnicity, and sex, rather than to a young adult baseline.

A bone density z-score of -2.0 or lower means your bones are significantly less dense than expected for someone like you. That finding often prompts investigation into underlying causes, since unusually low bone density in a younger person may point to a medication side effect or an undiagnosed condition rather than typical aging. Postmenopausal women and men over 50 receive T-scores instead, which compare against peak bone density in healthy young adults and are used to diagnose osteoporosis directly.

Evaluating Heart Size in Children

Pediatric cardiologists rely heavily on z-scores when measuring structures like the aortic root, the base of the body’s largest artery. A child’s heart grows as their body grows, so a raw measurement in millimeters is nearly meaningless without context. Z-scores solve this by comparing the measurement against what’s expected for a child of that body size.

Established reference charts correlate body surface area with expected aortic dimensions to generate z-scores. A z-score well above +2 might indicate abnormal dilation that needs monitoring or intervention. For certain populations, like individuals with Turner syndrome, condition-specific z-score calculations are used to more accurately assess risk, since standard charts don’t reflect their typical anatomy.

Comparing Scores Across Different Tests

In psychology and education, raw test scores are nearly useless for comparison. A score of 42 on one cognitive test and 118 on another tells you nothing about which performance was stronger. Z-scores fix this by placing every score on a common scale relative to the test’s reference population.

Psychologists use z-scores (and related scales like T-scores, which are just z-scores rescaled to have a mean of 50 and a standard deviation of 10) to interpret performance on IQ tests, memory assessments, attention measures, and achievement tests. This standardization process controls for differences in test difficulty, scoring range, and the population that took the test. It’s how a clinician can look at a battery of six different cognitive tests and identify which abilities fall meaningfully below expectations.

Detecting Outliers in Data

In statistics and data science, z-scores are a standard tool for flagging outliers, data points that are unusually far from the rest. The logic is intuitive: if most values cluster within two or three standard deviations of the mean, anything beyond that range deserves a closer look.

A common rule of thumb treats any value with a z-score beyond +3 or -3 as a potential outlier. The National Institute of Standards and Technology notes, however, that standard z-scores can be misleading with small datasets, since extreme values pull the mean and standard deviation toward them. For more robust detection, statisticians sometimes use a modified z-score based on the median rather than the mean. With that approach, values with a modified z-score beyond 3.5 are flagged as potential outliers.

This technique shows up in fraud detection, quality control in manufacturing, and cleaning messy datasets before analysis. If a factory produces bolts that are typically 10mm long with very little variation, a bolt that generates a z-score of +4 is almost certainly defective.

Z-Scores and Percentiles

Z-scores and percentiles convey the same information in different formats. A z-score tells you how many standard deviations from the mean, while a percentile tells you what proportion of the population falls below that point. The CDC’s growth chart data files spell out the exact correspondence: a z-score of -0.674 equals the 25th percentile, 0 equals the 50th, +1.036 equals the 85th, and +1.881 equals the 97th.

Percentiles are often easier for people to grasp intuitively. Telling a parent their child is at the 10th percentile for height communicates more immediately than saying the z-score is -1.282. But z-scores are more useful mathematically, because they can be averaged, compared across groups, and plugged into further calculations. In practice, most medical and educational reports present both, or convert freely between them depending on the audience.