Is a Likert Scale Ordinal or Interval? A Clear Answer

A Likert scale is technically ordinal, not interval. The response options (such as “strongly disagree” through “strongly agree”) have a clear rank order, but the psychological distance between each point isn’t guaranteed to be equal. That said, the practical answer is more nuanced: how you should treat Likert data depends on whether you’re working with a single survey question or a composite score built from multiple questions.

Why Likert Data Is Technically Ordinal

Ordinal data has a meaningful order but no guaranteed equal spacing between values. When someone moves from “disagree” to “neutral” on a 5-point scale, that shift doesn’t necessarily represent the same magnitude as moving from “neutral” to “agree.” The labels correspond to subjective judgments, and people interpret the gaps between them differently. Interval data, by contrast, requires that each unit of measurement is identical: the difference between 70°F and 71°F is the same as between 90°F and 91°F. Likert responses don’t meet that standard in a strict mathematical sense.

This distinction matters because it determines which statistics are appropriate. For true ordinal data, you’d report the median or mode rather than the mean, and you’d use non-parametric tests like the Mann-Whitney U or Kruskal-Wallis test for comparisons between groups.

Single Items vs. Composite Scales

One of the most important distinctions in this debate is the difference between a single Likert item and a Likert-type scale. A Likert item is one question with an ordered response format. A Likert-type scale combines several related items, typically 4 to 10, into a composite score by summing or averaging the responses. These two things behave very differently as data.

A single Likert item is clearly ordinal. It has only a handful of possible values, the spacing between them is ambiguous, and reporting a mean of 3.4 on a single “strongly disagree to strongly agree” question is hard to interpret meaningfully. For individual items, the median or mode is the more appropriate summary statistic, and non-parametric tests are the safer analytical choice.

Composite scores are a different story. When you average or sum responses across multiple related items, the resulting data begins to approximate a continuous distribution. A person’s total score across eight items can take on many possible values, and the data tends to behave much more like interval-level measurement. This is why researchers routinely apply parametric tests (t-tests, ANOVA, Pearson correlations, regression) to composite Likert scale scores. The composite score also provides a more reliable measure of whatever you’re trying to assess, because combining items reduces the random error that comes with any single question.

When Parametric Tests Are Acceptable

Even with individual Likert items, a growing body of evidence supports using parametric tests under the right conditions. Research published in medical education literature demonstrates that parametric tests are “sufficiently robust to yield largely unbiased answers that are acceptably close to the truth” when applied to Likert scale responses. Parametric tests have been shown to perform well, and in many cases outperform non-parametric alternatives, when two conditions are met: the sample size is adequate (at least 5 to 10 observations per group) and the data are approximately normally distributed.

This pragmatic position has gained traction across fields. Many researchers treat Likert data as interval when the sample is large enough and the distribution isn’t heavily skewed, even if the measurement level is technically ordinal. The key insight is that “ordinal” describes the measurement properties of the scale itself, while the choice of statistical test depends on how the data actually behave in your dataset.

The Role of the Neutral Midpoint

The middle option on an odd-numbered Likert scale, often labeled “neither agree nor disagree,” adds another wrinkle. From a theoretical standpoint, this midpoint represents a genuinely neutral position between the positive and negative ends of the scale. But in practice, people select it for different reasons: true neutrality, ambivalence, confusion, or simple disengagement.

Research suggests the midpoint matters most on shorter scales. On a 5-point scale, the neutral category plays an important role because there are so few response options. On a 9-point scale, the neutral option exerts much less influence on the overall data because respondents have more room to place themselves along the continuum. Even-numbered scales (4-point or 6-point) eliminate the neutral option entirely, forcing respondents to lean one direction or the other. These are sometimes called “forced-choice scales.”

The presence and behavior of the midpoint is one reason the equal-interval assumption is hard to defend for individual items. If “neutral” functions as a catch-all category rather than a precise midpoint, the distances between scale points are uneven by definition.

How to Choose Your Approach

The ordinal-versus-interval question ultimately comes down to what you’re analyzing and how cautious you want to be. Here’s a practical framework:

  • Single Likert items: Treat as ordinal. Report the median or mode. Use non-parametric tests for group comparisons.
  • Composite scores from multiple items: Can generally be treated as approximately interval. Reporting means is standard practice, and parametric tests like t-tests and ANOVA are widely accepted.
  • Small samples with skewed distributions: Stick with non-parametric methods regardless of whether you’re using single items or composites.
  • Large samples with roughly normal distributions: Parametric tests perform well even on individual Likert items, though reporting this choice transparently is good practice.

Visualizing Likert Data

The way you display Likert results should reflect the ordinal nature of the responses. Standard bar charts or pie charts can obscure the directional structure of the data. More effective options include 100% stacked bar charts, which show the proportion of respondents at each level and make it easy to read the combined “agree plus strongly agree” percentages at a glance.

Diverging stacked bar charts are particularly useful because they split responses around the midpoint, visually separating positive from negative sentiment. If the neutral category is large or difficult to interpret, you can pull it out and display it separately rather than forcing it into one side. Small multiples, where each question gets its own mini-chart, work well when you need to compare the distribution of “strongly agree” or “strongly disagree” across many items without the visual clutter of a single dense chart.

Whichever format you choose, avoid plotting individual Likert items on a continuous axis with decimal-point precision. A mean of 3.7 on a single item implies a level of measurement precision that the scale doesn’t actually provide. Save those continuous summaries for composite scores, where the math is on firmer ground.