What Is a Good AIC Value for Model Selection?

The Akaike Information Criterion (AIC) is a foundational statistical tool used in data analysis for model selection. Developed by Japanese statistician Hirotsugu Akaike, the criterion provides a numerical measure for estimating the quality of a statistical model relative to others in a candidate set. Its purpose is to guide researchers toward the model that offers the most effective balance between explanatory power and simplicity for a given set of observed data. By quantifying the information lost when a model is used to represent the process that generated the data, AIC offers a standardized way to compare competing statistical explanations.

Balancing Accuracy and Simplicity

The core challenge in statistical modeling is finding a model that accurately represents the data without being unnecessarily complex. This is the trade-off between model fit and parsimony. A model with many parameters, such as predictor variables, often appears to fit the initial training data better. This high degree of fit is misleading because the model captures random noise specific to that dataset, a phenomenon known as overfitting.

Overfit models lack generalizability, performing poorly when presented with new data. The AIC addresses this by integrating a penalty for each additional parameter added to the model. This penalty term directly discourages the selection of overly complicated models.

The AIC calculation combines a measure of the model’s goodness-of-fit (derived from its maximum likelihood estimate) with the number of estimated parameters. By factoring in this complexity penalty, the criterion ensures that a more intricate model is only preferred over a simpler one if the increase in explanatory power justifies the added complexity. The goal is to identify the model that maximizes explanatory power while minimizing the required number of variables.

Interpreting the AIC Score

The raw AIC score itself is arbitrary and holds no intrinsic meaning regarding a model’s quality in isolation. An AIC value of 100, or even a negative value like -500, is neither inherently good nor bad. The score’s magnitude is heavily influenced by the dataset size and the specific likelihood function, making it impossible to establish an absolute threshold for a “good” score.

The utility of the Akaike Information Criterion lies purely in its ability to facilitate relative comparison among a set of candidate models. AIC scores are only meaningful when compared against the scores of other models that have all been fit to the exact same dataset. The guiding principle is straightforward: a lower AIC value indicates a better, more plausible model relative to the others in the set.

This lower score suggests that the model has a smaller estimated distance from the unknown true model that generated the data. The model with the minimum AIC achieves the most favorable trade-off between maximizing fit and minimizing the penalty for complexity. True interpretation requires calculating the difference between the candidate models.

Practical Application: Comparing Candidate Models

Effective model selection relies on calculating the difference between the AIC scores of all candidate models. The process begins by identifying the model with the lowest AIC value, designated as the best model. The difference, known as Delta AIC (\(\Delta_i\)), is calculated for every other model by subtracting the minimum AIC from its own score. The best-performing model will always have a \(\Delta_i\) of zero.

Interpreting these difference scores provides guidance on the strength of evidence for each model.

#### Interpreting Delta AIC (\(\Delta_i\))

A model with a \(\Delta_i\) value of less than 2 has substantial support and is a plausible alternative to the top model. If two models have a \(\Delta_i\) difference of less than 2, they are often viewed as essentially equivalent in quality and explanatory power.

Models with a \(\Delta_i\) between 4 and 7 have considerably less support, suggesting they are distinctly weaker than the best model. Any model exhibiting a \(\Delta_i\) value greater than 10 has negligible support, meaning it is highly unlikely to be the best approximating model.

#### Akaike Weights

For a more quantitative measure of plausibility, researchers can calculate Akaike weights (\(w_i\)). These weights transform the Delta AIC values into a set of probabilities, indicating the likelihood that a particular model is the best among the entire candidate set. The model with the highest Akaike weight is the most probable candidate.

When Not to Rely Solely on AIC

While the Akaike Information Criterion is a powerful tool for comparative model selection, it is not without its limitations. The criterion selects the model that is the best approximation of the unknown truth, but it does not provide any test of the overall quality or absolute goodness-of-fit of the model. A low AIC score only means a model is the best among the candidates tested, even if all models are poorly suited to the data.

AIC assumes a large sample size for its theoretical properties to hold. When working with small datasets, the corrected version, AICc, is often preferred because it applies a stronger penalty for complexity, preventing the selection of an overfit model. When the dataset is very large, the Bayesian Information Criterion (BIC) may be favored because its complexity penalty is harsher, leading it to select simpler models than AIC.

The theoretical assumptions of the model must also be verified using traditional statistical diagnostics. Researchers should always examine model residuals, check for violations of assumptions like normality or homoscedasticity, and ensure the model makes theoretical sense within the context of the subject matter. AIC is a guide for selection, but it should always be supplemented with a thorough examination of the model’s internal statistical properties and its scientific plausibility.