What Is a Piecewise Linear Function? Definition & Uses

A piecewise linear function is a function made up of straight-line segments, where different linear formulas apply over different intervals of the input. Instead of one equation describing the entire function, you get several linear equations, each governing its own portion of the domain. The “breakpoints” where one segment ends and another begins give the function its characteristic look: a series of connected (or sometimes disconnected) straight lines that together can approximate curves, model tiered pricing, or power modern machine learning.

How the Notation Works

The standard way to write a piecewise linear function is with a brace that groups each sub-function next to the interval where it applies. For example:

y = { 2x when x < 2, and x + 1 when x ≥ 2 }

This tells you: for any input less than 2, double it. For any input of 2 or greater, add 1 instead. At x = 2, the function switches from one rule to the other. That switching point is called a breakpoint (or, in regression contexts, a “knot”). Every piecewise linear function has at least one breakpoint, and more complex versions can have dozens or hundreds.

The Absolute Value Function

The simplest and most familiar piecewise linear function is the absolute value, f(x) = |x|. Written in piecewise form, it’s just two rules: f(x) = x when x is zero or positive, and f(x) = −x when x is negative. The graph forms a V shape with a single breakpoint at the origin. Each arm of the V is a straight line with a different slope, which is exactly what makes it piecewise linear rather than a single line.

Continuous vs. Discontinuous

In a continuous piecewise linear function, each segment’s endpoint lines up exactly with the starting point of the next segment. The absolute value function is continuous: at x = 0, both rules give the same output (zero), so there’s no gap.

But piecewise linear functions don’t have to be continuous. A discontinuous version has a “step” at one or more breakpoints, where the endpoint of one segment and the starting point of the next share the same x-coordinate but produce different y-values. Think of a graph that suddenly jumps up or down. At such a step, the function isn’t uniquely defined by breakpoints and slopes alone; you also need to specify the y-values on either side of the jump. Only one step is allowed at any given breakpoint, and isolated points (a single dot floating away from any segment) aren’t permitted.

Everyday Example: Shipping Costs

Tiered pricing is a natural piecewise linear function. A shipping company might charge a flat $4.50 for packages up to 3 pounds, then $0.50 per additional pound from 3 to 10 pounds (making the cost $4.50 + $0.50 for each pound over 3). Each weight range follows its own simple formula, and the breakpoints sit at the weight thresholds where the pricing rule changes. Tax brackets, electricity rates, and overtime pay all work the same way: one linear rule per tier, stitched together at defined thresholds.

Piecewise Linear Regression

When data follows different trends in different regions, fitting a single straight line through all of it produces a poor model. Piecewise linear regression solves this by fitting separate lines to separate portions of the data, joined at knot values you choose (or estimate). For instance, Penn State’s statistics program illustrates this with shipping cost data where the relationship between shipment size and cost changes slope at 250 units. Below 250, the cost per unit follows one rate. Above 250, it follows a steeper or shallower rate. The two lines meet at x = 250, ensuring the model doesn’t have an awkward gap.

The key decision in piecewise linear regression is where to place the knots. If you put a knot at the wrong location, you’ll split the data in a place where the trend didn’t actually change, and the model won’t improve much over a single line. In practice, knot placement is guided by visual inspection of the data, domain knowledge (you know the pricing structure changes at a certain volume), or automated selection methods that test multiple candidates.

Interpolation and Computer Graphics

Piecewise linear functions are the backbone of linear spline interpolation, one of the most straightforward ways to estimate values between known data points. Given a set of data points, you simply connect each consecutive pair with a straight line. Each segment uses only its two neighboring points to compute a slope, and the result is a continuous piecewise linear curve that passes through every data point exactly.

This approach is fast and easy to compute, which is why it shows up constantly in computer graphics (connecting vertices of a polygon, rendering low-resolution curves), sensor data processing, and any situation where you need a quick estimate between measurements. The tradeoff is that the result has sharp corners at every data point. If you need smoother curves, you’d move to quadratic or cubic splines, but for many applications the simplicity and speed of straight-line segments is more than enough.

ReLU and Neural Networks

One of the most important modern uses of piecewise linear functions is the ReLU (Rectified Linear Unit) activation function used in neural networks. ReLU is almost comically simple: it outputs zero for any negative input and passes positive inputs through unchanged. That’s a two-piece linear function with a single breakpoint at zero.

What makes ReLU powerful is what happens when you stack layers of it. Each neuron applies its own ReLU, and the combination of many neurons across many layers produces an overall function that is still piecewise linear, just with a huge number of segments. Training a neural network with ReLU activations is essentially performing a regression that’s piecewise linear in nature. The network learns where to place the breakpoints and what slope each segment should have, allowing it to approximate complex, nonlinear relationships using nothing but joined straight-line pieces.

This property also has a practical engineering benefit: because ReLU networks are continuous piecewise linear functions, they can be reformulated as optimization problems that solvers handle efficiently. That makes them especially attractive as surrogate models in process engineering and operations research, where you need to embed a trained neural network inside a larger optimization framework.

Why Piecewise Linear Functions Are So Useful

Straight lines are the easiest functions to compute, differentiate, and reason about. Piecewise linear functions inherit all of that simplicity while being flexible enough to approximate virtually any shape. Add enough breakpoints and you can trace a curve as closely as you like, segment by segment. This combination of computational cheapness and expressive power is why piecewise linear models appear everywhere, from tax code calculations to the activation functions driving image recognition systems.

The core tradeoff is always the same: more segments give you a better approximation of complex behavior, but each additional breakpoint adds complexity to the model. In regression, more knots risk overfitting. In neural networks, more neurons increase training time and memory. Choosing the right number of segments for your problem is where the real modeling skill comes in.