AI and the Decision Moment

James Noon

08 May 2026 • 6 min read

Part 2 of 5 — Wrong but useful

TL;DR: AI doesn't deliver smarter segmentation, it eliminates segmentation as the primary inference method. The individual × moment × context granularity that demand modelling has always needed is now feasible. This post explains the mechanism, and why rail's scale makes it compelling.

This is part 2 of a series called Wrong but useful - named for George Box's observation that all models are wrong, but some are useful.

Each post retires one wrong model of rail retail and proposes a better one. The argument builds in a single direction: from treating the ticket transaction as the unit of analysis, toward the relationship that precedes any transaction.

Post 1 established the foundation: rail demand is not homogeneous. The same passenger's willingness to pay for the same journey can vary by a factor of 30 depending on context — whether they're commuting or travelling for leisure, whether they're anxious or relaxed, whether they have alternatives or don't. The demand curve, which averages this variation into a single elasticity estimate, produces a model that is systematically wrong at the individual level.

Post 2 asks what can replace it.This post retires the segmentation model. Not because segmentation is wrong — it isn't — but because it is structurally bounded in a way that individual-level AI inference is not.

Post 1	The homogeneous demand model - one elasticity per segment
Post 2	The segmentation model - AI isn't smarter segmentation, it eliminates it

The barrier has moved

The demand curve framework caps inference at the group level - a 30-fold range in elasticity within the same person cannot be captured by any segmentation model, no matter how granular. The solution has existed academically since the 1970s but the problem was always operationalisation. The UK rail industry's current standard - the Passenger Demand Forecasting Handbook, used by every TOC and referenced by the ORR - provides price elasticities by journey type, season, and purpose. It is a carefully calibrated instrument. But by design it operates at the category level: a leisure traveller on this route in this season has this elasticity. Individual variation within that category is averaged away. That averaging is not a refinement of the model — it is the model.

McFadden's Random Utility Model, the framework that earned the 2000 Nobel Prize in Economics, makes individual-level variation formally tractable. It doesn't require the analyst to impose group structure in advance, the model discovers the latent distribution. What it couldn't do was run at the scale and speed of a live retail system.

AI changes that. Not by being smarter segmentation, but by being a different kind of inference altogether.

The mechanism

Segmentation works by grouping people and assigning group-level characteristics.

Given that this person looks like X, what price does X typically accept

Individual-level demand modelling asks a different question:

Given that this specific person is searching for a ticket right now, in this context, at this moment — what is their willingness to pay?

The difference is not cosmetic. Segmentation is a snapshot of populations. Individual inference is a real-time signal about a decision.

AI - specifically large-scale recommendation and personalisation systems - is the first class of tool capable of operating at this granularity. Not because it's theoretically superior to Mixed Logit or hierarchical Bayes, but because it can learn individual behaviour patterns directly from the data, without imposing group structure in advance. The model discovers the latent distribution rather than requiring the analyst to specify it.

This is what changes with AI: not the theory, but the direction of inference. From population → individual to individual → distribution.

The cold-start problem, inverted

The standard objection to individual-level modelling is the cold-start problem. If you need a history of individual behaviour to make predictions about that individual, new users are invisible. Rail inverts this.

The UK generates roughly 1.4 billion rail journeys a year. Individual purchase records, linked to journey context - route, time, fare class, advance window, network state - constitute one of the richest behavioural datasets in any industry.

The cold-start problem in rail is not data volume. It's data linkage. Most rail data exists in operational silos. Fares data, reservations data, delay data, search data - separate systems. The opportunity is not more data collection. It's inference across existing signals at the moment a specific person initiates a search.

A model trained on behavioural sequences, not demographics, can infer context without asking for it. Someone who regularly buys Anytime tickets and whose searches show departure-time flexibility reads differently from someone who always purchases the cheapest available advance three weeks out. These behavioural signatures exist in current data. They're not being used at the point of offer construction.

The full decision arc

Rail retail currently captures one moment: the purchase.

The search → comparison → decision sequence that precedes it is largely invisible to retailers. This matters because willingness to pay isn't formed at the point of purchase, it's formed upstream.

The ticket is the final step in a chain that includes mode choice, time-of-day choice, companion decisions, and a prior assessment of what the journey is worth.

Traditional retail models are blind to this. A static pricing engine doesn't know whether you've been searching for three days and are about to give up, or whether you're booking with certainty for the first time. It assigns a price, and the price either converts or doesn't.

AI-native journey retail can be present earlier. A journey companion — an application that earns presence before the booking moment — accumulates context that a one-off purchase interaction cannot. It observes hesitation patterns, route comparisons, and the frequency and urgency of search. None of this requires explicit data collection. It's behavioural inference.The decision moment isn't a single point. It's a window. The closer to that window the offer is constructed, the more accurately it reflects actual willingness to pay.

Why rail specifically

The case for individual-level AI inference isn't unique to rail. Recommendation engines do this for content — Netflix, Spotify — and increasingly for e-commerce. What makes rail distinctive is the interaction of scale, context richness, and structural constraint.Scale: 1.4 billion journeys a year in a mid-sized market generates enough signal to train meaningful individual-level models. This is not a sparse dataset problem.

Context richness: Rail generates unusually rich contextual metadata. Timetable structure, platform assignments, typical lateness distributions, network disruption patterns — these exist in operational data and have proven predictive relevance to demand. A model that incorporates real-time network state isn't asking what the nominal journey time is. It's asking what this specific journey is actually worth, today, given known conditions.Structural constraint: Capacity on rail is fixed in a way hotel rooms are not. Once the 17:23 departs, those seats are gone. This creates an asymmetry: sophisticated engines can delay purchase, but they cannot create inventory. The value of accurate individual-level demand inference is higher when supply cannot flex. The technical path is not without precedent. American Airlines' DINAMO system, deployed in the 1980s, was the first large-scale application of dynamic demand-responsive pricing — replacing fixed fares with real-time inventory allocation. The transition from route-level averages to segment-level inference took roughly two decades and required both theoretical foundations and the computational capacity to run them at scale. Rail is at an analogous inflection point. The theory has existed since McFadden. The computing capacity is now available. The constraint is architecture, not capability.

What this looks like in practice

The practical form is not a smarter yield management algorithm. It's a shift in the unit of analysis — from route × segment × advance-window to individual × moment × context.

An AI-native retail system doesn't ask "what is the advance purchase demand curve for this route?" It asks: "for this user, with this search pattern, at this point in the planning arc, what offer clears with high probability?"

The output is not personalised pricing in the current (legally and reputationally fraught) sense. It's offer construction — in real time, from available fare combinations, reflecting inferred need. The innovation is not arbitrary price discrimination. It's precision in what to show, when, and with what framing.This distinction matters for implementation and for trust. Individual inference doesn't require charging different people different prices for the same product. It requires knowing which product — which combination of flexibility, timing, and features — best matches inferred willingness to pay. The offer does the work, not the price.

The model is still wrong

George Box's aphorism holds.

The AI model inferred from behavioural sequences is not a correct model of individual demand — there is no such thing. Context is partially observable, preferences shift, and the model is always working from incomplete signal.

But the relevant question isn't whether the model is right. It's whether it's more useful than the alternative.

Segmentation-based inference is wrong and bounded. It caps at group-level estimates by design. Individual-level AI inference is wrong and unbounded — it can get closer to the decision moment without a structural limit on its resolution.

For an industry with a 30-fold range in elasticity within the same person, that's not a marginal improvement. It's a different class of tool.

But individual demand inference, however precise, is only half the model. The retailer who understands the asset as well as the passenger has an asymmetric advantage no pricing algorithm can replicate.

Post 3 examines what that looks like — and why it compounds.