## Research Question and Hypotheses

Reduced travel time variability can be an important benefit when road congestion is reduced by policies or projects. A considerable amount of literature on the valuation of reliability has emerged over the last decades, together with a smaller literature on predicting travel time variability for road traffic (de Jong, G. C. and Bliemer 2015). This article focuses on the latter issue, specifically estimating relationships between the mean and standard deviation of travel time.

Several different functional specifications have been used in the past, and there appears to be little consensus about what works best. This article therefore aims to compare the performance of a number of different model specifications on a particularly interesting data set: travel time measurements made before and after the introduction of congestion charges in Stockholm. The charges caused traffic to and from the inner city to decrease by approximately 20%, resulting in substantial congestion reductions (Eliasson 2008). The two data sets hence represent significantly different traffic situations on the exact same set of links, making it ideal for evaluating and comparing the predictive performance of the models.

The fundamental question of concern in this article is whether models for forecasting travel time variability are sufficiently good to be useful for applied policy-making. This is explored through estimating models on data collected before the congestion charges, and using them to forecast travel time variability in the situation with the charges. Forecasts and outcome can then be compared to see which (if any) of the different model specifications give a sufficiently correct magnitude of the reliability benefits.

## Methods and Data

The data consist of average travel times for 15-minute periods, collected from 41 major urban streets and arterials in and around Stockholm’s inner city during six weeks in the spring of 2005 (before the charges) and the corresponding weeks in the spring of 2006 (with charges). Only data from Monday–Thursday, 6:00 a.m.–8:00 p.m. is used. Estimations are based on 41*14*4 = 2296 observations of standard deviations and average travel times, and standard deviations and average travel times for each link/time combination are calculated based on 6*4 = 24 measurements each.

The links are between 0.38 and 4.82 km long, most of them with two lanes per direction, and with speed limits either 50 km/h or 70 km/h. Most of the links also include at least one intersection. Traffic volumes vary between 15,000 and 50,000 vehicles per day.

When congestion charges were introduced in Stockholm in the spring of 2006, mean travel time per km (weighted by traffic volumes per link and 15-minute time period) decreased from 3.31 min/km to 2.91 min/km on the links in the sample. The corresponding mean standard deviation decreased from 1.10 min/km to 0.82 min/km. Assuming a typical reliability ratio (the ratio of the value of standard deviation to the value of travel time) of 0.8, this indicates that reliability benefits are approximately 60% of conventional travel time savings. This is considerably higher than in the cost-benefit analysis reported in

Eliasson (2009a), which used model calculations rather than actual measurements to estimate reliability benefits of the Stockholm charges.Table 1 presents nine different model specifications to be compared, drawing from the surveys in

de Jong & Bliemer (2015) and Kouwenhoven & Warffemius (2016). The specifications originate from the references given, but some have been simplified to allow comparison; there are no controls for road type or queueing buildup/dissipation phases, and all models are estimated using conventional linear regression rather than more advanced error specifications (as in Kim & Mahmassani [2014]). The only variables used are mean travel time free-flow travel time standard deviation and link length

## Findings

Table 2 summarizes the essential results (full estimation results are given in the Appendix). Columns 3 and 5 present *R ^{2}* goodness-of-fits of the models estimated on data from each year, calculated as where are observed values, and are model-predicted values. The first four models (1–4) stand out in terms of goodness-of-fit to the estimation sample, with

*R*: s around 0.5–0.6 for both samples. Columns 4 and 6 list parameter values, ignoring intercepts. Nonsignificant parameter values are marked as “–”.

^{2}Column 7 lists *R ^{2}*: s when the 2005 models are used to predict 2006 values, i.e., where means that the models estimated on 2005 data are used to predict observed 2006 values using measured travel times from 2006 as input. Models 1, 2 and 4 perform remarkably well. The common factor of the best-performing models is that the standard deviation and the travel time are normalized in some sense, by dividing by distance or by free-flow travel time, or by estimating parameter(s) for distance (which can be seen as a more flexible normalization). Of course, this also means that they have the inherent benefit of having more parameters (especially model 1 with its five parameters, some of which are not significant). The parameters of models 3–4 are fairly stable across years, while the parameters of models 1–2 vary more between years. This might be a signal of potential problems with overspecification, or it may be due to uncertainties in the measured variables.

Column 8 presents the most interesting result: how accurately the reliability benefits are predicted. Benefits are normalized to min/km and weighted by traffic volumes by link and 15-minute time period. Models 1–4 perform well; they are close enough to the truth to pass a heuristic “usefulness” test. Models 6 and 8 also perform well (model 8 remarkably so, considering the comparatively low model fit in the estimation).

Results indicate that it is indeed possible to predict reliability benefits with an accuracy that is sufficiently good to be useful for policymaking. However, note that travel times need to be predicted well for this to work, since travel times are essential inputs to the variability models. The comparisons in this article use actual travel time measurements from the prediction year (2006) in order to avoid confounding the performance of the variability forecasting model with the performance of a travel time forecasting model.

There are substantial differences between the models’ performances. Having enough flexibility in the model seems to pay off, giving remarkably good predictions of total benefits. The slight instability of parameters across different samples does not seem to be a real problem, since the out-of-sample predictions are so good. Of the simpler models, the distance-normalized linear model stands out for performing well with only two parameters.