1. Questions
We approach disaggregate travel demand models from a choice perspective, considering the short-term to long-term decisions of individuals. Activity-based models (ABMs) and synthetic population generation are example approaches to model these choice dimensions. Both models rely on creating agents but differ in handling activity-travel decisions. In ABMs, individuals or households are decision-making agents in activity scheduling. Population synthesis generates realistic datasets that represent the demographic and socio-economic characteristics of these agents within a target population, which are then used in ABMs to simulate the daily activities and travel behavior of individuals. Thus, the functionality of ABMs depends on the structure of the synthetic population (Board and National Academies of Sciences 2014). These models offer a framework for understanding how people make decisions about their activities and travel, considering the interactions among multiple individuals and the evolution of their behaviors over time.
This paper reviews the literature on ABMs and synthetic population generation, focusing on the following questions:
-
What are the short- and long-term choice dimensions of activity-travel decisions, and which approaches address them?
-
What are the individual- and multi-person level activity-travel choice dimensions, and what advancements capture them?
-
How can advanced population synthesis address fundamental gaps in ABMs?
2. Methods
This paper follows the systematic literature review methodology by Wee and Banister (2016), which includes defining the review purpose, selecting the database, and determining keywords. We chose Google Scholar to conduct a comparative analysis of publications on ABMs and synthetic populations, spanning from 1980, the approximate time ABMs first emerged, to the present. We conducted two searches: one with “activity-based models”, and another with “synthetic populations” and “transportation”. Figure 1 illustrates similar growth trends in both fields, particularly from 2000 to 2010, as interest in ABMs has increased focus on synthetic population methods. To refine our review, we target studies addressing two issues:
-
treatment of inter-personal relationships (e.g., intra-household), and
-
correlations over time (e.g., multi-day, dynamic models).
Table 1 classifies ABMs and synthetic population publications based on these axes. Most research examines individuals at a single point, with fewer studies analyzing households over time. This motivates our exploration of ABMs and synthetic population integration for multi-period and multi-person modeling.
3. Findings
ABMs, rooted in behavioral theories, address the limitations of traditional trip-based models by considering the interdependence of trips. They focus on short-term decisions that are influenced by individual preferences, social influence, and context-specific constraints (e.g., activity time windows, transport mode availability, and time budgets). Short-term choice dimensions are decisions on activity participation, scheduling, transport mode, and itinerary selection (Mcnally and Rindt 2007; Castiglione, Bradley, and Gliebe 2014; Davidson, Vovsha, and Freedman 2014). Limited longitudinal data challenges ABMs in modeling multi-period dynamics, leading to an overemphasis on past behaviors without adequately considering future planning (Cirillo and Axhausen 2010; W. Zhang et al. 2021). Given data limitations, deriving indicators from single-day data is considered, however, it risks biases due to behavioral assumptions (Arentze, Ettema, and Timmermans 2011; Hilgert et al. 2017). Moreover, most multi-day ABM studies focus on specific aspects of scheduling, such as activity generation (Nurul Habib and Miller 2008; Arentze, Ettema, and Timmermans 2011), neglecting a comprehensive examination of the entire scheduling process (i.e., activity generation, timing, sequence, location choice, and mode choice).
Synthetic data projection methods can make long-term decision information available by evolving populations over time, using various methods such as static projection (Lomax et al. 2022), dynamic projection (Geard et al. 2013), and resampling (Prédhumeau and Manley 2023). Examples of long-term decisions are spatial (e.g., residential location), lifestyle (e.g., household composition), and mobility choices (e.g., vehicle ownership). However, these methods often struggle with biases and non-representativeness due to their reliance on past trends, limiting their effectiveness in forecasting transport demand across multiple periods. Additionally, the focus on individual evolution rather than household dynamics can further reduce the accuracy of these projections.
Given the importance of household interactions in travel behavior, recent ABMs literature has emphasized the need to incorporate group decision-making processes within households. This includes dividing maintenance responsibilities, escort duties, joint activity participation, and managing resources like vehicle allocation, which requires coordinated scheduling decisions. Research has explored group decision-making in areas such as activity generation (Arentze and Timmermans 2009; Bradley and Vovsha 2005), time allocation (J. Zhang and Fujiwara 2006), and sequential household-level activity pattern generation (Bhat et al. 2013). For instance, Household Activity Pattern Problem (HAPP) models (Recker 1995) integrate operations research techniques into household activity scheduling. Rezvany, Bierlaire, and Hillel (2023) use a mathematical programming algorithm to simulate multiple intra-household interaction dimensions within the same scheduling framework, capturing the coordination of the activity scheduling decisions among all household members. Table 2 summarizes example ABMs in the literature addressing individuals to multi-person spanning from single-day to multi-day analysis.
In parallel, the field of synthetic population generation has shifted from individual- (Beckman, Baggerly, and McKay 1996; Abraham, Stefan, and Hunt 2012; Barthelemy and Toint 2013; Farooq et al. 2013; Lederrey, Hillel, and Bierlaire 2022) to household-centered (Ye et al. 2009; Kukic, Li, and Bierlaire 2024; Aemmer and MacKenzie 2022) generation, creating multi-level hierarchical data that better represent inter-individual interactions within households. However, existing datasets often overlook interpersonal relationships within social groups such as educational and work networks (Jiang et al. 2022).
A notable gap exists between the synthetic population and ABMs communities. While synthetic data is intended to support ABMs (Board and National Academies of Sciences 2014), most ABMs studies rely on real data (Tajaddini et al. 2020). This limited use may stem from:
-
lack of consensus on generation algorithms, which vary based on resources and data (Yaméogo et al. 2021), limiting reproducibility (Ramadan and Sisiopiku 2019);
-
unclear data requirements (e.g., socio-demographics only or full activity chains?); and
-
questions about synthetic data reliability, as validation often covers marginals only (Kukic, Li, and Bierlaire 2024).
Although open-source synthetic data is promoted (Hörl and Balac 2021), different, arbitrarily chosen algorithms are often used at each generation stage, impacting data quality. For instance, IPF is used to generate socio-demographics despite available alternatives addressing IPF’s limitations (Sun and Erath 2015; Farooq et al. 2013; Aemmer and MacKenzie 2022; Lederrey, Hillel, and Bierlaire 2022). Thus, to advance research in multi-person and multi-period contexts, better integration of synthetic population generation methods with ABMs is needed (Ramadan and Sisiopiku 2019; Garrido et al. 2020).
Current research lacks methods for generating multi-period synthetic populations that include social group information, crucial for developing sophisticated ABMs that simulate multi-day scheduling with interpersonal interactions. Challenges persist in capturing the heterogeneous nature of intra-household dynamics and ensuring consistency of choices across household members. For multi-period ABMs, studying scheduling correlations from forward-looking behavior, past experiences, and latent behaviors should be considered. ABMs with interactions face challenges in calibrating due to the combinatorial size of choice sets, requiring consistent choice set generation techniques (Rezvany, Hillel, and Bierlaire 2024). The increased complexity and dimensionality in multi-person and multi-period synthetic populations and ABMs pose challenges in computational time, data representativity, and constraint management. Decomposition methods offer potential solutions to handle this complexity.
In summary, integrating ABMs and synthetic populations, while considering interpersonal interactions over multiple time periods, is an interesting research avenue with a high potential for contribution.
Credit authorship contribution statement
Negar Rezvany: Conceptualization, Formal analysis, Methodology, Investigation, Writing - original draft, Visualization, Project administration. Marija Kukic: Conceptualization, Formal analysis, Methodology, Investigation, Writing – original draft, Visualization, Project administration. Michel Bierlaire: Conceptualization, Methodology, Supervision, Project administration, Writing – review & editing.
Acknowledgements
We would like to thank Janody Pougala for her valuable insights on the literature review for multi-period activity-based models.