RESEARCH QUESTION AND HYPOTHESIS
Bike sharing systems (BSSs) suffer from a central recurring problem: imbalance. Many bike stations become either empty or full during their daily operation. We hypothesize that we can reduce the cost of balancing bike stations by optimizing the number of bikes at each station at the start of the day, thus reducing the need for a dynamic balancing system (Schuijbroek, Hampshire, and van Hoeve 2017; Raviv and Kolka 2013; Lu 2016). We formulate our hypothesis by modeling each station using a Markov chain.
METHODS AND DATA
This study uses Ford GoBike’s BSS docking station data collected from August 2013 to August 2015 in the San Francisco Bay Area, as shown in Figure 1 (Bay Area Bikeshare 2016). The data provide the number of bikes at each station in one-minute intervals.
We used the discrete time-homogeneous Markov chain on a finite state space to model the system. We defined the state space as being all the possible states a station could be in. That is to say, if station s had
docks, then the number of states for that station would be where the “empty station” is counted as one possible state.A matrix
was constructed for each station, day of the week, and hour of the day, (i.e., a total of S×7×24 X matrices were constructed of size × Using a specific X matrix, the transition frequency matrix was created by computing the elements where The elements represent the number of times a transition occurred from state i to state j over a one-minute interval at a specific station, for a specific day of the week, and within a specific hour of the day. The transition probability matrix for a specific station, s, hour of the day, h, and day of the week, d, was then computed as The calculated transition matrices above are the one-step transition matrices for a specific station, day of the week, and hour of the day. Each transition (i.e., the time tick) is conducted per minute, making the movement between states as smooth as possible throughout the hour.The probability distribution of available bikes at the end of the day at a particular station is shown in Equation 1.
(1)
Here,
is the 60-minute transition matrix obtained from simulating the corresponding one-step transition matrix.Equation 1 finds the probability distribution of the available bikes at the end of the day given that the station started the day with m bikes. We count all possible paths from m at the very first hour of the day to all possible values of m at the end of the day. We use the corresponding transition matrix to simulate the Markov chains in order to produce a probability distribution that describes the likelihood of a particular state at the end of the hour. This leads to the creation of a probability distribution of available bikes at the end of the first hour. After that, we can use this probability distribution as the initial state probabilities for the following hour and create the next probability distribution, which is the next 60-step transition matrix. This procedure is repeated until we reach our target hour and draw the final probability distribution as a function of each initial condition.
When running the Markov chain, our objective function was to find the best initial conditions to maximize the probability of the station operating at a bike-to-capacity ratio (number of bikes relative to the capacity of the station) within the range of 0.25 to 0.75 at the end of each hour, as shown in Equation 2.
(2)
where
is the initial condition of station is the hour of the day (considered only the hours from 6:00 a.m. to 8:00 p.m. in our case), is the weight assigned to hour (assumed to be 1.0), is the expected state of the station at the end of the hour, and are the upper and lower desired bounds of the station status (in our case: =0.25× and =0.75× is the capacity of station s, and is the probability of having an initial state and a resulting state at the end of hourFINDINGS
We used the BSS data to build the Markov chain for each station and day of the week combination to investigate the daily imbalances and identify the optimal inventory level that would minimize the probability of a station reaching an empty or full state. When analyzing the results, we first looked at all 70 stations, considering different initial conditions to identify stations that would benefit most from optimizing the initial station state. We grouped stations into three categories:
(1) Those that have an imbalance issue but a small probability (≤10%) for 25% of the initial conditions;
(2) Those that have an imbalance issue with a medium probability (11% to 25%) for 25% to 45% of initial conditions; and
(3) Those that have an imbalance issue with a large probability (>25%) for >45% of the initial conditions.
In Table 1, we present each category’s percentage separately by city, as a previous study showed that there were close to no trips between the five cities (Ashqar et al. 2017).
As shown in Table 1, San Francisco has the highest percentage of category 3 stations, followed by San Jose. This demonstrates that San Francisco BSSs experience high bike demands, and thus are more likely to have an imbalance problem during the day. Our proposed approach would be less effective for the San Francisco BSSs and more effective for the other cities given that the daily evolution of states for San Francisco varies considerably.
Our analysis shows that the optimal initial conditions vary from day to day at the same station, and thus we present the optimal initial conditions for each day of the week for one selected station in Mountain View and one in San Francisco. Note that we made two assumptions when choosing the optimal initial conditions: (1) the bikes are taken from an infinite pool, meaning we have no constraints on the available inventory, and (2) there is no interaction between stations. The optimal station state is assumed to occur when the bike-to-capacity ratio ranges between 0.25 and 0.75 over the entire day, thus minimizing the probability of reaching either an empty or full state. Table 2 presents the optimum three initial states for stations 26 and 59 that result in the highest probability of maintaining a bike-to-capacity ratio ranging between 0.25 and 0.75 for the entire day. As was demonstrated earlier, the results of Table 2 demonstrate that there is a lower probability of being able to maintain the San Francisco station in the optimum range over the entire day, as was discussed earlier.
ACKNOWLEDGMENTS
This effort was funded by the University Mobility and Equity Center (UMEC).