An Agent-Based Simulation Model of Pedestrian Evacuation Based on Bayesian Nash Equilibrium

Yiyu Wang; Jiaqi Ge; Alexis Comber

doi:10.18564/jasss.5037

Introduction

Large public gatherings or crowds are commonplace and have been the subject of simulation research in many studies related to crowd management, disaster management and evacuation planning (Babojelić & Novacko 2020). However, in-depth research on pedestrians has been hindered by difficulties such as complex individual behaviours, different disaster characteristics, and varying environmental factors (Wijermans & Templeton 2022). As evacuee behaviour and movement vary in different scenarios, a number of field observations and simulation experiments have been conducted to explore pedestrian flows, movement patterns and potential factors affecting evacuation under different types of emergencies (Feng et al. 2021; Rozo et al. 2019; Sevtsuk & Kalvo 2022). Despite many simulation studies of pedestrian behaviours, few common behavioural features of pedestrian flows have been explored (Babojelić & Novacko 2020; Vermuyten et al. 2016). One of the main obstacles is the lack of experimental datasets that closely match individual movements during evacuations in the real world. Consequently, a more intelligent evacuation simulation model of pedestrian flow is needed to realistically replicate the movement and behaviours of evacuees and that is also easily adaptable to various evacuation scenarios.

Simulation models of pedestrian flow are generally classified into one of two main categories: macroscopic models and microscopic models. Macroscopic simulation models consider pedestrian flows as a single unit such that evacuees in these models are homogeneous during simulation (Jiang et al. 2010; Piccoli & Tosin 2011). In these cases, it is difficult to observe the interactions of individual pedestrians and their (micro-level) behaviours. To address this, a number of approaches have been developed, such as cellular automata, lattice gas automata, social force models, and other simulation tools, allowing pedestrians to be partially heterogeneous during simulation. Pedestrians determine their own actions according to their surrounding environment, but the probability distributions of their decisions are still controlled at the macroscopic level (Lu et al. 2017; Teknomo 2016). Agent-based modelling (ABM) can fully capture individual behavioural heterogeneity in pedestrian models, and it is one of the main individual-based models used to simulate pedestrian movement in different scenarios. It has the capability to reveal the aggregated patterns of individual actions and environments from the bottom up (Bar‐Haim 2010).

Game theory has attracted much attention from researchers in the fields of pedestrian behaviours and evacuation simulation. Current research on pedestrian flow pays much attention to whether and how the simulation could closely match the reactions of pedestrians in different scenarios in the real world by predicting pedestrians’ next move. Game theory provides an effective approach to realistically reproduce individual decision-making processes. Specifically, a game-theoretic approach assumes individuals make decisions based on their beliefs, which are updated in response to their surroundings. The best pedestrian strategies or responses can be derived using different game theories. For instance, Rigos et al. (2019) introduced Sequential Equilibrium and perfect Bayesian Equilibrium into their response model to simulate the evacuee actions after receiving the order to evacuate. Liao et al. (2019) incorporated Bayesian Nash Equilibrium into their simulation model to discover the relationships between safe pedestrian flow rates and public space area. There are many other examples of research on pedestrian behaviours that have introduced game theory into their models to more realistically simulate pedestrian decision-making and behaviours (Bouzat & Kuperman 2014; Lo et al. 2006; Mesmer & Bloebaum 2016). However, the main objectives of the studies focusing on both game theory and ABMs have generally been to compare simulations based on game theory with agent-based approaches (Noori et al. 2021), or the incorporation of ABMs with simple game theory such as a zero-sum 2-player game (Levy et al. 2018; Lo et al. 2006). Research using Bayesian game theory for pedestrian evacuation have tended to simulate individual selection of final exit rather than their actions during evacuation process (Bouzat & Kuperman 2014; Mesmer & Bloebaum 2014). The focus of these simulations was the interaction between a small number of agents rather than the mutual influences of a large number with the environment. In summary, game theory has been widely adopted in the context of individual evacuation decisions such as exit selection and route optimization (Levy et al. 2018; Mesmer & Bloebaum 2014) and is regarded as an appropriate behavioural model to simulate pedestrians’ actions under emergencies.

Few studies have sought to simulate pedestrian behaviours during emergency evacuation using game-theoretic approaches. One of the main barriers is that the interactions between individual behaviours and macro-phenomena of pedestrian flow are complex, and general game theories cannot account for the complexity of interactions between and among pedestrians, as well as with their environment. Early game theories place a number of restrictions on agents and environments such as the Nash game with complete information (Rosen 1965) and team-based decision making (Radner 1962). The refined Bayesian Nash Equilibrium (BNE) proposed by Ui (2016) relaxes the complete information constraint and considers a game with incomplete information, which is more realistic in the context of evacuation when complete real-time information is often missing for individuals. BNE defines a correlated equilibrium with varied payoff gradients according to different game conditions, which include the Nash Equilibrium as one particular case, compared to the monotonic payoff gradient in traditional Nash Equilibrium. As a result, BNE is more suitable to scenarios with incomplete information, multiple equilibriums, and varied payoff gradients, suggesting opportunities to use it to simulate pedestrian movement in an ABM. BNE has been used mainly in advertising and other economics fields (Gomes & Sweeney 2014) with little research using BNE to simulate pedestrian evacuation.

This research develops an agent-based evacuation simulation model of pedestrian flow by introducing BNE to realistically simulate individual decision-making processes and pedestrian behaviours in an emergency evacuation. The model combines Bayesian game theory and an agent-based approach at an individual level, to provide an experimental environment of pedestrian flows to support research on crowd management and evacuation planning. It was hypothesized that a BNE approach was able to positively affect individual evacuations during model simulation because of its capacity to predict future congestion levels. A series of simulation experiments were conducted with different parameter configurations to understand whether and how BNE affects pedestrian emergency evacuations.

BNE Model

The initial model was developed in NetLogo. The source code and experimental data have been published on COMSES and available at https://doi.org/10.25937/75wf-aa82. The full technical details of the model are shown in the Appendix followed ODD+D protocol. The description of BNE behavioural model is mainly provided in this section.

In order to translate the rationality of BNE theory into specific decision-making rules, a series of utility functions are introduced in this model to realize the BNE behavioural model. Individual decision-making depends on the value of “Total Utility” for optional patches. Total utility consists of three main elements: Distance Utility (\(U_{d}\)), Comfort Utility (\(U_{c}\)) and Expected Comfort Utility (\(U_{ec}\)), and refers to the total value of \(U_{d}\) and \(U_{ec}\). Specifically, the decision made by each BNE agent considers the distance from its current position to the exit, the number of neighbours who may move to the same patch as itself, and the possible surrounding situations in the next time step. Then, the patch with maximum total utility (i.e., the sum of \(U_{ec}\) and \(U_{d}\)) is selected by the agent to move to. In other words, agents use BNE to predict the congestion level in next time step and then avoid the most clogged patches during their movement, in order to determine an evacuation route with less exit time and higher comfort level. The choice criterium is the value of total utility in neighbouring patches, which is evaluated by agents to decide where to go. In this model, all BNE related utilities were set as patch attributes and are described in detail below.

Distance utility

This represents the distance from the current location to the exit. Since we assume that agents tend to choose the patch with largest value of total utility, \(U_{d}\) should be set to an increasing attribute value closer to the exits. Due to two exits existing in the evacuation space, two sets of distance utility are determined for the agents moving to the right or left exit respectively (i.e., parameters \(U_{d, rt}\) and \(U_{d, lf}\)). The equation is:

\[ U_{d} = \frac{D - d}{D}\]

\[(1)\]

where, \(d\) represents the distance from current patch to the exit, and \(D\) refers to the diagonal of the evacuation space.

Comfort utility

Comfort Utility, \(U_{c}\) is a set of coefficients that form a crucial component of Expected Comfort Utility, reflecting the comfort level of agents in any one patch. According to the speed-density relation associated with the Spatial-Grid Evacuation Model (SGEM) (Lo et al. 2004), the value is set to 1 when two or less than two agents occupy the patch. It decreases as the number of agents on the patch increases, by setting the value to be a proportion of the free-moving speed (i.e., 1.4 m/s) relative to the number of agents on the patch. Considering the limited space capacity in reality, \(U_{c}\) stays at zero when more than 4 persons move to the same patch. The equation is as follows, with full details in the Speed Calibration Section (Section 3.1):

\[ U_{c} = \begin{cases} 1.00, n \leq 2\\ 0.51, n = 3\\ 0.07, n = 4 \\ 0.00, n \geq 5 \end{cases}\]

\[(2)\]

where, \(n\) represents the number of agents in one patch.

Expected comfort utility

According to the definition of BNE, individual decision-making in this model is independent, which means that no account is taken of the agent’s previous actions in each time step. The main factors determining where agents go is the number of agents who may move to the candidate patches in next time step. In other words, the probability of the neighbours’ next actions has an impact on the decision-making process of the agent. Figure 1 illustrates how the BNE agents on the blue patch compete with other agents on eight surrounding patches with the probabilities of agents entering the blue patch has also been marked.

It is assumed that the probability of reverse movement during evacuation is extremely low, which means that each agent has six optional directions (i.e., candidate patches) \(P_{0}, P_{1}, \dots, P_{5}\) in each time step (see Figure 1). The probability of entering these candidate patches \(P_{m}\) is set to the same value (i.e., 16.7%) by default, which could be adjusted using the Probability-competing slider in further studies.

Thus, the patch variable named Expected Comfort Utility (\(U_{ec}\)) for each patch is dynamic in this model and reflects the interaction between agents. It is defined as the multiplication of comfort utility \(U_{c}\) and the probability \(p(n)\) that a certain number of agents move to this patch in next time step (see Equation 3):

\[ \begin{split} U_{ec} = & \sum_{n = 0}^{4}p(n)U_{c}(n) \\ = & \sum_{n = 0}^{4}C_{N}^{n} P_{m}^{n}(1 - P_{m})^{N - n}U_{c}(n) \end{split}\]

\[(3)\]

where, \(n\) represents the number of agents in this patch; and \(P_{m}\) refers to the probability of agents entering the candidate patches, which is set to 16.7% by default. In this way, the calculation of \(U_{ec}\) takes into account the agents on both the patch and its eight neighbouring patches (i.e., Moore neighbourhood) (see Figure 1).

The relationships of these utilities are illustrated as Figure 2.

**Figure 1.** The schema of agent decision-making.

**Figure 2.** The calculation of BNE utilities.

Calibration, Simulation Experiments

Speed calibration

To achieve a more realistic evacuation simulation, it is assumed that individual speeds in the model should change over time instead of a static attribute. The variation of moving speed has a close association with the number of surrounding agents, which means that the speed parameter should be calibrated in this way. After the comparison of several of the main pedestrian speed-crowd density models adopted in recent years (Luo et al. 2018; Mesmer & Bloebaum 2016; Zhou et al. 2019), the Spatial-Grid Evacuation Model (SGEM) proposed by Lo et al. (2004) was considered as the most appropriate relation model for this research, as it takes into account of the interconnections between surrounding pedestrians, as well as the potential effects of the short-term contact among pedestrians on individual evacuation speeds.

The general trend in the speed-density relation model remains consistent when the crowd density is less than 4 person/\(m^{2}\), with pedestrians in a free motion state with a speed of about 1.4 \(m\)/s. However, when crowd density is greater than 8 person/\(m^{2}\), pedestrians are considered to be in a state of constrained motion and move at around 0.1 m/s. When crowd density varies between this range between 4 person/\(m^{2}\) and 8 person/\(m^{2}\), pedestrian movement starts to be restricted and movement speed declines with increasing numbers of people. As the average step size of adults is around 0.7m who have a mean response time of about 0.5s (Chang et al. 2021), several parameters are adjusted in the SGEM model to fit the current environment. In this case, the initial speed in the model is adjustable through the move-speed slider instead of imposing a fixed value (i.e., 1.4 \(m\)/s), and individual speed is set to be in an inverse proportion to the number (density) of agents on its Moore neighbourhood. High density has a decay effect on agents’ moving speed which means that agents encircled by a crowd of people cannot hop large distances to a free patch next to the exit. Consequently, the suitable speed-density relationship for this model is illustrated in Equation 4:

\[ V = \begin{cases} 1.4, 0 < \rho \leq 4\\ 0.03\rho^{2} -0.64 \rho + 3.36, 4 < \rho < 8\\ 0.1 (\approx 0), \rho \geq 8 \end{cases}\]

\[(4)\]

where, \(\rho\) is density of pedestrians (person/\(m^{2}\)).

Simulation experiments

A series of simulation experiments were conducted in NetLogo BehaviorSpace to evaluate the role of BNE played in pedestrian evacuation. To determine whether and how BNE affects individual evacuation, three simulation scenarios were evaluated (1) Singleton Pattern with a fixed number of persons: evacuations in which all pedestrians follow one of three behavioural models (i.e. SR, RF, BNE), (2) Mixed Pattern with a fixed number of persons: evacuations in which a specific proportion of agents follow BNE and the rest evacuate by one of the other two models (i.e. BNE mixed with SR/RF), and (3) Mixed Pattern with a varied number of persons: evacuations in which the number of persons varies in a range from 1100 to 3000 and two types of agent movements participate into the simulation (i.e., BNE mixed with SR/RF). The evacuations were evaluated in terms of evacuation time and expected comfort utility to assess the performance of BNE.

The first experiments simulated evacuations in a tunnel space with all agents following one of BNE, RF, and SR models, with 2000 or 3000 pedestrians. The experiments were replicated 50 times for each parameter configuration and stopped when all of the agents evacuated successfully.

The second set of experiments evaluated how different proportions of agents following BNE would affect evacuation. The model was initialized with 2000 or 3000 agents, in which a varying proportion of agents followed BNE, and the rest SR or RF. Here the percentage of BNE users was set to vary from 0% to 100% in intervals of 2%. Simulations were replicated 50 times for each parameter configuration to evaluate the variations of both exit time and expected comfort utility.

The third set of experiments then simulated different numbers of pedestrians in the same tunnel space to explore the influence of BNE on evacuation scenarios with different crowd densities. Here the agents were randomly scattered over the simulation space with selected moving combinations (i.e., BNE mixed with SR or BNE mixed with RF). The number of people varied from 1100 to 3000 in intervals of 100, and the proportion of BNE users was set to vary from 0% to 100% in increments of 2%. In this case, 30 repetitions were undertaken for each parameter configuration.

The values of all the other variables remain unchanged to in all experiments. The list of model inputs is shown in Table 1.

**Table 1:** The list of parameter settings in experiments.
Parameters	Values (Experiment 1)	Values (Experiment 2)	Values (Experiment 3)	State
number-persons	2000/3000	2000/3000	1100 to 3000 (+100)	Dynamic
Percentage-of-agents-with-BNE	100%	0% to 100% (+2%)	0% to 100% (+2%)	Dynamic
Probability-competing	16.7%	16.7%	16.7%	Static
door-width	6	6	6	Static
move-speed	2 \(m\)/s	2 \(m\)/s	2 \(m\)/s	Static
Step-length	0.7 \(m\)	0.7 \(m\)	0.7 \(m\)	Static
follow-radius	3	3	3	Static
weight-\(U_{d}\)	1	1	1	Static
moving-pattern	Shortest Route; Random Follow; BNE mixed with SR/RF	BNE mixed with SR/RF	BNE mixed with SR/RF	Dynamic
Repetitions	50 simulations were conducted on each behavioural model	50 simulations were conducted at each percentage fraction of BNE agents	50 simulations were conducted at each percentage fraction of BNE agents	30 simulations were conducted at each percentage fraction of BNE agents

Results

To evaluate the effects of BNE on pedestrian evacuation, evacuation time and expected comfort utility were determined for each set of runs. Evacuation time is the time taken for all agents to successfully evacuate from the simulation space. Expected comfort utility refers to the overall mean of \(U_{ec}\) recorded in each time step during a simulation.

For the first set of experiment, the evacuation time for each of three behavioural models (i.e., BNE, RF, SR) were evaluated for 2000 and 3000 persons (Figure 3). Each simulation was run with all agents following the same behavioural models (i.e., one of the three models above), and 50 repetitions were undertaken for each behavioural model. The results show that BNE exit time was nearly half that of Random Follow and around two-thirds of the Shortest Route time. Similar findings were found in both two groups, demonstrating the impact that BNE has on reducing evacuation times compared with other more general behavioural models.

**Figure 3.** Evacuation time of 3 movement models: BNE, RF, SR (Number of Agents: 2000 and 3000; 50 simulations conducted for each behavioural model).

Additionally, specific stages of the behavioural models were also recorded during simulation. The model views were exported every 20 ticks and the stages of models in the first 100 ticks are illustrated in Figure 4. As shown, the congestion levels were more severe in the Shortest Route than the Random Follow, which were in turn more severe than BNE. This potentially explains the poorer performance of SR compared to RF. SR agents move forward to the exits all the time causing jams especially as more agents try to evacuate through the exits (which may not be wide enough). RF agents randomly follow one of their neighbours resulting in smaller scale of crowd groupings during evacuation. So, congestions are present in both the RF and SR models, but the relatively smaller scale of congestion in RF results in a quicker evacuation. BNE pedestrians were able to forecast the level of congestion in the next time step to avoid jams.

**Figure 4.** The stages of the flow of agents following three behavioural models -100% BNE, RF and SR.

The model was first simulated with 2000 persons and varying levels of BNE mixed with SR and BNE mixed with RF. Figures 5 & 6 illustrate the variations in exit time and mean expected comfort utility of these two combinations respectively, and a LOESS (Locally Estimated Scatterplot Smoothing) trend line along with 95% confidence interval was created to indicate the association between the percentage of BNE agents and evacuation time. As shown in Figure 5, evacuation time decreases as the number of BNE users increases in both two experiments and flatten at about 80% BNE users. Figure 6 shows the variations of average expected comfort utility in two experiments. Higher expected comfort utility values indicate that evacuating pedestrians feel more comfortable during simulation. Figure 6 illustrates a decreasing trend of \(U_{ec}\) in BNE-RF combination and an increasing one for BNE-SR combination with a peak around 40% agents using BNE and 60% with the SR model. That is, BNE shows a significant positive effect on improving individual comfort level when we mixed BNE and SR models.

**Figure 5.** Evacuation time against percentage BNE with RF and with SR, with a local line of fit, for 2000 agents, 50 simulations for each percentage fraction.

**Figure 6.** Expected comfort utility against percentage BNE with RF and with SR, with a local line of fit, for 2000 agents, 50 simulations for each percentage fraction.

To further confirm the influence of BNE, the number of agents was set to 3000 and the experiment repeated. Figure 7 shows a strong downtrend in evacuation time with increasing proportions of BNE and upward trends after around 60%-70% with RF and 70%-80% with SR. The average \(U_{ec}\) showed an increasing trend with BNE with both RF and SR (Figure 8), indicating the positive effects of BNE on reducing evacuation time, as well as improving individual comfort during evacuations.

**Figure 7.** Evacuation time against percentage BNE with RF and with SR, with a local line of fit, for 3000 agents, 50 simulations for each percentage fraction.

**Figure 8.** Expected comfort utility against percentage BNE with RF and with SR, with a local line of fit, for 3000 agents, 50 simulations for each percentage fraction.

A potential reason for the contradictory trends of average \(U_{ec}\) in BNE-RF combination (Figures 6 & 8) may be the difference in crowd density. With higher density (i.e., 3000 persons), BNE has a notable influence on improving pedestrian comfort. To unpick this further, a third set of experiments were conducted with the number of agents varying from 1100 to 3000 in increments of 100, and 30 repetitions were undertaken at each percentage fraction of BNE users. Both exit time and average \(U_{ec}\) were recorded to further evaluate the effects of BNE on pedestrian evacuation with different crowd densities. The full results are shown in the Appendix. Figure 9 shows the mean exit times for different numbers of agents and different BNE percentages. When BNE is combined with RF, there is little advantage of specifying BNE at low densities (i.e., around 1500 agents). However, a distinct decrease of evacuation time with increasing BNE proportion can be observed in the scenarios with over 1500 agents. Thus, the positive influence of BNE on reducing exit time becomes increasingly evident with increased crowd density. Figure 9 also indicates the non-monotonicity of the mean values, with the reduction in exit time bottoming out at around 60% BNE agents and 40% RF agents. Similar trends are evident in BNE with SR. The exit time declines with increasing BNE proportions as the densities increase, but with a greater positive effect at lower crowd densities than in BNE with RF. In this case the reduction in mean exit time bottoms out at around 80% BNE agents with 20% SR agents.

**Figure 9.** Contour plot of exit time vs number of agents and percentage of BNE users (BNE-RF/SR combinations; 1100 to 3000 agents; 30 simulations for each percentage fraction).

The changes in mean comfort utility, \(U_{ec}\) with different numbers of agents and different BNE percentages are shown in Figure 10. Since BNE was perceived to have no influences on improving pedestrians’ comfort when the number of agents was set to 2000 in BNE-RF combination (Figure 6), the range of initial pedestrian numbers was extended and \(U_{ec}\) showed an upward tendency when the number was set to 2300, increasing with agent density. Figure 10 shows gradual increases in comfort utility with increasing BNE with RF at higher densities (greater than 2000) to around 30% BNE, followed by declines, except for at higher densities (greater than 2500). For BNE with SR, comfort utility improves with increased BNE to around 50%, and again increasing at higher densities.

A plausible explanation for this variation of mean values with density and BNE percentage, is the coincident patch selection by BNE agents occupying same patches, especially when almost all agents are BNE, leading to the low evacuating speed. Specifically, since BNE agents tend to select the patch with maximum total utility as the target, it may be possible for them to choose simultaneously the same targets in the latter part of the simulation, causing congestion.

**Figure 10.** Contour plot of mean expected comfort utility vs number of agents and percentage of BNE users (BNE-SR/RF combinations; 1100 to 3000 persons; 30 simulations for each percentage fraction).

In summary, BNE has been shown to have a beneficial influence on shortening evacuation time as well as improving pedestrian comfort during emergency evacuation. The advantages of BNE were more pronounced in the scenarios with high densities, compared to those with low ones. BNE agents were observed to display more refined behaviours during evacuation and able to avoid the clogged areas, whilst considering the distance to the exits and the probability distribution of their neighbours’ movements in the next time step. It was also found that the introduction of a proportion of none BNE agents could speed up the evacuation process to a certain extent.

Discussion and Conclusions

This paper evaluated evacuation models that incorporate game theory (i.e., Bayesian Nash Equilibrium) within multi-agent systems with the aim of providing more realistic simulations of the movements and behaviours of evacuating pedestrians. Pedestrian agents adopting BNE were found to be more representative than agents with other behavioural models (SR and RF). At each step, they predict the next move of their neighbours and then avoid the most congested patches on their way to the exits, resulting in a relatively high comfort level. It was hypothesised that agents adopting BNE may provide a more forward-looking and representative behavioural model for pedestrian evacuation, as BNE agents sought to avoid congestion instead of directly moving to the exits or blindly following others.

A series of simulation experiments were undertaken to evaluate the role of BNE in pedestrian evacuation. The results demonstrate the positive impacts of BNE on reducing evacuation time and improving individual comfort during pedestrian evacuation, which was consistent regardless of the number (density) of agents. Agents with BNE displayed more efficient and intelligent behaviours during simulation compared with RF and SR agents, suggesting how simulation models that incorporate such behaviours for pedestrians during evacuation could be better represent real world scenarios. The individual decision-making process based on BNE is easily adaptable to other pedestrian simulations relating to flooding or fire and has the potential to fill the gap of a lack of forward-looking, intelligent individual behavioural model in ABMs for pedestrian evacuation.

The BNE agents in this model were assumed to be able to independently determine their next actions after considering environmental factors, as well as probabilities of their neighbours’ movement and decisions, which in turn, affect the subsequent steps of other agents. That is, agent decisions and movement with BNE varied based on their interactions with surroundings, predictions of neighbour’s actions and their own decisions, making the model outputs more realistic. This is different to previous simulation models of pedestrian flow (e.g., Jiang et al. 2010; Lu et al. 2017; Teknomo 2016). Thus far, research incorporating game theory and agent-based modelling has mainly focused on comparing game theory and ABM approaches (Noori et al. 2021) and the combination of ABMs and simple game theory (Levy et al. 2018). Studies of cooperation under Bayesian game theory and pedestrian evacuation have generally focused on individual decision-making over exit choice rather than pedestrian movements during evacuation process (Mesmer & Bloebaum 2014). The models described in this paper address this gap by incorporating complex game theory within an ABM approach at the individual agent level. Rather than simulation of exit selection, this model simulated pedestrian decisions using BNE for each time step, which more closely matches the reality of people avoiding crowd spaces in evacuation with routes that might be not the shortest path. Under BNE, the expected comfort utility of patches was constantly being updated by the varying distributions of other nearby agents at each time step. This allowed BNE agents to predict the next move of other agents and then to avoid the most congested areas during evacuation, in contrast to SR agents who move directly to the exit causing large congestions and resulting in longer evacuation time, and RF agents who randomly follow a neighbour resulting in smaller groups gathering but still slower evacuation times. The order of the effects of three behavioural models from high to low level is: BNE, RF, SR.

The proposed BNE model has a number of limitations: 1) The non-monotonicity of the mean values in Figures 5 to 8 reveals an interval issue of the current versions of the BNE implementation: BNE agents occupying the same patch may choose the same target in the next time step in the latter half of the simulation, which may cause small-scale congestions and low evacuation speed. The problem will be addressed, for example by adjusting the distributions of strategy selection from 100% optimal choice (i.e., patch with maximum total utility) to 50% optimal choice, 45% suboptimal choice (i.e., patch with second highest maximum total utility) and 5% worst choice (i.e., patch with minimum total utility) to disperse the aggregations of some BNE agents. This issue will be resolved in further studies; 2) Jumping from completely clueless Shortest Route to a new BNE strategy obscures the specific efficiency gain making it difficult to determine the magnitude of the improvement. For example, the SR strategy used for comparison failed to account for congestion costs which makes it unsurprising that BNE strategy performs better. This has been identified as a direction for our further work and what we are attempting to do is to replace this weak SR strategy to a relatively complex and efficient alternative (e.g., A* or Dijkstra’s search algorithm) so that costs are taken into account of pathfinding. This issue will be addressed in our following work; 3) Some model attributes such as moving speed, comfort utility and ‘Probability-competing’ need to be further calibrated and related sensitive analysis is also required to match evacuation movements in the real world; 4) Further simulation experiments need to be conducted with a wider range of parameter configurations, and different static variables (e.g., exit size) to further understand the effects of BNE on evacuation process from various perspectives; 5) Real or other existing evacuation models datasets are needed to validate this model; 6) The role of space played in decision-making process could be further explored by adding blockades into the simulation space before or during evacuation; 7) Greater self-organizing behaviours among pedestrians (e.g., competitive behaviour) need to be considered to make the model easily adaptable to a broader range of pedestrian studies. Such research is necessary to determine the full potential of the use of Bayesian Nash Equilibrium in pedestrian decision making in evacuations and to further advance the application of Bayesian game theory in agent-based pedestrian modelling.

Appendix

**Figure 11.** Facets: Evacuation time against percentage BNE with RF, with a local line of fit, for 1100 to 3000 agents, 30 simulations conducted for each percentage fraction.

**Figure 12.** Facets: Evacuation time against percentage BNE with SR, with a local line of fit, for 1100 to 3000 agents, 30 simulations conducted for each percentage fraction.

**Figure 13.** Facets: Expected comfort utility against percentage BNE with RF, with a local line of fit, for 1100 to 3000 agents, 30 simulations conducted for each percentage fraction.

**Figure 14.** Facets: Expected comfort utility against percentage BNE with SR, with a local line of fit, for 1100 to 3000 agents, 30 simulations conducted for each percentage fraction.

Model description (ODD+D protocol)

A complete and detailed description of the initial model following ODD+D protocol (Müller et al. 2013) is provided in this section.

1. Overview

1.1 Purpose. The purpose of this model is to introduce a new individual decision-making method, BNE, into the ABM of pedestrian evacuation to simulate individual behaviours and movements. The model was built to balance between fast evacuation and high comfortability, which is a general conflict in the domain of pedestrian research. The interactions of pedestrians with their neighbours as well as surroundings were also considered in order to simulate a more realistic pedestrian evacuation. This model ultimately aims to explore the influences of BNE on pedestrian flows from various perspectives, especially pedestrian comfort and exit time in an emergency evacuation with different parameter configurations.

1.2 Entities, state variables and scales. The model contains two main types of entities: Patches (i.e., evacuation space) and Agents representing evacuating pedestrians. The variable names are same as the variables implemented in NetLogo.

The Global Environment is defined as model parameters at the system level, controlling all of the global variables representing the simulation environment. Its state variables are shown in Table 2.

**Table 2:** Global environment state variables.
Variable Name	Variable Type and Units	Brief Description
number-persons	Person	Total number of agents in the model
Percentage-of-agents-with-BNE	Percent	The proportion of agents who are using BNE to evacuate
Probability-competing	Percent	The probability of agents entering one patch
door-width	Patch	The size of exits
move-speed	\(m\)/s	The speed when agents can move freely which reduces with the increasing crowd density
Step-length	\(m\)	The length of a single agent step
follow-radius	Patch	The distance that agents could see; it is used in Random Follow model
weight-\(U_{d}\)	Numeric	A coefficient to balance the influence of distance utility and expected comfort utility on determining agent movement directions.
Moving-pattern	Chooser	4 combinations are available: Shortest Route (SR), Random Follow (RF), BNE mixed with SR, and BNE mixed with RF

Patches refer to the areas in the simulation space. The evacuation environment in this model was divided into 1360 (68*20) patches and as values of different utilities can control the directions of agents, these patch attributes are considered as state variables. Details of the patch state variables are shown in Table 3.

**Table 3:** Patch state variables.
Variable Name	Variable Type and Units	Brief Description
\(U_{ec}\)	Numeric	Expected Comfort Utility
\(U_{d, lf}\)	Static; Numeric	Distance Utility, used by the agents moving to the left exit
\(U_{d, rt}\)	Static; Numeric	Distance Utility, used by the agents moving to the right exit
\(U_{total}\)	Numeric	Total Utility, the sum of distance utility and expected comfort utility
patch-target	Patch	Patches with maximum total utility

Agents represents the individual evacuees with different behavioural models and the related state variables are shown in Table 4.

**Table 4:** Agent state variables.
Variable Name	Variable Type and Units	Brief Description
speed	\(m\)/s	The speed of each agent during evacuation
left?	True/False	Whether or not the agent moves to the left exit
follow?	True/False	Whether or not the agent follows another agent
door	Location	The location of exit that the agent chooses
BNE-type	Boolean	“1” – this agent uses BNE to evacuate; “0” – agent follows SR/RF models
nearby-leaders	Agentset	The optional neighbours when the agent want to choose one to follow; only used in RF
Leader	Agent	The neighbour followed by the agent

Scales. The spatial extent of this model is a rectangular region of 68 * 20 square patches (see Figure 15). The model space is bounded and agents can only evacuate through the exits on either side. The model runs until all the agents evacuate from the simulation space. That is, the temporal scale in this model was not absolute and the number of time steps depended on the initial environmental conditions and the agents themselves. Three behavioural models were evaluated: Shortest Route (SR), Random Follow (RF) and BNE. The behavioural models were used to generate four moving combinations (i.e., model configurations): SR, RF, BNE mixed with SR, and BNE mixed with RF. The moving combination is selected by the user at the beginning of the simulation.

**Figure 15.** The interface of the simulation model, with agents in green and exits in red.

1.3 Progress Overview and Scheduling. The model simulates the complete process of pedestrian evacuation and demonstrates the detailed decision-making process of agents especially using BNE. Over each simulation run, patches and agents continuously update the relevant state variables in each time step.

The schedule of the model is shown as follows:

The simulation begins with a series of initial settings for global environment by the user, including the percentage of BNE agents, exit size, the number of agents, and other state variables. The type of moving combination is also selected, with all the environmental attributes static until the end of this run.
The patches execute the calculations of their distance utility and expected utility which are related to the implementation of BNE in the model. Expected comfort utility is continually updated until the end of simulation.
The agents choose a new direction (i.e., repeat their decision-making process) every time step in response to the new environmental conditions.
The state variables, plots and model interface are updated.

2 Design concept

2.1 Theoretical Background. Bayesian Nash Equilibrium (BNE) was used in the individual decision-making process of this model in order to augment the rationality of pedestrian evacuation simulation. BNE is an extension of Nash equilibrium and is a static game with incomplete information. It is generally defined as a strategy profile in which participants are assumed able to maximize their own expected utility based on the probability distributions of the strategies chosen by other players, and in this case, no player selects other strategies (Ui 2016). That is, the primary element of individual decision-making is the probability distributions of strategies played by other nearby participants, especially the probabilities of neighbours choosing the same strategy as the player. In this model, this is reflected as the probability distributions of the next actions of nearby agents and a series of utility calculations for candidate patches. The relevant underlying equations are described below.

2.2 Individual Decision-making. Decision-making is modelled on an agent level in this research. In this model, the decision-making processes of agents are different based on the three behavioural models (i.e., SR, RF and BNE). For BNE agents, they tend to move to a neighbour patch with maximum total utility every time step, which depends on the probabilities of nearby agents’ next actions and the surrounding environment. Each agent using RF follows a neighbour in sight at random and repeats this selection process until the end of simulation. Agents using SR makes only one decision (i.e., which exit to move towards) at the beginning of the simulation, which means that their directions remain unchanged until they succeed in evacuating from the simulation space. Only BNE agents use utility values to determine target patches during simulation.

2.3 Individual Sensing. Agents following RF model are assumed able to sense their neighbours within a specific radius of their current locations. The value of this radius remains constant during simulation and can be adjusted by the corresponding slider “follow-radius”. Agents following BNE are assumed to be able to sense surrounding conditions and select the target from nearby patches to move in each time step. Specifically, a BNE agent can sense and potentially move to all neighbouring patches which are located between its current patch and the exit. The choice criteria are the total value of distance utility and expected comfort utility of each patch.

2.4 Individual Prediction. Individuals in this model use the information on neighbours’ current locations and the probability distribution of their moving directions to predict future situations. Specifically, expected comfort utility of neighbouring patches is estimated over a time step by using the explicit prediction of nearby agents \(p(n)\) and comfort coefficient of each patch \(U_{c}(n)\).

2.5 Interaction. Interactions among agents are mediated through the variations of expected comfort utilities of neighbour patches. In BNE model, each patch’s expected comfort utility depends on the expected number of agents who will be on the patch in the next step, which in turn determines the number of agents in the nearby patches. Then, each agent will determine its direction by comparing the total utilities of six patches which are within its optional directions (i.e., candidate patches \(P_{0}\), \(P_{1}, \dots\) \(P_{5}\)) (details in Section 2.3.4). That is, the current position and expected next move of agents influence the expected comfort utility of patches, which, in turn, affects the next move of nearby agents.

2.6 Heterogeneity. Agents are heterogeneous in their decision-making process based on their own behavioural type. For agents following the Shortest Route model, their only decision is to move to the exit. For those with the Random Follow model, agents need to choose a neighbour in their view each time step and repeat this process until the end of simulation. For those with BNE model, agents select the patch with maximum total utility to move to and repeat this decision-making procedure every time step until they evacuate successfully.

2.7 Stochasticity. Stochasticity is introduced in two ways in this model. Firstly, the model is initialized randomly based on the settings configured by the user. Specifically, (a) the location of agents, (b) the random allocation of behavioural type, and (c) the initial directions of agents are considered to be stochastic at the beginning of a simulation. Secondly, when an agent following the RF model determines where to move, its choice of its following target is partly stochastic as it is limited by its view and the exit selected. Similarly, when two patches have same and maximum value of total utility, the agent in BNE model will randomly select one to move to. This decision is stochastic but not completely random since the choice is restricted by the location and direction of agents.

2.8 Observation. The purpose of this model is to explore whether and how BNE affects pedestrian evacuation procedures in the case of emergency, with two main measurements: evacuation time and pedestrian comfort level. The exit time and average expected comfort utility of each run are collected at the end of simulation in order to compare evacuations with varying proportions of agents following the BNE model in the simulation.

3 Details

3.1 Implementation Details. The initial model was developed in NetLogo. The source code and experimental data are available at https://doi.org/10.25937/75wf-aa82.

3.2 Initialization. The initial state of the model is a hypothetical evacuation space with emergency exits located on either side. The agents are initialized by setting the total number of persons (i.e., number-persons global parameter) and the percentage of BNE users through a slider Percentage-of-agents-with-BNE. The agents are randomly scattered over the simulation environment. The initial speed of each agent is tailored according to the number of agents on its Moore neighbourhood (i.e., the agent’s own patch and eight neighbouring patches), and all the adjustments are based on the reference speed assigned by the global parameter move-speed. The agent moving combinations are selected through the moving-pattern. The moving combinations in this model consist of Random Follow (RF), Shortest Route (SR), BNE mixed with RF, and BNE mixed with SR. For the first two combinations, all the agents are set to same behavioural model during simulation. For the last two combinations, a specific proportion of agents use BNE to evacuate which is defined by the global parameter Percentage-of-agents-with-BNE, and the rest follow one of two other models (i.e., SR or RF) based on the selected option.

Each patch can be occupied by one or more agents and calculate its own distance utility and expected comfort utility at the initialization stage. BNE agents compare the value of total utility (i.e., the sum of \(U_{d}\) and \(U_{ec}\) for the candidate patches and select the one with maximum value to move to in each time period. That is, the patch attributes are being continuously updating every time step to provide updated information to agents using BNE for determining their next actions. At present, a series of global parameters (e.g., door-width, follow-radius, etc.) are fixed due to the main research focus, which is the exploration of whether and how BNE affects pedestrian emergency evacuation, but variations in these global parameters could be evaluated in further research.

3.3 Input data. So far, no input is read in this initial model.

References

BABOJELIĆ, K., & Novacko, L. (2020). Modelling of driver and pedestrian behaviour - A historical review. Promet-Traffic & Transportation, 32(5), 727–745.

BAR-HAIM, Y. (2010). Research review: Attention bias modification (ABM): A novel treatment for anxiety disorders. Journal of Child Psychology and Psychiatry, 51(8), 859–870. [doi:10.1111/j.1469-7610.2010.02251.x]

BOUZAT, S., & Kuperman, M. N. (2014). Game theory in models of pedestrian room evacuation. Physical Review E, 89(3), 032806. [doi:10.1103/physreve.89.032806]

CHANG, C. L., Tsai, Y. L., CHANG, C. Y., & Chen, S. T. (2021). Emergency evacuation planning via the point of view on the relationship between crowd density and moving speed. Wireless Personal Communications, 119(3), 2577–2602. [doi:10.1007/s11277-021-08345-y]

FENG, Y., Duives, D., Daamen, W., & Hoogendoorn, S. (2021). Data collection methods for studying pedestrian behaviour: A systematic review. Building and Environment, 187, 107329. [doi:10.1016/j.buildenv.2020.107329]

GOMES, R., & Sweeney, K. (2014). Bayes-Nash equilibria of the generalized second-price auction. Games and Economic Behavior, 86, 421–437. [doi:10.1016/j.geb.2012.09.001]

JIANG, Y. Q., Zhang, P., Wong, S. C., & Liu, R. X. (2010). A higher-order macroscopic model for pedestrian flows. Physica A: Statistical Mechanics and Its Applications, 389(21), 4623–4635. [doi:10.1016/j.physa.2010.05.003]

LEVY, N., Klein, I., & Ben-Elia, E. (2018). Emergence of cooperation and a fair system optimum in road networks: A game-theoretic and agent-based modelling approach. Research in Transportation Economics, 68, 46–55. [doi:10.1016/j.retrec.2017.09.010]

LIAO, C., Guo, H., Zhu, K., & Shang, J. (2019). Enhancing emergency pedestrian safety through flow rate design: Bayesian-Nash equilibrium in multi-agent system. Computers & Industrial Engineering, 137, 106058. [doi:10.1016/j.cie.2019.106058]

LO, S. M., Fang, Z., Lin, P., & Zhi, G. S. (2004). An evacuation model: The SGEM package. Fire Safety Journal, 39(3), 169–190. [doi:10.1016/j.firesaf.2003.10.003]

LO, S. M., Huang, H. C., Wang, P., & Yuen, K. K. (2006). A game theory based exit selection model for evacuation. Fire Safety Journal, 41(5), 364–369. [doi:10.1016/j.firesaf.2006.02.003]

LU, L., Chan, C. Y., Wang, J., & Wang, W. (2017). A study of pedestrian group behaviors in crowd evacuation based on an extended floor field cellular automaton model. Transportation Research Part C: Emerging Technologies, 81, 317–329. [doi:10.1016/j.trc.2016.08.018]

LUO, L., Fu, Z., Cheng, H., & Yang, L. (2018). Update schemes of multi-velocity floor field cellular automaton for pedestrian dynamics. Physica A: Statistical Mechanics and Its Applications, 491, 946–963. [doi:10.1016/j.physa.2017.09.049]

MESMER, B. L., & Bloebaum, C. L. (2014). Incorporation of decision, game, and Bayesian game theory in an emergency evacuation exit decision model. Fire Safety Journal, 67, 121–134. [doi:10.1016/j.firesaf.2014.05.010]

MESMER, B. L., & Bloebaum, C. L. (2016). Modeling decision and game theory based pedestrian velocity vector decisions with interacting individuals. Safety Science, 87, 116–130. [doi:10.1016/j.ssci.2016.03.018]

MÜLLER, B., Bohn, F., Dreßler, G., Groeneveld, J., Klassert, C., Martin, R., Schlüter, M., Schulze, J., Weise, H., & Schwarz, N. (2013). Describing human decisions in agent-based models - ODD + D, an extension of the ODD protocol. Environmental Modelling & Software, 48, 37–48.

NOORI, M., Emadi, A., & Fazloula, R. (2021). An agent-based model for water allocation optimization and comparison with the game theory approach. Water Supply, 21(7), 3584–3601. [doi:10.2166/ws.2021.124]

PICCOLI, B., & Tosin, A. (2011). Time-evolving measures and macroscopic modeling of pedestrian flow. Archive for Rational Mechanics and Analysis, 199(3), 707–738. [doi:10.1007/s00205-010-0366-y]

RADNER, R. (1962). Team decision problems. The Annals of Mathematical Statistics, 33(3), 857–881. [doi:10.1214/aoms/1177704455]

RIGOS, A., Mohlin, E., & Ronchi, E. (2019). The cry wolf effect in evacuation: A game-theoretic approach. Physica A: Statistical Mechanics and Its Applications, 526, 12089. [doi:10.1016/j.physa.2019.04.126]

ROSEN, J. B. (1965). Existence and uniqueness of equilibrium points for concave N-person games. Econometrica: Journal of the Econometric Society, 33(3), 520–534. [doi:10.2307/1911749]

ROZO, K. R., Arellana, J., Santander-Mercado, A., & Jubiz-Diaz, M. (2019). Modelling building emergency evacuation plans considering the dynamic behaviour of pedestrians using agent-based simulation. Safety Science, 113, 276–284.

SEVTSUK, A., & Kalvo, R. (2022). Predicting pedestrian flow along city streets: A comparison of route choice estimation approaches in downtown San Francisco. International Journal of Sustainable Transportation, 16(3), 222–236. [doi:10.1080/15568318.2020.1858377]

TEKNOMO, K. (2016). Microscopic pedestrian flow characteristics: Development of an image processing data collection and simulation model. arXiv preprint. Available at: https://arxiv.org/ftp/arxiv/papers/1610/1610.00029.pdf

UI, T. (2016). Bayesian Nash equilibrium and variational inequalities. Journal of Mathematical Economics, 63, 139–146. [doi:10.1016/j.jmateco.2016.02.004]

VERMUYTEN, H., Beliën, J., De Boeck, L., Reniers, G., & Wauters, T. (2016). A review of optimisation models for pedestrian evacuation and design problems. Safety Science, 87, 167–178. [doi:10.1016/j.ssci.2016.04.001]

WIJERMANS, N., & Templeton, A. (2022). Towards more realism in pedestrian behaviour models: First steps and considerations in formalising social identity. In M. Czupryna & B. Kamiński (Eds.), Advances in Social Simulation. Proceedings of the 16th Social Simulation Conference, 20–24 September 2021 (pp. 53–64). Springer. [doi:10.1007/978-3-030-92843-8_5]

ZHOU, X., Hu, J., Ji, X., & Xiao, X. (2019). Cellular automaton simulation of pedestrian flow considering vision and multi-velocity. Physica A: Statistical Mechanics and Its Applications, 514, 982–992. [doi:10.1016/j.physa.2018.09.041]