An agent-based simulation model of pedestrian evacuation based on Bayesian Nash Equilibrium

This research incorporates Bayesian game theory into pedestrian evacuation in an agent-based model. Three pedestrian behaviours were compared: Random Follow, Shortest Route and Bayesian Nash Equilibrium (BNE), as well as combinations of these. The results showed that BNE pedestrians were able to evacuate more quickly as they predict congestion levels in their next step and adjust their directions to avoid congestion, closely matching the behaviours of evacuating pedestrians in reality. A series of simulation experiments were conducted to evaluate whether and how BNE affects pedestrian evacuation procedures. The results showed that: 1) BNE has a large impact on reducing evacuation time; 2) BNE pedestrians displayed more intelligent and efficient evacuating behaviours; 3) As the proportion of BNE users rises, average evacuation time decreases, and average comfort level increases. A detailed description of the model and relevant experimental results is provided in this paper. Several limitations as well as further works are also identified.


Introduction
1.1 Large public gatherings or crowds are commonplace and have been the subject of simulation research in many studies related to crowd management, disaster management and evacuation planning (Babojelić & Novacko 2020).However, in-depth research on pedestrians has been hindered by difficulties such as complex individual behaviours, different disaster characteristics, and varying environmental factors (Wijermans & Templeton 2022).As evacuee behaviour and movement vary in different scenarios, a number of field observations and simulation experiments have been conducted to explore pedestrian flows, movement patterns and potential factors affecting evacuation under different types of emergencies (Rozo et al. 2019;Feng et al. 2021;Sevtsuk & Kalvo 2022).Despite many simulation studies of pedestrian behaviours, few common behavioural features of pedestrian flows have been explored (Vermuyten et al. 2016;Babojelić & Novacko 2020).One of the main obstacles is the lack of experimental datasets that closely match individual movements during evacuations in the real world.Consequently, a more intelligent evacuation simulation model of pedestrian flow is needed to realistically replicate the movement and behaviours of evacuees and that is also easily adaptable to various evacuation scenarios.
1.2 Simulation models of pedestrian flow are generally classified into one of two main categories: macroscopic models and microscopic models.Macroscopic simulation models consider pedestrian flows as a single unit such that evacuees in these models are homogeneous during simulation (Jiang et al. 2010;Piccoli & Tosin 2011).
In these cases, it is difficult to observe the interactions of individual pedestrians and their (micro-level) behaviours.To address this, a number of approaches have been developed, such as cellular automata, lattice gas automata, social force models, and other simulation tools, allowing pedestrians to be partially heterogeneous during simulation.Pedestrians determine their own actions according to their surrounding environment, but the probability distributions of their decisions are still controlled at the macroscopic level (Teknomo 2016;Lu et al. 2017).Agent-based modelling (ABM) can fully capture individual behavioural heterogeneity in pedestrian models, and it is one of the main individual-based models used to simulate pedestrian movement in different scenarios.It has the capability to reveal the aggregated patterns of individual actions and environments from the bottom up (Bar-Haim 2010).
1.3 Game theory has attracted much attention from researchers in the fields of pedestrian behaviours and evacuation simulation.Current research on pedestrian flow pays much attention to whether and how the simulation could closely match the reactions of pedestrians in different scenarios in the real world by predicting pedestrians' next move.Game theory provides an effective approach to realistically reproduce individual decisionmaking processes.Specifically, a game-theoretic approach assumes individuals make decisions based on their beliefs, which are updated in response to their surroundings.The best pedestrian strategies or responses can be derived using different game theories.For instance, Rigos et al. (2019) introduced Sequential Equilibrium and perfect Bayesian Equilibrium into their response model to simulate the evacuee actions after receiving the order to evacuate.Liao et al. (2019) incorporated Bayesian Nash Equilibrium into their simulation model to discover the relationships between safe pedestrian flow rates and public space area.There are many other examples of research on pedestrian behaviours that have introduced game theory into their models to more realistically simulate pedestrian decision-making and behaviours (Bouzat & Kuperman 2014;Lo et al. 2006;Mesmer & Bloebaum 2016).However, the main objectives of the studies focusing on both game theory and ABMs have generally been to compare simulations based on game theory with agent-based approaches (Noori et al. 2021), or the incorporation of ABMs with simple game theory such as a zero-sum 2-player game (Lo et al. 2006;Levy et al. 2018).Research using Bayesian game theory for pedestrian evacuation have tended to simulate individual selection of final exit rather than their actions during evacuation process (Bouzat & Kuperman 2014;Mesmer & Bloebaum 2014).The focus of these simulations was the interaction between a small number of agents rather than the mutual influences of a large number with the environment.In summary, game theory has been widely adopted in the context of individual evacuation decisions such as exit selection and route optimization (Levy et al. 2018;Mesmer & Bloebaum 2014) and is regarded as an appropriate behavioural model to simulate pedestrians' actions under emergencies.

1.4
Few studies have sought to simulate pedestrian behaviours during emergency evacuation using game-theoretic approaches.One of the main barriers is that the interactions between individual behaviours and macrophenomena of pedestrian flow are complex, and general game theories cannot account for the complexity of interactions between and among pedestrians, as well as with their environment.Early game theories place a number of restrictions on agents and environments such as the Nash game with complete information (Rosen 1965) and team-based decision making (Radner 1962).The refined Bayesian Nash Equilibrium (BNE) proposed by Ui (2016) relaxes the complete information constraint and considers a game with incomplete information, which is more realistic in the context of evacuation when complete real-time information is often missing for individuals.BNE defines a correlated equilibrium with varied payoff gradients according to different game conditions, which include the Nash Equilibrium as one particular case, compared to the monotonic payoff gradient in traditional Nash Equilibrium.As a result, BNE is more suitable to scenarios with incomplete information, multiple equilibriums, and varied payoff gradients, suggesting opportunities to use it to simulate pedestrian movement in an ABM.BNE has been used mainly in advertising and other economics fields (Gomes & Sweeney 2014) with little research using BNE to simulate pedestrian evacuation.

BNE Model
2.1 The initial model was developed in NetLogo.The source code and experimental data have been published on COMSES and available at https://doi.org/10.25937/75wf-aa82.The full technical details of the model are shown in the Appendix followed ODD+D protocol.The description of BNE behavioural model is mainly provided in this section.

2.2
In order to translate the rationality of BNE theory into specific decision-making rules, a series of utility functions are introduced in this model to realize the BNE behavioural model.Individual decision-making depends on the value of "Total Utility" for optional patches.Total utility consists of three main elements: Distance Utility (U d ), Comfort Utility (U c ) and Expected Comfort Utility (U ec ), and refers to the total value of U d and U ec .Specifically, the decision made by each BNE agent considers the distance from its current position to the exit, the number of neighbours who may move to the same patch as itself, and the possible surrounding situations in the next time step.Then, the patch with maximum total utility (i.e., the sum of U ec and U d ) is selected by the agent to move to.In other words, agents use BNE to predict the congestion level in next time step and then avoid the most clogged patches during their movement, in order to determine an evacuation route with less exit time and higher comfort level.The choice criterium is the value of total utility in neighbouring patches, which is evaluated by agents to decide where to go.In this model, all BNE related utilities were set as patch attributes and are described in detail below.

Distance utility
2.3 This represents the distance from the current location to the exit.Since we assume that agents tend to choose the patch with largest value of total utility, U d should be set to an increasing attribute value closer to the exits.Due to two exits existing in the evacuation space, two sets of distance utility are determined for the agents moving to the right or left exit respectively (i.e., parameters U d,rt and U d,lf ).The equation is: where, d represents the distance from current patch to the exit, and D refers to the diagonal of the evacuation space.

Comfort utility
2.4 Comfort Utility, U c is a set of coefficients that form a crucial component of Expected Comfort Utility, reflecting the comfort level of agents in any one patch.According to the speed-density relation associated with the Spatial-Grid Evacuation Model (SGEM) (Lo et al. 2004), the value is set to 1 when two or less than two agents occupy the patch.It decreases as the number of agents on the patch increases, by setting the value to be a proportion of the free-moving speed (i.e., 1.4 m/s) relative to the number of agents on the patch.Considering the limited space capacity in reality, U c stays at zero when more than 4 persons move to the same patch.The equation is as follows, with full details in the Speed Calibration section (Section 3.1): where, n represents the number of agents in one patch.
Expected comfort utility 2.5 According to the definition of BNE, individual decision-making in this model is independent, which means that no account is taken of the agent's previous actions in each time step.The main factors determining where agents go is the number of agents who may move to the candidate patches in next time step.In other words, the probability of the neighbours' next actions has an impact on the decision-making process of the agent.Figure 1 illustrates how the BNE agents on the blue patch compete with other agents on eight surrounding patches with the probabilities of agents entering the blue patch has also been marked.

2.6
It is assumed that the probability of reverse movement during evacuation is extremely low, which means that each agent has six optional directions (i.e., candidate patches) P 0 , P 1 , . . ., P 5 in each time step (see Figure 1).The probability of entering these candidate patches P m is set to the same value (i.e., 16.7%) by default, which could be adjusted using the Probability-competing slider in further studies.

2.7
Thus, the patch variable named Expected Comfort Utility (U ec ) for each patch is dynamic in this model and reflects the interaction between agents.It is defined as the multiplication of comfort utility U c and the probability p(n) that a certain number of agents move to this patch in next time step (see Equation 3): where, n represents the number of agents in this patch; and P m refers to the probability of agents entering the candidate patches, which is set to 16.7% by default.In this way, the calculation of U ec takes into account the agents on both the patch and its eight neighbouring patches (i.e., Moore neighbourhood) (see Figure 1).

2.8
The relationships of these utilities are illustrated as Figure 2.

Speed calibration
3.1 To achieve a more realistic evacuation simulation, it is assumed that individual speeds in the model should change over time instead of a static attribute.The variation of moving speed has a close association with the number of surrounding agents, which means that the speed parameter should be calibrated in this way.After the comparison of several of the main pedestrian speed-crowd density models adopted in recent years (Mesmer & Bloebaum 2016;Luo et al. 2018;Zhou et al. 2019), the Spatial-Grid Evacuation Model (SGEM) proposed by Lo et al. (2004) was considered as the most appropriate relation model for this research, as it takes into account of the interconnections between surrounding pedestrians, as well as the potential effects of the short-term contact among pedestrians on individual evacuation speeds.

3.2
The general trend in the speed-density relation model remains consistent when the crowd density is less than 4 person/m 2 , with pedestrians in a free motion state with a speed of about 1.4 m/s.However, when crowd density is greater than 8 person/m 2 , pedestrians are considered to be in a state of constrained motion and move at around 0.1 m/s.When crowd density varies between this range between 4 person/m 2 and 8 person/m 2 , pedestrian movement starts to be restricted and movement speed declines with increasing numbers of people.
As the average step size of adults is around 0.7m who have a mean response time of about 0.5s (Chang et al. 2021), several parameters are adjusted in the SGEM model to fit the current environment.In this case, the initial speed in the model is adjustable through the move-speed slider instead of imposing a fixed value (i.e., 1.4 m/s), and individual speed is set to be in an inverse proportion to the number (density) of agents on its Moore neighbourhood.High density has a decay effect on agents' moving speed which means that agents encircled by a crowd of people cannot hop large distances to a free patch next to the exit.Consequently, the suitable speed-density relationship for this model is illustrated in Equation 4: where, ρ is density of pedestrians (person/m 2 ).evacuations in which the number of persons varies in a range from 1100 to 3000 and two types of agent movements participate into the simulation (i.e., BNE mixed with SR/RF).The evacuations were evaluated in terms of evacuation time and expected comfort utility to assess the performance of BNE.

3.4
The first experiments simulated evacuations in a tunnel space with all agents following one of BNE, RF, and SR models, with 2000 or 3000 pedestrians.The experiments were replicated 50 times for each parameter configuration and stopped when all of the agents evacuated successfully.

3.5
The second set of experiments evaluated how different proportions of agents following BNE would affect evacuation.The model was initialized with 2000 or 3000 agents, in which a varying proportion of agents followed BNE, and the rest SR or RF.Here the percentage of BNE users was set to vary from 0% to 100% in intervals of 2%.Simulations were replicated 50 times for each parameter configuration to evaluate the variations of both exit time and expected comfort utility.

3.6
The third set of experiments then simulated different numbers of pedestrians in the same tunnel space to explore the influence of BNE on evacuation scenarios with different crowd densities.Here the agents were randomly scattered over the simulation space with selected moving combinations (i.e., BNE mixed with SR or BNE mixed with RF).The number of people varied from 1100 to 3000 in intervals of 100, and the proportion of BNE users was set to vary from 0% to 100% in increments of 2%.In this case, 30 repetitions were undertaken for each parameter configuration.

3.7
The values of all the other variables remain unchanged to in all experiments.The list of model inputs is shown in Table 1.

4.1
To evaluate the effects of BNE on pedestrian evacuation, evacuation time and expected comfort utility were determined for each set of runs.Evacuation time is the time taken for all agents to successfully evacuate from the simulation space.Expected comfort utility refers to the overall mean of U ec recorded in each time step during a simulation.

4.2
For the first set of experiment, the evacuation time for each of three behavioural models (i.e., BNE, RF, SR) were evaluated for 2000 and 3000 persons (Figure 3).Each simulation was run with all agents following the same behavioural models (i.e., one of the three models above), and 50 repetitions were undertaken for each behavioural model.The results show that BNE exit time was nearly half that of Random Follow and around twothirds of the Shortest Route time.Similar findings were found in both two groups, demonstrating the impact that BNE has on reducing evacuation times compared with other more general behavioural models.

4.3
Additionally, specific stages of the behavioural models were also recorded during simulation.The model views were exported every 20 ticks and the stages of models in the first 100 ticks are illustrated in Figure 4.As shown, the congestion levels were more severe in the Shortest Route than the Random Follow, which were in turn more severe than BNE.This potentially explains the poorer performance of SR compared to RF. SR agents move forward to the exits all the time causing jams especially as more agents try to evacuate through the exits (which may not be wide enough).RF agents randomly follow one of their neighbours resulting in smaller scale of crowd groupings during evacuation.So, congestions are present in both the RF and SR models, but the relatively smaller scale of congestion in RF results in a quicker evacuation.BNE pedestrians were able to forecast the level of congestion in the next time step to avoid jams.shown in Figure 5, evacuation time decreases as the number of BNE users increases in both two experiments and flatten at about 80% BNE users.Figure 6 shows the variations of average expected comfort utility in two experiments.Higher expected comfort utility values indicate that evacuating pedestrians feel more comfortable during simulation.Figure 6 illustrates a decreasing trend of U ec in BNE-RF combination and an increasing one for BNE-SR combination with a peak around 40% agents using BNE and 60% with the SR model.That is, BNE shows a significant positive effect on improving individual comfort level when we mixed BNE and SR models.

4.5
To further confirm the influence of BNE, the number of agents was set to 3000 and the experiment repeated.
Figure 7 shows a strong downtrend in evacuation time with increasing proportions of BNE and upward trends after around 60%-70% with RF and 70%-80% with SR.The average U ec showed an increasing trend with BNE with both RF and SR (Figure 8), indicating the positive effects of BNE on reducing evacuation time, as well as improving individual comfort during evacuations.) may be the difference in crowd density.With higher density (i.e., 3000 persons), BNE has a notable influence on improving pedestrian comfort.To unpick this further, a third set of experiments were conducted with the number of agents varying from 1100 to 3000 in increments of 100, and 30 repetitions were undertaken at each percentage fraction of BNE users.Both exit time and average U ec were recorded to further evaluate the effects of BNE on pedestrian evacuation with different crowd densities.The full results are shown in the Appendix.Figure 9 shows the mean exit times for different numbers of agents and different BNE percentages.When BNE is combined with RF, there is little advantage of specifying BNE at low densities (i.e., around 1500 agents).However, a distinct decrease of evacuation time with increasing BNE proportion can be observed in the scenarios with over 1500 agents.Thus, the positive influence of BNE on reducing exit time becomes increasingly evident with increased crowd density.Figure 9 also indicates the non-monotonicity of the mean values, with the reduction in exit time bottoming out at around 60% BNE agents and 40% RF agents.Similar trends are evident in BNE with SR.The exit time declines with increasing BNE proportions as the densities increase, but with a greater positive effect at lower crowd densities than in BNE with RF.In this case the reduction in mean exit time bottoms out at around 80% BNE agents with 20% SR agents.

4.7
The changes in mean comfort utility, U ec with different numbers of agents and different BNE percentages are shown in Figure 10.Since BNE was perceived to have no influences on improving pedestrians' comfort when the number of agents was set to 2000 in BNE-RF combination (Figure 6), the range of initial pedestrian numbers was extended and U ec showed an upward tendency when the number was set to 2300, increasing with agent density.Figure 10 shows gradual increases in comfort utility with increasing BNE with RF at higher densities (greater than 2000) to around 30% BNE, followed by declines, except for at higher densities (greater than 2500).
For BNE with SR, comfort utility improves with increased BNE to around 50%, and again increasing at higher densities.
4.8 A plausible explanation for this variation of mean values with density and BNE percentage, is the coincident patch selection by BNE agents occupying same patches, especially when almost all agents are BNE, leading to the low evacuating speed.Specifically, since BNE agents tend to select the patch with maximum total utility as the target, it may be possible for them to choose simultaneously the same targets in the latter part of the simulation, causing congestion.4.9 In summary, BNE has been shown to have a beneficial influence on shortening evacuation time as well as improving pedestrian comfort during emergency evacuation.The advantages of BNE were more pronounced in the scenarios with high densities, compared to those with low ones.BNE agents were observed to display more refined behaviours during evacuation and able to avoid the clogged areas, whilst considering the distance to exits and the probability distribution of their neighbours' movements in the next time step.It was also found that the introduction of a proportion of none BNE agents could speed up the evacuation process to a certain extent.

Discussion and Conclusions
5.1 This paper evaluated evacuation models that incorporate game theory (i.e., Bayesian Nash Equilibrium) within multi-agent systems with the aim of providing more realistic simulations of the movements and behaviours of evacuating pedestrians.Pedestrian agents adopting BNE were found to be more representative than agents with other behavioural models (SR and RF).At each step, they predict the next move of their neighbours and then avoid the most congested patches on their way to the exits, resulting in a relatively high comfort level.
It was hypothesised that agents adopting BNE may provide a more forward-looking and representative behavioural model for pedestrian evacuation, as BNE agents sought to avoid congestion instead of directly moving to the exits or blindly following others.
5.2 A series of simulation experiments were undertaken to evaluate the role of BNE in pedestrian evacuation.The results demonstrate the positive impacts of BNE on reducing evacuation time and improving individual comfort during pedestrian evacuation, which was consistent regardless of the number (density) of agents.Agents with BNE displayed more efficient and intelligent behaviours during simulation compared with RF and SR agents, suggesting how simulation models that incorporate such behaviours for pedestrians during evacuation could be better represent real world scenarios.The individual decision-making process based on BNE is easily adaptable to other pedestrian simulations relating to flooding or fire and has the potential to fill the gap of a lack of forward-looking, intelligent individual behavioural model in ABMs for pedestrian evacuation.

5.3
The BNE agents in this model were assumed to be able to independently determine their next actions after considering environmental factors, as well as probabilities of their neighbours' movement and decisions, which in turn, affect the subsequent steps of other agents.That is, agent decisions and movement with BNE varied based on their interactions with surroundings, predictions of neighbour's actions and their own decisions, making the model outputs more realistic.This is different to previous simulation models of pedestrian flow (e.g., Jiang et al. 2010;Teknomo 2016;Lu et al. 2017).Thus far, research incorporating game theory and agent-based modelling has mainly focused on comparing game theory and ABM approaches (Noori et al. 2021) and the combination of ABMs and simple game theory (Levy et al. 2018).Studies of cooperation under Bayesian game theory and pedestrian evacuation have generally focused on individual decision-making over exit choice rather than pedestrian movements during evacuation process (Mesmer & Bloebaum 2014).The models described in this paper address this gap by incorporating complex game theory within an ABM approach at the individual agent level.Rather than simulation of exit selection, this model simulated pedestrian decisions using BNE for each time step, which more closely matches the reality of people avoiding crowd spaces in evacuation with routes that might be not the shortest path.Under BNE, the expected comfort utility of patches was constantly being updated by the varying distributions of other nearby agents at each time step.This allowed BNE agents to predict the next move of other agents and then to avoid the most congested areas during evacuation, in contrast to SR agents who move directly to the exit causing large congestions and resulting in longer evacuation time, and RF agents who randomly follow a neighbour resulting in smaller groups gathering but still slower evacuation times.The order of the effects of three behavioural models from high to low level is: BNE, RF, SR.

5.4
The proposed BNE model has a number of limitations: 1) The non-monotonicity of the mean values in Figures 5 to 8 reveals an interval issue of the current versions of the BNE implementation: BNE agents occupying the same patch may choose the same target in the next time step in the latter half of the simulation, which may cause small-scale congestions and low evacuation speed.The problem will be addressed, for example by adjusting the distributions of strategy selection from 100% optimal choice (i.e., patch with maximum total utility) to 50% optimal choice, 45% suboptimal choice (i.e., patch with second highest maximum total utility) and 5% worst choice (i.e., patch with minimum total utility) to disperse the aggregations of some BNE agents.This issue will be resolved in further studies; 2) Jumping from completely clueless Shortest Route to a new BNE strategy obscures the specific efficiency gain making it difficult to determine the magnitude of the improvement.For example, the SR strategy used for comparison failed to account for congestion costs which makes it unsurprising that BNE strategy performs better.This has been identified as a direction for our further work and what we are attempting to do is to replace this weak SR strategy to a relatively complex and efficient alternative (e.g., A* or Dijkstra's search algorithm) so that costs are taken into account of pathfinding.This issue will be addressed in our following work; 3) Some model attributes such as moving speed, comfort utility and 'Probability-competing' need to be further calibrated and related sensitive analysis is also required to match evacuation movements in the real world; Further simulation experiments need to be conducted with a wider range of parameter configurations, and different static variables (e.g., exit size) to further understand the effects of BNE on evacuation process from various perspectives; 5) Real or other existing evacuation models datasets are needed to validate this model; 6) The role of space played in decision-making process could be further explored by adding blockades into the simulation space before or during evacuation; 7) Greater self-organizing behaviours among pedestrians (e.g., competitive behaviour) need to be considered to make the model easily adaptable to a broader range of pedestrian studies.Such research is necessary to determine the full potential of the use of Bayesian Nash Equilibrium in pedestrian decision making in evacuations and to further advance the application of Bayesian game theory in agent-based pedestrian modelling.

Model description (ODD+D protocol)
A complete and detailed description of the initial model following ODD+D protocol (Müller et al. 2013) is provided in this section.

Overview
1.1 Purpose.The purpose of this model is to introduce a new individual decision-making method, BNE, into the ABM of pedestrian evacuation to simulate individual behaviours and movements.The model was built to balance between fast evacuation and high comfortability, which is a general conflict in the domain of pedestrian research.The interactions of pedestrians with their neighbours as well as surroundings were also considered in order to simulate a more realistic pedestrian evacuation.This model ultimately aims to explore the influences of BNE on pedestrian flows from various perspectives, especially pedestrian comfort and exit time in an emergency evacuation with different parameter configurations.
1.2 Entities, state variables and scales.The model contains two main types of entities: Patches (i.e., evacuation space) and Agents representing evacuating pedestrians.The variable names are same as the variables implemented in NetLogo.
The Global Environment is defined as model parameters at the system level, controlling all of the global variables representing the simulation environment.Its state variables are shown in Table 2. Agents represents the individual evacuees with different behavioural models and the related state variables are shown in Table 4.

Scales.
The spatial extent of this model is a rectangular region of 68 * 20 square patches (see Figure 15).The model space is bounded and agents can only evacuate through the exits on either side.The model runs until all the agents evacuate from the simulation space.That is, the temporal scale in this model was not absolute and the number of time steps depended on the initial environmental conditions and the agents themselves.Three behavioural models were evaluated: Shortest Route (SR), Random Follow (RF) and BNE.The behavioural models were used to generate four moving combinations (i.e., model configurations): SR, RF, BNE mixed with SR, and BNE mixed with RF.The moving combination is selected by the user at the beginning of the simulation.
distributions of the strategies chosen by other players, and in this case, no player selects other strategies (Ui 2016).That is, the primary element of individual decision-making is the probability distributions of strategies played by other nearby participants, especially probabilities of neighbours choosing the same strategy as the player.In this model, this is reflected as the probability distributions of the next actions of nearby agents and a series of utility calculations for candidate patches.The relevant underlying equations are described below.

Individual Decision-making.
Decision-making is modelled on an agent level in this research.In this model, the decision-making processes of agents are different based on the three behavioural models (i.e., SR, RF and BNE).For BNE agents, they tend to move to a neighbour patch with maximum total utility every time step, which depends on the probabilities of nearby agents' next actions and the surrounding environment.Each agent using RF follows a neighbour in sight at random and repeats this selection process until the end of simulation.Agents using SR makes only one decision (i.e., which exit to move towards) at the beginning of the simulation, which means that their directions remain unchanged until they succeed in evacuating from the simulation space.Only BNE agents use utility values to determine target patches during simulation.

Individual Sensing.
Agents following RF model are assumed able to sense their neighbours within a specific radius of their current locations.The value of this radius remains constant during simulation and can be adjusted by the corresponding slider "follow-radius".Agents following BNE are assumed to be able to sense surrounding conditions and select the target from nearby patches to move in each time step.Specifically, a BNE agent can sense and potentially move to all neighbouring patches which are located between its current patch and the exit.The choice criteria are the total value of distance utility and expected comfort utility of each patch.

Individual Prediction.
Individuals in this model use the information on neighbours' current locations and the probability distribution of their moving directions to predict future situations.Specifically, expected comfort utility of neighbouring patches is estimated over a time step by using the explicit prediction of nearby agents p(n) and comfort coefficient of each patch U c (n).
2.5 Interaction.Interactions among agents are mediated through the variations of expected comfort utilities of neighbour patches.In BNE model, each patch's expected comfort utility depends on the expected number of agents who will be on the patch in the next step, which in turn determines the number of agents in the nearby patches.Then, each agent will determine its direction by comparing the total utilities of six patches which are within its optional directions (i.e., candidate patches P 0 , P 1 , . . .P 5 ) (details in Section 2.3.4).That is, the current position and expected next move of agents influence the expected comfort utility of patches, which, in turn, affects the next move of nearby agents.
2.6 Heterogeneity.Agents are heterogeneous in their decision-making process based on their own behavioural type.For agents following the Shortest Route model, their only decision is to move to the exit.For those with the Random Follow model, agents need to choose a neighbour in their view each time step and repeat this process until the end of simulation.For those with BNE model, agents select the patch with maximum total utility to move to and repeat this decision-making procedure every time step until they evacuate successfully.

Stochasticity.
Stochasticity is introduced in two ways in this model.Firstly, the model is initialized randomly based on the settings configured by the user.Specifically, (a) the location of agents, (b) the random allocation of behavioural type, and (c) the initial directions of agents are considered to be stochastic at the beginning of a simulation.Secondly, when an agent following the RF model determines where to move, its choice of its following target is partly stochastic as it is limited by its view and the exit selected.Similarly, when two patches have same and maximum value of total utility, the agent in BNE model will randomly select one to move to.This decision is stochastic but not completely random since the choice is restricted by the location and direction of agents.
2.8 Observation.The purpose of this model is to explore whether and how BNE affects pedestrian evacuation procedures in the case of emergency, with two main measurements: evacuation time and pedestrian comfort level.The exit time and average expected comfort utility of each run are collected at the end of simulation in order to compare evacuations with varying proportions of agents following the BNE model in the simulation.

Details
3.1 Implementation Details.The initial model was developed in NetLogo.The source code and experimental data are available at https://doi.org/10.25937/75wf-aa82.

Initialization.
The initial state of the model is a hypothetical evacuation space with emergency exits located on either side.The agents are initialized by setting the total number of persons (i.e., number-persons global parameter) and the percentage of BNE users through a slider Percentage-of-agents-with-BNE.The agents are randomly scattered over the simulation environment.The initial speed of each agent is tailored according to the number of agents on its Moore neighbourhood (i.e., agent's own patch and eight neighbouring patches), and all the adjustments are based on the reference speed assigned by the global parameter move-speed.The agent moving combinations are selected through the moving-pattern.The moving combinations in this model consist of Random Follow (RF), Shortest Route (SR), BNE mixed with RF, and BNE mixed with SR.For the first two combinations, all the agents are set to same behavioural model during simulation.For the last two combinations, a specific proportion of agents use BNE to evacuate which is defined by the global parameter Percentageof-agents-with-BNE, and the rest follow one of two other models (i.e., SR or RF) based on the selected option.
Each patch can be occupied by one or more agents and calculate its own distance utility and expected comfort utility at the initialization stage.BNE agents compare the value of total utility (i.e., the sum of U d and U ec for the candidate patches and select the one with maximum value to move to in each time period.That is, the patch attributes are being continuously updating every time step to provide updated information to agents using BNE for determining their next actions.At present, a series of global parameters (e.g., door-width, follow-radius, etc.) are fixed due to the main research focus, which is the exploration of whether and how BNE affects pedestrian emergency evacuation, but variations in these global parameters could be evaluated in further research.
3.3 Input data.So far, no input is read in this initial model.

1. 5
This research develops an agent-based evacuation simulation model of pedestrian flow by introducing BNE to realistically simulate individual decision-making processes and pedestrian behaviours in an emergency evacuation.The model combines Bayesian game theory and an agent-based approach at an individual level, to provide an experimental environment of pedestrian flows to support research on crowd management and evacuation planning.It was hypothesized that a BNE approach was able to positively affect individual evacuations during model simulation because of its capacity to predict future congestion levels.A series of simulation experiments were conducted with different parameter configurations to understand whether and how BNE affects pedestrian emergency evacuations.

3. 3
A series of simulation experiments were conducted in NetLogo BehaviorSpace to evaluate the role of BNE played in pedestrian evacuation.To determine whether and how BNE affects individual evacuation, three simulation scenarios were evaluated (1) Singleton Pattern with a fixed number of persons: evacuations in which all pedestrians follow one of three behavioural models (i.e.SR, RF, BNE), (2) Mixed Pattern with a fixed number of persons: evacuations in which a specific proportion of agents follow BNE and the rest evacuate by one of the other two models (i.e.BNE mixed with SR/RF), and (3) Mixed Pattern with a varied number of persons:

Figure 4 :
Figure 4: The stages of the flow of agents following three behavioural models -100% BNE, RF and SR.

Figure 5 :
Figure 5: Evacuation time against percentage BNE with RF and with SR, with a local line of fit, for 2000 agents, 50 simulations for each percentage fraction.

Figure 6 :
Figure 6: Expected comfort utility against percentage BNE with RF and with SR, with a local line of fit, for 2000 agents, 50 simulations for each percentage fraction.

Figure 7 :
Figure 7: Evacuation time against percentage BNE with RF and with SR, with a local line of fit, for 3000 agents, 50 simulations for each percentage fraction.

Figure 8 :
Figure 8: Expected comfort utility against percentage BNE with RF and with SR, with a local line of fit, for 3000 agents, 50 simulations for each percentage fraction.

Figure 9 :
Figure 9: Contour plot of exit vs number of agents and percentage of BNE users (BNE-RF/SR combinations; 1100 to 3000 agents; 30 simulations for each percentage fraction).

Figure 10 :
Figure 10: Contour plot of mean expected comfort utility vs number of agents and percentage of BNE users (BNE-SR/RF combinations; 1100 to 3000 persons; 30 simulations for each percentage fraction).

Figure 12 :
Figure 12: Facets: Evacuation time against percentage BNE with SR, with local line of fit, for 1100 to 3000 agents, 30 simulations conducted for each percentage fraction.

Figure 13 :
Figure 13: Facets: Expected comfort utility against percentage BNE with RF, a local line of fit, for 1100 to 3000 agents, 30 simulations conducted for each percentage fraction.

Figure 14 :
Figure 14: Facets: Expected comfort utility against percentage BNE with SR, a local line of fit, for 1100 to 3000 agents, 30 simulations conducted for each percentage fraction.

Table 1 :
The list of parameter settings in experiments.

Table 2 :
Global environment state variables.Patches refer to the areas in the simulation space.The evacuation environment in this model was divided into 1360 (68*20) patches and as values of different utilities can control the directions of agents, these patch attributes are considered as state variables.Details of the patch state variables are shown in Table3.