Comparing Mechanisms of Food Choice in an Agent-Based Model of Milk Consumption and Substitution in the UK

: Substitution of food products will be key to realising widespread adoption of sustainable diets. We present an agent-based model of decision-making and influences on food choice, and apply it to historically observed trends of British whole and skimmed (including semi) milk consumption from 1974 to 2005. We aim to give a plausible representation of milk choice substitution, and test different mechanisms of choice consideration. Agents are consumers that perceive information regarding the two milk choices, and hold values that inform their position on the health and environmental impact of those choices. Habit, social influence and post-decision evaluation are modelled. Representative survey data on human values and long-running public concerns empirically inform the model. An experiment was run to compare two model variants by how they perform in reproducing these trends. This was measured by recording mean weekly milk consumption per person. The variants differed in how agents became disposed to consider alternative milk choices. One followed a threshold approach, the other was probability based. All other model aspects remained unchanged. An optimisation exercise via an evolutionary algorithm was used to calibrate the model variants independently to observed data. Following calibration, uncertainty and global variance-based temporal sensitivity analysis were conducted. Both model variants were able to reproduce the general pattern of historical milk consumption, however, the probability-based approach gave a closer fit to the observed data, but over a wider range of uncertainty. This responds to, and further highlights, the need for research that looks at, and compares, different models of human decision-making in agent-based and simulation models. This study is the first to present an agent-based modelling of food choice substitution in the context of British milk consumption. It can serve as a valuable pre-curser to the modelling of dietary shift and sustainable product substitution to plant-based alternatives in Britain.


Introduction
. A significant dietary shi for much of the Global North is needed to drastically reduce the planetary burden of the current global food system.Much of this burden stems from livestock and animal-sourced foods, with the sector responsible for .% of anthropogenic greenhouse gas emissions (GHGs), mainly through methane space) and focus on factors associated with the individual (i.e.personal di erences and a person's immediate social environment).Specifically, the set of factors influencing milk purchasing decisions are: cognitive perception of the health and environmental e ects of milk choice; habit; social influence; and choice evaluation, conceived as value-based cognitive dissonance.The model explores how these influences on individual choices impact the consumption dynamics of a population at large (Bruch & Atwell ).The model's individual decision-making processes are based upon utility maximisation, following a general approach common to agent-based models of innovation di usion and consumer adoption (Zhang & Vorobeychik ).
Background: Agent-Based Modelling of Food Choice .Agent-based models (ABMs) are simulated 'worlds' that contain a set of entities (agents) that exist and interact with each other and their environment.They behave according to a set of rules, 'empowering' them to make autonomous decisions at an individual level.These decisions, behaviours and interactions lead to emergent outcomes that cannot be understood simply by the constituent components of the simulation.Agent-based approaches are part of a suite of tools and methods that can help provide insight in a world where problems, and their solutions, are rarely simple and, increasingly, complex.Their ability to represent heterogeneity and incorporate diverse empirical data, for example in attitudes, preferences, biases, habits and demographics across populations, makes them a valuable tool in studying social systems in transition, where behaviour change is necessary and populations may not act rationally.
. ABMs have been used within behavioural change research that have looked at health, diet, environmental and sustainable behaviours.Some have focused on the food environment as an influence for healthy food choice (Auchincloss et al.
), or social influence of healthy eating from peers and marketing campaigns (Zhang et al. ).These and other studies are part of a growing number of ABMs focusing on health and diet (see a recent review paper of their use in public health from Tracy et al. ).However, there are fewer ABM studies concerned with the non-health impact of diets.One such study looked at meat consumption of UK consumers, focusing on social eating networks and testing responses to marketing strategies or price increases (Scalco et al. ).Another constructed a number of consumption profiles (food, energy, transport) of Italian households, and tested how the associated GHGs could change under a number of policy interventions designed to e ect food choice (Bravo et al. ).A recent study explored consumer behaviour and policy interventions with an ABM of Australian organic wine purchasing (Taghikhah et al. ).Further analysis, using the same agentbased model, looked to compare a theory driven vs data-driven modelling approach and the relative merits of each (Taghikhah et al. ).
. We build on this previous literature of consumer behaviour of (more) sustainable food, and model food choice influence and decision-making in the context of historical UK milk consumption, assessing the performance of two di erent model approaches.
Model Description -ODD .The model is described using the ODD (Overview, Design concepts, Details) protocol, which is a standard approach to describe, share, and compare agent-based models (Grimm et al. , , ).

Overview
Purpose .
The overall objective of the model is to reproduce adoption behaviours of milk consumption by the UK public, by replicating individual preferences and decision factors.Specifically, the goal of the model is to explore the influence of perception, habits, social influence and choice evaluation in an individual's decision-making process of milk choice.The model looks to replicate the substitution of whole milk for skimmed and semi-skimmed milk from the s onwards.The main outcome reported here is the average weekly consumption of each milk type per person.The simulation uses both a theoretical grounding and empirical data to inform the ABM, with calibration performed against observed macro level data.

.
In particular, the study conducts an experiment to compare the performance of two model variants in reproducing overserved milk consumption trends.These variants present di erent mechanisms for how agents become disposed to consider their choices, representing a threshold-based, and a probability-based approach.
The agents in the model represent adult consumers who occupy a random position in an information environment.Each agent has a disposition to consider alternative milk choices.Two disposition mechanisms are tested in the model, a threshold-based approach, and a probability-based approach.Each agent makes a choice of milk selection based on a function for each alternative, made up of the perceived health and environmental characterises of each choice.These are computed at the initialisation of the simulation and then calculated at each time step.An agent's milk choice function is modified by habit, social influence, and evaluation of previous choices.Agents ascribe di erent relative importance to each constituent part of the choice function (health factors and environmental factors).If the disposition requirement has been met, consumption of each milk type is split proportionately by the size of each choice function, modulated by the other influences.If not, agents keep their existing choice.
. Agents (n = 1, 000) start with an existing choice based on the dominant position of whole milk versus skimmed varieties in (start year of the data).All agents are part of a social network.Each agent in the network can sense and be influenced by the choice function of each milk alternative for other agents in their network.Links between agents are unidirectional, and influence occurs as a function of interaction probability, with the degree of influence characterised by agent susceptibility.Social norms are globally perceived by agents and impact the weightings of the choice function (see social influence sub-model).

Process overview and scheduling .
At the start of each model run a set of agents are created, positioned and linked with other agents in a network (see social influence sub-model) in the information environment.Agents are initialised with a choice, reflecting the milk consumption split between whole and skimmed (including semi) milk in .Agents perceive information about milk choices in their immediate neighbourhood of grid cells and construct a choice function based on the average of values perceived.Agents have a memory and draw on pervious information to inform the new choice functions.Some agents begin with being habitual to reflect the incumbent, long standing milk choice (whole milk).Habit, social influence, and choice evaluation impact the final choice functions.Milk choice for each time-step is determined, proportionally, by the scores of the final milk choice functions.The simulation runs at annual time steps from to .

Basic principles
. Decision-making follows a basic structure of: agent perception of choice characteristics, the triggering, or not, of disposition to consider alternatives, a set of scored choice functions made up of the perceived characteristics and modulated by habit and social influence, and finally, choice evaluation where agents consider the impact of their choices and may adjust their future decisions. .Other properties that are known to be important in food choice, e.g.price and accessibility, are not explicitly considered here as we assume that whole milk and skimmed (including semi) have similar profiles among these characteristics.The price component of this is supported in Family Food survey data by matching expenditure and consumption for both milk types.Excluding the influence of the first few years of data, as the very low skimmed milk consumption generates high pence/litre volatility, the average skimmed milk price between and was .pence/litre compared with whole milk at .pence/litre.Across the same price data range, mean absolute percentage error for these two curves was .%. Future work looking to incorporate plant-based alternatives would look to include and model price, accessibility and availability factors given the significant di erences here with dairy milks.
. The choice function is modulated by habit, social influence (peer e ect and norms), and evaluation functions, from which an overall set of options are possible and agents seek to maximise the utility of their choice.These functions are somewhat akin to Epstein ( ) agent formalism of rational, emotional and social componentsbut di er in the decision-making mechanism.Each of these functions are described in the sub-model section.) present an ABM of di usion dynamics where agents are triggered to an 'aware' state by imitation of their neighbours in a social network.Another ABM, looking at urban water demand, contains agents whose binary state (environmentalist or non-environmentalist) in part depends on the relative proportion of agent neighbours in each state, with a logistic function governing the probability of transitioning between the two (Galán et al. ).The disposition sub-model follows these previous studies and makes use of agent social networks (neighbours) to form the basis of the disposition mechanism.Here, an agent will only consider changing their milk choice if they are in a state of disposition to do so.
. Given its significance to how the model operates, we explore di erent model conceptualisations of how to represent this.The first conceptualisation is a threshold disposition approach.Here, agents each have a threshold ( to ) below which they remain indisposed, i.e. remain with their existing choice.Thresholds are not random but rather, are informed by European Social Survey data on human values of UK respondents.The values component of the survey is based on Schwartz's theory of basic human values (Schwartz ).We operationalise answers to a survey question associated with openness to change ( Schwartz) to indicate the level at which an agent would consider alternatives.Agents in the model replicate the distribution of answers from UK respondents on a scale of , very low threshold, to , very high threshold.Agent disposition is calculated by the proportion of agents that have chosen whole milk or skimmed/semi-skimmed milk.In this mechanism, there is also a small random probability ( %) that agents will become disposed.This is to reflect the sometime spontaneous or impulsive nature of food choice.

.
The second model variant is a probability-based mechanism, and follows an approach by Wang et al. ( ), adapted and developed for this model.Rather than a discrete threshold, disposition is expressed as a likelihood given by Equation : where k is the gradient of the probability logistic function; h is a measure of the uncertainty in aggregate neighbour choice and uses information entropy as an indicator (Equation ) of how 'settled' or 'unsettled' the collective neighbour decision is; h max given by − log 2 0.5 (equal to ), is the maximum value this can take; and . is a corrective coe icient to ensure that the probability is bounded between and .As h approaches h max (i.e.maximum 'unsettled' agents), the function tends towards (i.e.toward % probability), as h approaches (i.e.maximum 'settled' agent) the function also tends towards (i.e toward % probability).The measure of neighbour choice uncertainty is expressed by Equation : where, f whole and f skim , are the frequency of agent neighbours choosing whole milk or skimmed milk, and f all is the total number of neighbours.
. Cognitive perception.The cognitive perception sub-model represents the processes of how information regarding food choices operates and is perceived by agents.To reduce computation time, information is held directly by agents rather than an environment, with values drawn from a normal distribution of each of the components.The means of these distributions are determined by the health and environmental perception parameters.Values are given relative to the incumbent choice, i.e. whole milk.That is, whole milk has a mean value of for each component and alternative values are set relative to this.At each time-step, the information points are redistributed stochastically.This is to mimic the reality that information perceived by people is o en transient in everyday life.
. Memory e ects are included, with agents able to store a limited amount of averaged information from a set number of previous time-steps.At each new time-step, new information is added to a rolling average of previous time-steps and information is removed beyond a threshold governed by the memory length parameter.The upper and lower memory bounds ( , ) follow the El Farol NetLogo model by Rand & Wilensky ( ).
. The choice function is made up of the memory averaged components, weighted by the importance an agent ascribes to a particular component.This is given by Equation : where β 1 and β 2 are the weights assigned to the perception of health and environmental impact of the milk choices.Initially, weights are assigned randomly.As the model runs, agents can change their weights to align (or not) to the prevailing social norms (see social influence sub-model).
. Habit.Food behaviours are habitual (Riet et al. ).We model this as a multiplicative bonus on the milk score that repeatedly outturns as the highest scored choice.I.e.repeat behaviour makes it easier to continue this behaviour.The form of this is adapted from the empirical mathematical function of habit formulation (automaticity) of simple healthy activities from Lally et al. ( ).It is scaled to between to , being the upper habit 'multiplier' an agent can experience.The value of the upper 'multiplier' is a modelling choice to prevent this model component unduly dominating the decision-making process.The value that this takes should be subjected to robustness analysis in future work.This habitual inertia to change resets at each new choice.And so, repeat behaviours get reinforced but can be overcome if the choice function of the other option is su iciently large.The equations governing habit are given by the following: where peak habit (fixed at ) is the maximum choice function multiplier due to the influence of habit, consecutive choices is the number of uninterrupted same milk choices made by an agent, and habit threshold is the number at which the influence of habit e ects the milk choice functions.The coe icient of the exponent ( . ) is taken directly from the empirical modelling work of habit formation by Lally et al. ( ).
. Social influence.This sub-model represents the process of how agents interact with each other, and how the influence of social norms is modelled.Each node (agent) on the network has an average number of connections (degree) and strength of connection.The structure used in the model follows a small-world network (Watts & Strogatz ).We refer to this network as an agent's neighbours.They represent social interaction and peer influence in a broad sense for example within households, food shopping, and retail, and do not simply represent real-world physical 'neighbours'.The process is as follows: each agent has an incumbent choice at model initialisation and a new set of choices based on the latest information, an agent has a probability of sensing the choice functions of the other agents in their network.Influence is modelled as the mean of neighbour choice function values, with the e ect on an agent modulated by social susceptibility.The equation governing peer influence is given by the following: where f (cog.) is the cognitive component of the choice function (Equation ), social susceptibility is the degree to which an agent is influenced by its neighbours, and f (cog.)neighbour is the mean neighbour cognitive component of the choice function.
. In addition to social influence through an agent's network, social norms also play a role in choice.Here, social norms are modelled as the relative weights (β 1 and β 1 in Equation ) between the two components (health, environmental).This is conceptualised as the population level view of how important health and environmental matters are in general.This is a global variable that all agents can perceive.It is informed by empirical data from long running longitudinal surveys on concerns and issues perceived by the UK public (Ipsos Mori and YouGov).
Each agent has a degree of conformity (between and ) to the changing importance of health and the environment.Conformers (i.e.factor of ), will look to wholly align their individual weightings with the social norms.
Non-conforming agents look to do the opposite, and all other agents in between see a proportionate e ect.Each time-step the agents can shi their weights by at most % to closer reflect the weighting implied by the public concern data.
. Evaluation.Agents reflect and evaluate the choices they make.Evaluation o ers a mechanism for agents learn from prior experience and use this to inform future decisions.Cognitive dissonance in food choice is one theory of evaluation (see Ong et al. for a review), and we model this through a conceptualisation of tension between an agent's human values and their milk choice behaviour.Here we use the theory of basic human values to assign two values (security and universalism) to each agent, reflecting, broadly, their position on health and their position on the environment.High importance for universalism values are associated with a deeper concern and action toward environmental issues (Schultz et al. ; Schwartz ).Within Schwartz's theory of basic human values, health is orientated to the security value (Schwartz ).The European Social Survey includes questions from Schwartz basic human values.We take UK responses for universalism and security questions and operationalise them to give a distribution of values relating to environmental and health impacts of agent choice (based on data in Table ). .
Each milk choice has an associated health and environmental impact.Note, in the model, agents are free to consume any combination of each type up to the total average consumption, with a minimum consumption of pint ( ml).At each time step, the aggregate impact of the choices is calculated and then compared against the agent's values on a relative basis.If this relative impact is within a given proximity to their value position, determined by the 'cognitive dissonance threshold' parameter, no feedback is sent.However, if the di erence is su iciently large, the agent enters a state of cognitive dissonance whereby their actions are incongruent with the values they hold.Here, agents pursue the least costly path to try and escape this uncomfortable state.They will either reconsider their behaviour (next choice) and become spontaneously disposed, or they will alter their value base slightly to better fit the choices they make.The change of values is fixed at % each time step.If the di erence between impact and value base is too large, given by the 'justification' parameter, agents simply rationalise this dissonance and once again no feedback occurs.).The NetLogo model and Python were linked via the NL PY package ( Gunaratne), which allows the remote control, execution, and analysis of the model from within a Python environment (in our case Jupyter).The DEAP Python package was used to execute the EA (Fortin et al.

Calibration, Simulation Experiments, and Model Analysis
).Specifically, we employed the "Mu Plus Lambda" algorithm, using a simulated binary crossover, polynomial bounded mutation and the NSGA-II selection algorithm.Candidate parameter sets were drawn from a uniform distribution over upper and lower bounds and an initial population of individual sets were created, with the algorithm running over generations.Prior to running the EA, analysis was conducted to narrow the parameter space of k, the gradient of the probability logistic function, given that it is the sole additional parameter in the probability-based model variant, and its possible importance in model performance and comparison.The results of this pre-calibration analysis are given in Figure of the Appendix.

.
Initial conditions were set at , agents, with a starting average whole milk consumption of , .ml and skimmed (including semi) consumption of .ml per person per week which ran at yearly intervals from to .
. Post-calibration, uncertainty analysis sampled the optimised parameter sets and ran the full model output against observed data.Given the bi-objective function for the calibration, it is expected that more than one parameter set will be found, i.e. multiple non-dominated candidate sets.Saltelli sampling was used to generate parameters sets to run, with sample size given by the expression n(2p + 2), where n is the baseline sample size and p the number of model parameters.Total sample size for each parameter set was fixed at to ensure that each model variant had the same number of runs.This figure represents a compromise between su icient sampling of the optimised parameter space, and computational resource constraints.The baseline sample size (n) was reduced in the probability-based approach, given the extra parameter in this model variant.

Sensitivity analysis .
A temporal global variance-based sensitivity analysis was performed on the skimmed milk output curve from the best performing parameter sets of each model variant (see Figure c) from the calibration exercise.Samples (n = 1, 040) were drawn from a bounded ±2 % range of the central value and the analysis repeated seven times, at -year intervals, for each model variant.This constrains the parameter search space to allow a more manageable exploration, given the computational expense in a single model run. .
We use Sobol analysis with Saltelli sampling, implemented in Python using the SALib package (Herman & Usher ).Sobol analysis is a technique that estimates the relative contribution a parameter makes to a summary statistic (here, average weekly milk consumption) within di erent parameter sets (Sobol ).Saltelli sampling extends Sobel' analysis and is robust to non-linearity between model inputs and outputs, and can give a quantified assessment of the contribution to overall model uncertainty attributed to the interactions between inputs, as well the individual inputs themselves (Saltelli ; Saltelli et al. ).

Simulation experiments for model analysis .
The aim of this experiment was to compare two di erent model structures of agent decision making to consider alternative choices.Here we model two modes of agent disposition, a discrete threshold approach and a continuous probability approach.The rest of the model structure remains unchanged, see Figure of the Appendix.Initial conditions were set at , agents, with a starting average whole milk consumption of .ml and skimmed (including semi) consumption of .ml per person per week, which ran at yearly intervals from to .Parameter values and ranges are those given in Table .The only di erence between the two variant sets is an additional parameter governing the gradient of the logistic function in the probability-based disposition approach.The parameter space of each model is explored via the calibration exercise detailed in Section . .

Calibration
. The choice of objective function is key in calibrating model output to observed data, and we considered two approaches here.These were; minimise the root mean square error between the pairs of models and observed curves, and minimise the absolute di erence at the point of consumption crossover and final value for the skimmed milk consumption curve.The latter was chosen as it performed better in pre-calibration tests.It is not certain that this addition was directly responsible for these changes, or if simply re-running the optimisation exercise resulted in a di erent set of values.We can point to stability (pink dots, no/very little variation) of the threshold-based parameter sets and the wider variation of parameter values under the probability model variant, as evidence of the former.

Figure :
Optimisation results of parameter sets from the evolutionary algorithm to calibrate model and observed output.Results are given for each model variant, blue dots for the probability approach, pink for threshold approach, showing the mean parameter value and measure of the spread across di erent parameter sets ( % CI).The space in which parameters could occupy is given by the grey bars.

Simulation experiment
Model comparison and uncertainty analysis . .The threshold approach on average has a higher root mean squared error ( and ml/week) than the probability approach ( and ml/week), but over a narrower range (interquartile range of and ml/week vs and ml/week).
Figure : Repeat model runs from the optimised parameter sets against observed data for each model variant considered.The colour pink gives the threshold approach for whole milk, blue shows the threshold approach for skimmed, gold gives the probability approach for whole milk, green shows the probability approach for skimmed.

.
To compare the simulated and empirical survey data -from the UK's Family Food survey (DEFRA )-the dynamics of the latter are briefly detailed.Very little skimmed (and semi-skimmed) milk consumption occurred until , where began a period of increased consumption for the next years, surpassing whole milk in and peaking in at .litres per person per week before experiencing a plateauing of consumption.The simulated data was in general able to replicate the empirical data, with the probability approach performing better (lower RMSE), however, there are some di erences.Focusing on the 'best' selected skimmed milk simulated model runs (Figure c), the threshold approach output (blue line) had no period of limited or gradual consumption increase, instead rising rapidly from initialisation ( ).The output for the probability approach (green) more closely resembled the initial period of low consumption, but once again entered a phase of rapid increase several time steps (years) before the empirical curve.The crossover with whole milk, although in good alignment in terms of milk quantity, occurred years earlier ( ) for the probability approach, and year later ( ) for the threshold approach.The probability approach output (green) was able to closely match the peak and plateau dynamics, whereas the output of the probability approach (blue) was consistently less than that of the empirical data.

Sensitivity analysis
.
Figure shows temporal ( -year time step) variance-based global sensitivity analysis for each model variant.
Le hand figures represent the threshold model and right hand figures the probability model.Overall, 'Initial perception of environmental component of alternative' had the highest mean (over the temporal interval) sensitivity ( . ) of parameters in the threshold model, and 'Initial habit of incumbent' the lowest ( .).Sensitivity was overall higher in the probability model, with fairly equal mean parameter sensitivity ranging from . to . .However, single time-step sensitivity was more variable with 'Social blindness' having the highest sensitivity ( .), and 'No. of neighbours' the lowest ( .).

Discussion
. This agent-based model aimed to give a plausible representation of historical UK milk consumption trends to explore choice substitution from an incumbent (whole milk) to an alternative (skimmed/semi-skimmed milk).We ground the model in theory from social psychology, behavioural economics and environmental psychology, and use empirical evidence (human values survey data and public concerns data) to generate a richer representation of possible agent food influences and decision-making.UK historical milk consumption was chosen as its trend exhibits a classic substitution and adoption curve between two product choices.Overall, the model succeeded in reproducing the consumption pattern observed during the years -.
. Comparing models of human decision-making has been described as a grand challenge for ABM's (An et al. ).We compared two model variants, tweaking the agent disposition mechanism, while keeping the rest of the model unchanged, to give a threshold-based approach, and a probability-based approach.The probabilitybased approach performed better than the threshold variant, characterised by a lower average root mean squared error against observed data.The could stem from the continuous versus discrete nature of each approach in deciding if an agent will become disposed to consider alternative choices, i.e. a less binary choice trigger mechanism could be more reflective of real-world decision-making.

.
Within global variance-based sensitivity analysis, most studies do not look at temporal dynamics, instead performing a snapshot of the final outcome that does not consider variability in parameter sensitivity influence over time (Ligmann-Zielinska et al. ).The temporal global variance-based sensitivity analysis that we conducted indicated that the probability-based approach showed greater output variance than the threshold approach.This is supported by the larger spread of model outputs given in Figure .The reason for this could be that a less deterministic agent choice disposition (the threshold approach does still allow for some randomness) feeds through the model structure and results in a larger variance of model outcome.Across parameters, the health perception of alternatives was the dominant source of variance in the threshold model.In this model, the environmental perception parameter had a small contribution to model uncertainty.This could be explained by the stark di erence in nutrition (principally saturated fat) of the two milk options, but only the slight di erence in environmental impacts (see Table ).If plant-based alternatives were considered as a third option in an extended model, it is plausible that this component would have a larger e ect.No single parameter dominated in the probability approach, with variance instead more evenly distributed.

.
The model was able to reproduce, with quantified uncertainty, the observed pattern of historical milk consumption.However, the model curves di er from observed data in their rate of change.Model variants produces a pair of more aggressive curves than the real-world data.This could be due to a number of reasons, including the exclusion or inclusion of choice influences, and the e ect size of fixed values in the model.The latter point was somewhat explored through analysis of k, the gradient of the probability function, however, more complete and rigorous robustness analysis should be conducted in future model iterations.Second, the modelling choices and structure reflect just one of a great many di erent model components and possible combinations.
Although we explore this through di erent model variants, it is but one plausible representation of observed phenomena.Indeed, the question of model structure and model discovery is the central line of investigation of an emerging field within the ABM literature known as inverse generative social science (Vu et al. ).
. Finally, the model did not look to predict agent absolute consumption of milk, rather, it allowed for the relative split under a total average weekly milk consumption, e.g., from %/ % whole milk and %/ % skimmed (including semi) and every combination in between.The reason for this was that although skimmed and whole milk show a classic substitution curve, it is against a backdrop of declining overall consumption.The model does not aim to account for the factors involved in this decline, only the relative consumption between milk types.Neither did it allow for non-milk consumers in the population (but agents could choose to consume no whole or skimmed milk).We justify this modelling choice based of market research for the Agricultural and Horticultural Development Board that found % of UK households purchase liquid milk (AHDB ). .
The study presented here serves as a foundation to extend the model.In particular, future work could give a forward-looking study of milk consumption with scenario analysis to investigate consumption dynamics under changes to milk choice influences.Future work should also include plant-based alternatives to understand their possible adoption trajectories, and the policy interventions that may influence this.

Conclusion
. This paper described an agent-based model of food choice substitution, applied in the context of UK milk consumption.The model was structured around the perception of, and disposition to, consider alternatives, and is modulated by habit, social influence and choice evaluation.The model was informed empirically, and grounded by theories of social psychology, behaviour change, and consumer food acceptance.Simulation experiments were conducted to test the capability of the model to represent observed phenomena, the robustness of which was tested by comparing the performance of two model variants.Both model variants reproduced the general pattern of the observed data, however, the probability-based disposition approach performed better (smaller average error) than the threshold disposition approach.
. We employed temporal global variance-based sensitivity analysis, an improvement on final snapshot global sensitivity analysis, to understand how the variance associated with model parameters changed over the simulation.In both model variants, interactions between parameters, rather than parameters themselves were the source of most variance.This is an expected feature of agent-based models, however, temporal analysis showed the threshold approach converged toward a lower variance, whereas the probability-based approach remained at higher and less stable variance.
. This study is the first to present an ABM of food choice substitution in the context of UK milk consumption.The model takes a novel approach to the inclusion of basic human values, and, to our knowledge, is the first ABM to incorporate values survey data and use it to inform agent decision-making on food choice.Analytically, this study contributes to widening the limited literature base of ABMs that conduct temporal sensitivity analysis.
. Testing di erent human decision-making models in the context of real-world phenomena is an important area of investigation, that can push the frontiers of agent-based modelling and research.This study contributes, with the exploration of small di erences in decisions-making, to this evidence base, and provides a basis for further testing of more detailed decision-making models.Finally, the results of this experimental analysis and simulation provide evidence for modelling choices going forward.Specifically, the probability-based approach performed better in the experiment and this mechanism will inform the design choices for future modelling work that looks to incorporate plant-based milk alternatives.Table : Model parameter descriptions.

Parameter Group Description
Memory length

Cognitive perception
The size of an agent's memory that it can recall previous information.Cognitive perception is based on averaging values in the memory.

Initial perception of health component of alternative Cognitive perception
The perception of the health impact of skimmed (inc.semi) milk given as a relative score against whole milk.

Initial perception of environmental component of alternative Cognitive perception
The perception of the environmental impact of skimmed (inc.semi) milk given as a relative score against whole milk.

Habit threshold Habit
The number of consecutive choices that return the same majority milk type consumption needed before the e ects of habit take place.

Initial habit of incumbent Habit
The initial number of consecutive choices that have returned the same majority milk type.

Probability of interacting Social influence
The probability of an agent interacting (exchanging information on milk choice function scores) with other agents in its network.

Social susceptibility Social influence
The susceptibility of an agent to modify its milk choice function scores based on information it receives from neighbours.

Social conformity Social influence
The degree to which an agent will seek to conform its weighting between cognitive milk components (health, environment) to closer reflect the general public concern for each of these issues.

No. of neighbours Social influence
The number of neighbours in an agent's network.

Gradient probability disposition (probability model variant only) Social influence
The slope of the function that determines how quickly the probability of being disposed to consider choice of milk as a function of the informational entropy of milk choices in an agent's neighbour network.

Social blindness Evaluation
The probability that an agent has the ability to perceive the impact of its choice and therefore the option of evaluating it.

Post-choice justification Evaluation
The threshold beyond which an agent will simply justify the discrepancy between its values and behaviour (milk choice impacts), rather than act to resolve it.

Evaluation
The threshold below which any discrepancy between an agent's values and its behaviour (milk choice impacts) will not trigger a state of cognitive dissonance.

YouGov
The gradient (k) in Equation governs the rate of change of the logistic function.Physically, k represents the rate at which the set of agent's neighbour choices (given by an expression of information entropy) e ects the probability that they will become disposed to consider alternatives (Equation ).To narrow the parameter space for calibration, we explored values of k from to and averaged the resulting model behaviour over a number (n = 520) of di erent runs.Figure shows that an increasing k flattens the curve pair and delays the point at which curves undergo the rapid rate of increase or decrease.This shows the choice of gradient is significant with the model's macro behaviour diverging from the desired trend as it tended toward higher values.From this analysis, we bound k between and as an allowable range for the calibration exercise.
Figure in the Appendix represents the model flow schematically.The choice function is made up of two cognitive components: the health and environmental impact of the two milk types.

Calibration and model output verification .
Calibration was performed on the whole model across all parameters over the bounded range of values (see Table for ranges).The empirical pattern to be matched by the model was a set of two longitudinal curves of trends in the average weekly consumption of whole milk and skimmed (including semi) per person by the UK public from to (DEFRA ).The model looked to replicate the observed substitution of one food product with the adoption of an alternative.Therefore, we chose the fitting criteria to be the value of average skimmed milk consumption when the curves crossover ( -see Figure for crossover) and the final value of skimmed milk consumption in .The optimisation looked to minimise the di erence between the model and observed output at these time steps.The strategy used was a bi-objective evolutionary algorithm (EA) implemented in Python (van Rossum & Drake

.
The calibration exercise produced optimisation results of parameter sets for the probability-based model variant, and sets for the threshold-based variant.Figure shows these results, comparing the mean and spread ( % CI) of the parameter values.Blue dots and ranges represent the probability-based variant, pink dots, the threshold model variant.Parameters are grouped under their respective meta-parameters.A change in optimised values of the parameter set occurs with the introduction of a th parameter, 'gradient probability disposition' in the probability-based approach.
Figure shows the comparison between outputs of the threshold and probability-based disposition models.Calibration was performed on each model variation with candidate parameter sets, n= and n= respectively, sampled (Saltelli) over a ±2% range of optimised values.Figure a and Figure b show model output against observed milk data for the threshold and probability-based disposition approaches respectively.Figure c combines the 'best' set of runs from each approach, defined here as the closest match to observed data by way of smallest error (root mean squared error).Figure d shows this error, calculated from Figure c Figure :Repeat model runs from the optimised parameter sets against observed data for each model variant considered.The colour pink gives the threshold approach for whole milk, blue shows the threshold approach for skimmed, gold gives the probability approach for whole milk, green shows the probability approach for skimmed.Figureashows milk substitution pairs for the threshold-based agent disposition approach and b shows the same for the probability-based approach.Figure a contains pairs and b pairs and each of these model runs were repeated times.Figure c compares the best runs from a and b against each other and the observed milk data.Figure d quantifies this performance, giving the root mean square error for each approach.

.
Figures a and bshow the total variance at each time-step split by parameter variance and the variance due to parameter interactions.In the threshold model (Figurea), the cumulative variance (across the entire time interval) was % due to parameters, and % due to interactions between parameters.In the probability model (Figureb) this split was even more marked at % to %. Figurec and dshow this same output, but now split by parameter group and their interactions.The threshold disposition model shows a downward trend toward a stable output made up of variance overwhelmingly due to cognitive perception parameters.Output for the probability disposition model does not show this same trend, and does not converge over the analysis time frame.Figure e and f give the variance composition as a relative measure and show the increasing relative sensitivity dominance of cognitive perception parameters in the threshold model, and a reasonably stable variance attribution to the four parameter groupings in the probability-based approach.

Figure :
Figure : Temporal ( -year intervals) global sensitivity analysis applied to the skimmed (including semi) milk consumption output of the best set of model runs (Figure c) for each approach.Figures a and b show variance due to the parameters and interactions between parameters.Figures c and d show this same trace but split by parameter group.Figures e and f give this on a proportional rather than absolute basis for each model approach.

Figure :
Figure : Flow chart of agent-based model of milk choice.The dashed box contains details of the two approaches tested that model how an agent becomes activated to consider alternative choices.The orange box details the threshold approach, and the blue box details the probability-based approach.

Figure :
Figure : The e ect of changing parameter k, the gradient of the probability disposition function, on the model output.The figure shows the whole milk-skimmed milk curve pairs for values of k from to , against the observed data (black).
Table gives the model parameters that operate within these functions.A more detailed description is given in Table in the Appendix.containing the empirically informed (via Ipsos Mori and YouGov long-run surveys) concerns of the UK public regarding health and the environment.We use this to represent the population level observed social norms that agents perceive and seek to align with to a greater or lesser degree.See Table in the Appendix for details of the survey questions.

Table :
Questions from survey data used in the model.This question relates to the 'security' value which contains the health dimension.Note, the expanded item PVQ includes a direct question on health, 'She/he tries hard to avoid getting sick.Staying healthy is very important to her/him', but in the absence of this data in the ESS, we opt for the most relevant security value question.She/he likes surprises and is always looking for new things to do.She/he thinks it is important to do lots of di erent things in life.This question relates to the 'stimulation' value which forms part of Schwartz's openness to change higher order value dimension.UK responses to this question inform the level at which agents would consider alternative milk choice in the threshold-based model approach.