Pitfalls in Spatial Modelling of Ethnocentrism * a Simulation Analysis of the Model of Hammond and Axelrod

There is an increasing interest in using search heuristics to test and find solutions to problems in the place of analytical solutions and determin-istic algorithms, probably due to the accessibility and ease of studying emergent phenomena in complex models. However, what we gain in tractable complexity, we lose in clarity. This paper is a case study of an agent-based model by Hammond and Axelrod studying the evolution of tag-based cooperation – " ethnocentrism " – with a focus on the consequences of assuming a lattice structure in which agents are located and with strict spatial rules for how they reproduce. It is shown that the spatial structure is what drives the results, rather than the phenomenon that is under study. These findings illustrate the importance of scrutinising the assumptions and verifying the robustness of each assumption in the algorithm when we move away from analytical tractability and use search heuristics rather than deterministic algorithms, in specifically illustrating how spatial assumptions can alter a model to the extent that it no longer describes the phenomenon under study. Given that the kind of lattice structure in the case study is common in agent-based evolutionary algorithms, the problems highlighted here may have far-reaching consequences. This paper suggests methods for verifying the validity of such algorithms.


Introduction
Modelling is a powerful tool to understand and identify mechanisms in the world around us when we need to find general rules or when we cannot simply measure the object under study using experiments and available data.With limited computational resources at hand, scientific models have typically been analytically tractable, and particularly important in the natural sciences, and in particular physics.
However, with the advent of electronic computers and the growing field of computer science, it is now possible to study complex systems, and computer modelling is finding its way also into disciplines such as biology and the social sciences.Through simulations, we can find outcomes of increasingly complex modelling assumptions and make predictions without understanding exactly what is going on in the model.When it comes to making predictions and forecasting events, such as weather reports and climate studies, computer simulations are major pillars.
While models should be evaluated on their ability to predict what we actually observe in the world, forecasting is not their only purpose, but sometimes the purpose can instead be to understand the mechanisms that lead to certain outcomes.Simple, analytical models are usually well-designed for this, but computer modelling enables us to find more intricate patterns, such as emergent systems where a population of individuals following simple rules together form a complex system that is not only an aggregation of frequencies of individual behaviours (Miller and Page, 2007;Stolk, 2005;Wilensky, 2002).
The focus here will be on agent-based models (ABMs), which work as evolutionary algorithms.These are based on the same mechanisms as biological evolution: heredity, variation and selection (Mitchell, 1976).An execution of an evolutionary algorithm typically starts with agents that can reproduce, whose offspring inherits the traits of its parent(s) subject to mutations, and the reproductional success of agents is related to their strategies in a changing population.The solutions are the strategies that are most successful in the long run, and these are thus (at least local) optima under the selection pressure.
Evolutionary algorithms have a vast number of applications across disciplines, for example by solving optimisation problems.But apart from finding strategies that lead to outcomes that are optimal, the aim can be to recreate an outcome that corresponds to some real-world phenomenon, and to find under what conditions we would arrive at that equilibrium.Such evolutionary algorithms are commonly referred to as ABMs.The purpose with these is to gain better understanding of certain issues, rather than solving technical problems (see for example Axelrod's tournament, Axelrod, 1987).
A problem with evolutionary algorithms is that the performance of the strategies depends heavily on their environment (Mitchell, 1976).High-scoring strategies may be good at exploiting weaknesses of the other strategies available, but it is not necessary that these would score well in a different environment.This becomes especially important when drawing conclusions about the world.How generalisable are the results?Are the assumptions sound?How does each assumption affect the result?
In this paper, we will see how what can seem like innocent routine assumptions can drive the results almost completely, such that the outcome relates little to the phenomenon under study, and all the more to very specific assumptions, which lack generalisability, about the environment of the agents.This illustrates the importance of scrutinising the assumptions and verifying the robustness of the algorithms as a substitute for the full control we abandon when we move away from analytical tractability and use search heuristics rather than deterministic algorithms.
This paper is a case study of a well-cited ABM by Hammond and Axelrod (2006b) that uses computer simulations to address the issue of how preferential treatment towards people of the same group or with similar characteristics as oneself (sometimes referred to as ethnocentrism) may have evolved.There is a wealth of models addressing this issue, and it is a common trait among these to model these through tag-mediated population: agents have a number (a double or an integer) associated to them and base their behaviour on absolute distance or equality.This is a field of research that contributes to the debate on human co-operation, but is also of interest for improving the problem-solving performance of genetic programming systems (Spector and Klein, 2007;Spector et al., 2011).
The founding model in this tradition of tag-based genetic algorithms may have been a well-cited model by Riolo et al. (2001), where agents have a visible tag on a continuum and cooperate with sufficiently similar others.The paper in question draws bold conclusions on the evolution of co-operation, but may suffer from similar shortcomings as the model that will be analysed in detail here.In short, the number of offspring is determined by the success of the interactions and offspring inherit tag and tolerance level, subject to mutations.The result is that co-operation is maintained within small tolerance levels, but as tolerance levels increase due to drift, mutants with lower tolerance levels invade and form new cooperative clusters consisting of their offspring.Thus, in this model, and typical for models in its wake, preferential treatment based on the tag is successful if and only if it correlates highly with relatedness, with signals being but proxies for kin recognition, something which the authors claim to go beyond.Another restriction in this model is that co-operation relies on the fact that agents are not given the possibility of cooperating with no one (Roberts and Sherratt, 2002).
It is common in evolutionary algorithms that the individuals reside in some spatial setting, such as a lattice or a chain-like network, and interact only locally, with their neighbours.This paper is a case-study of one such ABM where the agents live on a lattice, a structure that is widely used agent-based evolutionary algorithms (Sarker and Ray, 2010), and thus of particular importance.The aim here is to suggest methods for scrutinising and validating assumptions for such a model and to illustrate the importance of such procedures by showing that the results in this model are mainly driven by spatial assumptions rather than the strategic interactions it was designed to investigate.

An ABM of 'ethnocentrism'
In order to understand the research questions addressed by Hammond and Axelrod (2006b) and others and how to evaluate the model to be analysed here, we will first start with a short review of the research on tag-based cooperation, or ethnocentrism.
Human beings are adapted to living in groups.Ethnocentrism refers to the tendency to behave differently towards people depending on whether they belong to the same group, the ingroup, or some other group, the outgroup.Ethnocentrism is sometimes referred to as an ingroup bias (Brewer, 1999), defined as beneficial behaviour towards the ingroup but differentiating it from previous definitions by not necessarily including outgroup hostility (Sumner, 1906;Levine and Campbell, 1972).The two terms will here be used interchangeably.There is vast evidence for an ingroup bias, both from field studies and laboratory experiments (Brewer and Campbell, 1976;Kramer and Brewer, 1984;Yamagishi and Mifune, 2009), but it is not clearly understood why people are sometimes willing to take on a short-term cost based on group tags.Within small groups, apparently altruistic behaviour can often be explained by kin selection (Hamilton, 1964) or reciprocity (Trivers, 1971).However, people cooperate in groups large enough to expand beyond interactions with relatives and people they know.
In-group bias can be triggered by minimal cues from arbitrary group definitions (Tajfel et al., 1971;Doise et al., 1972;Ahmed, 2007).There is evidence that the ingroup bias works on an implicit level (Otten and Wentura, 1999) and the bias seems to be regulated by the hormone oxytocin, suggesting deep biological roots (De Dreu et al., 2011).The objective for an evolutionary model of ethnocentrism is thus to find minimal conditions for when group discrimination emerges from arbitrary group tags that are not subject to direct reciprocity or kin recognition.
The model that will be investigated here was presented by Hammond and Axelrod (2006b), and aimed at finding minimal conditions for ethnocentrism to emerge.In the model, agents have a group phenotype and one strategy for how to play in a prisoners' dilemma, co-operate or defect, towards members of the ingroup, and one for how to play towards members of the outgroup.That is, agents have a final integer variable and utilise different methods, with consequences for reproduction, depending on whether these match when coupled with another agent.Defection is always rational in a one-shot game, while cooperation is socially optimal.An ethnocentric agent is defined as one who cooperates with the ingroup and defects towards the outgroup.Agents populate a toroidal lattice where they communicate only with their immediate neighbours.Agents are born with a tag (group membership) and their strategies, and reproduce, onto neighbouring patches, according to how well they have performed in previous interactions.
The model is successful in the means of providing a breeding ground for ethnocentric agents, which come to dominate the population.As was acknowledged already in a previous publication, however, the strict spatial structure is both a necessary and sufficient condition for co-operation to emerge in the model (Hammond and Axelrod 2006a) and it causes such high degrees of relatedness in agents' neighbourhoods that the model was first launched as an illustration of the "armpit effect" (distinguishing strangers from unfamiliar kin) (Axelrod et al., 2004).
Nevertheless, the model has gained much interest among researchers of both genetic algorithms in general and ethnocentrism in particular, from pure replication studies (Wilensky and Rand, 2007) to further analyses and extensions of the model, and also further developments into broader applications in evolutionary computation (Grappiolo et al., 2013).It is also part of the simulation software NetLogo's Models Library (Wilensky, 2003).There are several simulation studies that have tried to explain the results of the model.It has been observed that humanism (indiscriminate co-operation) is the most successful strategy in the first rounds of the simulations, and that ethnocentrism triumphs humanism first when the population is dense (Schultz et al., 2008), hypothesised to be a product of ethnocentric exploitation of humanitarians rather than free-rider suppression (Schultz et al., 2009), and then hypothesised to the contrary (Kaznatcheev, 2010).
In the words of Kaznatcheev (2010), the model "expand[s] beyond random interactions to facilitate the emergence of co-operation", and the strong effect of the spatial structure found already by Hammond and Axelrod (2006a) should raise concerns on its implications for ethnocentrism.In one study, viscosity is kept at high levels, but the structure is changed into Barabási-Albert networks (and sexual reproduction is added).The results "indicate that the spread of favoritism towards similar others highly depends on the network topology and the associated heterogeneity of the studied population" (Lima et al., 2009).Another simulation study concludes that kin selection may be a driving force in the model (Li, in preparation).Finally, it has been concluded that while group tags maintain co-operation, what creates it "is not the visible group tags of agents, but rather children residing close to their parents" and that spatial viscosity is a mechanism to increase the probability of self-interaction (Kaznatcheev and Shultz, 2011).
It could thus be that the model, through spatial viscosity, creates a setting for kin selection to initially favour co-operation towards similar others, and that this later on generalises into ethnocentrism.We will see, however, that in the model, spatial viscosity and tags are two sides of the same coin: they both function as a means for self-identification.

Overview of the paper
We will start out with a more detailed description of the Hammond-Axelrod model.Then the rest of the paper will scrutinise the model through simulations to provide a fuller understanding of what makes the ethnocentric strategy successful and draw conclusions on whether the ethnocentric strategy in the model has any bearing on the real-world phenomenon we want to describe.
First, the significance of the spatial structure will be examined, by replicating previous findings that the structure in itself generates co-operative interactions, and that with only tags and without spatial structure, both ingroup and outgroup will be defected against.We will also see how large the neighbourhood may be before the ethnocentric strategy fails.Second, it will be examined why ethnocentric agents are more successful in the model than universal cooperators, and how robust the strategy is against misidentifying kin.Third, the significance of the number of tags in the model will be investigated together with the resistance of the ethnocentric strategy against pure kin discriminators.Drawing conclusions on the limited applicability of the spatial model, an alternative approach will be suggested and tested.Finally, the main findings will be summarised and conclusions drawn for the applicability of the model and what can be learnt from it for computer modellers using spatial structures.

Description of the model
Below follows a description of the Hammond-Axelrod model (2006b) with standard parameter settings.For a more formal description, see the appendix of their paper.
Agents populate a toroidal lattice consisting of 50Ö50 patches, each with room for at most one individual.The lattice is empty in the beginning, becoming populated through immigration and reproduction as described below.Each agent has one out of four visible tags, one strategy for how to play towards an ingroup member (exhibiting the same tag), co-operate or defect, and one strategy towards outgroup members (that is, agents have a final integer variable each that is compared).The simulation runs for 2,000 rounds and the outline of a round in the simulation is the following: Immigration.An agent with random traits enters the population.The potential to reproduce (PTR) of all agents is reset to 0.12.
Interaction.Each agent interacts once with each of its immediate horizontal and vertical neighbours in a prisoners' dilemma.The agent observes the tag of the partner and chooses a strategy accordingly.A co-operating agent reduces its PTR by 0.01 and increases that of its partner by 0.03.
Reproduction.Each agent is chosen in a random order and reproduces one offspring with the probability of their PTR onto an empty neighbouring patch, if there is one.The offspring inherits the traits of its parent, but may change tag, or any of the tag-based strategies, with a mutation probability of 0.005 per trait.An agent surrounded by populated patches cannot reproduce at that round.
Death.Each agent has a probability of 0.1 to die.Simulations will also be run for a version of the model with random interactions (without spatial structure).The carrying capacity of the population will be kept at 50Ö50 = 2,500, implemented by multiplying the PTR of all agents by one minus the density (population size divided by 2,500).The more crowded the population, the less likely it is for each agent to reproduce.
The original simulations were run in NetLogo.As the advantages of using NetLogo, such as visualisation of spatial dynamics, will not be used here, all simulations will be programmed directly in Java for full flexibility.Replicating in a different language also serves to test the robustness of the implementation with respect to the conceptual model (as discussed in Wilensky, 2003).An important reason for choosing Java is that object-oriented programming languages are particularly apt for agent-based modelling, since the execution involves heterogeneous individuals rather than performing population-based computations, and the agents have a direct correspondence in software objects, with a general blueprint described by the class.
Results will be averaged over 100 runs, and frequencies of strategies will be represented by mean and standard error, as in the original study (Hammond and Axelrod, 2006b).Distributions of frequencies of strategies over all runs are slightly skewed due to boundary effects for rare and highly dominant strategies, but the differences between the mean and the median are small, usually less than one percentage unit.

Spatial structure
Agents populate a toroidal lattice and interact only with their four immediate neighbours.Admittedly, this setting bears little resemblance to social networks in the real world.Nor is the model (or minimal models in general) constructed to imitate the world, but rather to find minimal assumptions that can lead to observed phenomena.Is the assumption of the spatial structure credible, then, or is it an oversimplification creating the results?We will first see what impact the assumptions of spatial viscosity have on the results and then examine how generalisable they are to resemble actual social networks.

The la ice structure with neighbouring o spring induces co-operation
This section will replicate previous findings that the lattice structure causes co-operation to dominate, while only tags do not (Hammond and Axelrod, 2006a).In a null model, where all agents are equally likely to be selected as interaction partners and there are no tags, the share of co-operative agents averages 12% (±0.3 standard error).This is higher than in the previous simulation, but with four interactions per round and agent, which is the maximum number of neighbours in the spatial model, instead of one, the share drops to 3.4%, comparable to the less than 5% found previously.Adding a lattice structure with neighbouring offspring increases the share of co-operators to 80% (80.4±0.3).
With random interactions and four tags, ingroup biased agents reach 23% (23.0±0.8).With four interactions per round, which reduces random drift and increases the selection pressure, the number drops to 10% (9.7±0.5)(see Table 1).Thus, the lattice structure changes the strategic structure of the interactions such that defection is no longer the successful strategy.With random interactions, co-operating with ingroup members is an exception, and an ingroup biased agent has a larger share than other (contingent) co-operators only because it co-operates with fewer agents.With only two tags, when in-and outgroup biased agents co-operate with the same number of agents, they are equally successful.In the spatial model, co-operation gives better payoff than defection and ingroup biased agents out-compete indiscriminate co-operators.The lattice structure is both necessary and sufficient for co-operation to dominate.
Is it then reciprocity from repeated interactions with the same agents or interactions with kin due to neighbouring offspring that induces the success of the ingroup bias?A previous study showed that it is the latter case.When offspring is located at a random patch on the lattice, the results are similar to the null model (Kaznatcheev and Shultz, 2011).

Co-operative strategies require a small neighbourhood
The specific assumptions of local interactions and neighbouring offspring changed the conditions from favouring defectors to favouring co-operators, and contingent co-operators if discrimination was possible.How specific need these assumptions be?Social networks generally comprise more than a few highly related individuals.How many neighbours are allowed before defectors take over and does the spatial structure need to take the form of a lattice?The question will be examined by varying the number of neighbours in the lattice structure, changing structure into a small-world network, and finally by looking into assorted interactions without an underlying structure.

Varying the number of neighbours
Varying the span of the neighbourhood shows that ingroup biased agents perform at best with six neighbours (77%±0.5%)and that they tie with defectors at sixteen (48%±1%) (see Figure 1).With only two neighbours, all strategies are almost equally successful.The model is not restricted to exactly four neighbours, but still to an interval from four to fourteen, which is small for ethnocentrism to be expected to be at work.

Small-world networks
While a lattice structure is clearly not reminiscent of social networks, smallworld networks (Watts and Strogatz, 1998) have been influential in theory and applications.In its regular form, such a network is a regular ring lattice, where agents populate a circle and are connected to their closest neighbours on the circle.It resembles the toroidal lattice structure, but without such artificial restrictions as disabling interactions between grandparents and grandchildren that come from the latter with four neighbours.The regular ring lattice can then be rewired into a small-world network by reconnecting each edge between two connected agents by a probability p to another agent chosen at random over the entire circle.The result is a network where the minimum number of nodes that need to be passed between two randomly selected nodes is greatly reduced and that traits are more easily spread throughout the network.With p = 1, we obtain a random network.For the specifics on how this is implemented, see the Java documentation and code (Jansson, 2012).On a regular ring lattice (p = 0) and greatly reconnected networks (p 0.3), ingroup biased agents perform worse than on the toroidal lattice.With values in between, however, they perform slightly better, peaking when there is slight reconnection (somewhere round p = 0.05).Ingroup biased agents are most numerous with four or six neighbours, and with p = 0.05, they constitute the largest type up to twenty neighbours, which thus seems to be the maximum on small-world networks with optimal settings (see

Non-spatial assortment
A fixed network structure with neighbouring offspring captures the phenomenon that interactions may take place more often among peers.However, instead of assuming specific social structures to induce assortment, agents could interact with anyone in the population, but more commonly with those sharing its tag.This has been implemented as a weight: all outgroup members have weight 1 and all ingroup members have weight w.Interaction partners are then selected randomly with a probability according to their weight.If the weight is one less than the number of tags, then ingroup members are selected in half of the interactions on average in the initial stages of the simulation.The results are presented in Table 3 and show that larger weights reduce the share of ingroup biased and increases that of outgroup biased agents, except for when the number of tags is large.With increasing weights, co-operating only with outgroup members more and more resembles co-operating with no one.With many tags, however, the number of ingroup members is either small or constituted by a cluster of successful kin co-operators, more reminiscent of social networks with few neighbours.
Thus, for the model to produce a breeding ground for a co-operating strategy, it needs to make specific assumptions on small neighbourhoods populated by offspring, thus maintaining a large share of kin interactions.More general assumptions of non-spatial assortment are not sufficient.In the next section, we will see what makes tags and ingroup bias a successful combination.

Ethnocentrism vs. altruism
The spatial structure of the model transforms the underlying game such that co-operation is successful instead of defection.Still, when tags and an ability to discriminate based on them are introduced, contingent co-operators will outcompete those who co-operate indiscriminately.This section will provide an explanation as to why this is the case.

Tags are fairly accurate indicators of kinship
Tracking whether neighbouring agents are kin (that is, they have a common descent, in these simulations meaning they have the same immigrant as their ancestor) shows that 71% of the interactions are between kin with the same tag and that in only 12% of cases do not tags and kinship correspond (see Table 4).
If a neighbour belongs to the ingroup, then the probability that the agent is a relative (or, more correctly, a possibly mutated clone of the same immigrant) is 89%.Tags are thus not really "weak and potentially deceptive indicators" of kinship (Axelrod et al., 2004), but rather fairly accurate ones.and ingroup (I), together with probabilities for a relative to belong to the ingroup and for an ingroup member to be related.biased agent manages to avoid donations to nonkin, with 89% of their donations being to kin (out of which 61% are direct clones: parent or child).But being selective should come with a risk when your choice is based on crude distinctions, namely that of missing potentially fruitful co-operation with kin.However, such is not the case in this model.A neighbouring relative has a 95% probability of having the same tag.Both ethnocentric and altruistic (cooperating with everyone) agents interact mostly with kin, but an ethnocentric agent manages to exclude many non-related individuals without taking the cost of excluding kin.What happens, then, if the tag actually becomes a deceptive indicator of kinship, such that it not only includes some nonkin, but also excludes some kin?One way to investigate this is to leave the mutation rate of the strategies intact, but increase that of the tag.The result is detrimental to the ingroup bias (see Figure 2).At a tag mutation rate of 30%, altruists surpass ethnocentrics.A mutation rate of 50% means that the offspring is equally likely to belong to the ingroup as to the outgroup.With three outgroup tags available, however, offspring and tag are still correlated.At 75%, tags are no indicators of descent, as the offspring has equal probabilities of acquiring any of the tags.In between these two rates, at 60%, also outgroup biased agent surpass those with an ingroup bias, finding more relatives in the other groups.At higher rates, tags are negatively correlated with parent-child relationships, and, although unrealistic, it may be noted that at 90%, outgroup biased agents exclude almost no kin and become more numerous than altruists.

Ethnocentrics exclude almost only nonkin
If the tag is instead difficult to perceive, such that there is a probability that someone with the same tag will falsely be perceived by an ingroup co-operator to belong to the outgroup, then unconditional co-operators will take over if ethnocentrics defect in every other interaction with an ingroup member (see Figure 3).Note that an outgroup member will still not be perceived as an ingroup member, why the outgroup will still safely be excluded at high error rates.Thus, agents are surrounded mostly by kin.Tag-mediated co-operators outcompete unconditional co-operators since they are capable of excluding nonkin, while misidentifying kin less than five per cent of the time.The more distant the relative, the higher the probability of misidentification, but a distant relative is also more likely to have mutated strategies.Direct clones, which constitute 43% (43.4±0.2) of the neighbourhood, bear the same tag with a probability of 99.5%.Tags are deceptive mainly by including some nonkin, but only 11% of the time.In the next section, we will see what happens with improved kin recognition.

Ethnocentrism vs. kin discrimination
Ingroup biased agents fare well from being fairly efficient kin discriminators.Is it then a robust strategy against less error-prone kin identifiers, or is kin identification all there is to the model, with crude tags being nothing but a substitute of signals of common descent when such are not made available?This will be tested by introducing a new strategy into the model and varying the potential number of tags in the population.

Kin discriminators outcompete ethnocentrics
Two new strategies are introduced into the model.Agents can discriminate based either on the group tag in the original model or on kin marker, that is, whether they are of common descent.Thus, contrasting to previous models, some agents can now directly assess whether they are kin or not.As for the tag, there is a probability of 0.005 that the kin marker will mutate, meaning that child and parent (or any of its relatives) will not be perceived as kin, and that the offspring will become the ancestor of a new family tree.The kin marker is implemented as a reference to the highest node in the family tree (again, see Jansson, 2012), that is, the immigrant or mutated offspring that is the ancestor of the present agent.Note that agents are either group or kin discriminators, and will thus consider only the group tag or kin marker, respectively.The results show that kin discriminators practically eliminate all other strategies, except for a slight share of ingroup biased agents (see Table 5).Thus, ingroup bias is successful only in the absence of more refined discrimination.

Ethnocentrics become kin discriminators with many group tags
Increasing the potential number of group tags in the population makes nonrelated agents less liable to share tags.Indeed, if tags are sufficiently many, then every new immigrant will import a new tag into the population, and group tags will coincide with kin markers.From this we will expect that with more tags, ethnocentrics will be increasingly indistinguishable from kin discriminators and converge to being equally successful.Simulations verify that this is the case and show that the difference goes below ten percentage units at 36 tags (see Figure 4).Ingroup biased agents are really approximate kin discriminators and do not resist invasion by kin identifiers.In particular, they are less successful than agents in an environment with more refined group tags, which makes the strategy unstable to slight improvements in kin recognition.

Kin discriminators manage without spatial viscosity
The spatial structure of the model with local interactions and neighbouring offspring produces clusters of relatives and works together with the group tags to avoid giving help to nonkin.This structure is useful also to a kin discriminator, 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 who already avoids giving help to nonkin, but then also ends up in interactions without payoff less often.Is spatial viscosity then a necessary means for kin discrimination to prosper?Simulations with random interactions show that two strategies dominate the population: co-operating with no one and co-operating only with kin, with the two strategies being equally successful (see Table 6).The average share R of agents with which an agent has common descent is 28%, so kin discriminators differ from universal defectors in about every fourth interaction (as often as an ingroup biased agent).The difference between the two kinds of agents can be increased by reducing the base PTR and death rate to, say, one tenth of their standard values, while keeping the other parameters intact.Agents will then be more long-lived, compensated by less reproduction every round, while potential costs and benefits from interactions increase relatively.Note that parameter values in the original model are arbitrarily chosen and that it is not obvious what they correspond to in a well-mixed population, so with respect to showing existence of kin discriminative dominance, such a setup is equally valid, and shows the direction when agents encounter more interactions.The result is that kin discriminators increase considerably and that relatedness R at least doubles, both almost reaching levels obtained in the spatially structured standard model with ingroup biased agents.
Local interactions are not a prerequisite for kin discriminators.With spatial structure and local interactions, ethnocentrics are turned into efficient kin identifiers, and we have a case of kin selection.However, the spatial structure is not necessary for kin selection to be at work, and similar results can be achieved in random interactions when the efficiency of the kin identification does not depend on local interactions in small neighbourhoods.Ethnocentric agents can evolve gradually into kin discriminators by increasing the number of ethnic tags available.Not only is the strategy invasible by kin discriminators in both structured and unstructured environments, but the latter also prosper in both of them.An ingroup biased strategy based on crude distinctions can thus be numerous only if the crude distinctions cannot be improved upon and if very specific spatial assumptions are fulfilled.Without the first assumption, we would expect efficient kin discriminators to dominate in almost any environment.

A non-spatial approach that produces tag-based co-operation
We have seen that the specific spatial structure in the evolutionary algorithm analysed here transforms the strategic structure of the interactions between agents by facilitating clustering of agents of common descent.Since what we have then is no longer a prisoners' dilemma, we could abandon the prospect of finding parameter settings that would lead to adaptive environments for tagbased co-operation in such dilemmas.Instead, we likely need to extend the question beyond conditions for individually costly co-operative behaviour and ask, in general, in what strategic structures does group discrimination give an evolutionary advantage, also without kin selection?This brings us into the realm of evolutionary game theory.All symmetric two-by-two games can be described by a payoff matrix where a ij is the payoff to an agent choosing strategy i after an interaction with an agent choosing strategy j.In order to find evolutionary stable strategies in this game space, first note that evolutionary stable strategies (ESS) are defined in terms of payoff differences, so what matters for stability is not the absolute payoffs, but rather the differences in payoff between two strategies.
By subtracting the left column, payoffs for responses to strategy 1, by a 21 and the right column, responses to strategy 2, by a 12 , we get a simpler normalised form a 11 − a 21 0 0 a 22 − a 12 = a 1 0 0 a 2 .
With respect to ESSs, there are only four (or three, modulo renamed strategies) classes of games.Naming strategy 1 co-operate and strategy 2 defect, all games where a 1 is negative and a 2 positive are prisoners' dilemmas, with defection always being a rational response, while shifting the signs makes it into a harmony game, an identical game, only with renamed strategies.With positive payoffs on the diagonal, the best response is to do what the opponent does, resulting in a co-ordination game.With negative payoffs on the diagonal, the best response is to do the opposite of what the opponent does, an anti-co-ordination game.

Model specification and implementation
A large set of simulations were run using the Hammond-Axelrod model without a spatial structure, as in Section 5.1.Instead of filling a lattice, the population becomes saturated by adding a density parameter.The density of the population is the number of agents divided by 2,500.The PTR of all agents is multiplied by one minus the density.For each simulation, interactions are modelled by a specific game with payoffs held constant throughout the simulation.Different simulations were run for different payoff matrices.For each game reported on here, 500 simulations were run and the results averaged from the last round in each simulation.By removing the spatial structure, the model can now be analysed at a population level, with more homogeneous agents than in the spatial model.This means that a primarily object-oriented programming language does not have an obvious advantage.For these simulations, using a language with the basic data structure being the two-dimensional array, such as Matlab, is more straightforward and can be implemented with small chunks of code that allow for an easy overview.The code is included in the Appendix.
In order to plot different games in a two-dimensional figure, note that provided that a 11 − a 22 = 0, so all games where the diagonal elements are not equal can be represented by the payoff matrix 1 a b 0 .
To fit the magnitude of the PTR, the matrix will be multiplied by a scalar 0.06, such that the following payoff matrix will be used, for x ∈ [−2, 2] , y ∈ [−1, 3]: 0.06 1 x y 0 .
With this payoff matrix, prisoners' dilemmas are the games where x < 0, y > 1, harmony games where x > 0, y < 1, co-ordination games where x < 0, y < 1 and anti-co-ordination games where x > 0, y > 1.This set of games can be further divided into subclasses of games depending on whether x and y are smaller or greater than 0 and 1, y > x (which of the strategies gives the highest payoff when agents anti-co-ordinate), and 1 + x > y (which strategy is risk dominant).

Results
Figure 5 illustrates the average values, represented by colours, from 500 runs of the simulations for 17 × 17 different games in the range x ∈ [−2, 2] , y ∈ [−1, 3].Simulation results are given for populations consisting of two (panels a-d) and ten (panels e-f) groups, respectively.The values are the ratio of agents choosing strategy 1 (interpreted as co-operate in most games) in ingroup (a,e) and outgroup (b,f) interactions, and the prevalence of strategy set (1, 2) (interpreted as an ingroup bias in most games) (c,g).The figure also depicts the average size of the largest group in the last iteration (d,h).
In order to identify the different games more easily, each panel is divided by solid lines into four areas, each of which corresponds to the four classes of games (top left: prisoners' dilemmas, top right: anti-co-ordination games, bottom left: co-ordination games, bottom right: harmony games).The area is further divided by dotted lines to identify subclasses of games, as described in the previous section.
The results show that tag-based co-operation emerges in non-spatially structured populations in both co-ordination and anti-co-ordination games.In particular, the strategy is successful in what is often called stag hunts or assurance games -this is actually the type of games that prisoners' dilemma will be transformed into when played iteratively, such as in the Hammond-Axelrod model and models such as that of Riolo et al. (2001).
As advocated in this paper, simulation models should be accompanied by validations and robustness checks.Such analyses are provided in the Appendix.
These results are further explained and analysed in Jansson (in press).With a few simplifications, the model is also analytically tractable.For the present discussion we can note that evolutionary algorithms of tag-mediated co-operation can solve the problem more easily, and with more reliable conclusions, by changing the strategic structure rather than imposing restrictive spatial limitations.

Conclusions
There is an increasing interest in using search heuristics to test and find solutions to problems in the place of analytical solutions and deterministic algorithms, probably due to the accessibility and ease of studying phenomena in complex models.However, what we gain in ease of use, we lose in lucidity.This paper is a case study of a model studying the evolution of tag-based cooperation with a focus on the consequences of assuming a lattice structure in which agents are located and with strict spatial rules for how they reproduce.
Ethnocentrism is a phenomenon that leads to more favourable perceptions of and interactions with people exhibiting similar group tags along some line.What is puzzling is that it extends beyond kin selection and direct reciprocity, phenomena that we have a good understanding of, and affects behaviour towards perfect strangers.Seemingly, the model examined here captures this phenomenon: with only four tags in the population, and no built-in bias to favour anyone (initially, all strategies are equally likely), agents end up with an ingroup bias.However, as we have seen here, due to modelling design choices, the tags are not very crude at all.
For the model to favour ingroup biased agents, it needs to make the assumption that agents interact only locally in a small neighbourhood, into which offspring is reproduced.Translating this into the real world means that you will spend your life communicating mostly with your closest family members (who are clones in the model), and a few other people in repeated interactions.Intuitively, these assumptions seem to offer little breeding ground for ethnocentrism to emerge: recognising individuals seems a small matter for a human being in such small groups, and seems hard to generalise to using arbitrary tags among strangers.And indeed, the model fails to offer such generalisations.With local interactions, the tag becomes a fairly accurate proxy for kin recognition.If someone has got the same tag as you, the probability that you are derived from the same clone is 89%.And by excluding agents with other tags, there is only a five per cent risk that you will defect against a relative, and only 0.5% against a first-generation clone (parent or child).
The resulting process is a very simple form of kin selection.Agents do not need to take on a personal cost to invest in kin for the benefit of a common gene.By being a co-operator, the chances that an interaction partner will be a co-operator increases significantly.What is being played is thus not really a prisoners' dilemma.The expected payoff for co-operating is higher than that of defecting, not only from the gene's eye view, but also for each individual.With an increased ability to target agents of common descent, the expected payoff increases.
With random interactions, agents do play a prisoners' dilemma, and the defecting strategy dominates.Ingroup biased agents perform better than the other strategies, but the reason for this is that they co-operate with fewer individuals than outgroup biased agents, who in turn outperform universal co-operators.Hence, group tags do not induce co-operation.With local inter-actions, co-operation is rational, since most neighbours are kin, but the more you can target co-operation to relatives, the better you will do.If the model only allows for crude tags, but that do not exclude kin, then discriminating based on these will inevitably become a successful strategy.
Consequently, the model also seems to have little application to explaining the so called green-beard effect in simple organisms (Hamilton, 1964;Dawkins, 1976), where individual recognition may not occur.A green-beard is a perceptible trait that can help individuals identify other individuals with a common gene and give preferential treatment to these, differing from ethnocentrism in humans in that it may actually function as a means for kin recognition.The green-beard effect is susceptible to invasion by individuals displaying green beards without taking the costs of giving preferential treatments to others displaying them.What needs to be explained is how costly co-operative behaviour can evolve towards individuals displaying green-beards with the threat of them being false-beards.However, the model starts out in co-operative environment, and green-beards are used for exclusion of nonkin.Green-beard discrimination does not incur costs in the model, but it reduces them, compared to the universal co-operation that would otherwise take place.
The design choices of the model turned the ingroup bias into a kin bias, rendering ethnocentrism a deceptive description.This illustrates the importance of verifying consequences of modelling assumptions.A design choice that alters the outcome significantly, as did the local interaction structure, may also have changed the model so that it no longer describes the phenomenon that is being investigated.Local interactions both allowed for kin selection to be at work and, because of this, changed the strategic structure of the interactions such that agents were essentially not playing a prisoners' dilemma.However, it was also shown that if we accept other strategic structures, then an ingroup bias may evolve in certain games even where kin selection is eliminated through random interactions.This solution should more directly address the issue at hand and also illuminates what is actually taking place in the original model.The imposed spatial structure is an oblique way of transforming the strategic structure in the model into one for which a simpler model would already predict the desired outcome.
When designing an evolutionary algorithm to address specific issues it is important to systematically analyse each assumption.Some common assumptions may force the desired results such that a model no longer describes what it was intended for, others (such as the assumed strategic structure, in this case) may prevent them from occurring.

Appendix A. Convergence and robustness checks of the non-spatial model
This section refers to the simulations presented in Section 8 and analyses the convergence, variance and robustness of the model.

A.1. Convergence and Variation
The results presented deal with average values after 2,000 rounds.Have the strategies converged to potential equilibria at this point and how do the results vary between simulations?First, for most of the games, the average frequency of an ingroup bias has converged within 500 rounds, and the results look roughly the same as after the complete simulation.Convergence is slightly slower for borderline games, where x is close to 0 or y is close to 1, but average ratios change very little towards the end of the simulations.
Looking at each of the simulations, instead of average results, the prevalence of the ingroup bias changes by less than ten percentage units during the last 1,000 rounds in almost all of the runs of most of the games, except for borderline and co-ordination games.In the latter, ratios change more than ten percentage units in up to 50% of the runs, but this applies mainly to games where the largest group takes over the population, with there being less selection on the outgroup strategy.
In general, there is little variation between simulation runs of two groups in the success of ingroup favouritism, with a standard deviation (often well) below 0.1, except for some borderline and co-ordination games.The average frequency is close to zero in prisoners' dilemmas and harmony games, with a standard deviation consequently on par.(Note however that the outcomes in the borderline harmony games where there is an ingroup bias vary widely.) The outcomes in anti-co-ordination games fall closely to the mean, with a narrow, approximately normal distribution.The co-ordination games, instead, have wide distributions, with standard deviations up to 0.3.All of these results, except for borderline harmony games, are consistent with the analytical findings.
The pattern is similar for ten groups, but with some increased variation in ingroup co-operation for prisoners' dilemmas.As discussed earlier, co-operation towards the ingroup can spread in small groups with a high density of kin.

A.2. Robustness
In the simulations presented here, all parameter values except for the payoff matrix were held constant.The parameters were set to the same values as those in the spatial Hammond and Axelrod (2006b) model.Are the choices of these parameter values crucial to the results or is the model robust?
Keeping the PTR fixed, all other parameters can be modified to measure their relative impact.Increasing or decreasing the immigration rate has a similar effect to varying the mutation rate.It remains to investigate, then, whether the results are stable with respect to varying death and mutation rate.
The short answer is that extreme death and mutation rates can stimulate kin selection, but for other values, the results are robust against variation.
With two groups, the same pattern emerges independent of variable values as long as the death rate does not greatly exceed the PTR and the mutation rate is small (some percent).With a large death rate, discrimination can occur in any game due to random effects, while a high mutation rate eliminates discrimination, since group marker can no longer correlate with strategy.By decreasing the mutation rate, discrimination increases in the same games as in Figure 1.
Also with ten groups, the results are similar for small values of the parameters, except for death rates close to zero.Again, for large death rates, discrimination occurs randomly, while a high mutation rate eliminates it.For both small death rates and those slightly exceeding the PTR, discrimination is common in the prisoners' dilemma.These two extremes have a high relatedness coefficient in common: for small death rates, around one third of the population are related, and for high rates, more than one half.In the former case, since agents rarely die, the end result is highly dependent on initial conditions, for which the population consists of several small groups with high relatedness, in which ingroup discriminators are favoured.In the latter case, the explanation also spells small groups, since the high death rate maintains groups at a constant small size.

B. Programming code B.1. Java code
The Java code and documentation (Jansson, 2012) for the simulation analysis of the model of Hammond and Axelrod (2006b) is available at: www.openabm.org/model/3391

B.2. Matlab code
The Matlab code for one simulation run of the non-spatial model presented in Section 8 is provided below.Please note that this code compromises speed for lucidity.
% r e p r o d u c t i o n d e n s i t y = n / l i m i t ; p a r e n t s = a g e n t s ( : , 1 ) * ( 1 − d e n s i t y ) > rand ( n , 1 ) ; c h i l d r e n = a g e n t s ( p a r e n t s , : ) ; i f ˜isempty ( c h i l d r e n ) c h i l d r e n ( : , 1 ) = p t r ; mutants = rand ( s i z e ( c h i l d r e n , 1 ) , 3 ) < mu; c h i l d r e n ( mutants ( : , 1 ) , 2 ) = r a n d i ( markers , sum ( mutants ( : , 1 ) ) , 1 ) ; c h i l d r e n ( mutants ( : , 2 ) , 3 ) = ˜c h i l d r e n ( mutants ( : , 2 ) , 3 ) ; c h i l d r e n ( mutants ( : , 3 ) , 4 ) = ˜c h i l d r e n ( mutants ( : , 3 ) , 4 ) ; end % d e a t h k i l l e d = rand ( n , 1 ) < death ; a g e n t s ( k i l l e d , : ) = [ ] ; end

Table 2 :
Table2).Maximum number of neighbours allowed for the ingroup biased strategy to constitute the majority of the population given different rewiring probabilities p.

Table 3 :
Population shares of all possible strategies given different number of tags and weighted probability for interacting with ingroup members.

Table 4 :
The share of the total number of connections between two neighbouring agents divided into non-relatives (N), relatives (R), outgroup (O) (agents have different tags) Table 4 suggests why co-operators are so successful with the specific spatial structure of the model: 75% of the interactions are among kin.An ingroup Population share of different strategies with respect to mutation rate of the tag.

Table 5 :
Population shares of all possible strategies.
Population share of ingroup biased and kin biased strategies with respect to number of group tags.

Table 6 :
Results from simulations without spatial structure, with standard settings and with reduced PTR = 0.012 and death rate = 0.01.Values are presented with standard error and are the size of the population, degree of relatedness in the population, share of agents co-operating with no one, only outgroup, ingroup, nonkin or kin, or with everyone.