Computer Simulation and Emergent Reliability in Science

While the popular image of scientists portrays them as objective, dispassionate observers of nature, actual scientists rarely are. It is not really known to what extent these individual departures from the scientific ideal effects the reliability of the scientific community. This paper suggests a number of concrete projects which help to determine this relationship.


Introduction 1.1
Science aims at developing true, or at least very useful, theories about how the world works.It does this through the efforts of many individuals, each of whom objectively and dispassionately develops, tests, and refines theories about the world.Each scientist employs a method which ensures that neither personal bias nor political pressure will color her results.So as to ensure the process of science is most efficient, scientists communicate their results honestly to one another.They fairly evaluate one another's achievements, rewarding those who are particularly efficient and quickly removing those who do not adhere to the scientific ideal.Through their efforts, large and small, we slowly accumulate better and better theories and learn more and more about the world.

1.2
This description of science should seem familiar.No doubt we have all heard it, and many of us have reproduced parts of this characterization in unreflective moments.Perhaps we believe it to be at least approximately true of the scientific culture (if not accurate in its full generality).A version of this hopeful view of science was recorded in Robert Merton's seminal work The Sociology of Science (1973).Merton discusses the norms of scientific practice, but does little to describe the scientific method.This later task has been taken up by a number of philosophers and statisticians who seek to both describe the scientific method as it is now practiced and to refine the practice to make individual scientists better guides to the truth.

1.3
Of course, scientists are not so perfect as this image suggests.They are not always objective, dispassionate, and fair.Bad science escapes punishment and good science goes unnoticed.Scientists are human and are subject to many of the same cognitive biases that plague all human reasoning.Because the pollyanna view of science is so pervasive, many view these departures from it to be a symptom of pathology in science.Should a scientist fail to approach a problem dispassionately, we take this as a sign of science gone wrong.This diagnostic procedure has led a number of science commentators to conclude that science has no special claim to validity over any other knowledge-generating cultural practice-a conclusion disputed vehemently by a number of scientists and philosophers.Often these science defenders seek to point out how science in general comes very close to the pollyanna view presented above.

1.4
There is a third way, which retains the hopeful image of science but acknowledges that many individuals depart from the traditional model of dispassionate seekers of the truth.This third way views the objectivity and reliability of science as an emergent property.It comes about not because every individual in the community is objective and reliable, but because the community is structured in such a way to ensure that, in the long run at least, only the best theories survive. [1]  1.5 Taking this third path, one might view departures from the traditional model of perfect scientists in any number of ways.One might view them as unfortunate, but counteracted by a social system designed to minimize the impact of their imperfection.Alternately, one might view these apparent imperfections as integral parts of the system of science.David Hull, perhaps the strongest advocate of this view, says, ... some of the behavior that appears to be the most improper actually facilitates the manifest goals of science.Mitroff ... remarks that the "problem is how objective knowledge results in science not despite bias and commitment but because of them."Although objective knowledge through bias and commitment sounds as paradoxical as bombs for peace, I agree that the existence and ultimate rationality of science can be explained in terms of bias, jealousy, and irrationality (1988,32) 1.7 Hull's work, and others, has been primarily historical, focusing on specific cases in the history of science and generalizing from them.Unfortunately, historical observation is subject to the vicissitudes of a variety of cultural factors which constrain which alternative situations arise and how groups of scientists perform in their tasks.As an alternative, I suggest this set of questions is best tackled using mathematical and simulation modeling because this methodology allows for the exact comparison of a number of different possible individual behaviors.

1.8
Those who pursue the third way, between the traditional image of science and science skepticism, must grapple with one central question.What is the relationship between individual scientific behavior and the reliability of a scientific community?In order to convincingly argue that science is objective as a community even though its members are not objective one must develop a theory of how this objectivity arises.Developing a detailed theory of this relationship serves two purposes.First, it allows us to determine the degree to which one ought to trust the results of the scientific enterprise.Second, it provides direction for science policy makers to improve the structure of science so as to maximize its ability to seek the truth despite the "imperfections" of its practitioners.

1.9
This very general question is likely far too broad to be of any real use.In the paragraphs that follow, I will suggest a number of particular questions that have arisen already in philosophical literature but need further investigation.
Question 1: How can we make the best out of the limited abilities of individual scientists?

2.1
Sophisticated theories of proper scientific method are now common in philosophy and statistics.They often feature complex mathematical operations which require detailed background assumptions and significant computational power.While many scientific journals require rigorous treatment of the data presented in individual articles, a number of important scientific decisions are made more informally, utilizing the much more unsophisticated decision making and inference tools from everyday life.While a scientist must do statistical analysis on an individual dataset, the choice of experiment is often made without any significant calculation contrary to the recommendation of some theories of scientific method.

2.2
That scientists sometimes make decisions on the basis of simple heuristics rather than complex calculations might have significant impact on what sorts of social arrangements are best for science.For example, it is usually the case that if individuals are choosing their experiments optimally more information is always better.However, if scientists are choosing their experiments according to a simpler (and non-optimal) heuristic, significantly limiting information can be productive (Bala and Goyal 1998, Ellison and Fudenberg 1993, Zollman 2007,2010).

2.3
Scientists are often required to make assessment about what theory a body of evidence supports.This task can be quite difficult because often not all relevant evidence is published.Results which are not statistically significant, but are nonetheless relevant, are usually left in a scientist's "file-drawer."A significant amount of work in statistics has been done about this problem, known as the file-drawer problem (Rosenthal 1979), but it is unclear whether scientists are always employ proper statistical reasoning.What would be the impact if scientists ignore the file-drawer problem and treat the published evidence as if it was the only evidence?Does this effect the reliability of individual scientists, and should it impact the way we select papers to publish (Zollman 2009)?
Question 2: In what context and to what extent is heterogeneity in science beneficial?

3.1
Most philosophical accounts of scientific method found both in philosophical and statistics literatures are individualistic-they focus on the proper behavior of an individual scientist.Often these theories leave little room for heterogeneity.Some heterogeneity is undoubtedly good in science-we wouldn't want everyone to work on the same problem.Beyond the division of labor among different disciplines, we might also prefer that individuals pursue different avenues for solving the same problem.In this way the community hedges its bets.Thomas Kuhn (1977) suggests that the only way this can be achieved is by allowing individual scientists to have different "scientific methods." Before the group accepts [a scientific theory], a new theory has been tested over time by research of a number of [people], some working within it, others within its more traditional rival.Such a mode of development, however, requires a decision process which permits rational men to disagree, and such disagreement would be barred by the shared algorithm which philosophers have generally sought.If it were at hand, all conforming scientists would make the same decision at the same time (332).

3.2
Kuhn is perhaps too quick in assuming that only heterogeneous scientific standards can produce heterogeneity in a group, however.The study of symmetry breaking in complex systems has shown how groups of homogeneous individuals might nonetheless differentiate themselves.Closer to our topic, Kitcher (1993) and Strevens (2003aStrevens ( ,2003b) ) have shown how the drive to be the first discoverer of a scientific result can cause identical individuals to choose diverse scientific projects.Similarly, I have shown that in certain types of problems limiting access to information or encouraging stubbornness can help maintain diversity even when individuals share a single scientific method (Zollman 2007(Zollman ,2010) ) .There remain many questions, however.Is Kuhn right, that the best way to maintain diversity is with individuals who have diverse standards?What standards?How diverse?

3.3
Beyond the issue of diversity in standards, one can consider diversity in educational background or conceptual "schemes."Hong andPage (2001,2004) have argued that groups which are made up of worse, but more diverse, individuals are better at solving certain types of problems than groups that are made up of the best individuals.As applied to science, this suggests that one might not always want to choose the "best and brightest" without concern for diversity in the group.

3.4
All of the above results are limited to particular sets of problems with particular assumptions about the background information available to scientists.While many of us might see the benefit of pursuing several different fundamental physical theories (like string theory, loop quantum gravity, etc.), we would not see a similar benefit in entertaining the flat earth hypothesis.Even if diversity is beneficial, it will not be beneficial in all contexts (Zollman 2011).More investigation is needed to determine what types of diversity might be beneficial and under what conditions it should be sought out.

3.5
Heterogeneity in science raises an additional important question, how scientists should respond to diversity of opinion.There is an extensive literature which tackles this from the perspective of a single individual in economics (see Aumann 1976, and the resulting literature) and philosophy (see Feldman 2004, and the resulting literature).More recently there has been interest in using simulation to evaluate this question from the perspective of a scientific group (Hegselmann and Krause 2006,Douven and Riegler 2010).
Question 3: How robust is the scientific community to intentional misconduct like selective reporting, misreporting, and falsification?

4.1
With some regularity, a case of serious misconduct occurs which makes national attention: a scientist seriously fudges or outrightly fabricates data.In the most serious cases, these scientists have been influential.The social system of science is supposed to contain policing mechanisms to prevent this sort of behavior.Experiments are supposed to be repeatable and data is (at least occasionally) provided on request.But, nonetheless, one might ask whether the extant mechanisms are sufficient to catch fudging and outright fabrication.If not, one might question the widespread assumption that these cases are anomalous.Even if the current system is sufficient, knowing what features of the system are most effective can ensure that future science policy changes do not act to (unintentionally) disrupt these parts of science.

4.2
Beyond serious misconduct, there are a number of small scale deviations from perfect behavior that are undoubtedly widespread.Scientist will choose not to publish data which does not conform to their preferred theory.Individuals might make small changes in the design of an experiment to influence the results.Rarely are failures to replicate a result published, because negative results are regarded as less valuable than positive results.One might wonder how much these small-scale instances of misconduct influence the collective behavior of science?Is science designed to counteract these small fudges, and if so, how?
Question 4: What is the effect of social biases like conformist bias, social power, etc. on the outcome of scientific practice?

5.1
Volumes of psychological and sociological studies have demonstrated that individual humans are subject to all sorts of individual biases.We tend to pay attention to outliers, to data that confirms our beliefs, and to those in positions of power.These biases are seen, from the individual perspective, as erroneous.We do not perform appropriate statistical tests when thinking informally about random events, we don't give equally reliable data equal weight, and we don't question the reliability of those we trust.

5.2
David Hull, in the quote above, seems to suggest that these sorts of biases help to contribute to the effectiveness of the scientific society.Karl Popper, for instance, argued that refusing to abandon a theory in light of its refutation was inconsistent with his normative standard for individual scientific behavior, but was nonetheless useful from a community perspective (1975).

5.3
Different biases might influence individuals in different directions, and as a result it might be that biases "cancel out."Alternatively, social biases might help scientists make proper decisions because the bias points in the right direction (Brock and Durlauf 2002).Of course this need not the case, and understanding what sorts of social circumstances lead to these nonharmful effects would help us determine when to trust science what sorts of science policy ought to be adopted.
Question 5: How does the system of scientific reward (publication, tenure, grants, etc.) influence scientists choices?Is it helpful or harmful in tenure, grants, etc.) influence scientists choices?Is it helpful or harmful in produces socially desirable outcomes?

6.1
Along with the professionalization of science in the 19th century came a new set of incentives for scientists.A scientist is not solely concerned with discovering truth, but must also worry about securing tenure, promotion, grants, and awards.The metric of scientific success is often the peer-reviewed scientific paper which correlates imperfectly with the discovery of important truths.More recently, the assent of citation metrics as a method for scientific evaluation have evoked calls for reconsidering how good science is evaluated and rewarded.

6.2
Of primary concern has been the degree to which rewards in the sciences discourages "high-risk/high-reward" scientific research.The fact that many great scientific successes of the past were of this type leads many to suggest that our current system for rewarding scientific success is broken.Much of this discussion has been informal and the examples tend to be anecdotal.There is, however, a significant theoretical apparatus developed in economics which can tackle these sorts of problems.

6.3
The work of Kitcher and Strevens provides one illuminating example.They suggest that the "priority rule" reward system, which gives sole credit to the first discoverer of a particular result, is socially optimal because it encourages an appropriate allocation of scientists among different research programs each that might achieve a desired result.Their model relies on a number of strong assumptions about the degree of information available to the scientists, and it has been shown via computer simulation that their results depend critically on these assumptions (Muldoon and Weisberg 2009).Regardless of how one views this particular model, these results illustrate how further studies of this research program might be carried out.

Conclusion 7.1
Developing a full understanding of the relationship between individual scientific behaviors and the property of scientific groups will undoubted take a significant amount of research by a number of individuals.Ultimately, however, undertaking such a project will help us to understand the importance (or lack thereof) of individual virtues like objectivity and neutrality.Knowing this relationship will guide both the study of science in philosophy and other social sciences, and it will help to guide science policy in productive directions by focusing attention on those pathological behaviors which serve to threaten science.