“Know Thyself” Is the Beginning of Science.

Socrates is generally credited with having originated the famous aphorism that “to know thyself is the beginning of wisdom.” If that is true for philosophers and people in general, it is arguably true also for scientists engaged in the scientific endeavour. But what does it mean for a scientist to know themselves as a scientist? How might such self-knowledge enhance their effectiveness as a scientist?

In a just-published preprint “Distributed Science: The Scientific Process as Multi-Scale Active Inference,” Francesco Balzan et al. set out a theoretical understanding of the thought processes of a scientist participating in a scientific endeavour and also of science as a community of such people engaged together in the furtherance of knowledge through the scientific process. Their model is based on the ideas of co-author Karl Friston who around 2010 originated the theory of Active Inference in relation to biophysics and cognitive science. I shall not attempt here to expound in any detail on the theory, the foundations of which are highly technical and mathematical. Rather I will look to set out some of the salient elements of the theory which I believe illuminate our understanding of the activity of science and scientists.

The basic idea of Balzan et al. is that science is built in the image of its maker, human beings, and that the traditional model of science as an inductive process associated particularly with Karl Popper in The Logic of Scientific Discovery (1959) was based on an oversimplified understanding of the internal dynamics both of the underlying scientific process and of the thought processes of scientists. It is suggested that an extension of induction referred to as abduction provides a richer model of how scientific understanding evolves:

Abduction involves a two-stage process in which one generates sets of hypotheses and then infers, based on data and specific constraints (e.g., simplicity, coherence), which proposed hypothesis is most likely… Abduction introduces new hypotheses into the scientific process, deduction determines the logical implications derivable from these hypotheses, and induction subjects these implications to testing by evidence in order to achieve a scientific generalisation.
Balzan et al. (2023)

The notion of abduction is underpinned by what is called Bayesian epistemology. Under the modernist (Popperian) view of science, theories are held to be valid to the extent they are not yet falsified in experiments designed to test their validity. The essence of Popperian epistemology is that a theory or hypothesis is scientific to the extent it is amenable to falsification and either provisionally true (not yet falsified) or else false (falsified). Bayesian epistemology by contrast considers multiple plausible theories side by side associating plausibility with them on the basis of multiple factors such as elegance, coherence and favourable reception within the scientific community. There are thus prior probabilities of validity associated with each theory, or aspects thereof. Experiments are then done to see whether the phenomena described by the theory conform to the predictions of that theory. The probabilities assigned to those theories whose predictions offer a good match with the experimental results are then increased, while those which perform poorly have their probabilities downgraded (possibly to zero). The new probabilities are then called the posterior probabilities and serve as the prior probabilities for the next round of theorising and of experimentation. The attractiveness of this theory is that it conforms closely to how human cognition is now understood scientifically (in particular within the framework of active inference) to work.

A key aspect of the Bayesian interpretation of the relationship between evidence and hypotheses is the notion that prior experiences, which inform prior beliefs, influence how an epistemic agent views an observation in relation to hypotheses.
ibid.

We are thus led to a model of how science works in practical terms, which is richer but, in the view of the authors, still incomplete.

Bayesian accounts provide a good formal story of how social and psychological factors come to shape the production of scientific knowledge, or context of justification. But how do scientists themselves shape their socio-material and historical context? How do institutions change through scientific activity? What are the top-down and bottom-up relations between the way scientists initialise and learn their priors and how their work comes to shape the institutions from which subsequent generations of scientists acquire their priors? We believe that to get the full Bayesian picture of how science is progressed by individuals and how it is shaped by scientific communities—which themselves are shaped by the collective operations of their constituents—the Bayesian approach to scientific knowledge needs to detail the belief-based (cognitive) mechanisms whereby the context of discovery and the context of justification interact.
ibid.

This is where the specific theories of active inference come into play. Notice first that, as implied by the name, agents engaging in such activity are not merely theorising about the world and making sense of their experience but are actively shaping that experience through the actions they take and the disposition they adopt in their interaction with it. For example, we often imagine when we “see” something like a bird, it presents itself to us as such through a visual image it projects on our retina. In reality cognition is now understood to work in reverse to this intuitive understanding: we project the idea that we are seeing a bird onto the visual data and only then make sense of it. Prior to that we will use a Bayesian process to construct plausible hypotheses about what we are seeing with associated probabilities: if it is flying it might be a bird, but possibly a plane or a bat. We might look for a vapour trail which would significantly increase the probability that it is a plane. We might then check for wings to discount the possibility it is a helicopter, and so on. Only when we have reduced to a sufficiently low level the probability of all other interpretations of the visual data do we “see” a bird. It might be remarked how close this process is to our characterisation of scientific reasoning under the Bayesian paradigm.

At the heart of the theory of active inference is the idea of a probabilistic generative model possessed by an agent or body of agents engaging in such inference. This model allows them to infer future states of the world (external states) based on the evidence and understanding at their disposal (internal states), which of course includes the generative model itself. Part of that model is an idea of what constitutes prospering and well-being for the individual or group concerned. In this way the agent envisages and looks to facilitate future states where such prospering occurs. This is possible to the extent that the agent is able to simplify and make sense of the huge amount of information available to them from the world around them and their past experience in it.

This in practice requires use of what is called the Markov blanket trick. The blanket is a conceptual wrapper around the agent limiting the interaction of its internal states and the variables representing them with the external state variables in the world beyond (environment), which are not directly accessible. Such interaction only occurs through the agency of blanket states. In other words internal and external states do not interact with each other but only with the blanket which separates them. These blanket states are relatively few in number. Those which impact on the agent are called sensory states (mediated through the five senses) and those which impact on the environment active states. This simplification allows the representation of interactions with “the world” in the agent’s generative model to be simplified to the level of being conceptually manageable. Attaining well-being and prospering is then synonymous with extending and updating the generative model to minimise unwanted surprise and obtaining the evidence needed to maintain its accuracy in a changing world/environment, a.k.a. model validation.

Clearly, active inference understood in this way meshes well with our idea of how a scientist proceeds when carrying out a scientific investigation.

Interestingly, the implication here is that the scientist is no longer viewed as an objective observer but as an active creator of scientific knowledge through selection and design of experiments and interpretation of results. As Balzan et al. explain,

within the active inference framework, the process of model updating (perception and learning) constitutes just half of the narrative. Active inference agents not only update their internal generative models but also actively orchestrate the production of evidence to substantiate these models (self-evidencing through action and selective sampling). For an individual [scientist] engaged in active inference, this translates to the deliberate selection of experimental configurations capable of supplying evidence that effectively diminishes uncertainty about competing hypotheses entertained under their generative model.
ibid.

So the main difference between science and learning processes in general is not so much in the way information is obtained (active inference in both cases). Rather it is first that in science the internal generative model of phenomena inferred is expressed externally, alongside the evidence on which it is based, and secondly that this information is shared with and validated by the scientific community. So it is in the working of the scientific community, considered as an agent in its own right with its own Markov blanket and a shared internal model (“the science”), that the real difference lies. This conceptualisation of the scientific community and the scientific process as a higher-level ensemble of Markov blanketed systems whose behaviour both constrains and is constrained by the behaviour of its constituent parts is referred to by Balzan et al. as collective or distributed scientific cognition.

To the extent that the generative model to which this process gives rise has coherence and is genuinely shared by the scientists investigating a phenomenon, the result can be interpreted as a consensus scientific view or theory. But of course it is not guaranteed that scientists engaging as bona fide participants in a scientific investigation will arrive at a consensus view, i.e. shared model. Under the Bayesian view of science, the most that can be guaranteed in a contested situation is that, when evidence is shared within the scientific community which tends to support one view rather than another (but is not outright disproof of the latter), scientists will adjust their assessed posterior probability of the validity of the supported theory upwards. However, neither the prior nor the posterior probabilities associated with any scientific theory are intrinsic properties of the shared model. Each scientist or research group will assess and assign weight to evidence, new and old, based on their assessment from the perspective of their own Bayesian model. While the perspective of the scientific community in general constrains their views and their activities, it does not determine them.

Indeed it is this capacity of distributed scientific cognition to allow different researchers and groups to operate with significantly different levels of trust in the “consensus” model, to the extent that some will invest time and effort into investigating a competitor model even with no evidential base, that allows science to progress. So the existence of a degree of dissent around a consensus scientific view should be seen not as a shortcoming or a failing of the scientific process but as a sign of its health and, if you like, the capability of its underlying generative model to continue developing new or improved theories. Indeed, in line with the Popperian view, the persuasiveness of the consensus view is in direct proportion to the effort and energy that is seen to be invested into seeking contradictory evidence (falsification).

But a note of caution needs to be sounded. The Bayesian generative model of a scientist in respect of their development and validation of scientific theories and hypotheses is not independent of the higher-order generative model which governs their behaviour as a whole. This provides the overarching prescription of what constitutes and best contributes to prospering and well-being in their life generally. Indeed this perspective determines whether and to what degree an individual commits to a scientific endeavour in the first place. Clearly not everything which is taken into consideration in such overall personal assessment is aligned optimally with the interests of science. Concerns with finances, reputation and career stability clearly will have an influence, both at the the level of the individual and of the research group and may give rise to conflicts of interest. In this respect, individual scientists should not be considered intrinsically different from career professionals engaged in any other sphere of activity. As Balzan et al. conclude:

scientific communities are cultural communities, and they have a vested interest in confirming the hypotheses to which they have committed their careers. Scientists and their research groups routinely (almost on a yearly or biyearly basis) compete for resources in the space of research (namely, things like grant money, attention from the public and other scientists, room for publication in journals, etc.). Science is an evolutionary process of model selection.
ibid.

In fact one can go further here and observe that the cultural communities of science are embedded in turn in wider national and international communities which increasingly these days place expectations on science based on a “consensus” of priorities, which are arrived at not by a scientific process, but typically a political one. This will have a knock-on effect in shaping the “vested interests” of scientists and research groups cited by Balzan et al. above. Such impact would not a priori be expected to align with the interests of good science, as defined by the scientific community themselves.

Given that the impact of these wider communities on the scientific community can itself be seen as a form of (bi-directional) active inference in relation to the latter’s self-understanding, it is not a straightforward matter to isolate and evaluate such impact. Two good examples here are the contentious debates about “science-based” policies designed to address climate change and epidemics such as Covid-19. One of the problems here is that policy decisions tend to involve mutually exclusive choices meaning that evaluation of alternative policies involves consideration of counter-factuals. Such analysis done post hoc may result in the policies previously favoured being discredited (based on adjusted posterior probabilities of success), but by then the opportunity to follow a different course will usually be gone.

Clearly there are no easy answers here, but it is well to be aware, rather than ignorant, of the (active inference) dynamics shaping the policy discussion in order to make the best-informed assessment of the views being expressed. A corollary of this would of course be that one should be cautious when either side in a debate asserts that it is backed up by “The Science” or that “The Science is settled.” Active inference, like any evolutionary process, always remains a work in progress.