Noise and bias – some controversies raised by the book ‘Noise: A Flaw in Human Judgment’, written by

. The paper reviews and discusses the statistical aspects of the phenomenon called ‘noise’ which Daniel Kahneman, the Nobel Prize winning psychologist, and his colleagues present in their new book entitled ‘Noise: A Flaw in Human Judgment’. Noise is understood by the authors as an unexpected and undesirable variation present in people’s judgments. The variability of judgments influences decisions which are made on the basis of those judgments and, consequently, may have a negative impact on the operations of various institutions. This is the main concern presented and analyzed in this book. The objective of this paper is to look at the relationship between bias and noise – the two major components of the mean squared error (MSE) – from a different perspective which is absent in the book. Although the author agrees that each of the two components contributes equally to MSE, he claims that in some circumstances a reduction of noise can make accurate inference not less, but more difficult. It is justified that the actual impact of noise cannot be accurately determined without considering both bias and noise simultaneously.


Introduction
Events, estimates or judgments which occur repetitively in large numbers are viewed by statisticians in the way that in simplified form is presented by Figure 1. They look at the set of population units (or sample units which represent the parent population) and try to identify patterns and regularities which will be described by summary measures (numbers) and their proper interpretation. Quantitative description of patterns derived from observed events or people's judgments is the main goal of statistics. Daniel Kahneman, Olivier Sibony i Cass R. Sunstein, authors of a new book related to flaws in human judgments (Kahneman et al., 2021) have decided to look further or -to be more precise -to seek possible implications of given statistical characteristics of the population on activities of institutions and behavior of people who constitute this population. They are particularly interested in consequences of variability among people's judgments, described by measures of deviation, which may be regarded as problems and challenges in certain institutions. Daniel Kahneman, the Nobel Prize winning psychologist, a prominent and widely respected researcher of the human mind, author of the bestselling book 'Thinking, fast and slow' (Kahneman, 2011) has explained in a number of scientific papers typical cognitive errors which many people share, and errors which they make in judgments and decision-making. These are errors which can be observed both in professional and personal lives of people. One of the consequences of those cognitive errors is variation present in people's judgments which refer to the same event. Although a certain level of variation may not be disturbing to many people, it is likely to be more challenging if it increases. As the main reasons of variation among people's judgments have already been discovered and explained (see e.g., Kahneman, 2011;Kahneman et al., 1982;Morvan & Jenkins 2017), it seems reasonable to ask the question of the impact of variability of judgments on the activities of various institutions, in particular those in which consistency of judgments is expected and desirable. This question is raised in the presented book and investigated from different perspectives. The authors look at how decisions based on diverse judgments affect people's lives and reputation of institutions. Additionally, they suggest some ways which allow reducing the present level of deviation observed in judgments.
The title 'noise' is understood by the authors as unexpected and undesirable variation present in people's judgments, and they add that 'there is too much of it' (p. 361). Explaining motives and reasons for studying this problem the authors write: 'The surprises that motivated this book are the sheer magnitude of system noise and the amount of damage that it does ' (p. 365). From the very beginning of this book a reader will find examples of negative consequences of noise -variability of judgments and estimates among professionals. Convincing examples relate to: divergent judgments of doctors in diagnosis of illness, variation in judicial judgments related to the same case, and also deviation in administrative decisions (e.g. related to asylum status), economic forecasts, decisions of patent offices, actuarial estimates of insurance premiums.
On the one hand, it is quite natural that people in their personal judgments are different. The basis for their judgments is formed not only by professional knowledge, whose range and quality may be different among individuals, but also by their overall experience, and usually unique way of combining and processing information, or in other words, unique way of thinking. However, judgment is a category which should be regarded as narrower than thinking. According to the authors: 'Judgment is a form of measurement in which the instrument is a human mind' (p. 361). The measure applied does not have to use numbers, it may employ other scales. If the dispersion of judgments is not large, it does not attract public attention, even if the differences occur among professionals. It could be argued that we all are used to the presence of certain level of noise in judgments related to political issues, economic or environmental events or processes. It is not surprising that they differ one from another, as long as the differences are not too large. Also, in academic communities there seems to be common acceptance for a certain level of diversity among judgments concerning students' achievements or research outputs of scientists.
On the other hand, it would be much more difficult to obtain acceptance for diversity (noise) in judgments and decisions that follow them, if they seriously affect people's lives or future careers. Noise tends to be considered unwanted in institutions representing judicial system, health care system, vocational advice system, and others. If doctors present a wide range of judgments related to the choice of treatment in a particular case, or judges announce vastly different views on the severity of punishment, it will inevitably lead to confusion and disorientation. Additionally, it may undermine the competences of those professionals and confidence in their expertise. These are the main reasons, why we should seek information about the sources and nature of noise, as well as accessible ways of monitoring and reducing noise. This is what the book is about.

Bias and noise in judgments
The initial claim presented by the authors of this book indicates that noise, as one of two components of the total error which accompany every judgment, attracts less attention than the other one -bias. Noise tends to be overlooked and neglected. 'This book is our attempt to redress the balance' -declare the authors (p. 6). Using statistical terms, this view could be expressed as follows: systematic errors and bias in estimates attract more interest than an equally or more crucial factor of inaccuracy -noise. This claim, however, seems disputable. No doubt that a high level of noise in judgments of doctors, judges and many decision-makers in public administration is harmful, and sometimes has painful consequences for people. But on the other hand, there are several reasons for which bias, not noise, ought to remain in the center of our concern. Two of the reasons are explained below, and the final one later, together with measures of errors.
Firstly, bias is a systematic error or tendency toward a distorted judgment, and may manifest itself in dangerous social phenomena, like various kinds of inequalities or discriminations. Bias is responsible for false judgments which form basis for racial, religious, gender or wage discrimination. All these kinds of prejudice and discrimination are not caused by noise (variation of judgments). It is an irreducible constant bias present in judgments of groups of people and representatives of institutions, and their decisions based on those judgments, that accounts for these phenomena. Alleged racial bias in police activities in some countries, gender gaps and biased (prior) assessments of productivity by age or sex in labor market are examples of such discrimination.
Secondly, unlike noise, which can be reduced by increasing the number of independent judgments or averaging them, bias does not exhibit that or any other similar property. One cannot reduce bias by simply increasing the number of judgments collected. Also, authors of the presented book confirm that bias is not a decreasing function of the number of judgments or evaluations. Statistical methods and techniques are less helpful in reducing systematic errors than noise. If a bias occurs in judgments, one of the most efficient ways to deal with it is to incorporate other relevant sources of information, which in practice may be difficult. To reduce noise seems to be an easier task. Readers may of course not share this view. And it seems that the authors of this excellent book do not, either.

In-depth analysis of noise
The content of the book is split into six parts, each consisting of three to eight chapters. Every chapter is accompanied by a short recapitulation and conclusions. The total number of chapters is 28. The last one is followed by 'Review and Conclusions: Taking Noise Seriously'. The final part of the book, called 'Epilogue' is entitled: 'A Less Noisy World'. It includes three appendices in which practical rules and procedures designed for dealing with noise, including audit of noise, are proposed and discussed. One of the advantages of the book worth pointing out is clear and precise language, and also a large number of examples which enable one to understand interesting and original considerations involved in all its parts.
The first part of the book which consists of three chapters begins with a number of persuasive examples of undesirable variations which can be found in judges' decisions in courts and in estimates and judgments of underwriters employed by insurance companies. Such variations may evoke a sense of injustice among people and additionally incur financial losses, sometimes of considerable volumes. In this part of the book the notion of 'system noise' appears for the first time. It is defined as undesirable variation existing among judgments of different people assessing the same case. In further parts of the book system noise is divided into 'level noise' and 'pattern noise'. An interesting point is presented by the authors in relation to noise which can exist, although it tends to be overlooked, in singular events or unique decisions. It is proposed that a unique decision should be regarded as a potentially recurrent decision, even if it is taken only once. The decision-maker should follow the same rules aimed at reducing bias and noise, which are applied in the case of repetitive events. In statistics one will find more analogies to this kind of logical approach.
One of the first stages of the approach designed to reduce undesirable variability of judgments involves measurement of variation or measurement of noise. 'Your Mind Is a Measuring Instrument' is the title of the second part of the book. Evaluation of judgments in order to improve ways of making them does not seem to be easy. Especially, if the judgments cannot be verified with regard to their accuracy and precision, for example in hypothetical scenarios, or long-term forecasts. Therefore, the authors propose to look both at the accuracy of judgments, when it is possible, and simultaneously at the process of formulating judgments. In other words, it is recommended to compare ex post judgments with actual outcomes, if possible, and additionally to assess the quality of the process of making judgments.
The problem of measuring two principal components of the total error: bias and noise is extensively discussed in Chapter 5. Recalling the well-known formula in statistics for mean squared error (MSE) which can be expressed as the sum of squared bias and squared standard deviation, the authors claim that each of the two components contributes equally to MSE. They call this formula 'the error equation' and recognize it as 'the intellectual foundation of this book' (p. 66). The authors emphasize that a given change (increase or decrease) in bias or in noise has the same impact on MSE. They write: 'Reducing noise or reducing bias by the same amount has the same effect on MSE' (p. 65). This statement can be regarded disputable or controversial. The influence of bias and noise on MSE are described correctly, however the problem is not unambiguous, if other measures of quality of judgments are taken into account. It would be reasonable, for instance, to consider consequences of reducing noise in the context of how frequently less dispersed judgments will be equal to the true value or fall close to it. In other words, it may be important to look at how the probability that the next judgment will be close to the true parameter changes, if the noise is reduced, given that a certain level of bias is involved in all judgments. Figure 2 presents two distributions of judgments which have the same bias (the difference between the mean of judgments -� , and the true value -) but different values of noise (different standard deviations). It is clear that smaller dispersion, although desired in other circumstances, results in smaller probabilities of obtaining judgments which fall to the unit interval around true parameter . If one could reduce noise even further, the corresponding probability would get smaller and smaller. This means that the probability of obtaining judgments close to the true value will under this assumption approach zero. This is the real consequence of bias. A reduction of noise, if bias is present in judgments, makes it less likely that the true value is going to be discovered. This is because biased judgments will absorb increasingly large amount of probability. And ultimately, probabilities around the true value will be smaller compared to the distribution that has larger dispersion (larger noise). Whether or not it happens in a particular case depends mainly on the mutual relation between the size of bias and the size of noise. They need to be considered simultaneously. Otherwise, a reduction of noise may have positive or negative impact on the quality of judgments. Although the authors are aware of this problem, they do not explain it in the context suggested above. They confine themselves to claiming that 'Reducing noise would be less of a priority if bias were much larger than noise' (p. 66). It is difficult to agree that this is just a question of priorities. It is a problem of benefits or lack of benefits that one could expect, given different quantities of bias present in judgments. Of course, benefits are expressed in terms of probabilities of getting judgments close to the true value assessed. Emphasizing (more than once in this book) that bias and noise contribute equally to MSE, which is true, and paying little attention to various consequences of mutual relation between the two components, may be misleading.
All the other chapters of the second part of the book present discussion of the factors which generate noise and further classifications of noise. Psychological background which can often be identified behind a given kind of noise is widely and interestingly discussed. Perhaps most commonly 'occasion noise' is observed by people in their everyday social activities. However, the authors argue that this is not the most important source of system noise. Regardless of the factors which influence noise in individual assessments and judgments, it should not be assumed that an efficient way to reduce noise is always a group discussion which searches consensus. The discussion and its output may be affected, the authors prove, by a kind of social influence. Group polarization can be recognized to be a special case of such influence and simultaneously a source of noise. After a thorough discussion of these issues and the results accompanying the experiments, the authors conclude that groups of discussants looking for a consensus need to be properly managed in order to avoid high level of noise. Lack of management may cause that noise present in some individual judgments could be amplified.
The third part of the book covers various issues of noise which can arise in predictive judgments. The initial assumption stated and justified by the authors is that in forecasting human abilities are inferior to statistical models, including correlation and regression models and additionally artificial intelligence (AI). They indicate that one of the main factors which accounts for this is noise which commonly affects people's judgments. Even simple rules may be in these circumstances superior to human judgments. 'The combination of personal patterns and occasion noise weighs so heavily on the quality of human judgment that simplicity and noiselessness are sizable advantages ' (p. 133). Having access to the same information, models and algorithms tend to be more efficient and more accurate than humans. In further chapters a reader will find descriptions of rules (algorithms) free from noise.
The authors prefer using the term 'rules', which in their interpretation has broader meaning than models and algorithms. Unlike people, rules are not overconfident, which is one of the reasons why they are generally more reliable in prediction. Moreover, some people tend to deny their own lack of knowledge: 'The denial of ignorance is all the more tempting when ignorance is vast' (p. 145). Assuming that a causal mechanism of a phenomenon of interest has been discovered, it remains difficult to predict accurately its future development. People tend to neglect not only uncertainty but also noise. 'The future seems as predictable as the past. And noise is neither heard nor seen' (p. 158) write the authors in a slightly metaphorical mode at the end of the third part of the book.
The content of the fourth part is well reflected by its title: 'How Noise Happens'. The authors focus on psychological aspects of noise in human judgments. This is an area of science in which Professor Daniel Kahneman is an internationally renowned and respected expert. Readers of his previous book (Kahneman, 2011) will find once again a clear presentation and examples of some common cognitive errors and biases which people tend to make. They are in particular: heuristics of substitution, conclusion biases, and excessive coherence (i.e. forming coherent views quickly and slowly changing them). Each of them can generate noise and additionally bias. The analysis of sources of noise is complemented by discussion of the statistical and psychological aspects of the formulation of judgments. Concentrating on statistical issues the authors stress how important it is to adopt proper scales in order to help avoid noise in predictive judgments. Some psychological circumstances are discussed even more extensively. Special attention is given to sources of 'what may be the most intriguing type of noise: the patterns of responses that different people have to different cases' (p. 159).
Pattern noise, defined by the authors as 'an error in an individual's judgment of a case that cannot be explained by the sum of the separate effects of the case and the judge' (p. 203) is studied in this book thoroughly. It is disaggregated into two components -occasion noise and stable pattern noise. Both are explained with regard to their sources and specific features. In the last Chapter of the fourth part of the book one will find a list of all components of noise presented earlier and ways of measuring each of them. However, one thing may be found slightly unclear in this chapter. While graphical interpretation of MSE in Figure 16 raises no ambiguities, as it refers directly to the analytical formula for MSE (the sum of squared bias and standard deviation), the decomposition of stable pattern noise depicted in Figure 15 does not seem clear. This applies also to the following formula which precedes the graph: Squaring all elements in this equation may be perceived as not satisfactorily explained. It is intuitively meaningful; however, any analytical justification of this relation may not be straightforward.
Organizations which face the problem of actual or potential consequences of noise in their activities may want to take measures in order to improve judgments which account for the noise. How to do this is outlined in detail in the fifth part of the book. A part of this presentation is an original procedure of the audit of noise (Appendix A) and a checklist for the decision observer -a person who is in charge of searching for symptoms of cognitive errors in organizations (Appendix B). Sometimes, undesirable variability in judgments (noise) may be the result of lack of knowledge or insufficient expertise of employees. However, very often it is not just one reason but a combination of different reasons, which require some specially designed strategies to reduce the noise. They include for instance, aggregation of independent judgments. Such strategies may be effective in reducing particular cognitive errors, and consequently also noise, but will not be very useful in establishing which of the errors account mostly for the observed noise. 'Noise is an invisible enemy' -say the authors (p. 244).
The elimination of cognitive errors is not easy. They tend to be overlooked by a person who commits them, in spite of his/her ability to recognize them in others. To illustrate this phenomenon (blind spot) the authors refer to the survey of 400 professional forensic scientists from 21 countries, in which 71% agreed that cognitive bias is a cause for concern in the forensic sciences as a whole, but only 26% said that their own judgments are influenced by cognitive errors.
Criminology is one of several areas for which the authors propose strategies and methods for improving judgments and reducing noise. Other areas include: anticipation, judgment formulation and decision-making in healthcare, assessment of staff performance ('rank but not force', p. 294), structuring complex judgments in processes of recruitments. A more developed approach to the issue of recruitment is a procedure which the authors have called MAP -Mediating Assessments Protocol. Details of the protocol and examples of its implementation based both on the authors experiences and their research are discussed in the last Chapter of the fifth part of the book (Chapter 25).

Do we need certain noise in judgments?
The content of the fifth part but also other chapters of the book may suggest that perhaps the authors seek ways not only to reduce but to eliminate noise in all institutions which face its consequences. It could eventually be achieved by employing artificial intelligence (AI) or algorithms which would produce judgments for making decisions in judicial institutions, hospitals, centers of vocational counselling and others. Rationality of neural networks and various other representatives of AI may be perceived as a remedy for human's wavering and lack of certainty. It should be stressed, however, that the authors are aware of major risks which such a substitution would probably generate. Some of them have been identified and well described by O'Neil (2017). And the risks are not confined to the possible deepening of discrimination of various backgrounds, which is convincingly demonstrated by O'Neil (2017). The risks seem to be of more fundamental nature -they consist in profound standardization of all those characteristics of units of interest (people, events) that are unknown to algorithms. For example, many of us would presumably feel awkward, if a doctor's diagnosis were based solely on algorithms, excluding medical experience and conclusions of the patient interview. Similarly, professional judges and court jurors are commonly expected to take into account not only the output of algorithms but also other personal or social circumstances of the defendant. In other words, like the authors of this book, people are ready to accept certain level of noise in judgments of professionals.
What is the accepted level of noise and whether optimal noise exists is explored in the final part of the book. One will find there several objections to efforts aimed at reducing or eliminating noise. The major ones are: prohibitive costs or even lack of feasibility of such efforts, the risk that the reduction of noise can introduce other errors, the difficulty of adopting new values in a society where there is no room for noise. Moreover, some noise-reduction strategies might squelch people's creativity, point out the authors. The last risk is well exemplified by algorithms which in principle tend to replicate patterns from the past and do not indicate the need for change. All these objections and arguments are interestingly discussed in this part of the book. The discussion leaves room for the reader's own reflections and views.
Apart from 'Review and Conclusion' (18 pages) the authors have decided to include the main findings and key messages of the book in a separate one-page long part called 'Epilogue: A Less Noisy World'. They argue that the world less affected by noise would bring large savings of money, improve public safety and health, increase fairness, and prevent many errors. In the process of transition to such a world they see a role for AI to play.

Summary and conclusions
The book constitutes a broad view on human's judgments affected by noise analyzed from various perspectives. It is also a competent and interesting discussion of the human weaknesses that underlie the mistakes we make when forming judgments. The authors did their best to be objective in evaluating the consequences of noise for institutions and society, as well as in discussing the reasons for reducing or eliminating noise. Conclusions presented in this book are particularly important nowadays, when AI offers increasing support in decision-making and gradually replaces human judgments with its own assessments.
The only thing that may be considered unclear and ambiguous in the book is the aforementioned issue of the relationship between bias and noise. Particularly, in the context of frequencies of wrong decisions based on judgments affected by both these errors. The actual consequences of noise cannot be accurately determined without considering both bias and noise simultaneously. Noise reduction, despite its positive effect on MSE values, should not always be considered beneficial. If we all say the same, and it will turn out that we were all wrong, some may correctly conclude that if more dispersed views were available, they would suggest that the truth might be different.