Research Guides: Machines and Society: Bias

Introduction

LLMs are applied across domains and adopted by various organizations. With their rising popularity comes their increasing downstream impact on individuals and society. One significant aspect of their influence is the biases introduced throughout the AI pipeline, originating from both data and algorithms.

Therefore, understanding these biases is crucial before we engage in meaningful discussions about their implications and take steps to mitigate and account for them. This includes understanding the concept of bias, acknowledging the sources, processes, and histories that produce, propagate and perpetuate these biases, and recognizing the potential risks and harms they may pose to our society.

Notions of Bias

Bias is an overloaded term. For instance, the "social bias" refers to prejudices, inclinations, or discriminatory attitudes against an individual or a group. It typically includes biases based on inherent or acquired characteristics of an individual or a group, such as gender, age, sexuality, physical appearance, disability, ethnicity, race, nationality, socioeconomic status, profession, religion, and culture (the tendency to interpret a word or phrase according to the meaning derived from a given culture assigned to it; e.g. eating meat).

Bias may also refer to "statistical bias", which designates the difference between a model's prediction and the ground truth. In machine learning, minimizing statistical bias is a component of reducing error. In language models, "bias" generally concerns distributional skews that lead to unfavorable impacts for particular individuals or social groups.

Navigli, R., Conia, S., & Ross, B. (2023). Biases in Large Language Models: Origins, Inventory and Discussion. ACM Journal of Data and Information Quality. https://dl.acm.org/doi/10.1145/3597307

Weidinger, L., Mellor, J., Rauh, M., Griffin, C., Uesato, J., Huang, P. S., ... & Gabriel, I. (2021). Ethical and Social Risks of Harm from Language Models. https://doi.org/10.48550/arXiv.2112.04359

In practice, the boundaries between the social and the statistical sense of the term are not always clear. Social and historical biases rooted in world knowledge enter the ML/AI systems; they are encoded in text corpora and perpetuated in a cascading fashion. For instance, word embeddings trained on large corpora of text containing human biases can lead to a large language model to produce undesirable outcomes, even if data is perfectly sampled or measured.

Campolo, Alex, Madelyn Sanfilippo, Meredith Whittaker, and Kate Crawford. “AI Now 2017 Report.” Edited by Andrew Selbst and Solon Barocas. AI Now Institute, October 18, 2017. https://ainowinstitute.org/publication/ai-now-2017-report-2

Suresh, H., & Guttag, J. (2021). A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle. Equity and Access in Algorithms, Mechanisms, and Optimization, 1–9. https://doi.org/10.1145/3465416.3483305

Bias per se is not necessarily bad. Biases can also be morally neutral such as those related to insects or flowers, or mirroring the existing distribution of gender in relation to professions or first names. Biases become problematic when they, as input to predictive systems, exacerbate existing inequalities between users. Besides, it is argued that many of the biases can act as a diagnostic tool about the state of society.

Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183–186. https://doi.org/10.1126/science.aal4230

Hovy, D., & Prabhumoye, S. (2021). Five Sources of Bias in Natural Language Processing. Language and Linguistics Compass, 15(8), e12432. https://doi.org/10.1111/lnc3.12432

Schramowski, P., Turan, C., Andersen, N., Rothkopf, C. A., & Kersting, K. (2022). Large pre-trained language models contain human-like biases of what is right and wrong to do. Nature Machine Intelligence, 4(3), Article 3. https://doi.org/10.1038/s42256-022-00458-8

Sources of Bias

LLMs heavily depend on data that is either generated by humans or collected through systems designed by humans. Data is central to the performance, fairness, robustness, and safety of the AI systems. The quality of data carries significant scientific implications for the generalizability of the results. Moreover, it also has ethical implications that could potentially lead to real-world harm.

This section focuses on sources of bias introduced throughout the data-for-AI pipeline. Each stage of data and model development involves choices, practices and contexts that can lead to undesirable downstream consequences.

Bender, E. M., & Friedman, B. (2018). Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. Transactions of the Association for Computational Linguistics, 6, 587-604. https://doi.org/10.1162/tacl_a_00041

Liang, W., Tadesse, G. A., Ho, D., Fei-Fei, L., Zaharia, M., Zhang, C., & Zou, J. (2022). Advances, Challenges and Opportunities in Creating Data for Trustworthy AI. Nature Machine Intelligence, 4(8), 669-677. https://doi.org/10.1038/s42256-022-00516-1

Whang, S. E., Roh, Y., Song, H., & Lee, J.-G. (2023). Data collection and quality challenges in deep learning: A data-centric AI perspective. The VLDB Journal, 32(4), 791–813. https://doi.org/10.1007/s00778-022-00775-9

Besides, literature from the fields of Human-Computer Interaction (HCI) and Science and Technology Studies (STS) provide complementary perspectives to this discussion along the data-for-AI pipeline. For instance, HCI scholars examine the practices in working with data that shape the design and use of data-driven systems. STS researchers highlight sociopolitical contexts in which data, people, and things are situated. Readers may explore literature from these areas on their own.

Sambasivan, N., Kapania, S., Highfill, H., Akrong, D., Paritosh, P., & Aroyo, L. M. (2021). “Everyone Wants to Do the Model Work, Not the Data Work”: Data Cascades in High-Stakes AI. Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1-15. https://doi.org/10.1145/3411764.3445518