By Dennis Nguyen & Aletta Smits
Data bias in the design process is a considerable challenge for an inclusive and fair digital society, as it can put users at a disadvantage who are not considered part of an assumed “mainstream”. In the worst case, data bias not only excludes certain demographics from specific services and the values that these offer to others but may even cause severe harm. Examples are, unfortunately, plentiful: there are facial recognition systems that are blind to people of colour, solutions for image recognition that reproduce racism, voice recognition software that cannot understand minority accents (so some users need to change the way they talk), or user interfaces that are virtually inaccessible for disabled people . Other areas are architecture and building design (e.g., work environments for mainly “healthy” occupants), product design (e.g., cars that were developed primarily for an assumed male driver), fraud detection that is biased against migrants, or medical testing that excludes certain age groups.
Simply put, data bias is a problem throughout society and a challenge for practitioners in diverse professional domains. Their perceived impact to individuals ranges from barely noticeable to downright disastrous. There are two general types of data bias, which are both forms of discrimination: minorities and/or marginalised groups are either not considered as part of the target group for a design or they are particularly vulnerable for the harmful effects of a design. Negative consequences are either a result from not having any “data identity” in a design at all or from having a one-sided, disadvantageous data identity that reflects stereotypes and prejudices. Often, the unfair treatments are connected to questions of race, gender, health, and age while digital technologies reinforce existing inequalities and create new ones.
The people affected by data bias face several disadvantages: 1) designs (e.g. services, products, systems) are simply inaccessible to them and data bias thus may further solidify already disadvantageous positions in society; 2) designs may force them to adjust their behaviour to a perceived mainstream in order to have access; and 3) designs can cause distress and harm for already vulnerable demographic groups. For example, gender biases in the design process can lead to both social discrimination and actual physical harm, while racial bias in healthcare exposes minorities to greater risks (Healthaffairs.org). Another example are data-driven systems that frequently punish stigmatised groups. The respective systems are either built for this very purpose (e.g., policing systems biased against specific ethnic groups) or are “abused” for discrimination (e.g., “Corona-apps” used to identify and track marginalised group such as the LGBQTI+ community in some countries).
This project’s first aim is to provide a concise definition of data bias that considers the different social and technical dimensions to the problem. While data bias and algorithmic discrimination are acknowledged challenges for an inclusive digital society, disagreement among academics and professionals is considerable over the exact nature, extent, and especially causes of the problem. Different definitions lead to different proposed solutions that are potentially too narrow to be efficient in any meaningful sense. A systematic, critical review of the data bias discourse is needed to chart the challenge and take stock of approaches to the issue.
The second aim is to explore how data bias happens in concrete professional contexts. The empirical part sets out to chart “risk moments” in design processes and organisational decision-making. The guiding research questions are:
- How do organisations decide what data to consider in developing as well as validating an idea?
- What are the concrete origins of misconceptions and knowledge gaps about the targeted user group(s)?
- How can organisations expand their scope of preliminary investigation to avoid potential biases?
- What is practically feasible within the confinements of a given organisational context?
While awareness for the issue is growing, research on the causes, underlying processes, and impacts only started -and the same accounts for the development of practical solutions. The project strives for contributing to this effort through exploratory research that lays the foundation for further investigation that leads to empirically validated proposals for ethical and inclusive data practices in the professional field.