Representing location: subjectivity and bias in geospatial data
We live in a world where our lives are increasingly shaped by hidden algorithms. Selection bias in the data that powers them will influence more aspects of our lives than ever before as we enter the 2020s. The choices we make about how to represent data reflect our priorities, preconceptions and even ideologies.
Transparency and mitigating bias
Bias in the use of data, and geolocation data in particular, presents ethical challenges for governments and app developers. Even representative data presents problems, as statistical outliers at the fringes of society help us define our ethical priorities, according to Dr Jonathan Cave, a professor of Applied Game Theory and data ethics expert who advises the UK government: “Statistical representativeness may be less important than getting an adequate coverage of the things we’re concerned about — particularly in a political context, when it is the extreme things that drive policy.”
To address the risks presented by bias, organisations need to build technical tools to assess the unintended consequences of their technology and mitigate human error, according to Noor Mo’alla, an expert in artificial intelligence for development and director of Doteveryone, a think tank that promotes responsible use of technology. Doteveryone is developing such tools to help organisations incorporate “consequence scanning” into project management.
Transparency around the collection and use of data can help government and developers spot unintended consequences and protect them from accusations of intentional bias. Organisations like openDemocracy are experimenting with tools to make transparency around user data the norm. Dr Hannah Fry, Associate Professor in the Mathematics of Cities at the UCL Centre for Advanced Spatial Analysis, advocates a Hippocratic Oath for data science that could mitigate the risks posed by data bias and prioritise privacy and transparency. Such pledges are already attracting signatories: the Safe Face Pledge, championed by computer scientist Joy Buolamwini, founder of the Algorithmic Justice League, is a public commitment to address bias in tech and prioritise transparency.
Algorithms that predict future crime seem to disproportionately target ethnic minority neighbourhoods. A landmark report by investigative journalists published by ProPublica in 2016 revealed how a risk assessment algorithm used by US judges to set probation conditions was biased against African Americans. While ethnicity was not directly factored into risk assessment scores, the same biases that disadvantaged African Americans in wider society were replicated in the data. Previous convictions, education levels and employment history acted as proxies for race. “Blacks are almost twice as likely as whites to be labeled a higher risk but not actually re-offend,” the report states.
Prediction models can fail individuals, even if they are broadly representative, according to Noor Mo’alla: “Data that’s representative of most people means that other people are being left behind and sometimes that’s what’s creating bias.”
It’s not just the collection of data that can introduce bias inadvertently. Bias can emerge as a result of how data is shared and linked — for example, when sensitive information is included in data unnecessarily.
Noor Mo’alla observed this kind of bias when working on a UN sanitation project in a refugee camp, she says. Truck drivers tasked with collecting waste water were accidentally incentivised to discriminate against women after gender was included in a data set that was given to them to help them locate water tanks. “The bias wasn’t necessarily in the data, but was in how it was collected and shared,” explains No’alla. “Sometimes it’s not just the data set that introduces bias, but how it’s been shared or stored afterwards.”
Bias can emerge in our interpretation of data if we don’t understand its context. Research by the RAND Corporation shows that the crack cocaine epidemic in Los Angeles in the 1980s was a market-driven phenomenon. Targeting drug dealers only shifted the problem, as others would take their place to satisfy the latent demand. Market forces outweighed individual tendencies to commit crime. In social psychology, attributing behaviours to individual dispositions while ignoring the context in which they occur is known as correspondence bias.
When trying to make sense of data, context is everything. By monitoring the price of goats at different locations, an innovation lab at the UN Refugee Agency (UNHCR) was able to predict migration from Somalia into Ethiopia during 2019, allowing them to provide resources to refugee camps in advance. Goats were being used almost like a banking system, according to UNHCR data scientist Rebeca Moreno Jimenez. Cultural bias was a barrier to seeing this, and Jimenez only discovered this correlation by talking to people on the ground — gathering the qualitative data that provided the context.
Politics of place
All maps are political, as is illustrated by a long-running cartographical controversy. In 1973, German historian Arno Peters presented what he claimed was a new “egalitarian world map”. Peters argued that the Mercator projection, the predominant map of the world at the time, diminished the importance of what is now referred to as the Global South, by seemingly inflating the size of the countries in the northern hemisphere. He claimed that his map rectified this by representing the sizes of the continents accurately.
Cartographers were well aware of the limitations of flat projections of a spherical world, but this pragmatism allowed Peters to accuse the profession of reinforcing European colonial attitudes, and to position his map as part of a cultural movement in opposition to a Eurocentric worldview.
The Peters controversy illustrates how our models of the world reflect our biases and priorities, and that ethics cannot be overlooked in geography.
Intentional bias and targeting
It’s important to remember that not all bias is unintentional. In Xinjiang province in China, the authorities have mandated the use of government-supplied navigation equipment in all vehicles in order to track the locations of the Muslim Uighur ethnic minority. Maintaining and strengthening international commitments to rights around personally identifiable information and protected characteristics is an urgent priority for governments and aid organisations to mitigate the risk of this kind of targeting. Governments, aid organisations and app developers can also mitigate bias by embracing transparency. Open debate around the ethics of data collection can help us to recognise bias when it emerges and work towards mitigating its impacts. More importantly, transparency around the use of data can help to ensure that the tools we design to protect minorities and vulnerable groups, like Myanmar’s Rohingya, are not used against them.