Unequal challenges of Deep Learning for mapping all schools in the world
Schools are a critical infrastructure for the development of every child. Not only they are a fundamental asset in a kid’s education but also provide many time unique access to connectivity and new technologies, act as a hub for critical news and information and are used as vaccination centers or as shelters and unification points after disasters.
Despite their importance, it is not surprising that many governments --specially those in poorer countries-- lack good record of where the schools of the country are, hence limiting their capacity to deploy programs at scale that can bring opportunity and development to the most vulnerable children.
Progress on Deep Learning and Transfer Learning techniques, along with new extensive datasets designed for training on image recognition and, more recently, on satellite image recognition, offer new hopes and opportunities for solving these critical challenges of infrastructure mapping, such as mapping all schools in the world.
Nevertheless, when looked closely, these datasets are biased towards labels, locations and places far from the reality of the most vulnerable children. Furthermore, AI solutions trained on these datasets report holistically on their performance measures. These 2 facts combined make it very difficult to obtain the necessary results for addressing the needs of the most vulnerable children as well as to understand the bias that these solutions might have on performance when applied to different socio-economic backgrounds or diverse contexts, even as basic as rural-urban.
In this paper we look through these issues in detail, presenting results of a project on mapping all schools in the world, and the experience on using a Deep Learning in the loop approach for mapping schools in Colombia and the Caribbean, ending up with a curated dataset of 40,000 schools that led to the semiautomatic discovery of 7,000 unmapped schools. We provide, in addition an initial approach towards a more fair analysis of the performance of such methods, looking at the results obtained as well as to some of the most recent alternatives available in the state of the art. Finally we present a discussion and offer alternatives towards facilitating the development of more fair Deep Learning based infrastructure detection technologies to tackle the problems of the most vulnerable children.