Timely and accurate data on local-level food security outcomes is key to both targeting interventions to populations in need, and to understanding whether interventions had their intended benefit. Unfortunately, reliable data on local-level food security and related outcomes is often scarce.
Organizations requiring this information have two main options. They can rely on coarse publicly-available data, e.g. national-level data on incomes or agricultural performance, whose reliability is often questioned and which only become available months or years after the fact. Or they can collect the needed data themselves, which is both expensive and time consuming and requires designated staff.
And even when organizations have the resources and capacity to collect their own data, they inevitably must sample a small number of intended beneficiaries or clients. Such samples provide an accurate picture of average outcomes – e.g. average food security levels or average agricultural productivity across a population – but they are rarely large enough to allow organizations to really drill down to understand what outcomes look like in specific villages or on specific farms, or how they change over time.
While most villages and households in developing countries are rarely if ever visited by data collectors, satellites are collecting data on these locations all the time, with increasing frequency and resolution. New research seeks to understand whether this wealth of new information from satellites can be used to measure, understand, and improve food security outcomes around the developing world. Our work at Stanford University's School of Earth, Energy & Environmental Sciences is a collaboration with USAID's Bureau for Food Security, the College of William & Mary, and the Global Innovation Fund, as well as a number of organizations mentioned below.
We use satellites to measure smallholder agricultural productivity - a key component of food security for the majority of households in the developing world. Researchers have long used satellite-based remote sensing for measuring agricultural outcomes, but the coarseness of the imagery that has been historically available has meant that this approach was only relevant for the much larger fields in developed countries. Only in the last few years have sensors been launched that have both the spatial resolution (e.g. 10 meters or less) and the overpass frequency (e.g. daily or weekly) that make it possible to make accurate measurements of productivity on the often very small plots of smallholders. Key here are both publicly available satellite imagery, such as from the Sentinel constellation of the European Space Agency, as well as imagery from important private sector partners like Planet and Digital Globe.
A scalable, satellite-based approach to agricultural productivity measurement needs to be able to do at least two things: tell you what crops are being grown where, and tell you the productivity of these crops. The constraint in accomplishing these tasks is now rarely the availability of the satellite input, but instead the need for accurate, georeferenced ground data on which models must be trained and validated. We either collect these data ourselves, or we partner with trusted partners such as the World Bank LSMS team and organizations like One Acre Fund. (If you’re reading this and have well-georeferenced agricultural data, please get in touch!)
How well can satellite-based models do at these two tasks? Our research – which includes work done with multiple excellent students and postdocs over the years – has mainly focused on maize thus far, the main African staple. Our findings from multiple countries across Africa has shown that satellites can predict with very high accuracy (>80%) whether a given plot or image pixel is planted with maize (e.g. see here and here).
We have then found that we can also make an accurate prediction of the productivity on maize plots. Below we show the relationship between satellite-predicted and ground measured maize yields across different settings, with satellites-based estimates explaining about half of the overall variation in ground-measured yields at the field level. Each dot represents a field where we have both a ground-based measurement and a satellite-based measurement.
You might be thinking: explaining half the variation is okay, but what about all that scatter we see in these graphs? Why aren’t satellites doing better? We’ve found that a big part of the answer is the ground data themselves: even the most carefully collected ground data comes with its own sources of error. For self-reported data, farmers frequently mis-estimate their field size or forget (or never measure) exactly how much they harvested. And “objective” ground-based measures collected by surveyors are typically based on crop cuts that estimate yield on an entire plot by collecting grain on only a tiny part of the plot (typically just a few square meters, or <1% of the plot). When scaled to the entire plot, these estimates then have substantial error as well.
So where’s more of the noise coming from: ground or sky? If most of the noise is coming from the sky, then we might expect satellite-based measures to correlate poorly with other agricultural factors that we think should improve yields – e.g farmers’ use of fertilizer or hybrid seeds. Instead, in settings where we have data on these other inputs, we find that satellite-based yield estimates are roughly as good as ground-based yield measures at predicting responses to these inputs (see here and here). This suggests then that satellite-based estimates are probably no noisier than most ground-based estimates, and that for many applications you’d be just as well off using satellite-based yield estimates as you’d be using ground-based yield estimates. This is good news, both because again we have imagery everywhere, and because a plot level yield estimate from a satellite is orders of magnitude cheaper, and much faster, than doing a survey.
This does not mean surveys are no longer important. On the contrary, ground surveys have a critical role to play, as they can measure countless things that satellites can’t, and they provide key training data for satellite-based approaches. We believe the way forward lies in combining high-quality, geo-referenced ground-based surveys with the wealth of new satellite data and computational methods becoming available.
Using this powerful combination, we’re now working to scale these crop yield estimates across the continent and over time. Below is an example of these sorts of scaled estimates, showing estimated field-level maize yields across multiple years and hundreds of thousands of fields in Western Kenya.
Again, unlike most survey data, these estimates are available for all fields in a region – basically an annual census of fields, rather than an intermittent sample of just a few of them. And reliable estimates can be generated as soon as the season is over, without the long time lag often associated with ground-based estimates.
In our view, academia is not necessarily the right place to continually serve and update these types of datasets and help them get incorporated into the operations of the many organizations that see value in them. Incentives – and core competencies – in academia are to innovate on new methods rather than to take existing methods, scale them, and make them usable to non-specialists. And so, with initial support from the Rockefeller Foundation, we’ve founded (with Stefano Ermon, a computer science colleague) a public-benefit corporation named Atlas AI which will generate and serve these and related estimates going forward. If these sorts of data can be useful, please get in touch and check out our work at Atlas (http://www.atlasai.co).
This post was written by Marshall Burke and David Lobell, with Stanford’s School of Earth, Energy & Environmental Sciences, originally published on Agrilinks in May 2019.