Reviewing a Decade of Research on Satellite-Based Yield Modeling in Smallholder Agriculture
Reliable crop production estimates shape many decisions in food security planning. Governments use them to anticipate shortages, guide imports, and coordinate response programs. Yet in many low- and lower-middle-income countries dominated by smallholder agriculture, those estimates remain difficult to produce. Reliable ground measurements of crop yields are limited in many smallholder regions, making it difficult to train and evaluate yield models.
Satellite observations allow crop conditions to be monitored across large regions where field surveys are limited. Researchers have increasingly used this data to estimate and forecast yields. In practice this usually means stitching together several imperfect signals, satellite vegetation indicators, rainfall estimates, and whatever yield estimates exist for the area. A recent review by NASA Harvest researchers, published in the International Journal of Applied Earth Observation and Geoinformation, examines how these methods are being applied across smallholder farming systems and where they still struggle.
Smallholder landscapes present challenges that many yield models were not originally designed to handle. Fields are small and fragmented, and a single satellite pixel may include multiple crops, trees, or non-agricultural land. Farming practices vary widely between neighboring plots and reliable ground measurements of crop yields are often scarce or available only at coarse administrative scales.
“Smallholder agriculture introduces a level of variability that many yield models were not built for,” said Fangjie Li, a NASA Harvest researcher and lead author of the study. “Understanding how researchers are adapting Earth observation methods to these environments helps clarify both the progress that has been made and the gaps that still remain.”
To examine those patterns, the team reviewed 268 peer reviewed studies published between 2012 and 2022 that used Earth observation data to estimate or forecast crop yields in 73 countries dominated by smallholder agriculture. The studies covered 25 crops, with most focusing on staple cereals such as maize, wheat, and rice.
Caption: Figure 7. Modeling approaches used in yield estimation studies over time.
Number of studies by year using statistical regression, machine learning, or process-based crop models. The figure shows the rapid growth of machine learning approaches in recent years alongside continued reliance on statistical models in data-limited environments.
While reviewing the selected studies, the researchers found three modeling approaches show up repeatedly. Statistical regression models remain common because they can operate with relatively limited input data. Machine learning approaches have expanded quickly in recent years and often improve predictive accuracy when satellite observations are combined with weather variables. Process-based crop models are less common. They simulate crop growth using environmental conditions and plant physiology, but they require detailed information about soils and farm management that is rarely available in smallholder systems.
Most studies rely on freely available satellite imagery such as MODIS, Landsat, and Sentinel-2. Vegetation indices like NDVI are widely used to track crop growth, often alongside rainfall and temperature data. Figure 7 illustrates how the use of different modeling approaches has evolved across the literature over time.
Despite rapid growth in satellite-based yield modeling, several structural constraints appear consistently across studies. Reliable ground yield data remain limited in many regions, making model calibration and validation difficult. Small field sizes create mixed pixels in satellite imagery that obscure crop signals. Research coverage also remains uneven across countries and crops, as shown in Figure 4.
Caption: Figure 4. Geographic distribution of yield forecasting and estimation studies included in the review.
Map showing where satellite-based yield modeling studies have been conducted across low- and lower-middle-income countries dominated by smallholder agriculture. The distribution highlights regions where research activity is concentrated and where evidence remains limited.
“The literature shows real progress, but it also highlights where the biggest uncertainties remain,” said Sheila Baber, a NASA Harvest researcher and co-author of the study. “Recognizing those gaps is an important step toward building yield models that can actually work in smallholder systems.”
Mapping where research has concentrated and where it remains sparse can help guide future work. Improving yield estimation in smallholder landscapes will likely depend not only on new algorithms, but also on better ground data, improved methods for fragmented agricultural landscapes, and stronger data infrastructure.