Machine Learning Datasets for crop Diseases: Imagery and Spectrometry Data
The project aims to deliver open, accessible, and quality machine learning datasets for crop pests and disease diagnosis based on crop imagery and spectrometry data from Uganda, Tanzania, Namibia, and Ghana. The development of beneficial and effective real-world machine-learning applications require localized and labeled pest and disease datasets. This project will provide these appropriate image datasets for food security crops grown in sub- Saharan Africa: Cassava, Maize, Beans, Bananas, Pearl Millet, and Cocoa. In collaboration with the national agricultural experts, this study will deliver on a two-way data set approach for crop pests and diseases: (a) A field-level Geo-coded and time-stamped dataset of 145,000 images representing diseased and healthy cassava, maize, beans, bananas, pearl millet, and cocoa crops. (b) A dataset of 8160 cassava spectra and 2000 spectra points of maize and pearl millet representing disease manifestations before symptoms are visibly seen by the human eye.
Image phenotyping for necrosis in cassava roots
We automate necrosis phenotyping with more efficiency than current methods 
We use artificial intelligence to mine data from local village radio stations to generate timely data on crop pests and disease in sub-Saharan Africa. Crop loss due to pests and disease threatens the economic survival of smallholder farmers, and access to surveillance data is critically important yet often not affordable. Local radio shows are a powerful source of information flow in rural African villages: they cover topics including politics, policy, climate, and social circumstances, in addition to crop concerns. Collectively, this information provides a holistic representation of current events in these communities. They will analyze local broadcasts to generate crop surveillance data that is linked to the local community situation.Radio content will be collected at low cost through a collaboration with Pulse Labs Kampala, and they will build artificial intelligence models based on deep neural networks and keyword identification to mine the data.The results will be combined with photographs of diseased crops provided by local farmers and used to train machine learning models to ultimately extract radio information in multiple languages and with diverse accents. This project will provide near real-time crop surveillance data and allow for timely responses to threats.
Computational prediction of famine
Food shortages are increasing in many areas of the world. We are looking at how to infer the probability of households experiencing famine, based on demographic and geographical features. We are also interested in using structure learning techniques to understanding the causal relationships between these factors and famine risk.
Causal discovery in disease data

It is sometimes thought to be impossible to discover causes of events without any background knowledge or the ability to do experiments. However, the field of inferring causes and effects with purely observational data is developing. Correlation does not directly imply causation, but some patterns of association make particular causal relationships more likely than others.

This work is focused on developing fast methods to find strong causes and effects related to a target variable from a large set of covariates. This is useful (1) for gaining insight into a domain, and (2) for prediction of the effects of interventions. We are particularly interested in applying this to data collected in Uganda concerning prevalence of disease and the outbreak of epidemics such as cholera and ebola. This analysis could confirm or disconfirm our ideas about climatic, demographic and environmental factors which are thought to influence such events. An indication of the relative strengths of different causes can also help in predicting the efficacy of different eradication policies. Entry to NIPS 2008 causal discovery competition received honourable mention for “significant advance on the REGED dataset”.
Mobile monitoring of crop disease
Cassava is the world’s third-largest source of carbohydrate and can grow in hostile conditions where other crops cannot, but has one major weakness: susceptibility to viral disease. Monitoring the spread of disease is essential in countries that depend on it as a staple crop, but the processes currently employed are expensive and slow. We are working on an automated system using $100 smartphones to capture images, diagnose disease with computer vision techniques and provide real-time map information, as well as extensions into banana diseases and automated pest survey. For more information about this project, see blog Work supported by Bill & Melinda Gates Foundation.
Automated malaria diagnosis with digital microscopy

The most reliable test for malaria is microscopic examination of blood films for presence of the parasite. The problem with this is that it requires equipment, and an expert on-site to use it. Some researchers have recently indicated the promise of combining microscopy with mobile phones, in order to mitigate the requirement for an expert to be physically present, and others have investigated the use of computer vision techniques for automatic classification, so that a human expert need not be available at all. However, all of this work has been undertaken in ideal laboratory conditions. We are working on developing these ideas and to trial an automated diagnosis system in the field, intended for use by non-experts. We deal with thick blood film slides as shown.

For more information about this project here. Work supported by Microsoft Research.
Data generation and language technology for low-resourced African languages

The realization of developing natural language processing techniques in tasks such as Machine Translation (MT) requires the availability of monolingual and cross-lingual resources. Currently, the exploration of various advances in NLP techniques for low-resource languages and language pairs in the developing world is complicated by the lack of data resources. For example, in Uganda, where there are over 40 independent languages, there are no monolingual nor bi/multilingual resources for developing NLP systems such as those that significantly benefit well-resourced languages. Now, we are using both manual and existing automated methods to build bilingual corpora for several language pairs involving any low-resourced African language. We plan to use the corpora to explore several NLP applications involving any of the respective low-resourced African languages.

Work supported by a Google Research Award.

Robust traffic flow monitoring
Traffic monitoring systems usually make assumptions about the movement of vehicles, such as that they drive in dedicated lanes, and that those lanes rarely include non-vehicle clutter. Urban settings within developing countries often present extremely chaotic traffic scenarios which make these assumptions unrealistic. We are working on robust techniques for traffic congestion monitoring. Instead of tracking individual vehicles we treat a lane of traffic as a fluid and estimate the rate of flow.
Spatiotemporal models for biosurveillance
It is useful to know the geographical density of a transmittable disease in order to plan interventions and to predict its future developments. This can be difficult where there is a lack of co-ordinated statistics, which is often the case where diseases like malaria or tuberculosis are endemic. In such situations it is possible to combine irregular updates from a variety of less consistent sources. We are looking at the use of spatiotemporal state space models for biosurveillance, for use when there are irregular updates about disease counts.
Kudu: Auction design for agricultural commodity trading

We are trialling an auction system called Kudu, which is designed for trading agricultural commodities in Uganda by phone or web. This is a double auction, meaning that buyers and sellers submit their information separately, and we computationally find the best matches.

This approach seems more promising than both single auction systems (i.e. listings sites, which can’t be used with a basic phone or anywhere bandwidth is scarce) and price advisory systems, which have problems with accuracy and timeliness (wholesale market prices in Kampala change in the course of hours, hence a weekly price bulletin is of limited use). By matching buyers and sellers algorithmic-ally we can overcome these problems. The prototype web interface to the auction system can be tried here: (requires a Ugandan mobile phone number for registration), or text BUY or SELL to 8228. The crops we are currently supporting are coffee, beans, sweet banana and watermelon. Work supported by a Google Research Award.
Learn More