Funder

Lacuna Fund

Duration

Lorem ipsum dolor

Keywords (Technologies and Domain)

Language Technology, Personally Identifiable Information (PII)

Lacuna PII

The Lacuna PII project focused on developing high-quality, privacy-aware datasets for Luganda, Lumasaaba, Hausa, and Kanuri, with particular attention to protecting Personally Identifiable Information (PII) and addressing gender bias. The project involved collecting and annotating text from diverse sources, including radio transcripts, newspapers, and publicly available datasets. The annotation process emphasized the identification and labeling of entities such as names, locations, organizations, dates, and contact details, while maintaining grammatical accuracy and preserving the linguistic integrity of the text.

Outputs (Datasets, publications, models)

A high-quality multilingual dataset for training privacy-preserving AI models