{"id":3159,"date":"2025-03-21T18:12:01","date_gmt":"2025-03-21T18:12:01","guid":{"rendered":"https:\/\/air.ug\/?page_id=3159"},"modified":"2025-03-21T18:14:06","modified_gmt":"2025-03-21T18:14:06","slug":"project-waxal-voice-collection","status":"publish","type":"page","link":"https:\/\/air.ug\/?page_id=3159","title":{"rendered":"Project : Waxal Voice collection"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-page\" data-elementor-id=\"3159\" class=\"elementor elementor-3159\">\n\t\t\t\t<div class=\"elementor-element elementor-element-d297aae e-flex e-con-boxed e-con e-parent\" data-id=\"d297aae\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t\t\t\t<div class=\"e-con-inner\">\n\t\t<div class=\"elementor-element elementor-element-1c72a05 e-con-full e-flex e-con e-child\" data-id=\"1c72a05\" data-element_type=\"container\" data-e-type=\"container\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;}\">\n\t\t<div class=\"elementor-element elementor-element-9ab7503 e-con-full e-flex e-con e-child\" data-id=\"9ab7503\" data-element_type=\"container\" data-e-type=\"container\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;}\">\n\t\t\t\t<div class=\"elementor-element elementor-element-384f51f elementor-widget elementor-widget-heading\" data-id=\"384f51f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Funder<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e39babb elementor-widget elementor-widget-text-editor\" data-id=\"e39babb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Google<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2459449 elementor-widget elementor-widget-heading\" data-id=\"2459449\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Duration<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ec5eeae elementor-widget elementor-widget-text-editor\" data-id=\"ec5eeae\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>2023-2024<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-39dbb44 elementor-widget elementor-widget-heading\" data-id=\"39dbb44\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Keywords (Technologies and Domain)<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-619c7ec elementor-widget elementor-widget-text-editor\" data-id=\"619c7ec\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Language Technologies, Automatic Speech Recognition<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-abc782c e-con-full e-flex e-con e-child\" data-id=\"abc782c\" data-element_type=\"container\" data-e-type=\"container\">\n\t\t\t\t<div class=\"elementor-element elementor-element-0889ff8 elementor-widget elementor-widget-heading\" data-id=\"0889ff8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Waxal Voice collection<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-69bc5c2 elementor-widget__width-initial elementor-widget elementor-widget-text-editor\" data-id=\"69bc5c2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Uganda is home to over 52 local langauges. However, many of these lack adequate speech datasets. This project aimed to address this gap by collecting crowdsourced speech and text data for diverse African languages. Our goal was to leverage this data to research innovative architectures and deep learning algorithms for multilingual NLP systems (including Speech, NMT, Q&amp;A, LM). These systems would be robust to variations in accents. Ultimately, Waxal strives to make NLP systems more inclusive with the development and inclusive of richer image prompt and transcribed datasets for Ugandan languages.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t<div class=\"elementor-element elementor-element-14914bc e-con-full e-flex e-con e-child\" data-id=\"14914bc\" data-element_type=\"container\" data-e-type=\"container\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;}\">\n\t\t<div class=\"elementor-element elementor-element-aa2e99e e-con-full e-flex e-con e-child\" data-id=\"aa2e99e\" data-element_type=\"container\" data-e-type=\"container\" data-settings=\"{&quot;background_background&quot;:&quot;classic&quot;}\">\n\t\t\t\t<div class=\"elementor-element elementor-element-2dcd67d elementor-widget elementor-widget-heading\" data-id=\"2dcd67d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Outputs (Datasets, publications, models)<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-517005a elementor-widget elementor-widget-text-editor\" data-id=\"517005a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><strong>Image prompt speech datasets.<\/strong> Collected 200 hours for Luganda, Runyankole, Lumasaaba, Lusoga and Acholi and transcribed 20 hours for Luganda, Runyankole, Lumasaaba, Lusoga and Acholi.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Funder Google Duration 2023-2024 Keywords (Technologies and Domain) Language Technologies, Automatic Speech Recognition Waxal Voice collection Uganda is home to over 52 local langauges. However, many of these lack adequate speech datasets. This project aimed to address this gap by collecting crowdsourced speech and text data for diverse African languages. Our goal was to leverage [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"site-sidebar-layout":"no-sidebar","site-content-layout":"page-builder","ast-site-content-layout":"full-width-container","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"disabled","ast-breadcrumbs-content":"","ast-featured-img":"disabled","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"class_list":["post-3159","page","type-page","status-publish","hentry"],"rttpg_featured_image_url":null,"rttpg_author":{"display_name":"airlab","author_link":"https:\/\/air.ug\/?author=1"},"rttpg_comment":0,"rttpg_category":null,"rttpg_excerpt":"Funder Google Duration 2023-2024 Keywords (Technologies and Domain) Language Technologies, Automatic Speech Recognition Waxal Voice collection Uganda is home to over 52 local langauges. However, many of these lack adequate speech datasets. This project aimed to address this gap by collecting crowdsourced speech and text data for diverse African languages. Our goal was to leverage&hellip;","_links":{"self":[{"href":"https:\/\/air.ug\/index.php?rest_route=\/wp\/v2\/pages\/3159","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/air.ug\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/air.ug\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/air.ug\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/air.ug\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3159"}],"version-history":[{"count":4,"href":"https:\/\/air.ug\/index.php?rest_route=\/wp\/v2\/pages\/3159\/revisions"}],"predecessor-version":[{"id":3163,"href":"https:\/\/air.ug\/index.php?rest_route=\/wp\/v2\/pages\/3159\/revisions\/3163"}],"wp:attachment":[{"href":"https:\/\/air.ug\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3159"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}