The CoRal Project - Danish Speech Dataset

Speech tech in general is a growing industry, with an estimated annual compound growth of 16-19% in the next five years. Danish, a small and low-resource language, risks falling behind and missing revenue mainly due to lack of Danish data.

Speech tech in general is a growing industry, with an estimated annual compound growth of 16-19% in the next five years. But the majority of speech tech development is targeted towards high resource languages such as English. Danish, a small and low-resource language, risks falling behind and missing revenue from this technology. Modern speech tech is based on ML algorithms, which require large amounts of data. Lack of Danish data is the main problem for moving the industry forward.

During this webinar Dan Saattrup Nielsen will outline the purpose of the Danish Conversational and Read-aloud Speech Dataset and the main problems it solves, including insight in the process with data collecting, and some of the challenges that follow when working with language and different dialects.

Martin Carsten Nielsen will be introducing us to the Machine Learning, clarify how the models are being trained and share some of the results achieved. Finally, Lars Maaløe will share perspective of the use of the Danish speech dataset within healthcare and possibly other sectors, and he will offer his prediction on the impact of the speech dataset and how practice will look like in 10 years.

You will meet:

Dan Saattrup Nielsen has a PhD in Mathematics and has been working actively with natural language processing since 2019, within academia, startups, governmental institutions, as well as consulting. He is currently working as a Senior AI Specialist at the Alexandra Institute, where he leads the machine learning model development on the CoRal project, which aims to build open-source speech datasets and models for the Danish language.

Martin Carsten Nielsen, Co-Founder at Alvenir, Natural Language Processing.

Lars Maaløe, Co-Founder & CTO at Corti, Adj. Assoc. Professor of Machine Learning.

Registration
Please note this webinar is organized and run by Danish Sound Cluster, upon signing up you will be redirected to their site

Information

When

7. dec. 2023 15:00 - 16:30
Where

Webinar foran din egen pc, tablet, smartphone
Registration Deadline

6. dec. 2023 - 23:59
Organizer

IDA Fremtidsteknologi
Available Seats

17
Event Number

352601

The CoRal Project - Danish Speech Dataset

When

Where

Registration Deadline

Organizer

Available Seats

Event Number