Site logo

About us

The Laboratory of Language Technology focuses on the following topics:

  • Speech recognition
  • Speaker recognition
  • Language and accent identification
  • Speech corpora
  • Phonetics (Estonian prosody and sound system, L2 speech)
  • Various subtopics in natural language processing

In addition, we also work on making speech technology more accessible to the general public, by creating end-user oriented speech recognition applications and packaging speech recognition related software components in more accessible form. Our main focus is on Estonian speech recognition, but most of the components are not specific to Estonian. We are firm supporters of open source.

The lab was formerly part of the Institute of Cybernetics. Our old web pages are available here.

 Laboratory of Language Technology

Past members

  • Asadullah
  • Ottokar Tilk
  • Kairit Sirts
  • Rena Nemoto

Studies

Courses

ITS8040 - Natural Language and Speech Processing

ITS8035 - Speech Processing by Humans and Computers (Kõnetöötlus inimeses ja arvutis)

  • Lecturer: Einar Meister
  • Language: Estonian
  • Level: Master
  • Course details

Master Theses

We offer supervision of Master theses on topics that are related to our research.

Here is a selection of already supervised theses:

  • Fred-Eric Kirsi, Tanel Alumäe (sup). End-to-end Phoneme Segmentation. 2020.
  • Jörgen Valk, Tanel Alumäe (sup). Using Web Scraping for Building Spoken Language Identification Models. 2020. 
  • Hendrik Kivi, Tanel Alumäe (sup). Identification and Localization of Foreign Accent in Speech. 2020.
  • Aivo Olev, Tanel Alumäe (sup). Web Application for Authoring Speech Transcriptions. 2019.
  • Siim Kaspar Uustalu, Tanel Alumäe (sup). Automated Detection and Sentiment Analysis of Registered Entity Mentions in Estonian Language News Media. 2019. Nominated as one of the best MSc theses of the School of IT.
  • Siim Talts, Tanel Alumäe (sup). Analysing Election Candidate Exposure in Broadcast Media Using Weakly Supervised Training. 2019.
  • Leo Kristopher Piel, Tanel Alumäe (sup). Speech-based Identification of Children's Gender and Age with Neural Networks. 2018. Nominated as one of the best MSc theses of the School of IT.
  • Margus Baumann, Tanel Alumäe (sup). Identification of Foreign Language Accent from Speech Using Neural Networks. 2018.
  • Martin Talimets, Tanel Alumäe (sup). End-to-End Speech Recognition for Estonian. 2018.
  • Martin Väljaots, Einar Meister (sup). Computer Aided Pronunciation Training Tool for Estonian. 2018.
  • Roman Hrushchak, Einar Meister (sup). Visualization of Tongue and Lip Movements. 2018. 
  • Evgeniia Rykova, Einar Meister (sup). Perceptual and acoustic similarities between the voices of family members: an approach to synthesize a voice based on family-shared F0 characteristics. 2018. 
  • Thales Santos Ribeiro, Einar Meister (sup). Online Recording of Speech Corpora. 2018. 
  • Lasha Amashukeli, Einar Meister (sup). Online Perception Experiments. 2018
  • Martin Karu, Tanel Alumäe (sup). Weakly Supervised Training of Speaker Identification Models. 2017. Best MSc thesis of the School of IT.

PhD Studies

We are looking for talented and hardworking people to do their doctoral studies on topics that are related to our research.

All PhD students at our lab become a member of our team. You will be hired as an Early Stage Researcher, and will get a salary from the university, in addition to the doctoral scholarship. The full compensation depends on the person (better skills and better research output result in better salary), but the minimum is 1500 EUR (after taxes). This is actually about 25% more than the avarage salary in Estonia. Living costs in Estonia are significantly lower than in Western European countries.

We can admit new PhD students any time.

The proposed topics are described below. However, other topics in the field of speech recognition and speaker recognition are also possible (the exact topic can be determined based on the student and her/his interests and skills). 

Requirements:

  • Interest in scientific reasearch (and understanding about what reasearch is)
  • Masters degree in computer science (or a related field)
  • Good background in mathematics, statistics, probability theory and linear algebra
  • Good background in some subfield of speech technology (e.g., speech recogniton, speaker recogniton)
  • Knowledge of modern approaches in machine learning (incl. deep learning)
  • Excellent programming skills (Python, C++, bash scripting)
  • Experience with modern deep learning toolkits (Pytorch, Tensorflow)
  • Excellent academic writing skills
  • Some experience with the speech recognition toolkit Kaldi is a plus
  • Previous academic or industry experience in speech or language processing is beneficial (but not strictly needed)

We have currently two open PhD positions:

Current PhD Students

  • Martiv Verrev, supervisor Tanel Alumäe and Tanel Tammet. Knowledge extraction from natural language using both machine learning and common sense knowledge systems. 
  • Andrus Paats, supervisor Einar Meister and Ivo Fridolin. Development and implementation of voice recognition system in medicine for radiology.

Graduated Phd Students

Projects

Current Projects

Estonian Speech Recognition (2018-2022)

Speech recognition is a technology for converting natural speech to text. It is used for dictating documents and automatic transcription of speech recordings. Estonian speech recognition has significantly improved during the recent years. On broadcast speech data, a word error rate of 10% has been reached. The improvements have been made possible due to our work on collecting and transcribing new speech corpora and recent advancements in deep neural networks in machine learning. The goal of this project is to further improve the state of Estonian speech recognition. We focus on the kind of speech data which currently causes many recognition errors: noisy data, multi-speaker meetings, speech from seniors, speech with high code-switching content. To fulfil this goal, we will improve the currently used speech recognition methods and algorithms and transcribe new speech corpora. We will also improve the flexibility and usability of our open-source Estonian speech recognition systems.

Rich Transcription System for the Estonian Parliament (2018-2020)

Within this project, we implemented a system for producing transcripts for the Estonian Parliament using speech recognition, automatic puntuation and speaker identification technology. The system outputs fully punctuated and "nicely" formatted text and identifies members of the parliement based on their voice. The system is currently in the deployment phase.

Publications

Software

End-user applications

Web-based Estonian speech transcription system

Web applications that allows to transcribe long speech recordings, such as interviews, conference speeches. It uses our latest Estonian speech recognition technology. Also does automatic punctuation and identifies Estonian public figures based on their voice.

The application can be used via an old interface (transcripts are sent to e-mail) or new fully web-based interface that also provides web-based post-editing capabilities.

Source code:

Estonian speech recognition for Android (Kõnele)

For many years, Estonian speech recognition was natively not available for Android. Therefore, we developed an Android application Kõnele. Kõnele works as a virtual keyboard: when the user wants to dictate text in any application (e.g. GMail), she/he can switch to the Kõnele keyboard and use speech recognition to input text.

Kõnele uses client-server based speech recognition: Kõnele records user's speech and sends it to the lab's server, and the server sends recognized text back to the user's device.

Links:

Automatic phonetic segmentation for Estonian speech

The web application uses speech recognition models to generate phoneme boundaries for Estonian speech, based on a provided orthographic transcript. It is primarily used by phoneticians to generate initial phonetic segmentations for phonetic corpus annotation.

Links:

Transcribed Speech Archive Browser (TSAB)

An interface to a large collection of automatically transcribed Estonian radio broadcasts and some podcasts. Serves mostly as a showcase of our Estonian speech recognition technology.

Links: