IA-SPEAK – Artificial Intelligence for Speech Rehabilitation

General description of the project

The IA-SPEAK project focuses on the development of an intelligent comprehensive assistance system aimed at people who have lost the ability to communicate orally in a normal way as a result of Acquired Brain Injury (ABI), neurodegenerative diseases or processes associated with ageing. One of the most common communication disorders following ABI is dysarthria, an impairment of the motor skills of the organs involved in speech articulation. Although those affected can produce coherent and well-structured speech, poor articulation makes it difficult for others to understand them. This limitation creates significant barriers to everyday communication, negatively impacting both the quality of life of those affected and their full social integration.

The proposed system integrates two main components: on the one hand, an intelligent rehabilitation platform, designed to facilitate speech improvement from home through personalised exercises and remote monitoring; and on the other hand, a real-time translation device, capable of transforming the user’s non-standard voice into speech that is understandable to any interlocutor. In this way, the system acts as a “personal interpreter” that deciphers the communicative intention and converts it into a clear and accessible message.

The relevance of this project lies in the current lack of effective technological solutions for this group. Most speech rehabilitation therapies require frequent visits to specialised centres, which limits accessibility and places a high demand on healthcare resources. Furthermore, even when patients achieve a certain degree of recovery, their speech often remains difficult for others to understand, which maintains significant communication barriers and causes social isolation.

The project will be developed with the participation of 100 people with speech disorders at different stages of recovery, in collaboration with ADACEN centres and the Ubarmin Clinic. Throughout the process, voice samples will be collected and analysed to identify specific patterns of dysarthria and other speech disorders. This data will serve as the basis for the design and training of artificial intelligence algorithms capable of recognising and translating each person’s particular form of communication.

Objectives

The overall objective of IA-Speak is to design, develop and validate an intelligent system providing comprehensive, personalised assistance for people with speech disorders resulting from brain damage. This system seeks to reduce the communication barriers faced by these patients, facilitating their daily interaction, promoting their social integration and contributing significantly to improving their quality of life.

With this strategy, IA-SPEAK addresses both the improvement of speech ability and effective, immediate communication in everyday life.

Within this framework, the following specific objectives are proposed:

Develop agile vocal characterisation through brief sessions between speech therapists and patients to identify individual speech patterns and adapt both rehabilitation processes and translation algorithms to the specific needs of each user.
Design and validate real-time voice translation algorithms, based on artificial intelligence techniques capable of interpreting non-standard speech and generating clear and understandable messages, preserving the user’s vocal identity as much as possible.
Integrate biometric verification and facial recognition systems, adapted to unconventional voices and complemented by the analysis of lip movements and expressions, in order to increase the accuracy of interpretation and enrich the rehabilitation process.
Create intelligent multimodal recommendation systems that combine vocal, facial and contextual information to automatically propose personalised exercises tailored to the progress and specific needs of each patient.
Develop a portable, lightweight and accessible device with sufficient battery life for daily use, incorporating criteria of durability, resistance and ergonomics, thus facilitating its natural integration into the daily lives of users.
Validate the complete solution with end users through a pilot programme involving 100 people in different stages of recovery, in collaboration with specialised clinical centres, which will allow its real effectiveness to be evaluated and information to be gathered for future improvements.

Contribution of Nair Center

Nair Center plays a key role in leading the development of advanced components for audio processing and the implementation of deep learning models aimed at recognising and translating non-standard speech.

Its main contribution is the design of the voice processing architecture, which interprets variable and complex patterns using automated pre-processing pipelines, applying filtering, normalisation and segmentation to optimise signal quality. Within this framework, it develops advanced diarisation algorithms, spectrogram extraction and MFCC coefficients, which allow the identification of each user's particularities and the generation of personalised voice profiles that underpin the entire system.

A notable contribution is the non-standard speech translation module, structured in four interconnected blocks: data input, transcription, cloning and optimised output. Transcription uses models such as Whisper and WhisperX, fine-tuned to learn the pronunciation characteristics of each patient, similar to adapting to a new accent or dialect. The voice cloning module, based on models such as F5-TTS adapted to Spanish, generates synthetic audio that preserves the user's vocal identity with greater clarity and comprehensibility, ensuring authenticity in communication.

Finally, Nair Center applies its expertise in multimodal recommendation systems, integrating voice analysis, facial recognition, user profile, and historical progress. Using deep reinforcement learning techniques, the system suggests personalised exercises in real time, adapted to each patient's progress.

Partners

Financing

More projects