ECA-SIMM - Artículos de revista

ECA-SIMM - Artículos de revista https://uvadoc.uva.es/handle/10324/27505 ECA-SIMM - Artículos de revista Mon, 13 Apr 2026 11:43:26 GMT 2026-04-13T11:43:26Z Prosodic Feature Analysis for Automatic Speech Assessment and Individual Report Generation in People with Down Syndrome https://uvadoc.uva.es/handle/10324/82062 Evaluating prosodic quality poses unique challenges due to the intricate nature of prosody, which encompasses multiple form–function profiles. These challenges are more pronounced when analyzing the voices of individuals with Down syndrome (DS) due to increased variability. This paper introduces a procedure for selecting informative prosodic features based on both the disparity between human-rated DS productions and their divergence from the productions of typical users, utilizing a corpus constructed through a video game. Individual reports of five speakers with DS are created by comparing the selected features of each user with recordings of individuals without intellectual disabilities. The acquired features primarily relate to the temporal domain, reducing dependence on pitch detection algorithms, which encounter difficulties when dealing with pathological voices compared to typical ones. These individual reports can be instrumental in identifying specific issues for each speaker, assisting therapists in defining tailored training sessions based on the speaker’s profile. Mon, 01 Jan 2024 00:00:00 GMT https://uvadoc.uva.es/handle/10324/82062 2024-01-01T00:00:00Z Adaptación de ASR al habla de personas con síndrome de Down https://uvadoc.uva.es/handle/10324/82053 El habla de las personas con discapacidad intelectual (DI) plantea enormes retos a los sistemas de reconocimiento automático del habla (ASR), dificultando con ello el acceso de una población especialmente sensible a los servicios de información. En este trabajo se estudian las dificultades de los sistemas ASR para reconocer habla de personas DI y se muestra cómo esta limitación puede ser combatida con estrategias de ajuste fino de modelos. Se mide el rendimiento de ASR basado en whisper (v2 y v3) con un corpus de referencia de habla típica y habla DI, comprobando que hay diferencias importantes y significativas. Aplicando técnicas de fine-tuning, el rendimiento para hablantes DI mejora en al menos 30 puntos porcentuales. Nuestros resultados muestran que la inclusión de voz de personas DI en los corpus de entrenamiento es fundamental para mejorar la eficacia de los ASR. Mon, 01 Jan 2024 00:00:00 GMT https://uvadoc.uva.es/handle/10324/82053 2024-01-01T00:00:00Z Evaluating the impact of an autonomous playing mode in a learning game to train oral skills of users with Down syndrome https://uvadoc.uva.es/handle/10324/82051 The use of ICT tools is broadly extended among people with intellectual disabilities and also, to a lesser degree, the use of learning tools including learning games. Although the use of learning games is widely accepted due to its high engagement capacity, there are few studies that analyze its usability for people with intellectual disabilities. This work presents an evaluation of the impact of adding an autonomous playing mode on the usability of a learning game designed to aid the training of oral skills in people with Down syndrome. A study in which the effectiveness, efficiency and user satisfaction of a learning game are compared when the system is used with different degrees of teacher supervision is carried out. A learning game originally designed to train oral competencies for people with Down syndrome in a teacher supervised scenario is adapted to allow its autonomous use, by including a module that provides the automatic assessment of oral productions. The use of the tool is thus compared in three different scenarios: a supervised environment, autonomous use, and laboratory use with multiple users working in parallel. The different usability evaluation instruments used reveal that, although there are no differences in the degree of engagement, there may be important differences regarding session performance: the quality of the audios is lower in the laboratory sessions and the number of errors increases in the autonomous sessions. We conclude that, although the autonomous use of learning games by users with intellectual disabilities is possible, and this can lead to considerable savings in human resources, if the feedback provided by the game is not comparable with that provided by the teacher, performance may drop considerably although the degree of engagement is maintained. Fri, 01 Jan 2021 00:00:00 GMT https://uvadoc.uva.es/handle/10324/82051 2021-01-01T00:00:00Z Acoustic characterization and perceptual analysis of the relative importance of prosody in speech of people with Down syndrome https://uvadoc.uva.es/handle/10324/51995 There are many studies that identify important deficits in the voice production of people with Down syndrome. These deficits affect not only the spectral domain, but also the intonation, accent, rhythm and speech rate. The main aim of this work is the identication of the acoustic features that characterize the speech of people with Down syndrome, taking into account the different frequency, energy, temporal and spectral domains. The comparison of the relative weight of these features for the characterization of Down syndrome people's speech is another aim of this study. The openSmile toolkit with the GeMAPS feature set was used to extract acoustic features from a speech corpus of utterances from typically developing individuals and individuals with Down syndrome. Then, the most discriminant features were identied using statistical tests. Moreover, three binary classiers were trained using these features. The best classication rate, using only spectral features, is 87.33%, and using frequency, energy and temporal features, it is 91.83%. Finally, a perception test has been performed using recordings created with a prosody transfer algorithm: the prosody of utterances from one group of speakers was transferred to utterances of another group. The results of this test show the importance of intonation and rhythm in the identication of a voice as non typical. As conclusion, the results obtained point to the training of prosody in order to improve the quality of the speech production of those with Down syndrome. Mon, 01 Jan 2018 00:00:00 GMT https://uvadoc.uva.es/handle/10324/51995 2018-01-01T00:00:00Z Analysis of atypical prosodic patterns in the speech of people with Down syndrome https://uvadoc.uva.es/handle/10324/48103 The speech of people with Down syndrome (DS) shows prosodic features which are distinct from those observed in the oral productions of typically developing (TD) speakers. Although a different prosodic realization does not necessarily imply wrong expression of prosodic functions, atypical expression may hinder communication skills. The focus of this work is to ascertain whether this can be the case in individuals with DS. To do so, we analyze the acoustic features that better characterize the utterances of speakers with DS when expressing prosodic functions related to emotion, turn-end and phrasal chunking, comparing them with those used by TD speakers. An oral corpus of speech utterances has been recorded using the PEPS-C prosodic competence evaluation tool. We use automatic classifiers to prove that the prosodic features that better predict prosodic functions in TD speakers are less informative in speakers with DS. Although atypical features are observed in speakers with DS when producing prosodic functions, the intended prosodic function can be identified by listeners and, in most cases, the features correctly discriminate the function with analytical methods. However, a greater difference between the minimal pairs presented in the PEPS-C test is found for TD speakers in comparison with DS speakers. The proposed methodological approach provides, on the one hand, an identification of the set of features that distinguish the prosodic productions of DS and TD speakers and, on the other, a set of target features for therapy with speakers with DS. Fri, 01 Jan 2021 00:00:00 GMT https://uvadoc.uva.es/handle/10324/48103 2021-01-01T00:00:00Z PRAUTOCAL corpus: a corpus for the study of Down syndrome prosodic aspects https://uvadoc.uva.es/handle/10324/47128 Oral productions of speakers with Down syndrome exhibit special characteristics that have been the target of study for decades. In spite of this attention, the availability of rich resources for its analysis is still scarce. In this paper, we present the definition and compiling procedure of a corpus of semi-controlled oral productions of speakers with Down syndrome that aims to allow the analysis of how these speakers with these speakers produce functional and linguistic aspects of speech. The PRAUTOCAL corpus has been recorded while using a video game for training oral competences. Utterances are related to well defined communicative tasks recorded by both speakers with Down syndrome and typically developing speakers. We present the procedure for human experts to evaluate the recordings and the transcription criteria followed for enriching the utterances of the corpus. PRAUTOCAL permits the analysis of the clear contrast in voice and speech between individuals with Down syndrome and typically developing speakers, taking into account the high heterogeneity of the speech problems characteristic of the syndrome. This material allows the analysis of the speech problems in Down syndrome, with applications to the generation of knowledge that could be used in future works for therapists to prepare specific training or enriching diagnosis regarding possible speech and language disorders. Fri, 01 Jan 2021 00:00:00 GMT https://uvadoc.uva.es/handle/10324/47128 2021-01-01T00:00:00Z Analysis of inter-transcriber consistency in the Cat_ToBI prosodic labeling system https://uvadoc.uva.es/handle/10324/41016 A set of tools to analyze inconsistencies observed in a Cat ToBI labelling experiment are presented. We formalize and use the metrics that are commonly used in inconsistency tests. The metrics are systematically applied to analyze the robustness of every symbol and every pair of transcribers. The results reveal agreement rates for this study that are comparable to previous ToBI inter-reliability tests. The inter-transcriber confusion rates are transformed into distance matrices to use multidimensional scaling for visualizing the confusion between the different ToBI symbols and the disagreement between the raters. Potential different labelling criteria are identified and subsets of symbols that are candidates to be fused are proposed. Sun, 01 Jan 2012 00:00:00 GMT https://uvadoc.uva.es/handle/10324/41016 2012-01-01T00:00:00Z Automatic assessment of non-native prosody by measuring distances on prosodic label sequences https://uvadoc.uva.es/handle/10324/41015 The aim of this paper is to investigate how automatic prosodic labeling systems contribute to the evaluation of non-native pronunciation. In particular, it examines the efficiency of a group of metrics to evaluate the prosodic competence of non-native speakers, based on the information provided by sequences of labels in the analysis of both native and non-native speech. A group of Sp_ToBI labels were obtained by means of an automatic labeling system for the speech of native and non-native speakers who read the same texts. The metrics assessed the differences in the prosodic labels for both speech samples. The results showed the efficiency of the metrics to set apart both groups of speakers. Furthermore, they exhibited how non-native speakers (American and Japanese speakers) improved their Spanish productions after doing a set of listening and repeating activities. Finally, this study also shows that the results provided by the metrics are correlated with the scores given by human evaluators on the productions of the different speakers. Sun, 01 Jan 2017 00:00:00 GMT https://uvadoc.uva.es/handle/10324/41015 2017-01-01T00:00:00Z Automatic assessment of prosodic quality in Down syndrome: Analysis of the impact of speaker heterogeneity https://uvadoc.uva.es/handle/10324/41013 Prosody is a fundamental speech element responsible for communicative functions such as intonation, accent and phrasing, and prosodic impairments of individuals with intellectual disabilities reduce their communication skills. Yet, technological resources have paid little attention to prosody. This study aims to develop an automatic classifier to predict the prosodic quality of utterances produced by individuals with Down syndrome, and to analyse how inter-individual heterogeneity affects assessment results. A therapist and an expert in prosody judged the prosodic appropriateness of a corpus of Down syndrome’ utterances collected through a video game. The judgments of the expert were used to train an automatic classifier that predicts prosodic quality by using a set of fundamental frequency, duration and intensity features. The classifier accuracy was 79.3% and its true positive rate 89.9%. We analyzed how informative each of the features was for the assessment and studied relationships between participants’ developmental level and results: interspeaker variability conditioned the relative weight of prosodic features for automatic classification and participants’ developmental level was related to the prosodic quality of their productions. Therefore, since speaker variability is an intrinsic feature of individuals with Down syndrome, it should be considered to attain an effective automatic prosodic assessment system. Tue, 01 Jan 2019 00:00:00 GMT https://uvadoc.uva.es/handle/10324/41013 2019-01-01T00:00:00Z Using challenges to enhance a learning game for pronunciation training of English as a second language https://uvadoc.uva.es/handle/10324/41012 Learning games have a remarkable potential for education. They provide an emergent form of social participation that deserves the assessment of their usefulness and efficiency in learning processes. This study describes a novel learning game for foreign pronunciation training in which players can challenge each other. Native Spanish speakers performed several pronunciation activities during a one-month competition using a mobile application, designed under a minimal pairs approach, to improve their pronunciation of English as a foreign language. This game took place in a competitive scenario in which students had to challenge other participants in order to get high scores and climb up a leaderboard. Results show intense practice supported by a significant number of activities and playing regularity, so the most active and motivated players in the competition achieved significant pronunciation improvement results. The integration of automatic speech recognition (ASR) and text-to-speech (TTS) technology allowed users to improve their pronunciation while being immersed in a highly motivational game. Wed, 01 Jan 2020 00:00:00 GMT https://uvadoc.uva.es/handle/10324/41012 2020-01-01T00:00:00Z Engaging adolescents with Down syndrome in an educational video game https://uvadoc.uva.es/handle/10324/27527 This article describes the design, implementation and evaluation of an educational video game that helps individuals with Down syndrome to improve their speech skills, specifically those related to prosody. Special attention has been paid to the design of the user interface, taking into account the cognitive, learning, and attentional limitations of people with Down syndrome. The learning content is conveyed by activities of production and perception of prosodic phenomena, aimed at increasing their communicative competence. These activities are introduced within the narrative of a video game so that the players do not conceive the tool as a mere succession of learning activities, but so that they learn and improve their speech while playing. The evaluation strategy that has been followed involves real users and combines different evaluation activities. Results show a high level of acceptance by participants and also by professionals, speech therapists, and special education teachers. Sun, 01 Jan 2017 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27527 2017-01-01T00:00:00Z