ECA-SIMM - Comunicaciones a congresos, conferencias, etc.

ECA-SIMM - Comunicaciones a congresos, conferencias, etc. https://uvadoc.uva.es/handle/10324/27507 ECA-SIMM - Comunicaciones a congresos, conferencias, etc. Fri, 17 Apr 2026 22:33:40 GMT 2026-04-17T22:33:40Z The effects of spontaneous speech on disfluencies assessment of Spanish speakers with Down syndrome https://uvadoc.uva.es/handle/10324/82148 The aim of this study is to investigate the phonetic and fluency characteristics of spontaneous speech produced by Spanish speakers with Down syndrome (DS) compared to nonspontaneous speech modes (read, elicited and imitation) and assess the impact of these differences both on expert speech quality assessment and on automatic speech recognition (ASR) performance. The PRAUTOCAL corpus includes four different speech generation modes of utterances spoken by people with DS. The results show that there are minor differences in some features between spontaneous speech and other modes, but specific types of disfluencies and phonetic errors are more prevalent in spontaneous speech. The Whisper model showed improved performance on spontaneous speech, achieving a significantly lower Word Error Rate (WER) and fewer substitution errors. The Wav2Vec phoneme recognition model performed significantly worse, showing higher phoneme error rate (PER), more substitutions, and greater total errors, no matter the automatic segmentation tool used (MFA or WebMAUS). Wed, 01 Jan 2025 00:00:00 GMT https://uvadoc.uva.es/handle/10324/82148 2025-01-01T00:00:00Z Pronunciation assessment and automated analysis of speech in individuals with Down syndrome: phonetic and fluency dimensions https://uvadoc.uva.es/handle/10324/82147 In this study, we analyze the potential use of an annotated corpus to identify various dimensions of speech quality, including phonetics and fluency, in individuals with Down syndrome, enabling the development of automated assessment systems. Two experiments were conducted: for phonetic evaluation, we used the Goodness of Pronunciation (GoP) metric with an automatic segmentation system and correlated results with a speech therapist’s evaluations, showing a positive trend despite not notably high correlation values. For fluency assessment, deep learning models like wav2vec were used to extract audio features, and an SVM classifier trained on a fluency-focused corpus categorized the samples. The outcomes highlight the complexities of evaluating such phenomena, with variability depending on the specific type of disfluency detected. Mon, 01 Jan 2024 00:00:00 GMT https://uvadoc.uva.es/handle/10324/82147 2024-01-01T00:00:00Z Integration of generative LLMs into the new generation of chatbots to enhance human-computer interaction https://uvadoc.uva.es/handle/10324/82144 Most conventional chatbots rely on strategies that extract information from databases and use predefined templates to generate responses, which poses a significant limitation in maintaining natural, rich, and contextually adapted dialogues. This study examines the enhancement of chatbots through the integration of application programming interfaces (APIs) from large pretrained language models (LLMs), focusing particularly on the GPT architecture. First, the conventional architectural paradigm of chatbots is described, followed by a description of the integration of GPT-based components. As a proof of concept, this enhanced architecture is implemented in a controlled environment, evaluating coherence, contextual relevance, and adaptability. Results, based on user opinions, indicate a significant improvement in the quality of interactions with the enhanced chatbot compared to its conventional counterpart. In conclusion, the integration of LLM APIs, in this case GPT, represents a notable advancement in dialogue systems, offering more contextual and adaptive responses. This study anticipates a relevant leap in chatbot technology, suggesting a paradigm shift towards more humanized and effective human-computer interactions in the coming years. Mon, 01 Jan 2024 00:00:00 GMT https://uvadoc.uva.es/handle/10324/82144 2024-01-01T00:00:00Z Japañol, a Computer Assisted Pronunciation Tool for Japanese Students of Spanish Based on Minimal Pairs https://uvadoc.uva.es/handle/10324/46517 There are many software tools that rely on speech technologies for providing to users L2 pronunciation training in the field of Computer Assisted Pronunciation Training (CAPT) [1]. Currently the most popular mobile and desktop operating systems grant users a free access to several Text-To-Speech (TTS) and Automatic Speech Recognition (ASR) systems. The combination of adequate teaching methods and gamification strategies are expected to increase user engagement, provide an adequate feedback and, at the same time, keep users active and comfortable. This study describes the "Japañol" mobile application, a specific and controlled version of TipTopTalk! , a serious game for anywhere anytime self-learning, especially designed for Japanese learners of Spanish as a foreign language, that allows users to train and to test their pronunciation skills using their own Android mobile phones or Windows PCs. Mon, 01 Jan 2018 00:00:00 GMT https://uvadoc.uva.es/handle/10324/46517 2018-01-01T00:00:00Z TipTopTalk! Mobile application for speech training using minimal pairs and gamification https://uvadoc.uva.es/handle/10324/27857 This demonstration describes the TipTopTalk! mobile application, a serious game for foreign language (L2) pronunciation training, based on the minimal-pairs technique. Multiple Spoken Language Technologies (SLT) such as speech recognition and text-to-speech conversion are integrated in our system. User’s interaction consists in a sequence of challenges along time, for instance exposure, discrimination and production exercises. The application implements gamification resources with the aim of promoting continued practice. A specific feedback is also given to the user in order to avoid the performance drop detected after the protracted use of the tool. The application can be used in different languages, such as Spanish, Portuguese (European and Brazilian), English, Chinese, and German. Fri, 01 Jan 2016 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27857 2016-01-01T00:00:00Z Tiptoptalk!: A game to improve the perception and production of L2 sounds https://uvadoc.uva.es/handle/10324/27856 Swain’s (1985) Comprehensible Output Hypothesis considers that input alone may not be enough for second/foreign language (L2) learners to acquire new language forms. The Hypothesis claims that producing an L2 will facilitate L2 learning due to the mental processes related with language production. Thus, learners will more likely notice discrepancies and gaps between linguistic aspects of their native language (L1) and those of their L2 when producing language than when only perceiving language. Fri, 01 Jan 2016 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27856 2016-01-01T00:00:00Z Nuevas Propuestas Tecnológicas para la Práctica y Evaluación de la Pronunciación del Español como Lengua Extranjera https://uvadoc.uva.es/handle/10324/27850 La pronunciación, a pesar de ser uno de los aspectos más importantes para el dominio de una lengua extranjera, no ha sido estudiada con la misma profundidad que otros aspectos lingüísticos como la gramática o el léxico, lo que se refleja en el hecho de que en los manuales de enseñanza raramente aparecen ejercicios de pronunciación. Por otro lado, los avances tecnológicos permiten incorporar las tecnologías de la información en el aprendizaje de lenguas extranjeras. En este artículo se describe el estado de la cuestión de la enseñanza de la pronunciación asistida por ordenador a partir de una revisión de los modelos de actividades incluidos en un conjunto de aplicaciones y programas seleccionados, cuyo fin específico es la mejora de la pronunciación: uso de voz grabada para mejorar la percepción de los sonidos en lengua extranjera; uso de habla sintetizada para sustituir las grabaciones; grabación y escucha de la propia voz (del estudiante), bien sea de forma aislada, combinada con voz nativa o manipulada para parecer más nativa; uso de reconocimiento de habla con palabras o frases previamente establecidas; uso de reconocimiento de habla con dominio abierto, incluido en un sistema de diálogo o en un sistema de dictado. La capacidad de adaptarse a los cambios tecnológicos, de definir metodologías adecuadas y de demostrar su utilidad y eficacia, son las claves para el éxito futuro de estas herramientas. Thu, 01 Jan 2015 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27850 2015-01-01T00:00:00Z Playing around Minimal Pairs to improve pronunciation training https://uvadoc.uva.es/handle/10324/27847 Computer Assisted Pronunciation Training (CAPT) apps are becoming widespread to aid learning new languages. However, they are still highly criticized for the lack of the unreplaceable need of direct feedback from a human expert. The combination of the right learning methodology with a gamification design strategy can, nevertheless, increase engagement and provide adequate feedback while keeping users active and comfortable. Thu, 01 Jan 2015 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27847 2015-01-01T00:00:00Z Implementation and test of a serious game based on minimal pairs for pronunciation training https://uvadoc.uva.es/handle/10324/27533 This paper introduces the architecture and interface of a serious game intended for pronunciation training and assessment for Spanish students of English as second language. Users will confront a challenge consisting in the pronunciation of a minimal-pair word battery. Android ASR and TTS tools will prove useful in discerning three different pronunciation proficiency levels, ranging from basic to native. Results also provide evidence of the weaknesses and limitations of present-day technologies. These must be taken into account when defining game dynamics for pedagogical purposes. Thu, 01 Jan 2015 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27533 2015-01-01T00:00:00Z Exploratory use of automatic prosodic labels for the evaluation of Japanese speakers of L2 Spanish https://uvadoc.uva.es/handle/10324/27532 An automatic labeling system using Sp ToBI annotation conventions has been applied both to a non-native corpus of Japanese speakers using Spanish and to a reference corpus of Spanish speakers. A set of metrics based on conditional entropy is computed by using the output of an automatic labeler which happens to be highly correlated with the rates assigned by a team of subject evaluators. An analysis of the relative frequencies in the use of each of the Sp ToBI symbols permits to identify the recurrent mistakes in the productions of non-native speakers. It is discussed with the results that the majority of the observed prosodic deficits can be explained by the prosodic transference between the Japanese and Spanish systems as it had been previouly reported in the state of art. Fri, 01 Jan 2016 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27532 2016-01-01T00:00:00Z Measuring pronunciation improvement in users of CAPT tool TipTopTalk! https://uvadoc.uva.es/handle/10324/27531 We present a L2 pronunciation training serious game based on the minimal-pairs technique, incorporating sequences of exposure, discrimination and production, and using text-to-speech and speech recognition systems. We have measured the quality of users’ production during a period of time in order to assess improvement after using the application. Substantial improvement is found among users with poorer initial performance levels. The program’s gamification resources manage to engage a high percentage of users. A need is felt to include feedback for users in future versions with the purpose of increasing their performance and avoiding the performance drop detected after protracted use of the tool. Fri, 01 Jan 2016 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27531 2016-01-01T00:00:00Z The magic stone: a video game to improve communication skills of people with intellectual disabilities https://uvadoc.uva.es/handle/10324/27530 "The Magic Stone" is a video game whose main aim is to help people with Down syndrome to improve communication skills that have been affected due to their disability, especially those related with prosody. The interface of the video game includes a number of elements to motivate the users to practice and train their pronunciation. The usability tests of the system have reported high degrees of satisfaction of users and trainers. Perception tests have permitted to confirm that players improve the use of prosody with the use. Fri, 01 Jan 2016 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27530 2016-01-01T00:00:00Z Evaluation different non-native pronunciation scoring metrics with the Japanese speakers of the sample corpus https://uvadoc.uva.es/handle/10324/27529 This work presents an analysis over the set of results derived from the goodness of pronunciation (GOP) algorithm for the evaluation of pronunciation at phoneme level over the SAMPLE corpus of non native speech. This corpus includes several recordings of uttered sentences by distinct speakers that have been rated in terms of quality by a group of linguists. The utterances have been automatically rated with the GOP algorithm. The phoneme dependence is discussed to suggest the normalization of intermediate results that could enhance the metrics performance. As result, new scoring proposals are presented which are based on computing the log-likelihood values obtained from the GOP algorithm and the application of a set of rules. These new scores show to correlate with the human rates better than the original GOP metric. Fri, 01 Jan 2016 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27529 2016-01-01T00:00:00Z Improving L2 production with a gamified computer-assisted pronunciation training tool, TipTopTalk! https://uvadoc.uva.es/handle/10324/27528 We present a foreign language (L2) pronunciation training serious game, TipTopTalk!, based on the minimal-pairs technique. We carried out a three-week test experiment where participants had to overcome several challenges including exposure, discrimination and production, while using Text-To-Speech (TTS) and Automatic Speech Recognition (ASR) systems in a mobile application. The quality of users’ production is measured in order to assess their improvement. The application implements gamification resources with the aim of promoting continued practice. Preliminary results show that users with poorer initial performance levels make relatively more progress than the rest. However, it is desirable to include specific and individualized feedback in future versions so as to avoid the performance drop detected after the protracted use of the tool. Fri, 01 Jan 2016 00:00:00 GMT https://uvadoc.uva.es/handle/10324/27528 2016-01-01T00:00:00Z