Por favor, use este identificador para citar o enlazar este ítem:https://uvadoc.uva.es/handle/10324/82077
Título
Concatenation Augmentation for Improving Deep Learning Models in Finance NLP with Scarce Data
Autor
Año del Documento
2025
Editorial
MDPI Electronics
Documento Fuente
Vaca, C.; Román-Gallego, J.-Á.; Barroso-García, V.; Tejerina, F.; Sahelices, B. Concatenation Augmentation for Improving Deep Learning Models in Finance NLP with Scarce Data. Electronics 2025, 14, 2289. https://doi.org/10.3390/electronics14112289
Abstract
Nowadays, financial institutions increasingly leverage artificial intelligence to enhance decision-making and optimize investment strategies. A specific application is the automatic analysis of large volumes of unstructured textual data to extract relevant information through deep learning (DL) methods. However, the effectiveness of these methods is often limited by the scarcity of high-quality labeled data. To address this, we propose a new data augmentation technique, Concatenation Augmentation (CA). This is designed to overcome the challenges of processing unstructured text, particularly in analyzing professional profiles from corporate governance reports. Based on Mixup and Label Smoothing Regularization principles, CA generates new text samples by concatenating inputs and applying a convex additive operator, preserving its spatial and semantic coherence. Our proposal achieved hit rates between 92.4% and 99.7%, significantly outperforming other data augmentation techniques. CA improved the precision and robustness of the DL models used for extracting critical information from corporate reports. This technique offers easy integration into existing models and incurs low computational costs. Its efficiency facilitates rapid model adaptation to new data and enhances overall precision. Hence, CA would be a potential and valuable data augmentation tool for boosting DL model performance and efficiency in analyzing financial and governance textual data.
Revisión por pares
SI
Idioma
eng
Tipo de versión
info:eu-repo/semantics/publishedVersion
Derechos
openAccess
Aparece en las colecciones
Files in questo item
La licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 Internacional








