• español
  • English
  • français
  • Deutsch
  • português (Brasil)
  • italiano
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Stöbern

    Gesamter BestandBereicheErscheinungsdatumAutorenSchlagwortenTiteln

    Mein Benutzerkonto

    Einloggen

    Statistik

    Benutzungsstatistik

    Compartir

    Dokumentanzeige 
    •   UVaDOC Startseite
    • WISSENSCHAFTLICHE ARBEITEN
    • Departamentos
    • Dpto. Teoría de la Señal y Comunicaciones e Ingeniería Telemática
    • DEP71 - Artículos de revista
    • Dokumentanzeige
    •   UVaDOC Startseite
    • WISSENSCHAFTLICHE ARBEITEN
    • Departamentos
    • Dpto. Teoría de la Señal y Comunicaciones e Ingeniería Telemática
    • DEP71 - Artículos de revista
    • Dokumentanzeige
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano

    Exportar

    RISMendeleyRefworksZotero
    • edm
    • marc
    • xoai
    • qdc
    • ore
    • ese
    • dim
    • uketd_dc
    • oai_dc
    • etdms
    • rdf
    • mods
    • mets
    • didl
    • premis

    Citas

    Por favor, use este identificador para citar o enlazar este ítem:https://uvadoc.uva.es/handle/10324/67944

    Título
    A MapReduce opinion mining for COVID-19-related tweets classification using enhanced ID3 decision tree classifier
    Autor
    Es-Sabery, Fatima
    Es-Sabery, Khadija
    Qadir, Junaid
    Sainz de Abajo, BeatrizAutoridad UVA Orcid
    Hair, Abdellatif
    García Zapirain, Begoña
    Torre Díez, Isabel de laAutoridad UVA Orcid
    Año del Documento
    2021
    Editorial
    IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC.
    Descripción
    Producción Científica
    Documento Fuente
    IEEE Access, Abril 2021, vol. 9, p. 58706-58739.
    Zusammenfassung
    Opinion Mining (OM) is a field of Natural Language Processing (NLP) that aims to capture human sentiment in the given text. With the ever-spreading of online purchasing websites, micro-blogging sites, and social media platforms, OM in online social media platforms has picked the interest of thousands of scientific researchers. Because the reviews, tweets and blogs acquired from these social media networks, act as a significant source for enhancing the decision making process. The obtained textual data (reviews, tweets, or blogs) are classified into three different class labels which are negative, neutral and positive for analyzing and extracting relevant information from the given dataset. In this contribution, we introduce an innovative MapReduce improved weighted ID3 decision tree classification approach for OM, which consists mainly of three aspects: Firstly We have used several feature extractors to efficiently detect and capture the relevant data from the given tweets, including N-grams or character-level, Bag-Of-Words, word embedding (GloVe, Word2Vec), FastText, and TF-IDF. Secondly, we have applied a multiple feature selector to reduce the high feature’s dimensionality, including Chi-square, Gain Ratio, Information Gain, and Gini Index. Finally, we have employed the obtained features to carry out the classification task using an improved ID3 decision tree classifier, which aims to calculate the weighted information gain instead of information gain used in traditional ID3. In other words, to measure the weighted information gain for the current conditioned feature, we follow two steps: First, we compute the weighted correlation function of the current conditioned feature. Second, we multiply the obtained weighted correlation function by the information gain of this current conditioned feature. This work is implemented in a distributed environment using the Hadoop framework, with its programming framework MapReduce and its distributed file system HDFS. Its primary goal is to enhance the performance of a well-known ID3 classifier in terms of accuracy, execution time, and ability to handle the massive datasets. We have carried out several experiences that aims to assess the effectiveness of our suggested classifier compared to some other contributions chosen from the literature. The experimental results demonstrated that our ID3 classifier works better on COVID-19_Sentiments dataset than other classifiers in terms of Recall (85.72 %), specificity (86.51 %), error rate (11.18 %), false-positive rate (13.49 %), execution time (15.95s), kappa statistic (87.69 %), F1-score (85.54 %), classification rate (88.82 %), false-negative rate (14.28 %), precision rate (86.67 %), convergence (it convergent towards the iteration 90), stability (it is more stable with mean deviation standard equal to 0.12 %), and complexity (it requires much lower time and space computational complexity).
    Palabras Clave
    Big data
    Opinion Mining
    ISSN
    2169-3536
    Revisión por pares
    SI
    DOI
    10.1109/ACCESS.2021.3073215
    Patrocinador
    Este trabajo ha sido financiado a través de la subvención IT 905-16, del eVIDA Research Group de la Universidad de Deusto.
    Version del Editor
    https://ieeexplore.ieee.org/document/9404185
    Propietario de los Derechos
    "© Todos los derechos reservados". Propietario de los derechos: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC.
    Idioma
    eng
    URI
    https://uvadoc.uva.es/handle/10324/67944
    Tipo de versión
    info:eu-repo/semantics/publishedVersion
    Derechos
    openAccess
    Aparece en las colecciones
    • DEP71 - Artículos de revista [358]
    Zur Langanzeige
    Dateien zu dieser Ressource
    Nombre:
    A_MapReduce OM.pdf
    Tamaño:
    3.921Mb
    Formato:
    Adobe PDF
    Descripción:
    Artículo principal
    Thumbnail
    Öffnen
    Attribution-NonCommercial-NoDerivatives 4.0 InternacionalSolange nicht anders angezeigt, wird die Lizenz wie folgt beschrieben: Attribution-NonCommercial-NoDerivatives 4.0 Internacional

    Universidad de Valladolid

    Powered by MIT's. DSpace software, Version 5.10