Mostrar el registro sencillo del ítem

dc.contributor.authorEs-Sabery, Fatima
dc.contributor.authorEs-Sabery, Khadija
dc.contributor.authorQadir, Junaid
dc.contributor.authorSainz de Abajo, Beatriz 
dc.contributor.authorHair, Abdellatif
dc.contributor.authorGarcía Zapirain, Begoña
dc.contributor.authorTorre Díez, Isabel de la
dc.date.accessioned2024-06-01T18:25:14Z
dc.date.available2024-06-01T18:25:14Z
dc.date.issued2021
dc.identifier.citationIEEE Access, Abril 2021, vol. 9, p. 58706-58739.es
dc.identifier.issn2169-3536es
dc.identifier.urihttps://uvadoc.uva.es/handle/10324/67944
dc.descriptionProducción Científicaes
dc.description.abstractOpinion Mining (OM) is a field of Natural Language Processing (NLP) that aims to capture human sentiment in the given text. With the ever-spreading of online purchasing websites, micro-blogging sites, and social media platforms, OM in online social media platforms has picked the interest of thousands of scientific researchers. Because the reviews, tweets and blogs acquired from these social media networks, act as a significant source for enhancing the decision making process. The obtained textual data (reviews, tweets, or blogs) are classified into three different class labels which are negative, neutral and positive for analyzing and extracting relevant information from the given dataset. In this contribution, we introduce an innovative MapReduce improved weighted ID3 decision tree classification approach for OM, which consists mainly of three aspects: Firstly We have used several feature extractors to efficiently detect and capture the relevant data from the given tweets, including N-grams or character-level, Bag-Of-Words, word embedding (GloVe, Word2Vec), FastText, and TF-IDF. Secondly, we have applied a multiple feature selector to reduce the high feature’s dimensionality, including Chi-square, Gain Ratio, Information Gain, and Gini Index. Finally, we have employed the obtained features to carry out the classification task using an improved ID3 decision tree classifier, which aims to calculate the weighted information gain instead of information gain used in traditional ID3. In other words, to measure the weighted information gain for the current conditioned feature, we follow two steps: First, we compute the weighted correlation function of the current conditioned feature. Second, we multiply the obtained weighted correlation function by the information gain of this current conditioned feature. This work is implemented in a distributed environment using the Hadoop framework, with its programming framework MapReduce and its distributed file system HDFS. Its primary goal is to enhance the performance of a well-known ID3 classifier in terms of accuracy, execution time, and ability to handle the massive datasets. We have carried out several experiences that aims to assess the effectiveness of our suggested classifier compared to some other contributions chosen from the literature. The experimental results demonstrated that our ID3 classifier works better on COVID-19_Sentiments dataset than other classifiers in terms of Recall (85.72 %), specificity (86.51 %), error rate (11.18 %), false-positive rate (13.49 %), execution time (15.95s), kappa statistic (87.69 %), F1-score (85.54 %), classification rate (88.82 %), false-negative rate (14.28 %), precision rate (86.67 %), convergence (it convergent towards the iteration 90), stability (it is more stable with mean deviation standard equal to 0.12 %), and complexity (it requires much lower time and space computational complexity).es
dc.format.mimetypeapplication/pdfes
dc.language.isoenges
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC.es
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subject.classificationBig dataes
dc.subject.classificationOpinion Mininges
dc.titleA MapReduce opinion mining for COVID-19-related tweets classification using enhanced ID3 decision tree classifieres
dc.typeinfo:eu-repo/semantics/articlees
dc.rights.holder"© Todos los derechos reservados". Propietario de los derechos: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC.es
dc.identifier.doi10.1109/ACCESS.2021.3073215es
dc.relation.publisherversionhttps://ieeexplore.ieee.org/document/9404185es
dc.identifier.publicationfirstpage58706es
dc.identifier.publicationlastpage58739es
dc.identifier.publicationtitleIEEE Accesses
dc.identifier.publicationvolume9es
dc.peerreviewedSIes
dc.description.projectEste trabajo ha sido financiado a través de la subvención IT 905-16, del eVIDA Research Group de la Universidad de Deusto.es
dc.identifier.essn2169-3536es
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.type.hasVersioninfo:eu-repo/semantics/publishedVersiones


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem