• español
  • English
  • français
  • Deutsch
  • português (Brasil)
  • italiano
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Stöbern

    Gesamter BestandBereicheErscheinungsdatumAutorenSchlagwortenTiteln

    Mein Benutzerkonto

    Einloggen

    Statistik

    Benutzungsstatistik

    Compartir

    Dokumentanzeige 
    •   UVaDOC Startseite
    • STUDIENABSCHLUSSARBEITEN
    • Trabajos Fin de Máster UVa
    • Dokumentanzeige
    •   UVaDOC Startseite
    • STUDIENABSCHLUSSARBEITEN
    • Trabajos Fin de Máster UVa
    • Dokumentanzeige
    • español
    • English
    • français
    • Deutsch
    • português (Brasil)
    • italiano

    Exportar

    RISMendeleyRefworksZotero
    • edm
    • marc
    • xoai
    • qdc
    • ore
    • ese
    • dim
    • uketd_dc
    • oai_dc
    • etdms
    • rdf
    • mods
    • mets
    • didl
    • premis

    Citas

    Por favor, use este identificador para citar o enlazar este ítem:http://uvadoc.uva.es/handle/10324/32896

    Título
    HDFS File Formats: Study and Performance Comparison
    Autor
    Alonso Isla, Álvaro
    Director o Tutor
    Martínez Prieto, Miguel AngelAutoridad UVA
    Bregón Bregón, AníbalAutoridad UVA
    Editor
    Universidad de Valladolid. Escuela Técnica Superior de Ingenieros de TelecomunicaciónAutoridad UVA
    Año del Documento
    2018
    Titulación
    Máster en Investigación en Tecnologías de la Información y las Comunicaciones
    Zusammenfassung
    The distributed system Hadoop has become very popular for storing and process large amounts of data (Big Data). As it is composed of many machines, its file system, called HDFS (Hadoop Distributed File System), is also distributed. But as HDFS is not a traditional storage system, plenty of new file formats have been developed, to take advantage of its features. In this work we study that new formats to find out their characteristics, and being able to decide which ones can be better knowing the needs of our data. For that goal, we have made a theoretical framework to compare them, and easily recognize which formats fit our needs. Also we have made an experimental study to find out how the formats work in some specific situations, selecting two very different datasets and a set of simple queries, resolved with MapReduce jobs, written with Java or run using Hive tool. The final goal of this work is to be able to identify the different strengths and weakenesses of the file formats.
    Palabras Clave
    Big Data
    Hadoop
    HDFS
    MapReduce
    Departamento
    Departamento de Informática (Arquitectura y Tecnología de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos)
    Idioma
    eng
    URI
    http://uvadoc.uva.es/handle/10324/32896
    Derechos
    openAccess
    Aparece en las colecciones
    • Trabajos Fin de Máster UVa [7002]
    Zur Langanzeige
    Dateien zu dieser Ressource
    Nombre:
    TFM-G932.pdf
    Tamaño:
    5.371Mb
    Formato:
    Adobe PDF
    Thumbnail
    Öffnen
    Attribution-NonCommercial-NoDerivatives 4.0 InternationalSolange nicht anders angezeigt, wird die Lizenz wie folgt beschrieben: Attribution-NonCommercial-NoDerivatives 4.0 International

    Universidad de Valladolid

    Powered by MIT's. DSpace software, Version 5.10