A COMPREHENSIVE STUDY ON BIG DATA FRAMEWORKS

Nagham A. SULTAN, Dhuha B. ABDULLAH

A COMPREHENSIVE STUDY ON BIG DATA FRAMEWORKS

Abstract With the advent of cloud computing technology, the generation of data from various sources has increased during the last few years. The current data processing technology must handle the enormous volumes of newly created data. Therefore, the studies in the literature have concentrated on big data, which has enormous volumes of almost unstructured data. Dealing with such data needs well-designed frameworks that fulfil developers’ needs and fit colourful purposes. Moreover, these frameworks can use for storing, processing, structuring, and analyzing data. The main problem facing cloud computing developers is selecting the most suitable framework for their applications. The literature includes many works on these frameworks. However, there is still a severe gap in providing comprehensive studies on this crucial area of research. Hence, this article presents a novel comprehensive comparison among the most popular frameworks for big data, such as Apache Hadoop, Apache Spark, Apache Flink, Apache Storm, and MongoDB. In addition, the main characteristics of each framework in terms of advantages and drawbacks are also deeply investigated in this article. Our research provides a comprehensive analysis of various metrics related to data processing, including data flow, computational model, overall performance, fault tolerance, scalability, interval processing, language support, latency, and processing speed. To our knowledge, no previous research has conducted a detailed study of all these characteristics simultaneously. Therefore, our study contributes significantly to the understanding of the factors that impact data processing and provides valuable insights for practitioners and researchers in the field.

Keywords: : Big data; Frameworks; Metrics of Big Data.

http://dx.doi.org/10.47832/2717-8234.14.4

354

ISSN Online2717-8234
Articles sent to our journal are scanned with ithenticate, a similarity control program. If the similarity rate is high, the article is rejected. Accepted articles are presented to the referees together with the report. Similarity rates of 20% or more are not accepted for articles.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Year:2023 Volume: 5 Issue: 1 Area: Computer Science and Engineering

A COMPREHENSIVE STUDY ON BIG DATA FRAMEWORKS

MINAR International Journal of Applied Sciences and Technology

Journal

Article Management

Contact information