Main Article Content

Abstract

This study analyzes research trends in Hadoop MapReduce within the field of Big Data analytics using a bibliometric approach. The rapid expansion of digital data has driven the development of distributed computing frameworks, with Hadoop MapReduce playing a foundational role in large-scale data processing. Despite the extensive body of research in this area, a structured evaluation of publication trends, thematic development, and collaboration networks remains essential to understand its intellectual evolution. Using bibliometric analysis supported by VOSviewer, this study examines publication growth, influential countries and institutions, keyword co-occurrence, and emerging research themes from 2005 to 2025. The findings indicate significant publication growth between 2012 and 2018, followed by thematic diversification. Major research clusters focus on distributed computing, Big Data analytics, performance optimization, and cloud integration. The analysis also reveals a shift toward integration with machine learning, cloud computing, and newer frameworks such as Apache Spark. While Hadoop MapReduce remains a fundamental technology in distributed data processing, research trends suggest increasing attention to efficiency, scalability, and hybrid analytical frameworks. This study contributes to a clearer understanding of the evolution, current landscape, and future directions of Hadoop MapReduce research in Big Data analytics.

Keywords

Hadoop MapReduce Big Data Analytics Bibliometric Analysis

Article Details

How to Cite
Awaluddin, M., & Windiarti, I. S. (2026). Research Trends in Hadoop MapReduce for Big Data Analytics: A Bibliometric Analysis. Sang Pencerah: Jurnal Ilmiah Universitas Muhammadiyah Buton, 12(1), 48–66. https://doi.org/10.35326/pencerah.v12i1.8195

References

  1. Akhil, M. P. (2022). Employing Bibliometric Analysis to Identify Emerging Technologies in the Insurance Industry. In Big Data Analytics in the Insurance Market (pp. 207–220). Emerald Publishing Limited. https://doi.org/10.1108/978-1-80262-637-720221011
  2. Al-Hawari, F., Tayem, K., Alouneh, S., & Ksasbeh, A. Al. (2023). Impact of Virtual Hadoop Cluster Scalability on The Performance of Big Data Mapreduce Applications. In 2023 24th International Arab Conference on Information Technology (ACIT) (pp. 1–6). IEEE. https://doi.org/10.1109/acit58888.2023.10453885
  3. Chand, K., Chandel, A., Tiwari, R., & Chauhan, A. S. (2024). Trends and Patterns in Insurance Research: A Bibliometric Analysis (2020–2024). In Data Alchemy in the Insurance Industry (pp. 153–181). Emerald Publishing Limited. https://doi.org/10.1108/978-1-83608-582-920241025
  4. Charles, V., Gherman, T., & Emrouznejad, A. (2022). Characteristics and Trends in Big Data for Service Operations Management Research: A Blend of Descriptive Statistics and Bibliometric Analysis. In Studies in Big Data (pp. 1–18). Springer International Publishing. https://doi.org/10.1007/978-3-030-87304-2_1
  5. Cuzzocrea, A. (2022). Multidimensional Big Data Analytics over Big Web Knowledge Bases: Models, Issues, Research Trends, and a Reference Architecture. In 2022 IEEE Eighth International Conference on Multimedia Big Data (BigMM) (pp. 1–6). IEEE. https://doi.org/10.1109/bigmm55396.2022.00008
  6. Cuzzocrea, A., & Soufargi, S. (2024). Privacy-Preserving Big Hierarchical Data Analytics via Co-Occurrence Analysis. In Proceedings of the 13th International Conference on Data Science, Technology and Applications (pp. 93–103). SCITEPRESS - Science and Technology Publications. https://doi.org/10.5220/0012767800003756
  7. Dass, S., & J., P. (2022). Amelioration of Big Data Analytics by Employing Big Data Tools and Techniques. In Research Anthology on Big Data Analytics, Architectures, and Applications (pp. 1527–1548). IGI Global. https://doi.org/10.4018/978-1-6684-3662-2.ch074
  8. Demchenko, Y., Cuadrado-Gallego, J. J., Chertov, O., & Aleksandrova, M. (2024). Big Data Algorithms, MapReduce and Hadoop ecosystem. In Big Data Infrastructure Technologies for Data Analytics (pp. 145–198). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-69366-3_5
  9. Dong, Z. (2022). Research of Big Data Information Mining and Analysis : Technology Based on Hadoop Technology. In 2022 International Conference on Big Data, Information and Computer Network (BDICN) (pp. 173–176). IEEE. https://doi.org/10.1109/bdicn55575.2022.00041
  10. Fu, Y., & Cao, S. (2023). Bibliometric analysis of the research hotspots and trends in diversification strategy. In Second International Conference on Electronic Information Engineering, Big Data, and Computer Technology (EIBDCT 2023) (p. 46). SPIE. https://doi.org/10.1117/12.2674772
  11. Hasija, T., Ramkumar, K. R., Kaur, A., & Bali, M. S. (2025). Exploring the landscape of post quantum cryptography: a bibliometric analysis of emerging trends and research impact. In Journal of Big Data (Vol. 12, Issue 1). Springer Science and Business Media LLC. https://doi.org/10.1186/s40537-025-01269-5
  12. Hmioui, A., & Ouarrak, Y. El. (2024). Mapping Big Data Analytics and Supply Chain Resilience Research Nexus: A Bibliometric Study. In 2024 9th International Conference on Big Data Analytics (ICBDA) (pp. 311–315). IEEE. https://doi.org/10.1109/icbda61153.2024.10607250
  13. Kumar, A., Varshney, N., Bhatiya, S., & Singh, K. U. (2023). Replication-Based Query Management for Resource Allocation Using Hadoop and MapReduce over Big Data. In Big Data Mining and Analytics (Vol. 6, Issue 4, pp. 465–477). Tsinghua University Press. https://doi.org/10.26599/bdma.2022.9020026
  14. Lawrance, J. U., Jesudhasan, J. V. N., & Rittammal, J. B. T. (2024). Parallel Fuzzy C-Means Clustering Based Big Data Anonymization Using Hadoop MapReduce. In Wireless Personal Communications (Vol. 135, Issue 4, pp. 2103–2130). Springer Science and Business Media LLC. https://doi.org/10.1007/s11277-024-11101-7
  15. Liang, L., Zhao, H., & Shen, Y. (2022). Comparative Analysis of Hadoop MapReduce and Spark Based on People’s Livelihood Appeal Data. In Communications in Computer and Information Science (pp. 71–91). Springer Nature Singapore. https://doi.org/10.1007/978-981-16-9709-8_6
  16. Murali, N., Gopi, R., & Alagarsamy, M. (2023). MapReduce based Rank Boosting in Hadoop Framework in Metaverse Data Analytics Process Mining. In Industrial Revolution and Metaverse: Industry 5.0 (pp. 86–92). Quing Publications. https://doi.org/10.54368/qpbc.2023.1.6
  17. Niha, K., & Banu, W. A. (2022). New Trends and Applications of Big Data Analytics for Medical Science and Healthcare. In Handbook of Intelligent Healthcare Analytics (pp. 387–411). Wiley. https://doi.org/10.1002/9781119792550.ch18
  18. Ning, A. (2023). Network Log Big Data Analysis Processing Based on Hadoop Cluster. In 2023 IEEE 2nd International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA) (pp. 1925–1928). IEEE. https://doi.org/10.1109/eebda56825.2023.10090697
  19. Pasupuleti, M. K. (2024). Modeling Climate Impact and Urban Growth with Hadoop and ArcGIS: Advanced Geospatial Solutions. In Spatial Big Data Analytics: Leveraging Geospatial Tools for Research Innovation with Hadoop and ArcGIS (pp. 31–57). National Education Services. https://doi.org/10.62311/nesx/978-81-98048530
  20. Ragazou, K., Passas, I., Garefalakis, A., Galariotis, E., & Zopounidis, C. (2023). Big Data Analytics Applications in Information Management Driving Operational Efficiencies and Decision-Making: Mapping the Field of Knowledge with Bibliometric Analysis Using R. In Big Data and Cognitive Computing (Vol. 7, Issue 1, p. 13). MDPI AG. https://doi.org/10.3390/bdcc7010013
  21. Ramakrishnan, U., & Nachimuthu, N. (2022). An Enhanced Memetic Algorithm for Feature Selection in Big Data Analytics with MapReduce. In Intelligent Automation & Soft Computing (Vol. 31, Issue 3, pp. 1547–1559). Tech Science Press. https://doi.org/10.32604/iasc.2022.017123
  22. Ramesh, R., & Selvam, V. (2023). Healthcare Analytics Using Big Data for Evaluation and Extreme Machine Learning Based on MapReduce. In Indian Journal of Computer Science (Vol. 8, Issue 1, p. 28). Associated Management Consultants, PVT., Ltd. https://doi.org/10.17010/ijcs/2023/v8/i1/172682
  23. Rani, P., Lamba, R., Sachdeva, R. K., Kumar, R., & Bathla, P. (2023). Big Data Analytics: Integrating Machine Learning with Big Data Using Hadoop and Mahout. In Intelligent Systems and Smart Infrastructure (pp. 366–374). CRC Press. https://doi.org/10.1201/9781003357346-41
  24. Rao, K. S., Saravanan, S., Raghu, K., Rajesh, V., & Kumar, P. S. (2022). India’s Remote Medical Monitoring System Using Big Data and MapReduce Hadoop Technologies. In Advances in Social Networking and Online Communities (pp. 47–61). IGI Global. https://doi.org/10.4018/978-1-7998-9640-1.ch004
  25. Rauf, A., Tariq, U., Tang, H., & Shishir, M. A. (2024). Bibliometric Analysis: Research Trends of Privacy in Big Data and its Applications. In 2024 7th International Conference on Data Science and Information Technology (DSIT) (pp. 1–6). IEEE. https://doi.org/10.1109/dsit61374.2024.10881578
  26. Ren, Y., Han, L., & Li, J. (2022). Design of Internet Opinion Analysis System for Emergencies in Big Data Environment Based on Hadoop Platform. In Lecture Notes on Data Engineering and Communications Technologies (pp. 95–101). Springer Singapore. https://doi.org/10.1007/978-981-16-7469-3_10
  27. Samsul, S. A., Yahaya, N., & Abuhassna, H. (2023). Education big data and learning analytics: a bibliometric analysis. In Humanities and Social Sciences Communications (Vol. 10, Issue 1). Springer Science and Business Media LLC. https://doi.org/10.1057/s41599-023-02176-x
  28. Seseni, L., Mbohwa, C., & Madonsela, N. S. (2024). Technology-Organisation-Environment Framework Theory for Adopting and Implementing Big Data Analytics: A Bibliometric Analysis Study. In Proceedings of the International Conference on Industrial Engineering and Operations Management. IEOM Society International. https://doi.org/10.46254/an14.20240627
  29. Sharma, R., Yadav, R. S., & Kumar, P. (2025). Artificial Intelligence and Telecom Data Analytics: A Bibliometric Approach to Big Data Insights. In 2025 12th International Conference on Emerging Trends in Engineering & Technology - Signal and Information Processing (ICETET - SIP) (pp. 1–6). IEEE. https://doi.org/10.1109/icetetsip64213.2025.11156803
  30. Taha, K. (2025). Big Data Analytics in IoT, social media, NLP, and information security: trends, challenges, and applications. In Journal of Big Data (Vol. 12, Issue 1). Springer Science and Business Media LLC. https://doi.org/10.1186/s40537-025-01192-9
  31. Tan, C. N.-L., & Fauzi, M. A. (2023). The Bibliometric Overview of Research on Healthcare Information Systems Using Big Data Analytics. In International Journal of Data Science and Big Data Analytics (Vol. 3, Issue 1, pp. 45–57). SvedbergOpen. https://doi.org/10.51483/ijdsbda.3.1.2023.45-57
  32. Thakkar, H. K. (2022). A Workload-Aware Data Placement Scheme for Hadoop-Enabled MapReduce Cloud Data Center. In Predictive Analytics in Cloud, Fog, and Edge Computing (pp. 185–197). Springer International Publishing. https://doi.org/10.1007/978-3-031-18034-7_11
  33. Topcu, I., Karpak, B., Ülengin, F., & Aktas, E. (2025). Big data analytics in supply chain management: uncovering emerging trends through a bibliometric network analysis and a systematic literature review. In Journal of Enterprise Information Management (pp. 1–34). Emerald. https://doi.org/10.1108/jeim-07-2024-0374
  34. Verma, C., & Pandey, R. (2022). Statistical Visualization of Big Data Through Hadoop Streaming in RStudio. In Research Anthology on Big Data Analytics, Architectures, and Applications (pp. 758–787). IGI Global. https://doi.org/10.4018/978-1-6684-3662-2.ch035
  35. Verma, S. (2022). Big Data and Advance Analytics: Architecture, Techniques, Applications, and Challenges. In Research Anthology on Big Data Analytics, Architectures, and Applications (pp. 541–570). IGI Global. https://doi.org/10.4018/978-1-6684-3662-2.ch026
  36. Vijay, D. V., Sharma, D. V., Srivastava, D. V., & Jaind, D. V. K. (2024). A Comparative Study on Hadoop MapReduce and Apache Spark Framework for Big Data Analytics. In International Journal of Research Publication and Reviews (Vol. 5, Issue 2, pp. 3228–3232). Genesis Global Publication. https://doi.org/10.55248/gengpi.5.0224.0601
  37. Yang, X., Xu, X., & Ying, J. (2024). Research Trends in Application of Artificial Intelligence in Alzheimer’s Disease: Bibliometric and Visualization Analysis. In 2024 IEEE 7th International Conference on Big Data and Artificial Intelligence (BDAI) (pp. 247–253). IEEE. https://doi.org/10.1109/bdai62182.2024.10692826
  38. Yao, L., Liu, Y., Wang, T., Han, C., Li, Q., Li, Q., You, X., Ren, T., & Wang, Y. (2025). Global trends of big data analytics in health research: a bibliometric study. In Frontiers in Medicine (Vol. 12). Frontiers Media SA. https://doi.org/10.3389/fmed.2025.1456286
  39. Zhang, J., Sun, J., & Qiao, S. (2025). Hot Spots and Development Trends of Smart Rural Research in the Context of Big Data—An Analysis of Knowledge Mapping Based on Citespace. In 2025 10th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA) (pp. 634–639). IEEE. https://doi.org/10.1109/icccbda64898.2025.11030545
  40. Zhang, Y., Wu, C. Q., & Hou, A. (2025). Cross-layer Scheduling for MapReduce-based Big Data Workflows in Heterogeneous Hadoop Systems. In 2025 International Conference on Computing, Networking and Communications (ICNC) (pp. 350–355). IEEE. https://doi.org/10.1109/icnc64010.2025.10993951