Entorno experimental de procesamiento de datos distribuidos integrando devops en el ciclo de entrega de software

Autores/as

DOI:

https://doi.org/10.15649/2346030X.3011

Palabras clave:

data processing, big data devOps, hadoop, spark, data, cloud, aws, cluster, infrastructure as code

Resumen

Sin duda las generaciones de grandes volúmenes de datos de diferentes fuentes han permitido a las organizaciones obtener valor y conocimiento a partir de los datos generados. Por ello, las empresas necesitan a especialistas que sean capaces de digerir esos datos y convertirlos en información útil. Un tema importante es la manera en que los estudiantes pueden adoptar el conocimiento teórico de manera práctica en entornos big data, tecnologías en la nube y herramientas demandadas en el mercado evitando configuraciones extensas.

En este artículo se crea un entorno experimental de big data, describiendo el concepto como tal, sus arquitecturas de referencia y componentes, diseñando e implementando una arquitectura para un clúster de procesamiento de datos distribuido, integrando Devops en un flujo de entrega continua de software; mediante un despliegue automatizado big data procesing de infraestructura como código en la nube.

Referencias

C. Howard, “Top Priorities for IT: LeadershipVision for 2021, Data and Analytics Leaders,” 2020, [Online]. Available: gartner.com.

D. Smith, D. Villaba, M. Irvine, D. Stanke, and N. Harvey, “Accelerate State of DevOps 2021,” p. 45, 2021, [Online]. Available: https://cloud.google.com/blog/products/devops-sre/announcing-dora-2021-accelerate-state-of-devops-report.

T. Sousa, H. S. Ferreira, and F. F. Correia, “A Survey on the Adoption of Patterns for Engineering Software for the Cloud,” IEEE Trans. Softw. Eng., vol. 5589, no. c, pp. 1–13, 2021, doi: 10.1109/TSE.2021.3052177.

“What is a Cloud Engineer and How Do You Become One?” https://www.techtarget.com/searchcloudcomputing/definition/cloud-engineer (accessed Mar. 14, 2023).

E. Bello, “¿Qué es Data Engineering? Funciones, requisitos y salario,” Think. Innov., Oct. 2022, Accessed: Mar. 14, 2023. [Online]. Available: https://www.iebschool.com/blog/data-engineering-big-data/.

S. Ananthi and S. Hariganesh, “A comprehensive study on cloud computing,” ICIIECS 2015 - 2015 IEEE Int. Conf. Innov. Information, Embed. Commun. Syst., 2015, doi: 10.1109/ICIIECS.2015.7193151.

Q. Rida, “A Roadmap Towards Big Data Opportunities, Emerging Issues and Hadoop as a Solution,” Int. J. Educ. Manag. Eng., vol. 10, no. 4, pp. 8–17, 2020, doi: 10.5815/ijeme.2020.04.02.

B. Leonel Goldman Cita and B. Leonel Goldman, “El Big Data y la Analítica de Negocios en el capitalismo informacional,” p. 8, 2017, [Online]. Available: https://www.aacademica.org.

J. Cao, M. Lin, and X. Ma, “A survey of big data for IoT in cloud computing,” IAENG Int. J. Comput. Sci., vol. 47, no. 3, pp. 585–592, 2020.

S. Zhelev and A. Rozeva, “Big data processing in the cloud - Challenges and platforms,” AIP Conf. Proc., vol. 1910, no. December 2017, 2017, doi: 10.1063/1.5014007.

“Chapter 1: What is Software Architecture? | Microsoft Docs.” https://docs.microsoft.com/en-us/previous-versions/msp-n-p/ee658098(v=pandp.10)?redirectedfrom=MSDN (accessed Apr. 11, 2022).

P. Mell and T. Grance, “The NIST-National Institute of Standars and Technology- Definition of Cloud Computing,” NIST Spec. Publ. 800-145, p. 7, 2011.

M. I. Malik, “Cloud Computing-Technologies,” Int. J. Adv. Res. Comput. Sci., vol. 9, no. 2, pp. 379–384, 2018, doi: 10.26483/ijarcs.v9i2.5760.

I. Ashraf, “An Overview of Service Models of Cloud Computing,” Int. J. Multidiscip. Curr. Res., vol. 2, no. August 2014, pp. 779–783, 2014, [Online]. Available: http://ijmcr.com/wp-content/uploads/2014/08/Paper18779-783.pdf.

C. Ebert, G. Gallardo, J. Hernantes, and N. Serrano, “DevOps,” 2016.

M. Artac, T. Borovssak, E. Di Nitto, M. Guerriero, and D. A. Tamburri, “DevOps: Introducing infrastructure-as-code,” Proc. - 2017 IEEE/ACM 39th Int. Conf. Softw. Eng. Companion, ICSE-C 2017, no. May, pp. 497–498, 2017, doi: 10.1109/ICSE-C.2017.162.

S. E. Bibri and J. Krogstie, “Towards a novel model for smart sustainable city planning and development: A scholarly backcasting approach,” J. Futur. Stud., vol. 24, no. 1, pp. 45–62, 2019, doi: 10.6531/JFS.201909_24(1).0004.

宗成庆, “State of Software development,” p. 48, 2021.

G. Ruijun, “A Lightweight Experimental Platform for Big Data Based on Docker Containers,” J. Phys. Conf. Ser., vol. 1437, no. 1, 2020, doi: 10.1088/1742-6596/1437/1/012104.

K. Miao, J. Li, W. Hong, and M. Chen, “A Microservice-Based Big Data Analysis Platform for Online Educational Applications,” Sci. Program., vol. 2020, 2020, doi: 10.1155/2020/6929750.

M. Gupta, M. N. Chowdary, S. Bussa, and C. K. Chowdary, “Deploying Hadoop Architecture Using Ansible and Terraform,” 2021 5th Int. Conf. Inf. Syst. Comput. Networks, ISCON 2021, pp. 1–6, 2021, doi: 10.1109/ISCON52037.2021.9702299.

S. Saxena, S. K. Gupta, S. Poongodi, and P. Singh, “Turkish Journal of Computer and Mathematics Education Vol . 12 No . 11 ( 2021 ), 2507- 2521 Research Article A modern approach to building a data science framework delivery pipeline using DevOps practices,” vol. 12, no. 11, pp. 2507–2521, 2021.

D. Yang et al., “DevOps in practice for education management information system at ECNU,” Procedia Comput. Sci., vol. 176, pp. 1382–1391, 2020, doi: 10.1016/j.procs.2020.09.148.

D. Blazquez and J. Domenech, “Big Data sources and methods for social and economic analyses,” Technol. Forecast. Soc. Change, vol. 130, no. March 2017, pp. 99–113, 2018, doi: 10.1016/j.techfore.2017.07.027.

A. Gonçalves, F. Portela, M. F. Santos, and F. Rua, “Towards of a Real-time Big Data Architecture to Intensive Care,” Procedia Comput. Sci., vol. 113, pp. 585–590, 2017, doi: 10.1016/j.procs.2017.08.294.

N. Naik, “Docker container-based big data processing system in multiple clouds for everyone,” 2017 IEEE Int. Symp. Syst. Eng. ISSE 2017 - Proc., 2017, doi: 10.1109/SysEng.2017.8088294.

J. Bhimani, Z. Yang, M. Leeser, and N. Mi, “Accelerating big data applications using lightweight virtualization framework on enterprise cloud,” 2017 IEEE High Perform. Extrem. Comput. Conf. HPEC 2017, 2017, doi: 10.1109/HPEC.2017.8091086.

V. L., Camargo, J. J. Camargo-Ortega, and J. F. . Joyanes-Aguilar;, “Vista de Arquitectura vertida,” vol. 1, pp. 7–18, 2015, doi: https://doi.org/10.14483/udistrital.jour.RC.2015.21.a1.

“Terraform by HashiCorp.” https://www.terraform.io/ (accessed Apr. 14, 2022).

“¿Qué es AWS?” https://aws.amazon.com/es/what-is-aws/ (accessed Nov. 13, 2020).

“Apache Hadoop 3.3.2 – HDFS Architecture.” https://hadoop.apache.org/docs/r3.3.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Introduction (accessed Apr. 14, 2022).

“Overview - Spark 3.2.1 Documentation.” https://spark.apache.org/docs/latest/ (accessed Apr. 15, 2022).

S. Salloum, R. Dautov, · Xiaojun Chen, · Patrick, X. Peng, and J. Z. Huang, “Big data analytics on Apache Spark,” Int. J. Data Sci. Anal., vol. 1, pp. 145–164, 2016, doi: 10.1007/s41060-016-0027-9.

Descargas

Publicado

01/01/2023

Cómo citar

[1]
C. Angulo-Angulo, «Entorno experimental de procesamiento de datos distribuidos integrando devops en el ciclo de entrega de software», AiBi Revista de Investigación, Administración e Ingeniería, vol. 11, n.º 1, pp. 20–38, ene. 2023.

Número

Sección

Artículos de Investigación

Altmetrics

Descargas

Los datos de descargas todavía no están disponibles.