Data mining models to predict timber production across Colombian departments
DOI:
https://doi.org/10.15649/2346075X.5883Keywords:
Timber production, Data mining, Machine learning, Colombia, Forecasting, Time seriesAbstract
Introduction. Timber production in Colombia is strategic for economic development and environmental conservation, yet reliable predictive tools remain scarce. Objective. To evaluate the performance of statistical and machine-learning models for forecasting department-level timber mobilization volumes in Colombia using open data from the Colombian Agricultural Institute (2012–2022). Materials and Methods. Following the CRISP-DM framework, we performed data cleaning and preprocessing, imputed missing values via KSSA, and implemented five model families (ARIMA, Prophet, GLMNET, Random Forest, and Prophet Boost). Models were trained on 90% of the historical series and evaluated with RMSE, MAE, and MAPE. Results. ARIMA and Random Forest achieved the best performance depending on the stability or variability of each series, enabling reliable four-quarter-ahead forecasts. Departments such as Antioquia, Valle del Cauca, and Cauca are projected to maintain high production levels, whereas Meta and Casanare exhibit greater instability. Conclusions. These findings underscore the value of integrating open data and machine-learning techniques to support the sustainable management of Colombia’s forest resources.
References
1. Martínez-Cortés ÓG, Kant S, Isufllari H. An analysis of wood availability under six policy scenarios of commercial forest plantations in Colombia. Forest Policy and Economics. 2022;138:102722. https://doi.org/10.1016/j.forpol.2022.102722
2. Fedemaderas. Boletín Estadístico Forestal 2022 – Federación Nacional de Industriales de la Madera [Internet]. 2022 [cited 2023 May 13]. Available from: https://fedemaderas.org.co/boletin-estadistico-forestal-2022/
3. Fondo Mundial para la Naturaleza - WWF. Madera legal: un mercado lleno de oportunidades para Colombia [Internet]. 2023 [cited 2023 Sep 22]. Available from: https://www.wwf.org.co/?380872/Madera-legal-un-mercado-lleno-de-oportunidades-para-Colombia
4. Scientific Reports. Deforestation in Colombian protected areas increased during post-conflict periods [Internet]. [cited 2026 Apr 20]. Available from: https://www.nature.com/articles/s41598-020-61861-y
5. Verkerk PJ, Levers C, Kuemmerle T, Lindner M, Valbuena R, Verburg PH, et al. Mapping wood production in European forests. Forest Ecology and Management. 2015;357:228–38. https://doi.org/10.1016/j.foreco.2015.08.007
6. FAO. Global Forest Resources Assessment 2020. Key findings [Internet]. 1st ed. Rome: FAO; 2020 [cited 2026 Apr 20]. Available from: https://openknowledge.fao.org/items/ac91b7b4-87eb-41eb-bdb1-d1c31fe249a8
7. Mazon B, Jaramillo M, Romero O, Borja A, Aguirre M, Contento M. Tecnologías de Inteligencia de Negocios y Minería de datos para el análisis de la producción y comercialización de cacao. Revista ESPACIOS [Internet]. 2018 [cited 2023 Jun 13];39(32). Available from: https://www.revistaespacios.com/a18v39n32/18393206.html
8. Wu P, Yi X, Jin K. A study on Chinese output of timber prediction model based on PSO-SVM. Advances in Information Sciences and Service Sciences. 2012;4(2):227–33. https://doi.org/10.4156/aiss.vol4.issue2.28
9. Pereira Martins Silva J, Luiza Marques da Silva M, Ribeiro de Mendonça A, Fernandes da Silva G, Almeida de Barros Junior A, Ferreira da Silva E, et al. Prognosis of forest production using machine learning techniques. Information Processing in Agriculture. 2023;10(1):71–84. https://doi.org/10.1016/j.inpa.2021.09.004
10. Yasar K. What is data analytics? | Definition from TechTarget [Internet]. Search Data Management. 2024 [cited 2025 Sep 16]. Available from: https://www.techtarget.com/searchdatamanagement/definition/data-analytics
11. Espinosa-Zúñiga JJ. Aplicación de metodología CRISP-DM para segmentación geográfica de una base de datos pública. Ingeniería, investigación y tecnología. 2020;21(1). https://doi.org/10.22201/fi.25940732e.2020.21n1.008
12. Ministerio de Agricultura y Desarrollo Rural. Base de datos relacionada con madera movilizada proveniente de Plantaciones Forestales Comerciales | Datos Abiertos Colombia [Internet]. 2024 [cited 2025 Sep 3]. Available from: https://www.datos.gov.co/Agricultura-y-Desarrollo-Rural/Base-de-datos-relacionada-con-madera-movilizada-pr/9aan-wm8m/about_data
13. Mahmoudvand R, Rodrigues PC. Missing value imputation in time series using Singular Spectrum Analysis. Int J Energy Stat. 2016;4(1):1650005. https://doi.org/10.1142/S2335680416500058
14. Hyndman RJ, Athanasopoulos G. Forecasting: Principles and Practice. 2nd ed [Internet]. Melbourne: OTexts; 2018 [cited 2025 Sep 3]. Available from: https://otexts.com/fpp2/
15. Yadav S, Shukla S. A comparative study of ARIMA, Prophet and LSTM for time series prediction. J Artif Intell Mach Learn Data Sci. 2022;1(1):1813–6. https://doi.org/10.51219/JAIMLD/sandeep-yadav/402
16. Engebretsen S, Bohlin J. Statistical predictions with glmnet. Clin Epigenetics. 2019;11:123. https://doi.org/10.1186/s13148-019-0730-1
Downloads
Published
How to Cite
Downloads
Issue
Section
License
Copyright (c) 2026 Innovaciencia

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
All articles published in this scientific journal are protected by copyright. The authors retain copyright and grant the journal the right of first publication, with the work simultaneously licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0), which permits sharing the work with authorship recognition and without commercial purposes.
Readers may copy and distribute the material from this journal issue for non-commercial purposes in any medium, provided the original work is cited and credit is given to the authors and the journal.
Any commercial use of the material from this journal is strictly prohibited without written permission from the copyright holder.
For more information on the copyright of the journal and open access policies, please visit our website.









