Predicción de insolvencia económica en PYMES Colombianas mediante modelos de aprendizaje automático.
Cargando...
Fecha
Título de la revista
ISSN de la revista
Título del volumen
Editor
Resumen
La insolvencia económica en las Pequeñas y Medianas Empresas (PYMES) representa un desafío significativo en Colombia, dado que constituyen el 99 % del tejido empresarial, generan el 80 % del empleo y aportan el 35 % al PIB nacional. Por tanto, esta investigación se enfocó en desarrollar un modelo predictivo basado en aprendizaje automático para anticipar la insolvencia económica en las empresas, utilizando datos financieros y sociodemográficos de PYMES colombianas. Se emplearon datos de la Superintendencia de Sociedades (2021-2022), que abarcaron 10,952 PYMES no insolventes y 477 insolventes en 2021, y 11,030 no insolventes y 470 insolventes en 2022. Se compararon dos modelos de Bosques Aleatorios con diferentes técnicas de muestreo, en particular submuestreo y sobremuestreo, y se encontró que el desbalanceo de clases se mitigó de manera efectiva con el submuestreo. El primer modelo de Bosques Aleatorios logró clasificar correctamente al 85 % de las empresas solventes y al 81 % de las insolventes. Es importante destacar que nuestro modelo demostró su capacidad de replicación exitosa al utilizar una base de datos de un año distinto, lo que destaca su robustez y capacidad de generalización. Estos resultados refuerzan la confianza en la eficacia y aplicabilidad del modelo en diversos contextos y periodos, consolidándolo como una herramienta sólida para anticipar la insolvencia económica en las PYMES colombianas, y proporcionando un valioso apoyo a las Cámaras de Comercio y otros actores económicos en la identificación temprana y la mitigación del riesgo de insolvencia en estas empresas.
Economic insolvency in Small and Medium Enterprises (SMEs) represents a significant challenge in Colombia, as they constitute 99% of the business fabric, generate 80% of employment, and contribute 35% to the national GDP. Therefore, this research focused on developing a machine learning-based predictive model to anticipate economic insolvency in companies, using financial and sociodemographic data from Colombian SMEs. Data from the Superintendence of Companies (2021-2022) were used, covering 10,952 non-insolvent SMEs and 477 insolvent ones in 2021, and 11,030 non-insolvent and 470 insolvent ones in 2022. Two Random Forest models were compared using different sampling techniques, specifically undersampling and oversampling, and it was found that class imbalance was effectively mitigated with undersampling. The first Random Forest model correctly classified 85% of financially solvent companies and accurately identified 81% of insolvent companies. It is important to highlight that our model demonstrated its successful replication capacity by using a database from a different year, underscoring its robustness and generalization capability. These findings reinforce confidence in the effectiveness and applicability of the model in various contexts and periods, positioning it as a robust tool for anticipating economic insolvency in Colombian SMEs and providing valuable support to Chambers of Commerce and other economic actors in early identification and mitigation of the risk of insolvency in SMEs.
Economic insolvency in Small and Medium Enterprises (SMEs) represents a significant challenge in Colombia, as they constitute 99% of the business fabric, generate 80% of employment, and contribute 35% to the national GDP. Therefore, this research focused on developing a machine learning-based predictive model to anticipate economic insolvency in companies, using financial and sociodemographic data from Colombian SMEs. Data from the Superintendence of Companies (2021-2022) were used, covering 10,952 non-insolvent SMEs and 477 insolvent ones in 2021, and 11,030 non-insolvent and 470 insolvent ones in 2022. Two Random Forest models were compared using different sampling techniques, specifically undersampling and oversampling, and it was found that class imbalance was effectively mitigated with undersampling. The first Random Forest model correctly classified 85% of financially solvent companies and accurately identified 81% of insolvent companies. It is important to highlight that our model demonstrated its successful replication capacity by using a database from a different year, underscoring its robustness and generalization capability. These findings reinforce confidence in the effectiveness and applicability of the model in various contexts and periods, positioning it as a robust tool for anticipating economic insolvency in Colombian SMEs and providing valuable support to Chambers of Commerce and other economic actors in early identification and mitigation of the risk of insolvency in SMEs.