Hybrid Modeling for Sales Prediction Using SARIMA, CNN, LSTM, and Stacking Ensemble
Keywords:
Demand forecasting, SARIMA, CNN, LSTM, stacking ensemble, supply chain, predictive accuracy, hybrid modeling, seasonal trends, time-series forecasting, machine learning, deep learning, statistical modeling, XGBoost, Walmart sales data, external variables, Consumer Price Index (CPI), unemployment rate, public holidays, weather data, economic indicators, hierarchical forecasting, feature engineering, data preprocessing, rolling statistics, lag features, ensemble learning, sales prediction, operational efficiency, inventory management, retail analytics, climate impact, promotional analysis, holiday effects, resource allocationAbstract
Accurately forecasting sales in dynamic supply chain environments is essential for optimizing inventory management, resource allocation, and operational efficiency. This study addresses the challenge of achieving precise demand predictions by developing a hybrid modeling framework that integrates SARIMA, CNN, LSTM, and stacking ensemble methodologies. Ineffective sales forecasting often leads to overstocking, understocking, increased operational costs, and diminished customer satisfaction, adversely affecting global supply chain stakeholders. The research evaluates the effectiveness of combining traditional statistical models with advanced machine learning techniques for demand forecasting. SARIMA models effectively captured seasonal and linear trends, while CNN and LSTM architectures identified non-linear and temporal dependencies. However, integrating SARIMA with aggregated weekly data and CNN and LSTM models using daily granular data posed significant challenges. This mismatch excluded SARIMA from the initial stacking ensemble (XGBoost) integration. To address this limitation, a hybrid SARIMA-XGBoost model was subsequently developed and evaluated for performance. Limited time for fine-tuning CNN and LSTM models presented another challenge, leading to SARIMA outperforming both CNN and LSTM in predictive accuracy. The SARIMA-XGBoost hybrid model demonstrated superior performance compared to standalone CNN and LSTM models but was slightly less effective than SARIMA alone. The hybrid model excelled at capturing seasonal patterns, external variables, and irregular trends within the dataset. Historical sales data from 45 Walmart stores, augmented with external variables such as the Consumer Price Index (CPI), unemployment rates, temperature, and holiday indicators, formed the basis of the study. The findings revealed SARIMA’s robustness in handling linear and seasonal trends under constrained conditions, while the SARIMA-XGBoost hybrid model provided enhanced predictive accuracy.
This study concludes that hybrid frameworks hold substantial potential for improving demand forecasting, particularly in addressing diverse temporal granularities and resource constraints. Future research should focus on integrating additional external variables and optimizing deep learning models to refine the hybrid framework’s applicability across industries. Such advancements can empower supply chain managers with actionable insights to reduce costs, enhance operational efficiency, and improve customer satisfaction.
References
R.J. Hyndman and G. Athanasopoulos, *Forecasting: Principles and Practice*, 3rd ed. OTexts, 2018.
I. Goodfellow, Y. Bengio, and A. Courville, *Deep Learning*. MIT Press, 2016.
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” *Neural Computation*, vol. 9, no. 8, pp. 1735–1780, 1997.
H. Zhang and L. Chen, “Deep learning and time series integration for demand forecasting,” *Applied Intelligence*, vol. 52, no. 1, pp. 485–502, 2022.
T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in *Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining*, 2016, pp. 785–794.
J. Smith and D. Chen, “Advanced hybrid models in econometric forecasting,” *Econometric Research Journal*, vol. 48, no. 1, pp. 89–102, 2023.
Y. Huang, X. Wu, and Z. Lin, “Comparative analysis of hybrid demand forecasting techniques,” *International Journal of Forecasting*, vol. 37, no. 4, pp. 345–361, 2021.
Y. Xu and T. Zhang, “Stacking ensemble models in predictive analytics: A systematic review,” *Machine Learning and Applications*, vol. 21, no. 6, pp. 478–498, 2023.
Y. Yanru et al., “Learning pattern-specific experts for time series forecasting under patch-level distribution shift,” *ArXiv (Cornell University)*, Oct. 2024. Available: https://doi.org/10.48550/arxiv.2410.09836.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Adekunle O. Ajiboye

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who submit papers with this journal agree to the following terms.
