Proactive resource scaling in container orchestration platforms like Kubernetes is essential to maintain application responsiveness under fluctuating workloads. Our study aims to explore predictive auto-scaling by forecasting CPU utilization from service request volumes using statistical and neural-network-based time series models. We applied 29 forecasting techniques to production traces from a FinTech system, evaluating each model’s accuracy using nine metrics: MAE, RMSE, SMAPE, MASE, RMSSE, $R^2$, NMAE, and NRMSE, and assessing the actual prediction distance. Our results show that several statistical models, specifically AutoTheta, FFT, and Exponential Smoothing, achieved higher accuracy than neural approaches and completed inference within one second, albeit the latter showing a more consistent accuracy. Our findings demonstrate the viability of lightweight forecasting-driven scaling and suggest practical improvements for auto-scaling reliability in Kubernetes environments.