This paper introduces an adaptive test-healing framework that delivers resolver-level resilience for GraphQL service-oriented architectures by unifying three telemetry streams semantic log embeddings obtained from large-language models, structural dependencies encoded with graph neural networks, and statistically grounded operational metrics into a single reinforcement-learning state vector. A deep Q-network learns context-aware recovery actions, including selective retry, safe skip, dependency reordering, and escalation, enabling automated mitigation of cloud-induced flakiness without obscuring root causes. Evaluated on more than one thousand episodes that inject realistic cloud uncertainty, the approach increases test-and-runtime success rates from 68.7 % to 92 %, reduces mean-time-to-recovery from 687 ms to 203 ms, trims continuous-integration compute time by 61 % through a KL-stability early-stop rule, and preserves tail-latency accuracy within a 5 % error bound while adding only 11.8 ms median overhead per healed request. These results demonstrate that statistically principled, reinforcement-learning-driven healing can provide practical, fine-grained self-recovery for modern cloud applications.