Does moving a game’s first “gate” from level 30 to level 40 improve player retention? Using an A/B test dataset from the mobile game Cookie Cats, I analyzed how a design change impacted player behavior, going beyond top-level metrics to understand the full user experience.
Problem
Retention is a key metric for the long-term success of mobile games. A change to the game’s design was proposed: move the first gate—a point where progression slows down—from level 30 to level 40.
The business question: Would this change increase short-term or long-term player retention without negatively impacting overall engagement?
Approach
- Data Preparation — Loaded and inspected the dataset, checking for data quality and ensuring the control and treatment groups were properly randomized (SRM check).
- A/B Test Analysis — Used a two-proportion z-test to compare Day-1 and Day-7 retention rates between the two groups.
- Guardrail Analysis — Assessed the impact on a key guardrail metric,
sum_gamerounds, to ensure the change didn’t harm overall player engagement. - Subgroup Analysis — Segmented players into “Casual” and “Heavy” users based on median game rounds played to identify any heterogeneous treatment effects.
- Survival Analysis — Employed Kaplan-Meier curves to visualize and compare the “survival” (retention) of both groups over a 7-day period, providing a more continuous view of the impact than snapshot metrics alone.
- Bayesian Analysis — Modeled retention with Beta–Binomial distributions to estimate the posterior probability that the treatment outperformed control and to compute credible intervals for lift.
Results
- Overall A/B Test: The new gate at level 40 had an inconclusive effect on Day-1 retention but significantly decreased Day-7 retention.
- Guardrail Check: The change had no significant impact on the total number of game rounds played.
- Subgroup Analysis: The negative effect on long-term retention was driven entirely by heavy players, the most valuable segment of the user base. Casual players were unaffected.
- Survival Analysis: The survival curves reinforced the findings, showing that the treatment group’s players churned faster.
- Bayesian Analysis: The posterior probability strongly favored control over treatment for Day-7 retention, with a credible interval showing negative lift.
Impact
The analysis revealed that while the new design didn’t hurt a casual player’s experience, it negatively impacted the most engaged players, who are crucial for the game’s long-term health. The results provide a clear, data-driven recommendation: Do not ship the new gate design.
Skills & Tools
- Python (pandas, NumPy, scipy, statsmodels)
- Statistical Analysis (two-proportion z-test, chi-squared test, survival analysis)
- Data Visualization (Matplotlib, Kaplan-Meier plots)
- Experimentation (A/B testing, guardrail metrics, subgroup analysis)
Check out the full code on GitHub
