Showing 3 Result(s)

Predicting Music Subscription Churn at Scale with PySpark

How can streaming platforms use big data to predict churn?Using the KKBox dataset (tens of millions of transactions and listening log records), I built a scalable churn prediction pipeline with PySpark to show how distributed computing enables machine learning on massive data. Problem Churn is a major threat for subscription services. The business question: can …

Predicting & Explaining Music Subscription Churn

How do music platforms keep users subscribed in a competitive streaming market?Using the KKBox churn dataset, I combined machine learning and causal inference to explore not just who is likely to churn, but why. Problem Subscription churn is a major revenue challenge for streaming platforms.The business question: Can we predict which users are at risk, …

Employee Attrition Prediction

Using the IBM HR Analytics Employee Attrition & Performance dataset, I built interpretable machine learning models to understand why employees leave and to help HR teams identify at-risk staff. This project combines EDA, class imbalance handling, and predictive modeling with a focus on interpretability. Problem Employee turnover is costly, and HR teams need tools to …