DATA ANALYTICS: ML ATTRITION PREDICTOR (MAP)

Discover the transformative power of data analytics as it transcends beyond historical insights. Data analytics doesn't just narrate the past; it crafts a narrative for what lies ahead.

In the realm of workforce dynamics, I delve into a project that redefines the future for a leading retirement home company. Uncover the nuances of predictive attrition modeling—where data speaks not only of the present but also foretells the workforce's future journey.

Welcome to a realm where insights predict, guide, and shape the path forward.

GOAL

How can we nurture lasting commitments? What propels talent out the door? Can we foresee the company's retention strength? Can we forecast the company's strength in retaining employees with precision? The answers, intricately woven into the fabric of data analytics techniques.

RESULT

In analysis, Python programming, and machine learning, I unveiled hidden data patterns. This resulted in a sophisticated algorithm, now integral to decision-making, guiding employee retention optimization. This project transforms how we decipher data, empowering the company to navigate the present and forge a future where employee retention is a realized strength.

Want to know more about my experience with Data Analytics & Machine Learning? Please get in touch with me here.

CASEPREDICTING EMPLOYEE ATTRITION: A STRATEGIC DIVE

Unlocking the ability to foresee employee attrition is a game-changer for any organization. Attrition, a crucial metric defined by the number of departures within a time frame divided by the active employee count, holds the key to understanding and mitigating workforce challenges. In this case study, we delve into a project centred around a senior housing company grappling with post-pandemic employee retention amidst a growing industry shortage.

DECODING EMPLOYEE CHARACTERISRTICS TO ATTRITION

Our journey begins with a comprehensive Exploratory Data Analysis (EDA), where we scrutinize both employee and behavioural characteristics to unearth predictors of attrition. The EDA, showcased through vivid visualizations, becomes the compass guiding our modeling journey. Around 25 features come under scrutiny, with impactful charts shedding light on the nuanced interplay between 34 attributes like Performance Rating, Num Companies Worked, Distance From Home, Total Working Years, Job Level, Years In Current Role, and Age. (Some examples shown below)

Age Distribution by Attrition Status
Age Distribution by Attrition Status
Ex-Employees by Marital Status
Ex-Employees by Marital Status
Attrition count
Attrition count
Correlation Map
Correlation Map

Some Key Findings from the EDA Expedition:

- Imbalanced dataset necessitates using Area Under the ROC Curve (AUC) for model evaluation.

- Identification of redundant features: EmployeeCount, EmployeeNumber, StandardHours, and Over18.

- Insight: Single employees exhibit a higher proportion of attrition compared to their married or divorced counterparts.

- Around 10% of ex-employees departed at their 2-year work anniversary.

- Noteworthy: Nurses exhibit a significant percentage of attrition.

CRAFTING THE MODEL: A SYMPHONY OF ALGORITHMS

Before plunging into machine learning, a meticulous separation of training and testing datasets sets the stage. Baseline algorithms like Logistic Regression, Random Forest, SVM, KNN, Decision Tree Classifier, and Gaussian NB undergo scrutiny. The chosen trio—Logistic Regression, Random Forest, and SVM—undergo fine-tuning based on ROC AUC comparison, leading to the emergence of the final model.

ROC AUC Comparison
ROC AUC Comparison
ROC Graph
ROC Graph

The fine-tuned Logistic Regression and SVM shine with higher scores, and despite a combined model attempt yielding an AUC of around 0.864, the standalone Logistic Regression steals the spotlight. A strategic decision is made to monitor its real-world performance until an updated dataset warrants revisiting the combined model.

NAVIGATING THE FUTURE: PREDICTIVE POWER UNLEASHED

Armed with a meticulously crafted statistical model, we unleash the power to calculate attrition probabilities for each employee. The model's algorithm, coupled with a user-friendly interface on the company server, ranks employees by likelihood of departure. This actionable insight empowers the company to proactively intervene, reshaping the workforce narrative and fostering a culture where attrition is not merely predicted but strategically managed.