Salifort Motors — Employee Turnover Prediction

Core Finding: Employees are systematically overworked

Across all models, the same 4 features dominate turnover prediction: last evaluation score, number of projects, tenure, and overwork flag. The data reveals two distinct high-risk groups — employees with too few projects (disengaged) and employees with too many (burned out). Neither extreme retains talent. The company's evaluation system rewards overwork, creating a perverse incentive structure that drives departures.

Model Comparison — Test Set Performance

Model	Accuracy	Precision	Recall	F1 Score	AUC	Note
Logistic Regression	83%	80%	83%	80%	—	Most interpretable
Decision Tree	96.2%	87.0%	90.4%	88.7%	93.8%	Strong baseline
Random Forest ★	96.2%+	87%+	90%+	88.7%+	93.8%+	Best performer
XGBoost	~96%	~87%	~90%	~88%	~93%	Comparable to RF

Model Accuracy Comparison

Test set accuracy across all 4 models (%)

Feature Importance — Random Forest

Top predictors of employee departure

Precision vs Recall — All Models

Trade-off between false positives and false negatives

Turnover Rate by Number of Projects

U-shaped risk — both extremes drive departures

Key Findings & Business Recommendations

🔴 Overwork is the primary driver

Employees working 200+ hours/month leave at high rates. High evaluation scores are disproportionately awarded to overworked employees, creating a perverse incentive. Recommend capping monthly hours and rebalancing evaluation criteria.

📊 Project load has a U-shaped risk

Both extremes are dangerous — employees with 2 projects leave (disengaged), and employees with 6–7 leave (burned out). The sweet spot is 3–5 projects. Recommend capping projects at 5 per employee.

📅 4-year tenure is a critical inflection point

Employees at exactly 4 years show unusually high departure rates, possibly linked to promotion timelines. Recommend investigating promotion policies for this cohort specifically.

💬 Satisfaction score is a leading indicator

Self-reported satisfaction strongly predicts departure even when controlling for workload. Recommend regular pulse surveys and acting on results — not just collecting them.

Methodology

Framework

Google PACE framework — Plan, Analyze, Construct, Execute. EDA first to understand distributions and correlations, then feature engineering (overwork flag, tenure buckets), then model building and comparison.

Feature Engineering

Created overworked binary flag (avg monthly hours > 175), tenure buckets, and interaction features. Removed data leakage candidates before final model training.

Evaluation Metrics

Accuracy, Precision, Recall, F1-Score, AUC-ROC. Prioritised Recall — in an HR context, missing a true leaver (false negative) is more costly than a false alarm.

Dataset

14,999 employee records · 10 features · Binary target (left = 1/0) · Multinational vehicle manufacturer · Google Advanced Data Analytics Certificate capstone dataset.

Employee Turnover PredictionSalifort Motors

Core Finding: Employees are systematically overworked

Employee Turnover Prediction
Salifort Motors