People Analytics - IBM Watson's simulated dataset

People Analytics - IBM Watson's Dataset

Presented here is the fictional data set dataset from IBM Watson’s People Analytics module, created by the Data Scientists Team of IBM.

Let’s uncover the factors that lead to employee attrition and explore important questions such as ‘break-down of distance from home by job role and attrition’ or ‘compare average monthly income by education and attrition’.

Please find the referenced dataset here: Dropbox

Original link for the dataset:

Dataset Snapshot:
N = 1470
Variables = 35
Attrition = 237 / 1233 ~ 19.2 %

Structure of this post:
     a)  Conclusion & Recommendations. 
     b)  EDA on important variables, related to income, promotion, performance and work-life balance – which usually are the triggers for Attrition.

Conclusions and Recommendations for the workplace in context:
a)  Effort-Reward imbalance seems to be the main reason for attrition for the organization in context. In this case, this mostly applies to people who are working overtime and who in many cases have a relatively low salary or who have not been promoted or both - it should be checked whether there is an effective overtime policy in our company;

b)    As well, Work-life balance also represents an issue for the workforce, a low Work Life balance is attributed to the either Overtime or Higher Distance of Travel from Home or both. One of the things that should be checked is availability of Flexi – Work and Flexi – Timings policies offered to the workforce.

c) Initiation of Hi – Potential Programs and Higher Earning Potential through Variable pay linked to Performance in the lower job levels. It’s been seen that highest attrition happens in the early years of joining with an equal attribution to Low Job Level and low Salary earning potential. A Hi-Potential Program enables the opportunity to fast track the employee progression, as well, a more Variable pay linked to Performance for lower job helps improve the gratification,  in recognition and tangibles along with performance. It equally improves the psychological safety across the organization with a stimulus to go beyond.

d) Fixing the Promotion Policy: It’s been inferred from the context, that increase in longevity diminishes the chances of promotion, or in other words, its takes longer time for tenured employees to reach a position as compared to lateral inductees.

e)  Certain departments & positions are much higher in terms of attrition risks – Sales function and role of Sales Representative. This needs to be deep dived in to with cumulation of Performance ratings, Satisfaction Indexes, Exit Interviews, Talent Sourcing strategy and framework documents of Job descriptions and Job evaluation.

Exploratory Data Analyses:

        a)      Income Levels across Attrited and Non – Attrited workforce:

 From the above plots, a large majority of those who left had a relatively lower monthly income and daily rate, on the other hand, the differences go sublime when comparing the monthly rate. A possible reason for this anomaly is the data being simulated, hence the income levels cannot give us pinpoint picture of the triggered action for Attrition, albeit a rough estimate while mapping the compensation factors.

From the Monthly Income plot, it’s apparent from the spread, the in context organization balances out its compensation strategy with respect to Job level & Loyalty, as we see the scatter more evenly spaced out beyond 10 years bracket. Infact, the Less than 10 years and less than 10, 000 in Income are highly condensed, with major part of attrition hitting this area. The Salary Hike plot concurs the fact of firm being even in rewarding hikes, agnostic of loyalty shown, similar concurrence can be seen with the Performance Ratings as well.
The Promotion graph opens up an interesting perspective, bcos a lot of it getting spaced out towards the left hand side, indicating a positive correlation between these two variables (the longer you are in the company, less chance you have to be promoted, so to speak) may mean that people are not really growing within the company.  Let’s see this more closely:

Here we note two things. Firstly, there is a relatively higher percentage of people working overtime in the group of those who left, secondly, while things seem to be going in the right direction for the group of people who continue with firm, (higher correlation between years since last promotion and years at company for those who don't work overtime), the opposite is happening in the other group. It seems that there may be a pattern of people leaving because they are not promoted although they work hard. From the graph below, the inference of Overtime folks at higher risk of Attrition is true alongside the happenstance of Promotion.

Let’s have a look (below) at Job Satisfaction – Performance Ratings over Attrition and Overtime (Column Facet). From the Work Life Balance plot, we infer that situation is more or less similar for most of the people except folks not putting in Overtime with Higher Performance rating. On Job Satisfaction, folks who have left tend to have lower job satisfaction as compared to respective peers, be it on Overtime or on Performance ratings. This does help us conclude that Job Satisfaction takes precedence over Work Life balance factor for higher weightage in workforce’s decision making.
The Overtime influence is well evident from the below graphs (with ‘Yes’ Columns), even though Performance is rated on the higher side, for folks not putting in Overtime, the Work Life balance seems to be the key for lower Job Satisfaction resulting in to Attrition.

                             Overtime                                                                     Overtime
Let’s as well see the impact of Distance from Home and Overtime to Work Life Balance and Attrition:
From the above, those who rated their work-life balance relatively low were commuting from a bit farther away in comparison with those who rated their work-life balance as very good – the same as well holding up for Overtime as well. This difference is more visible in the group of those who have attrited, suggesting the interplay of both, overtime and distance from home, as a very likely influential attrition factor.

From the above plots, we can conclude the following:
    1)    Males are at higher attrition risk comparatively than females (17 % v/s. 14.7 %)
   2) With respect to job levels, the maximum happens level 1 and level 2, incidentally, the           organization’s pay scales are as well directly proportional to the Job levels.
    3)  Attrition is witnessed for all levels of Job Satisfaction
    4)  Singles are most riskiest of the lot, while Married seem to be most stable of the group.
    5) The Highest risk of Attrition is in the first 3 years, followed by reduced levels at around 6 years, followed by 10. This is the time period to institutionalization of the person to organization’s way of living.

Summarizing the inferences and observations from above, the top factors leading to attrition are:
      a) Overtime
      b) Job Level
      c)  Monthly Income
      d) Work Life Balance
      e)  Distance from Home

We should prioritize and organize an HR audit for each of these areas, deep diving on areas of improvement vis a vis what is not broken or unaligned.