Correlation of work hours and pay

Ray Hu
2 min readFeb 29, 2020

--

Find knowledge behind data

It may seem counterintuitive that working more can lead to earning less, but sometimes, data reveals intriguing patterns that challenge our assumptions. I delved into the United States Census Income Dataset, employing TensorFlow to train a predictive model. This model piqued my curiosity, especially when examining the prospects of a 25-year-old Black individual in the workforce.

When this individual works a standard 40-hour workweek, their likelihood of earning more than $50K is 5.64%. Doubling the workload to 80 hours per week sees this likelihood drop to 2.21%, and when pushed to an astonishing 120 hours per week, the probability plummets to a mere 0.29%. Astonishingly, it appears that more work can indeed yield less pay!

However, let’s clarify that correlation is not causation; it’s a measure of conditional probability. It’s easy to confuse causation and inference when conducting data analysis.

Another intriguing facet of this model is its lack of worldly knowledge. For instance, how can a person possibly work 120 hours per week? The model, devoid of ethical or practical constraints, accommodates such hypothetical scenarios without hesitation.

Herein lies the essence of a data scientist’s role: deciphering the ‘why’ behind data patterns and remaining vigilant for seemingly illogical situations. After all, data scientists are the gatekeepers of rational analysis, and their unique human perspective ensures that AI will not easily replace them.

In the snippets of raw data below, you can witness the model’s predictions based on different work hours:

{ "age": 25, "workclass": " Private", "education": " 11th", "education_num": 7, "marital_status": " Never-married", "occupation": " Machine-op-inspct", "relationship": " Own-child", "race": " Black", "gender": " Male", "capital_gain": 0, "capital_loss": 0, "hours_per_week": 40, "native_country": " United-States"}
- Likelihood of earning more than $50K: 5.64%

{ "age": 25, "workclass": " Private", "education": " 11th", "education_num": 7, "marital_status": " Never-married", "occupation": " Machine-op-inspct", "relationship": " Own-child", "race": " Black", "gender": " Male", "capital_gain": 0, "capital_loss": 0, "hours_per_week": 80, "native_country": " United-States"}
- Likelihood of earning more than $50K: 2.21%

{ "age": 25, "workclass": " Private", "education": " 11th", "education_num": 7, "marital_status": " Never-married", "occupation": " Machine-op-inspct", "relationship": " Own-child", "race": " Black", "gender": " Male", "capital_gain": 0, "capital_loss": 0, "hours_per_week": 120, "native_country": " United-States"}
- Likelihood of earning more than $50K: 0.29%

In this world of data, unraveling the mysteries behind the numbers and contextualizing the insights is where the true power of data science lies.

--

--

Ray Hu
Ray Hu

Written by Ray Hu

nobody satirist with abnormal knowledge of current affairs

No responses yet