A Comprehensive Study of Capgemini Employee Reviews

Capgemini Project

This project aims to analyze Capgemini employee reviews to uncover trends and patterns in employee satisfaction. The objectives include identifying primary factors influencing satisfaction, developing an NLP model to extract keywords from reviews, and creating a recommendation system for tailored suggestions. Additionally, a multiclass sentiment classification model will be developed to categorize employee ratings as positive, negative, or neutral. By leveraging advanced data analysis and machine learning techniques, this project seeks to provide insights into employee sentiments and improve organizational strategies for enhancing employee satisfaction and engagement.

The Steps I Took

  • Data Collection and Cleaning
  • Exploratory Data Analysis (EDA)
  • Natural Language Processing (NLP) Model Development
  • Recommendation System Implementation
  • Multiclass Sentiment Classification Model

Why I Took These Steps

  • Ensured reliable data for analysis
  • Identified key factors influencing satisfaction
  • Extracted meaningful insights from feedback
  • Enhanced employee experience and engagement
  • Improved sentiment analysis accuracy

Tools I Used

  • Excel Excel
  • Python Python
  • Power BI Power BI

Challenges:

  • Missing Values: Certain columns contain irreplaceable missing values, which may impact the model's accuracy and reliability.
  • NLP Model Limitations: NLP models may struggle with understanding context in employee reviews, affecting the accuracy of keyword extraction, sentiment analysis, and recommendations.
  • Imbalanced Data: Data imbalance can negatively affect the performance of the sentiment analysis model. Techniques like oversampling or undersampling can introduce bias or lead to a loss of information.
  • Generalization Issues: The multiclass sentiment classification model may struggle to generalize if the training data lacks diverse sentiments across various organizational contexts, which can impact its accuracy on new or unseen data.

Here are some of my wonderful visuals from this project:

Count Plot of ratings for different features

Count Plot of ratings for different features

Donut Plot of Employees in department

Donut Plot of Employees in department

Correlation Heatmap

Correlation Heatmap

Word Cloud for Likes and Dislikes

Word Cloud for Likes and Dislikes

Results

1. Employee Satisfaction Factors

Work-life balance, skill development, salary, benefits, job security, and career growth significantly contribute to employee satisfaction. Positive perceptions in these areas generally lead to higher job satisfaction.

2. Areas for Improvement

Employees express satisfaction with work-life balance, job security, and career growth. However, neutral ratings for salary and benefits suggest these aspects may need enhancement or closer attention from the company.

3. Keyword Extraction

By using Natural Language Processing (NLP) effectively extracts key insights from Capgemini employee reviews. By employing techniques like TF-IDF, we identify and prioritize relevant words and phrases, enhancing our understanding of the main themes and sentiments expressed. This approach significantly refines our analysis of employee feedback.

Keyword Extraction

4. Recommendation System

The recommendation system for Capgemini employee reviews provides personalized suggestions based on specific departments and locations. By analyzing review patterns, the system offers tailored insights to enhance employee satisfaction and address departmental needs, ensuring more relevant and actionable recommendations for improvement.

Recommendation System

5. LSTM Model Performance

The LSTM model achieved an accuracy of 79.05% in classifying Capgemini employee reviews, effectively distinguishing between positive, negative, and neutral sentiments. This high accuracy indicates the model's robustness in analyzing sentiment and suggests its reliability in providing meaningful insights into employee ratings.

LSTM Model Performance