KPMG Logo

Data Science Intern

May 2025 - November 2025

6 months

About the Role

As a Data Science Intern in KPMG’s Forensic Team, I worked on optimizing and deploying machine learning pipelines used in forensic data analysis.

I redesigned the feature engineering pipeline using vectorized operations in Pandas and optimized extraction logic, reducing average processing time from 5 minutes to just 5 seconds — an impressive 98% performance improvement.

I also developed and trained a Generative Adversarial Network (GAN) to generate synthetic datasets that enriched training data quality. This enhancement led to a measurable 10% increase in downstream model accuracy.

To ensure scalability and reproducibility, I containerized ML models and applications using Docker and deployed them on Microsoft Azure. Infrastructure was managed through Infrastructure-as-Code (IaC) using Terraform.

Technical Stack

Frontend

Javascript

Javascript

HTML

HTML

CSS

CSS

React

React

Backend

Python

Python

MySQL

MySQL

Docker

Docker

Azure

Azure

Machine Learning

Pandas

Pandas

Scikit-learn

Scikit-learn