6 months
As a Data Science Intern in KPMG’s Forensic Team, I worked on optimizing and deploying machine learning pipelines used in forensic data analysis.
I redesigned the feature engineering pipeline using vectorized operations in Pandas and optimized extraction logic, reducing average processing time from 5 minutes to just 5 seconds — an impressive 98% performance improvement.
I also developed and trained a Generative Adversarial Network (GAN) to generate synthetic datasets that enriched training data quality. This enhancement led to a measurable 10% increase in downstream model accuracy.
To ensure scalability and reproducibility, I containerized ML models and applications using Docker and deployed them on Microsoft Azure. Infrastructure was managed through Infrastructure-as-Code (IaC) using Terraform.