Work
NC State University
|Graduate Teaching Assistant
→
Highlights
Graded assignments, evaluated projects, mentored, and conducted tutorial sessions for 7 graduate and undergraduate courses like Object-Oriented Design and Development, Operating Systems, and Data Structures. Guided over 400 students across 3 institutions.
NetApp
|Artificial Intelligence & Software Engineering Intern
→
Highlights
Developed and deployed an MLOps pipeline for revenue forecasting using time-series modeling with Python, Pandas, and DataRobot, integrating product capacity and usage data, CI/CD workflows, and quarterly backtesting (90/60/30-day rolling updates), reducing forecasting error to 1% (MAPE) for a $30M business unit; presented insights and visualizations (Tableau) to business stakeholders.
Built and evaluated customer churn prediction models for a SaaS product using XGBoost, Dynamic Time Warping, Tableau, and survival analysis; estimated potential loss and retention ROI, uncovered data limitations that informed strategic pivots.
CANDLE Research Lab, IIT Roorkee
|Machine Learning Research Intern
→
Highlights
Implemented and enhanced image dehazing, PCB components inspection, dental X-ray segmentation, and arrhythmia classification using novel neural networks and deep learning techniques, leading to scholarly publications.
Mahindra & Mahindra Financial Services
|Data Science Intern
→
Highlights
Designed, built, and deployed a voting ensemble classification model using Python, SQL, and Azure ML for predicting cases likely to default while extending loan offers for Scorpio Z101, a newly launched SUV improving recall by 30% to achieve 72%. This model was developed using features extracted from large-scale credit bureau data and the internal bank database.
Developed and executed ETL pipelines, utilizing Python and SQL, to process retro scrub data from diverse credit bureau sources such as CRIF, CIBIL, and Experian, reducing data processing time by 20 hours.
Capgemini
|Data Science Intern
→
Highlights
Developed an interactive stock analysis dashboard using Python, Dash/Plotly, CSS, & Yahoo Finance API, integrating real-time data, user interaction logic, and dynamic Ul elements for technical indicators, risk modeling, and future projections (GBM, GARCH). (Link)
Education
North Carolina State University
Master of Computer Science
Computer Science
Grade: 4/4
Indian Institute of Information Technology, Raichur
Bachelor of Technology
Computer Science & Engineering
Grade: 8.8/10
Courses
Data Structures & Algorithms
Software Engineering
Database Management Systems
Object Oriented Design & Analysis
Operating Systems
Parallel Systems
High Performance Computing
Natural Language Processing
Neural Networks
Publications
Skills
Python
C++
SQL
NoSQL
MySQL
MariaDB
MongoDB
Windows
Linux (Ubuntu)
HTML/CSS
Flask
FastAPI
REST APIs
Dash/Plotly
PyTorch
Tensorflow
MLOps
LangChain
Microsoft Excel
Pandas
Tableau
Git
GitHub Actions
Docker
Azure
AWS
Spark
CUDA
Projects
Wikipedia-Based Language Model
Summary
Fine-tuned DistilGPT-2, a lightweight generative language model, on a curated subset of Wikipedia articles for text generation, leveraging tokenization and causal language modeling to generate coherent responses while optimizing for minimal computational resources.
Multimodal Knowledge Base
Summary
Engineered a dockerized RAG system using FastAPI and React for document ingestion and querying leveraging LangChain for semantic and agentic chunking (Gemini), ChromaDB for vector-based retrieval, and prompt engineering. Integrated PDF parsing (Llama Parse, pymupdf4llm) and table/image extraction to handle diverse formats.
Wolf Parking Database System
Summary
Built using Java, JDBC, and MariaDB, enabling multi-role support for permits, vehicles, zones, and citations through dedicated modules for permit issuance, citation processing, payments, appeals, and automated reporting. Designed relational schemas and implemented transactional operations to ensure ACID compliance and real-time system updates.
Parallel PageRank for large-scale webgraphs
Summary
Implemented and profiled parallel variants using CUDA, OpenMP, and Hybrid CUDA-MPI in C++. Benchmarked performance across increasing graph sizes up to 16K nodes, achieving 350x speedup over sequential baseline.