Name: Vatsal Vinay Parikh

Education: MS in Data Science

GPA: 3.90

Address: Bloomington, IN, 47401

Skills

Python 95%
SQL 85%
Machine Learning 90%
Data Visualization 85%
Statistical Analysis 80%

About

About Me

A highly passionate and innovative individual with dedicated experience in the realm of data science and analytics. Proficient in data analysis, visualization, machine learning, database management and python programming, with a proven track record of leading 30+ projects. Possesses excellent interpersonal communication skills and problem-solving abilities & committed to delivering actionable insights. Seeking answers to complex day-to-day problems and applying theoretical knowledge practically through data-driven methodologies.

  • Programming: Python, SQL, R
  • Libraries: Pandas, NumPy, Matplotlib, OpenCV, SciKit-learn, Streamlit, NLTK, TensorFlow, Keras, PyTorch, ggplot2, dplyr
  • Databases: MySQL, PostgreSQL, MongoDB, Neo4j, Cassandra, Snowflake
  • Other Tools: AWS, Google Cloud, Linux, Git & GitHub, PowerBI, Tableau, Kubernetes, Docker, Apache Hadoop

0 +   Projects completed

Resume

Resume

Experienced Data Science Professional with demonstrated success in leveraging data analytics to inform strategic business decisions. Proficient in data analysis, statistical modeling, machine learning algorithms, and project management, with skills in driving actionable insights through comprehensive data analysis.

Experience


June 2025 - Present

AI & Data Analytics Intern

Rearc, New York City, United States

  • Engineered an end‑to‑end pipeline that parses 100+ MDX posts, generates podcast‑grade MP3s (OpenAI TTS + ElevenLabs), GPT‑4o summaries and vector embeddings, persisting assets to S3 and an FAISS index: fully scripted in Python, TypeScript, and AWS CDK.
  • Integrated the output into the Next.js and Tailwind site: each article now renders an auto‑generated TL;DR, a share‑ready social copy for each platform, and top‑K related‑article links delivered through a lightweight API, drastically boosting content discoverability and cutting manual marketing steps.

June 2025 - Present

Senior Consultant

Heartland Community Network (Remote)

  • Worked directly with clients throughout Indiana to identify challenges and drive business improvements, and supported the implementation of practical, scalable solutions to improve performance and efficiency.
  • Conducted in-depth research, analyzed business operations, assessed areas for digital transformation and strategic development, and used data-driven insights to craft and recommend tailored solutions.
  • Engaged with 10+ business owners and teams across various industries to deliver meaningful and measurable impact.

Jan 2025 - May 2025

Graduate Teaching Assistant

Luddy School of Informatics, Engineering, and Computing, IU Bloomington

  • Associate Instructor for two graduate level courses INFO-I 590: Topics in Informatics - Data Visualization under Prof. Goren Gordon and INFO-I 513: Usable Artificial Intelligence under Prof. Filipi Silva.
  • Assisting 100+ graduate students in teaching advanced data visualization, artificial intelligence, and machine learning techniques using Python and its libraries, conducting weekly help sessions, and providing guidance on course projects/assignments, exams, and quizzes.

Aug 2024 - Dec 2024

Data Scientist

Indiana Energy Independence Fund, Indianapolis, IN, United States

  • Led geospatial analysis for the Indiana Energy Independence Fund using Python and GeoPandas, to identify 745 solar sites from Indiana GIS Data Harvest, and cluster 197 top locations based on economic impact and solar potential.
  • Developed an interactive mapping platform with Folium and C#, utilizing the Google Solar API to provide detailed solar insights, and projected energy savings of up to 25% per site, enabling optimized site selection for 12 key regions.

Apr 2024 - Dec 2024

Research Assistant

Kelley School of Business, IU Bloomington

  • Worked as a research assistant under the guidance of Prof. Lucy Yan, Operations and Decisions Technologies, Kelley School of Business, IU Bloomington.
  • Performed temporal network and topic modeling analysis on over 145,000 posts and 623,000 comments from the r/depression subreddit to explore community structures and engagement trends.
  • Applied Latent Dirichlet Allocation (LDA) to identify 5 key themes in mental health discussions over a three-month period revealing that 30% of user interactions focused on existential despair, social isolation, and emotional turmoil.
  • Created temporal networks to map community structures and develop a content-content network using Louvain method and Fruchterman-Reingold force-directed algorithm.

Jan 2023 - May 2023

Data Analyst Intern

IBM Corporation, Ahmedabad, India

  • Developed a "Heart Disease Prediction System" using machine learning, analyzing a dataset of 30710 patient records with 14 medical attributes achieving a 93.44% accuracy using Support Vector Machines (SVM).
  • Conducted comprehensive exploratory data analysis (EDA) on the web-scraped dataset by employing statistical techniques for data cleaning and preprocessing, improving data quality by 98% & identifying 5 key predictive features.
  • Deployed an interactive web application using Streamlit and Python to predict heart disease risk based on user-input medical parameters, reducing diagnosis time by 60%. Presented findings at the internship pitch night, securing first place among 50+ participants and received winner certificate for innovative application of data science in healthcare.



Education


Aug 2023 - Present

Master of Science in Data Science

Indiana University, Bloomington

GPA: 3.90 / 4.00

Coursework:

  • CSCI-B 565: Data Mining
  • CSCI-P 556: Applied Machine Learning
  • STAT-S 520: Introduction to Statistics
  • ILS-Z 639: Social Media Mining
  • DSCI-D 532: Applied Database Technologies
  • INFO-I 513: Usable Artificial Intelligence
  • INFO-I 590: Data Visualization
  • DSCI-D 592: Data Science in Practice
  • CSCI-B 651: Natural Language Processing
  • CSCI-Y 799: Computer Science Colloquium
Jul 2019 - Jun 2023

Bachelor of Engineering in Information Technology

Gujarat Technological University, India

GPA: 9.31 / 10.00

Coursework:

  • 3130702: Data Structures
  • 3130703: Database Management Systems
  • 3140708: Discrete Mathematics
  • 3150703: Analysis and Design of Algorithms
  • 3161605: Software Engineering
  • 3161607: Big Data Analytics
  • 3161610: Data Warehousing and Mining
  • 3161613: Data Analysis and Visualization
  • 3171614: Computer Vision

Projects

Projects

Take a deep dive into my projects and case studies in Python, SQL and ML.

Community Solar Adoption in Indiana

Conducted geospatial analysis and developed an interactive mapping platform to identify and prioritize 745 community solar sites in Indiana, optimizing energy savings and environmental impact for tax-exempt organizations.

PoisonGPT: Editing Knowledge of LLMs

Conducted research on the vulnerabilities of large language models to knowledge editing attacks, using ROME to inject misinformation into GPT-2 and GPT-J-6B, and developed an interactive chatbot to demonstrate the impact on accuracy.

Predictive Modelling for Personalized Diabetes Care

Led data mining and developed machine learning models to predict diabetes risk with 75.47% accuracy, resulting in a real-time risk assessment web app.

Waste Management & Garbage Classification Model

Developed an AI-powered waste management system with real-time video processing capabilities for efficient waste classification and disposal guidance.

Heart Disease Prediction System

Developed a machine learning-based heart disease prediction system achieving 93% accuracy, leveraging SVM and Python libraries.

StayEase: Simplified Hotel Booking Solution

Created a user-friendly web application for hotel booking, integrating data from Goibibo to facilitate seamless search and booking, with a management dashboard for property owners.

100 Days Challenge

100 Days of Data Science Challenge

Join me in embarking on a 100-day journey of data science exploration and learning!

Python

Python

Explore Python programming language for data science.

SQL

SQL

Learn SQL for data manipulation and querying.

R

R

Discover R for statistical analysis and data visualization.

PowerBI

PowerBI

Master PowerBI for data visualization and business intelligence.

Tableau

Tableau

Explore Tableau for interactive data visualization.

Case Studies

Case Studies

Analyze real-world case studies to apply data science techniques.

Calendar Progress

Explore each day in the calendar below, representing a milestone in my personal 100 Days of Data Science Challenge.

Click on any day to delve into the corresponding challenge on my GitHub repository.

Feb 2025
Mon Tue Wed Thu Fri Sat Sun
- - - - - 1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 - -
March 2025
Mon Tue Wed Thu Fri Sat Sun
- - - - - 1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
April 2025
Mon Tue Wed Thu Fri Sat Sun
31 1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 - - - -
May 2025
Mon Tue Wed Thu Fri Sat Sun
- - - 1 2 3 4
5 6 7 8 9 10 11
- - - - - - -
- - - - - - -
- - - - - - -
0 Achievements
0 Projects
0 Publications
0 Certifications

More projects on Github

I love to solve business problems & uncover hidden data stories


GitHub

Contact

Contact Me

Below are the details to reach out to me!

Address

Bloomington, IN, USA

Download Resume

Resume