Loading Portfolio...

Data Scientist & Python Developer

Analytics | Machine Learning | Continuous Learner

πŸ“ Meerut, Uttar Pradesh

3
Internships
4
Projects
15
Technical Skills
3
Certifications

About Me

I'm a B.Tech student in Artificial Intelligence and Machine Learning at Meerut Institute of Technology with a strong foundation in data science, ML, and cloud technologies. I have practical experience in building AI-powered applications, data analytics, and automation solutions through internships at leading organizations.

My continuous learning and improvements span from data cleaning and exploratory analysis to building production-grade ML models and deploying them with modern web frameworks.

Why Choose Me

Adaptive Learning & Innovation

Building intelligent systems using NLP, semantic search, and predictive analytics while evolving technical skills and exploring emerging technologies

Data-Driven Solutions

Analyzing 2000+ datasets, creating Power BI dashboards, and turning raw data into actionable business insights

Cloud & Deployment

Deploying ML models on AWS, GCP, and Azure with REST APIs, Flask, and Streamlit for real-world applications

Full-Stack Development

End-to-end project execution: data cleaning β†’ model training β†’ API development β†’ interactive dashboards

Education

πŸŽ“ 2022 πŸ“ˆ 2024 (In Progress) πŸ† 2026 B.Tech - A.I. & Machine Learning | CGPA: 7.8 Meerut Institute of Technology Foundation Built: Data Science, ML & Cloud Technologies

B.Tech - Artificial Intelligence and Machine Learning

Meerut Institute Of Technology, Meerut

Duration: 2022 – 2026 | CGPA: 7.8

Internship Experience

AI Intern β€” Infosys Springboards

πŸ“… Nov 2025 - Present

Role: AI Semantic Search Tool Development

πŸ“Š Raw Data 🧹 Cleaning πŸ€– NLP Processing πŸ”— Text Embeddings πŸš€ ✨ Output: Context-Aware Search Results Query Understanding β†’ Semantic Similarity β†’ Ranked Results 🎯 User-Friendly Streamlit Dashboard Interface
  • Built an AI-based Semantic Search Tool using Natural Language Processing (NLP) to retrieve context-aware search results
  • Cleaned and transformed raw datasets, enhancing data quality for model training and analysis
  • Implemented semantic similarity search using text embeddings, enabling accurate query-to-document matching
  • Developed a Streamlit-based web application to deploy the AI model, providing an interactive and user-friendly interface
  • Tools: Python (Pandas, scikit-learn), SQL, NLP, Git, GitHub, Streamlit

Data Analyst Intern - Techsaksham (Edunet, Microsoft & SAP)

πŸ“… December 2024 - March 2025

Role: Recruitment Analytics and Automation

πŸ“„ 2000+ Resumes TF-IDF & BERT πŸ“Š Match Ranking πŸ“ˆ Fit πŸ“Š Power BI Dashboards Generated Skill Gaps Analysis | Candidate-Job Fit Scores | Hiring Trends 🎯 Result: 75% Improvement in Resume-Ranking Effectiveness HR Analytics for Better Recruitment Decisions
  • Conducted data cleaning and trend analysis on 2,000+ candidate resumes and job descriptions using Python and Excel
  • Built Power BI dashboards visualizing skill gaps, candidate-job fit scores, and hiring trends for HR analytics
  • Improved resume-ranking effectiveness by 75% through NLP-based preprocessing (TF-IDF + BERT)

Projects

AI Data Cleaning and EDA Agent

October 2025

Automation 90% Time Saved
Python Pandas NumPy Streamlit
πŸ“€ Data Upload 🧹 Data Clean πŸ“Š EDA πŸ“ˆ Visualize πŸŽ›οΈ Dashboard πŸ“‹ Report πŸ‘₯ User Interaction
  • Built an AI-powered agent to automate data cleaning and exploratory data analysis (EDA) using Python, Pandas, and Streamlit
  • Implemented missing value handling, outlier detection, and automated visualization for dataset insights
  • Integrated a Streamlit dashboard to allow users to upload datasets and generate instant data reports

House Price Predictions Model

January 2025

Real-time Results REST API Deployed
Python Flask scikit-learn REST API
Input Features Area, Location Rooms ML Model Linear Regression Random Forest πŸ’° Price Prediction Flask REST API /predict endpoint 🌐 Web Interface Real-time Input
  • Developed a Machine Learning model to predict house prices based on parameters like area, location, and number of rooms
  • Performed data cleaning, feature engineering, and model training using Linear Regression and Random Forest algorithms
  • Deployed the trained model using a Flask REST API, enabling real-time predictions from user inputs via a web interface

Bank Customer Churn Analysis

2024-2025

85% Accuracy 2000+ Records
Python Pandas scikit-learn Power BI
πŸ‘₯ Customer Demographics πŸ“Š Analysis Patterns ⚠️ Churn Risk Detection πŸ“ˆ Reports Logistic Regression 85% Accuracy Random Forest Behavior Segments πŸ“Œ Insights: Geography & Age-based Churn Probability
  • Analyzed customer demographics and transactions to identify churn risk patterns using Python and Power BI
  • Built interactive reports highlighting customer behavior segments and churn probability by geography and age group
  • Achieved 85% classification accuracy using Logistic Regression and Random Forest

E-Commerce Sales Analysis Dashboard

April 2026

1000+ Orders $527K Revenue
Python Pandas Plotly Jupyter Scikit-learn SQL
πŸ“Š Dataset 1000+ Orders πŸ“ˆ EDA 10 Insights 🎨 Charts Plotly πŸ”€ RFM 🎯 K-Means Clustering πŸ‘₯ Segments 3-5 Groups 🎯 Strategy Recommendations πŸ“‰ Chi-Square Testing πŸ§ͺ A/B Test Results πŸ“Š Confidence Intervals πŸ—ΊοΈ Geographic Insights
  • Comprehensive e-commerce analysis on 1000+ orders covering EDA, customer segmentation, and statistical testing using Jupyter notebooks
  • Performed RFM analysis and K-Means clustering to identify 5 customer segments (Champions, Loyal, At Risk, etc.) with targeted marketing recommendations
  • Implemented A/B testing framework with chi-square tests, Wilson score confidence intervals, and geographic analysis across 50+ states
  • Built SQL queries for business intelligence and created interactive Plotly visualizations showing $527K revenue trends and customer lifetime value insights
View on GitHub

Technical Skills

Full Stack Languages Python, C++, SQL Data Science Cloud & Databases Tools & Frameworks Data Engineering Connected Skills Ecosystem

Languages

Python
C++
SQL

Data Engineering

ETL/ELT
Data Cleaning
Transformation
REST APIs

Databases & Cloud

MySQL
Amazon RDS
Google Cloud
BigQuery
AWS EC2

Data Science & ML

EDA
Feature Engineering
Model Training
scikit-learn
Data Analysis

Tools & Frameworks

Pandas
NumPy
Flask
Streamlit
Git/GitHub
Power BI

Certifications

Oracle Cloud Infrastructure (OCI) Data Science Professional Certificate

2025

Data Analyst - Microsoft

2024-2025

Power BI Course

2024-2025

Get In Touch

I'm always interested in hearing about new projects and opportunities. Send me a message!

Ready to Build Something Amazing?

Let's collaborate and create intelligent solutions together

Start a Conversation