Pranjal Shukla's
Data Portfolio

Love telling stories and learning how to work with data to make them even more convincing.
@pranjalshukla98

Tableau Portfolio

Hi, I’m Pranjal. I am currently pursuing Master of Science in Business Analytics at George Washington University.
I am a highly motivated individual with strong foundation in technical skills and real-world experience in data analysis,business intelligence, and project management.
My tableau public repository contains multilple dynamic dashboards related to amazon orders in India, Top 1000 IMDb movies, Flight delays....

Netflix Recommendation System

Built a Netflix recommendation system using Python, leveraging NLP techniques and cosine similarity to provide personalized content recommendations based on user preferences and content features. Cleaned and processed data, implemented text vectorization, and built a function to recommend movies and TV shows, enhancing the user experience on the platform.

Health Insurance Premium Predictions

Utilized Python's tech stack, including Pandas for data manipulation, Plotly Express for visualizations, and Scikit-learn's Random Forest Regressor for predicting health insurance premiums. Process involved data cleaning, feature transformation, and correlation analysis. Delivered a strong predictive model considering age, gender, BMI, and smoking to forecast premiums, enhancing data understanding for pricing and risk assessment.

Fake News Detection

Processed a mixed Fake and True News dataset using text cleaning, TfidfVectorizer, and a Multinomial Naive Bayes Classifier. The resulting model demonstrated a high accuracy of 95.31%, showcasing the effectiveness of our approach in classifying news authenticity. This project underscores the importance of preprocessing and appropriate algorithm selection in accurate news classification

Analysis of Flight Delays & Cancellation Rates

Through extensive data cleaning and analysis of a vast dataset comprising 5.8 million domestic flights, this project delved into flight delay and cancellation patterns. Employing Tableau dashboards, key insights were extracted to evaluate airline performance, dissect route-specific delays, and examine social media's influence on customer sentiments. The project's outcomes emphasize its role in informed decision-making within the aviation industry.

Credit Line Predictive model

Developed a Decision Tree model for predicting credit default probability, utilizing demographic and payment history data. Evaluated model fairness, ethical considerations, and potential risks to ensure responsible implementation within an educational context. (98% accuracy; trained 30000 data points and tested 7500 data points.

Stroke Prediction

Focused on stroke prediction using machine learning techniques. Data preprocessing, exploratory data analysis, and feature engineering were conducted to analyze attributes like age, gender, work type, smoking status, and more. Different models including Decision Trees, K-Nearest Neighbors, XGBoost, and Random Forest were trained and evaluated to predict stroke risks, yielding insights into correlation, data distribution, and model performance.

Upcoming project

Developed a personalized Netflix content recommendation system using Python libraries. User-specific suggestions were generated based on titles, descriptions, and genres through data preprocessing and cosine similarity. The outcome underscores the value of leveraging advanced techniques to enhance content discovery and user engagement.