Rutu Desai

- Data Science graduate student with an undergraduate degree in Computer Science and Engineering.
- Strong analytical and visualization skills with proficiency in softwares-Microsoft Power BI, Tableau, Microsoft Excel.
- Advanced coding & programming skills in Python, R, SQL.
- Passionate about developing meaningful insights from data analysis.

Work Experience

  • CS Energy

    Edison, New Jersey, USA

    Data Analyst Intern

    06/2022 - 12/2022

  • Accenture

    Bengaluru, India

    Application Development Associate

    05/2021 - 07/2021

  • InventGrid Technologies India Pvt. Ltd.

    Ahmedabad, India

    Software Developer Intern

    01/2021 - 05/2021

Education

Projects

I have led individual and group projects in the field of Data Analysis and Visualizations.

  1. "CRISPY"-Automatic Text Summarizer



    09/2022-12/2022

    - "CRISPY" is an automatic text summarizer created to understand the concepts of how text summarization works.
    - I completed this project with 1 other group member for Statistical Software Class.
    - Implemented 2 text summarization approaches : 1) Extractive Summarization 2) Abstractive Summarization.
    - Applied transfer learning by taking pre trained transformer from Huggingface.
    - Got to play with the hyper parameters using the Trainer API and create a different model for the project.
    - Combined Extractive and Abstractive Summarization techniques to compare the results of the summaries generated.
    - Used ROUGE score (Recall Oriented Understudy for Gisting Evaluation) method to compare the Gold/Standard outputs with the application generated outputs.
    - Created an application to show the results using Streamlit(Python Framework).

    Tools and Technologies Used :

  2. Online Purchasing Intention



    09/2022-12/2022

    - This project aimed at finding whether a user will generate revenue for the e-commerce website by buying anything or not.
    - I completed this project with 2 other group member for Statistical Learning Class.
    - The data was taken from UCI datasets which had about 12,330 rows with 18 attributes to choose from.
    - Implemented Sequential Feature Selection - Forward and Backward to find the significant variables
    based on the type of classification model implemented.
    - With the help of AdaBoost we were able to make better outcomes by combining weak learners.
    - Created an application which helps e-commerce business owner to see the trends in the users and try to maximize the revenue.

    Tools and Technologies Used :

  3. Microsoft Power BI



    08/2022

    - Here is a link to my Github Page where I have added screen shots of
    Microsoft Power BI Dashboard for HR Data Summary
    and below is a list of steps taken to achieve the task.


    - Getting text data to Power BI.
    - Transforming data using Power Query as an Extract, Transform and Load Tool.
    - Manipulated data to create measures and fields accordingly to get desired results.
    - Used Conditional Formatting to find duplicates in different columns of sheet.
    - Leveraged the power of different visuals from Power BI and created a wonderful Dashboard.
    - Used Page Navigator to create buttons to go through different pages on the Dashboard easily.
    - Created conditional columns as per the requiremnets in power query.
    - Merged Data to get appropriate results.

    Tools and Technologies Used :

  4. Absenteeism at Work



    02/2022-05/2022

    - I completed this project with 1 other group member for Financial Data Mining Class.
    - Absenteeism at Work Project was done for Data Mining class where we classify
    for how much time will the employee be absent based on the different variables given in the dataset.
    - We inspected the dataset and found that the data was not very clean and
    had to process it accordingly using different pandas and numpy methods to get the data in desired format.
    - We then used different visualization libraries to create visuals which can help
    us understand more about the data using seaborn, plotly.express and matplotlib libraries.
    - We applied different classification methods to compare our results and
    also tried hyper parameter tuning with randomized search and grid search cross validation to get better results.
    - We also applied feature selection process to get most significant variables based
    on the approaches used like Forward Selection and Backward Elimination.
    - Created a poster for the project and presented in front of everyone and
    explained the approach and insights from the project.

    Tools and Technologies Used :

  5. YouTube Video Classification and Comment Sentiment Analysis



    03/2022-05/2022

    - I completed this project with 2 other group members for Data Wrangling and Husbandary class.
    - Used YouTube API Key with "tuber" Package in R to get data from YouTube Video and Comment section.
    - Found the overal public sentiment(positive, negative or neutral) in the comment section
    and produced it as a bar chart over a time by applying differnet tidying methods in R.
    - Also applied different Latent Dirichlet Allocation(LDA) methods to produce the topic classification results
    from the Comments section and compared the results for both VEM and Gibbs LDA method.
    - Created a Shiny Application and .rmd (R Markdown) to showcase the application and report.

    Tools and Technologies Used :

  6. COVID-19 Twitter Data Search Application



    03/2022-05/2022

    - Worked on this project with 2 other group members for Database Management for Data Science class.
    - The aim of the project was to create an efficient Search Engine for COVID-19 Twitter Data
    using a relational database, non-relational database and make it faster using caching.
    - To co-work with everyone we had to use Docker to share the building & sharing of containerized applications
    because of different machine/OS (Windows and MacOS).
    - We inspected the data and had to clean it first before doing any processing on it so had to use
    Natural Language Toolkit library(NLTK) to make the data usable.
    - For Relational Database we used PostgreSQL to store some of the important information and
    for Non-Relational Database(NoSQL Database) we used MongoDB.
    - To make use of caching in our application we used Redis to make application more efficient and faster.
    - With the help of Streamlit framework we crated an Application which takes any sentence as an input and
    finds related tweets and other details which get retrieved in fraction of seconds because of the efficiency in our application.

    Tools and Technologies Used :

  7. Insurance Cost Predictor



    10/2021-12/2021

    - I led this project along with 3 other group members for Regression and Time Series Analysis class Project.
    - Predicted the insurance cost within different regions of the US given specific attributes.
    - Found correlations between the attributes and insurance cost charges.
    - Used Backward Elimination and Forward Selection Method to distinguish thier processing.
    - Applied different Regression techniques to find the best model.
    - Used the best models in different techniques and made a Streamlit Application using Python.

    Tools and Technologies Used :

  8. Chance of Admission Predictor



    05/2020-07/2020

    - Created a web application that predicts the chances (percentage) of being admitted
    to an Elite University for Graduate Programs, this dataset was for students applying to UCLA.
    - Worked on EDA of the Dataset using Python.
    - Applied different regression models and used the model with highest accuracy
    as the final model of the web application, using Streamlit framework in Python.

    Link for Video Description

    Tools and Technologies Used :

  9. Sentiment Analysis



    01/2020-06/2020

    - The main aim of the project was to classify comments into 3 categories: 1) Positive 2) Negative 3) Sarcastic.
    - Used Natural Language Processing (NLP) to achieve the task and results.
    - Used Selenium to web scrape comments from different genres of YouTube videos to test our model.
    - Created word2vec embedding for both CBOW and SkipGram models and then applied
    different classification models to predict the output class of each comment in the test dataset.

    Tools and Technologies Used :

  10. Analysis



    - This is my Analysis in Python using different in built libraries
    and methodologies to find results of OTT TV Shows on different Platforms
    including Netflix, Prime Video, Hulu and Disney+.

    Tools and Technologies Used :

  11. Visualization



    - This is my Visualization in Python using different in built libraries like
    matplotlib, plotly.express, seaborn in Python.

    Tools and Technologies Used :

  12. Tableau Visualizations



    - This is a link to all my Tableau Visualizations including different interactive dashboards,
    stories and small tableau projects which includes datasets from Make Over Monday as well.

    Tools and Technologies Used :

Skills

  • Tableau

  • Microsoft Power BI

  • Python

  • R

  • SQL

  • Data Analysis

  • Data Visualization

  • Streamlit Framework

  • Shiny Framework

  • Data Manipulation and Cleaning

  • Microsoft Excel, Word and Powerpoint

  • Problem Solving

  • Critical and Analytical Thinking

  • Teamwork

Certifications

  • Google Data Analytics Professional Certification - 8 Course Specialization

  • Data Visualization with Tableau - 5 Course Specialization

  • Tableau Desktop Specialist

  • Data Analysis with Python

  • Data Visualization with Python

  • SQL(Intermediate) Hackerrank

  • Pandas Kaggle

Hobbies

Cooking

Fond of cooking and eating, love trying diferent types of cuisines and try to make those dishes at home.



Watching TV Shows

These are my Top 3 favourite TV Shows of all time which I can see again and again and never get bored from watching.



UI/UX Designing

Using Adobe XD, I create different User Interface Designs just for exploring the field and to have some fun.



Business Cards Designing

Using Adobe Photoshop CC 2015, I try to create different patterns and styles for Business Cards.

Profiles