Hi, I'm Shreayan. I was born and brought up in Mumbai. I am 25 years old. I am a fun-loving person who likes to travel, eat delicious food, meet new people, and listen to all types of music. I have graduated with a First Class with Distinction in B.Tech in Software Engineering from SRM University, Chennai. I love coding and my areas of interest include Machine Learning, NLP, and Data Science. I started coding in 2012 when my class teacher introduced me to it. My love for coding has only improved over the years. Call me a fanatic, but I'm crazy for all things related to Space and Astronomy. I love researching random stuff in Astronomy and Space, which often gives me existential crisis. My hobbies include playing video games, chess, football and cooking. In my free time, I also play various songs on the keyboard. Do check out my Instagram if you want to check out how I play the keyboard.
Shreayan Chaudhary
+1 (551) 339-7724
shreayan98c@gmail.com
linkedin.com/in/shreayan98c/
Baltimore, Maryland
My goal is to pursue a career in Machine Learning and venture into a technological field where my ongoing learning skills can be utilized to organize, manage, analyze massive amounts of structured and unstructured data to create various algorithms using machine learning to meet specific business needs and goals, thus contributing to the growth of the organization.
A Machine Learning Engineer with 3 years of experience in fintech and edtech, working on end to end ML projects, from data ingestion to deploying ML models in production. As a Computer Science Masters student at Johns Hopkins University specializing in Machine Learning, I am highly skilled in multimodal ML - NLP and Computer Vision. I have two publication at established international conferences as well as one patent.
Machine Learning Intern • June 2023 - Present
• Worked as an Machine Learning Intern primarily focused on Generative AI and NLP
• Leveraged GPT-4, Falcon and Llama2 LLMs for text2SQL for database querying using RAG + chain of thoughts
• Reduced cost by ~67% by finetuning LLMs using PEFT and LORA to optimize the generated SQL, and vector caching
• Extended support to unstructured data (PDF, PPT, DOC, MP4, TXT files) and enabled complex metric computations
• Technologies Used: Pytorch, JAX, LangChain, vLLM, Streamlit, Huggingface, Azure, LLaMA, Falcon
Machine Learning Engineer • January 2021 - July 2022
• Worked as an Machine Learning Engineer primarily focused on Deep Learning and NLP
• Automated scanned invoices’ OCR and Information Extraction by building a rule-based AI with 92.7% F1 score and reduced processing time by 85%
• Developed PyTorch and Tensorflow backends APIs for RZTDL - the core no code drag-and-drop deep learning library used for the Razorthink Platform by procedurally generating code, used by 40+ corporations (Domain: Neural Networks, Backend Python services and server operations)
• Designed and executed a scalable end-to-end data pipeline handling 1.5+ GB/day, including Data Ingestion, Wrangling, Warehousing, ETL, and Model Feature Engineering
• Technologies Used: Pytorch, Tensorflow, NLTK, Spacy, CoreNLP, Gensim, GCP, OpenCV, Tesseract, Google Vision, Fuzzy ML Algorithms
Machine Learning Research Intern • December 2019 - January 2021
• Worked as an Applied Research Intern primarily focused in Deep Learning and NLP
• Transformed Indian Navy’s Russian - English translation process, automating domain-specific docs’ translation building a Bi-LSTM Attention Transformer achieving 0.29 BLEU score, reducing human involvement by 80%, under the guidance of Prof. Ganesh Ramakrishnan
• Led 7 members of 3 teams (ML, Dev, Mgmt) to develop a Sanskrit OCR + post-editing using CNN, Tesseract, GoogleOCR to digitize ancient Sanskrit manuscripts
• Pro bono engineered a feedback summarizer using BERT for IChangeMyCity and Civis NGOs, reducing processing time for 1.3M+ registered users and increasing resolution rate for 3.7M+ complaints by 94%
• Duties included researching and thoroughly understanding the cutting edge research papers in ML and conduct various experiments on them
• Technologies Used: Bi-LSTM Transformers, CNN, RNN, Tesseract, BERT, OpenCV, Huggingface, Keras, Tensorflow, NLTK, SpaCy, Django, GCP, AWS
Intern Data Scientist • June 2019 - July 2019
• Worked as a Data Scientist as a part of the Statistics and AI Team
• Conducted Statistical Analysis & EDA on banks’ loan data to visualize customers’ propensity and segment defaulters
• Devised and automated an SVM model, reducing staff hours in identifying potential loan defaulters by 70%
• Predicted the propensity of a loan defaulter and identified the most probable method for repayment
• Also deployed it in prod on an AWS EC2 instance and automated it to run daily at 6pm
• Technologies Used: Flask, NumPy, Scikit-Learn, AWS EC2, PostgreSQL, SVM, Neural Nets
Intern Software Engineer • June 2018 - July 2018
• Worked as a Intern Software Engineer as a part of the Software Engineering team
• Engineered a JSP and Servlet based scalable web app for A/B testing with emails, SMS¬ifications to 1.2M+ users, thereby assisting the company’s customer relations team to help and effectively communicate with the customers
• Technologies Used: JSP, Servlets, jQuery, Datatables.js, HTML, CSS: MaterializeCSS
Intern Web Developer • January 2018 - June 2018
• Worked as the Web Developer at WisOpt, an education focussed SRM University startup.
• Built analytics dashboard for web.wisopt.com used by SRM University serving 600+ profs and 15k+ students for all official communications
• Technologies Used: jQuery, Vue.js, HTML, CSS: Bootstrap
Social Media Head • June 2017 - June 2020
• Worked as the Social Media Head to manage the various social media handles of Software Engineering Association.
• Organized national level technical events such as Hackathons & Coding Competitions to promote various applications of coding and collaboration among CS undergrad students.
• Handled the Facebook, Twitter and Instagram accounts with over 6000 followers.
• Increased the user engagement by 30% across all the social handles - Facebook, Instagram and Twitter. (4.6k to 6.1k).
MS in Computer Science • 2022 - 2024
• CGPA: 3.97/4
• Research Assistant at ARCADE (Advanced Robotics and Computationally AugmenteD Environments) Lab advised by Prof. Unberath - Spring '23, Fall '23, Spring '24
• Research Assistant at CLSP (Centre for Language and Speech Processing) Lab advised by Prof. Yarowsky - Fall '22
• Teaching Assistant for the graduate level course Information Retrieval by Prof. Yarowsky - Spring '24
• Teaching Assistant for the graduate level course Software Engg. by Prof. Darvish - Spring '23
• Teaching Assistant for the graduate level course Databases by Prof. Yarowsky - Fall '22
B.Tech in Software Engineering • 2016 - 2020
• Grade: 89%
• First Class with Distinction: Ranked in the top 10% of the department
XIIth grade in CBSE Board • 2014 - 2016
• Obtained a Silver medal in International Informatics Olympiad.
Xth grade in ICSE Board • 2004 - 2014
• Certificate of excellence in Computer Science and Mathematics.
I started my coding journey in 2012, when my class teacher introduced me to it. My love for coding has only improved over the years. I started coding with Java in 2012. After that, I learned the basics of C and C++. Then I moved on to HTML, CSS and JavaScript to create fluid, mobile responsive and lightweight web applications. When I was introduced to Python by a friend, I immediately fell in love with this language, and it has been my favourite language since. I have dived deep into deep learning (pun intended) and started learning machine learning (also, pun intended) in Python. My area of research lies in Recommender Systems, OCR and NLP.
Shreayan Chaudhary on ResearchGate
Chaudhary, Shreayan Killeen, Benjamin; Osgood, Greg; Unberath, Mathias (2024). Take a Shot! Natural Language Control of Robotic X-ray Systems for Image-guided Surgery, International Conference on Information Processing in Computer-Assisted Interventions
This paper proposes a natural language based communication protocol to control C-Arm robotic devices using voice commands.
Chaudhary, Shreayan; Anupama, C. (2020). Ensemble Recommendation System using a hybrid decision level fusion of Popularity Model and Collaborative Filtering, International Conference for Artificial Intelligence and Evolutionary Computations in Engineering Systems, pp.551-559 DOI: 10.1007/978-981-15-0199-9_47.
This paper proposes a hybrid recommendation system algorithm using Popularity Based model and Collaborative filtering algorithm to improve the performance metrics and address the cold-start problem to overcome the drawbacks of both the algorithms.
Chaudhary, Shreayan; Ferni, U. (2020). Recommendation System for Establishing New Businesses using Geospatial Clustering for Multiple Reference Points, National Conference on Artificial Intelligence and Intelligent Information Processing; Patented under SRM University.
Created custom clustering algorithms and a recommender system to find the optimal location in any given city or place to help set up a business for entrepreneurs, thus saving time, money, and risk.
Engineered prototype glasses with camera and mic, and finetuned OpenFlamingo on real-time video to answer users’ audio-based questions (Whisper for ASR) conditioned on the image (CLIP ViT for images and LLaMA for text) YouTube Link
Data Pipeline: Video Stream from phone camera -> Capture Frames -> Record User Question -> Transcribe using Whisper -> Model Inference conditioned on image frames -> Return Answer
Technologies: DroidCam, OpenCV, OpenAI Whisper, OpenFlamingo (LLaMA 7B + CLIP ViT/L-14) Vision-Language Model.
Created an end to end project to help entrepreneurs and businessmen find the optimal location to set up their business. Created custom clustering algorithms using geospatial data analysis and a recommender system to find the optimal location in any given city or place to help set up a business for entrepreneurs, thus saving time, money, and risk. YouTube Link
Data Pipeline: Data Integration -> Data Inspection -> Data Visualizations -> Data Modeling -> Model Evaluation -> Django Web Application -> Cluster Analysis -> RESTful API creation -> Deployment on Heroku
Technologies: ML, Custom Clustering Algorithm, Folium, Heroku, Django.
Data Integrated from: Census of India 2011, Geopy Nominatim, Foursquare API and Google Places API.
Venter is a response categorization and document summarization tool. Goal is to create a platform to analyse, process and summarize large corpora of user feedbacks, inputs and responses autonomously.
Clients: Civis, ICMyC (I Change My City), SpeakUp.
Technologies: ML, Deep Learning, GCP, NLP, Django.
Created an end to end project to help students find the prospective grad schools for themselves. Predicting the chance of a student being selected for a grad program abroad and finding the most suitable universities for him/her.
Data Pipeline: Data Collection -> Data Inspection -> Data Visualizations -> Data Modeling -> Model Evaluation -> Flask Web Application -> RESTful API creation -> Deployment on Heroku
Technologies: ML, Linear Regression, XGBoost, Heroku, Flask, AWS.
Created an automated machine learning model that will predict the propensity of a loan defaulter whether he will return the loan (s)he has taken from the bank or not and find out the most probable form of repayment. Also automated it to run at 6pm daily and deployed it on an AWS EC2 server for production.
Creating a smart headband which is connected to the headphones to guide the visually impaired user via audio instructions about the objects and people in his/her path. This project was awarded a special mention at Conception, an IoT project expo.
Created a recommender system by applying collaborative filtering and simple popularity method algorithms on a movie dataset consisting of 27000 movies and 20 million ratings given by 138000 users.
Dataset: MovieLens 20 Million Dataset
Created a Smart Vehicle to Vehicle Technology using IoT to decrease commute times as well as enable multi-vehicle control by a single driver using GPS and Ultrasonic sensors.
Technologies Used: Keras, Arduino, NodeMCU, Bluetooth, Flask
Languages Used: Python, HTML, JavaScript, CSS, C
Creating a local library application using django to manage books systematically. This project was created in Django and was deployed on Heroku server. Project on GitHub
Used the techniques of SMOTE (Synthetic Minority Oversampling TEchnique), under-sampling and over-sampling to classify the type of business in multiple categories in an unbalanced dataset and achieved top 10 percentile F-1 score across India.
A news blog using django that fetches the latest news via news blog api. This project was created in Django and was deployed on Heroku server. Project on GitHub
Created a deep neural network using keras to identify the type of cloth or the footwear from the given images.
Using the OpenCV library to detect the user's face from the frames of the video captured through the videocam.
A web application which can send Emails, SMS and Android push notifications to application users to all (or selected few) employees in an organization. It uses HTML, CSS and JS as the frontend for webpage and MSSQL Server and Java as backend which stores the employee details.
Software Requirements: Netbeans IDE 8.2, Java 8 EE, Microsoft SQL Server 2012
Languages Used: HTML, JavaScript, CSS, Java
Frameworks Used: Bootstrap, DataTables, jQuery
To predict the price of a house in Boston using scikit-learn’s boston dataset.
The problem that we solved here is that given a set of features that describe a house in Boston, our machine learning model must
predict the house price.
Languages Used: Python
Dataset: Boston Houses Dataset
Created a model using machine learning algorithms that can predict whether the cancer is benign or malignant.
Dataset: Breast Cancer Wisconsin Dataset
Objective was to understand the factors that influence the attrition and to predict the employees who are going to leave the company in future.
Dataset: IBM Employees Dataset
A very simple to-do list that allows you to keep a track of activities / tasks you need to complete without having to forget about it.
Check it out!