![]() |
            |
Masters in Data ScienceTexas A&M UniversityDepartment of Computer Science and EngineeringStudentInstitute of Data Science |
---|
CURRENT PROJECT
Designing and Optimizing Scalable Data Pipelines for Precision Agriculture and
Environmental Monitoring: Predicting Crop Health, Greenhouse Gas Flux, and Climate Risk
Developing ML models for improving environmental data processing
through multi-source data integration - Dept. of Computer & Electrical Engineering(TAMU)
Go to my Github page to look my work and collaborate.
Crafted an anticipatory maintenance system using Python (scikit-learn, TensorFlow), SQL, and Azure leveraging machine learning to prophesy equipment failures in data centers. This proactive approach curtailed downtime and slashed maintenance expenses significantly. Garnered a striking 90% accuracy in foretelling equipment failures, leading to a notable 30% drop in maintenance costs.
Developed using Mixtral, Whisper, and AWS, integrating language models and tools like GPT-3, BERT, and FFmpeg for efficient video processing. Set up AWS EC2 instances to run large models, reduce latency, and transcribe audio to text with Whisper. Implemented dynamic quiz generation and a feedback system using Flask, HTML, and JavaScript, storing user data in a database. The project highlighted the strengths and limitations of various language models and their practical application in video summarization and interactive quizzes.
Developed an AI-based system to detect financial fraud in real-time, leveraging deep learning models like Transformers, CNNs, and GANs for fraud identification and simulation of forged transactions. Use EfficientNet for image-based fraud detection, while integrating Explainable AI (XAI) techniques to enhance model transparency and ensure ethical decision-making. Visualize fraud pa erns and risks using Tableau or Seaborn, aiding stakeholders in making informed, data-driven investment decisions. Achieved 95% accuracy in detecting fraudulent transactions, reducing false positives..
Built MLOps pipeline for a Loan Eligibility Prediction model using Python, deployed on Google Cloud Platform (GCP). The pipeline involved creating a Flask API, containerizing it with Docker, and managing source code through Cloud Source Repository and Git.Automated deployment was handled via Cloud Build, and the model was deployed using Cloud Run. This project demonstrated efficient cloud architecture, leveraging GCP services for scalable and automated machine learning operations.
Conducted statistical and multivariate analysis on customer data to identify key churn drivers. Developed predictive models using logistic regression and Random Forest, improving churn prediction accuracy by 85%. Provided actionable insights to stakeholders, enabling retention strategies that reduced churn rates by 20%.
Engineered a content recommendation system driven by machine learning to tailor content suggestions for individual users. This bespoke approach fostered a substantial increase in user engagement. Amplified user engagement by a commendable 25% through personalized content recommendations.
Leveraging the robust capabilities of YOLOv5, YOLOv8, DeepSORT, and Easy OCR, engineered an ensemble model tailored for number plate recognition (ANPR) and vehicle tracking, particularly excelling in low light conditions. Achieving an impressive F1 Score of 0.97, this model stands as a testament to its efficacy in challenging environments. Complemented by a user-friendly web interface, it emerges as a versatile and robust solution for ANPR.
Built a predictive model to assess credit risk using logistic regression and decision trees, achieving a risk classification accuracy of 92%. Conducted Time Series Analysis to identify trends in loan defaults, improving risk profiling. Developed dashboards in Power BI to visualize risk metrics and trends, enhancing transparency for stakeholders.
Internships
Data Science Intern (Graduate) - Texas A&M University, College Station
Student Researcher (FLAIR Lab)- Texas A&M University, College Station
Entropik Technologies (Machine Learning Engineer) - Remote, India
High Radius (Data Analyst) - Chennai, India
Publications and Awards
Mailing address: Unit : 204 , The Villas of Cherry Hollow, 503 Cherry Street, College station, TX 77840
E-mail: raj2001@tamu.edu
or rajpurohitharjun58@gmail.com
Linkdedin: Reach me at Linkedin