Suraj Thapa — Data Science & ML

About

About Me

I recently graduated with a Master's degree in Data Science from the University of New Haven (May 2026). I am deeply passionate about building intelligent, robust systems that solve complex real-world problems, and I am driven by a fascination with transforming raw data into actionable, high-impact intelligence.

Rather than just training models, I enjoy the entire end-to-end process — from architecting efficient data pipelines and engineering software, to deploying scalable machine learning solutions. I thrive in dynamic environments where rigorous theoretical mathematics meets practical, hands-on software engineering.

I am a strong advocate for the open-source community and value continuous learning, cross-disciplinary collaboration, and breaking down complex topics through detailed technical writing and tutorials.

Actively seeking full-time opportunities in Data Analytics, Machine Learning, and Computer Vision. Let's connect →

Core Skills

Python PyTorch TensorFlow Computer Vision NLP SQL Pandas Data Visualization Deep Learning SLAM C++ Git

Research

Research Interests

Exploring robust estimation and computer vision with a focus on accuracy, determinism, and reproducibility.

Robust Estimation

RANSAC and deterministic variants to effectively filter extreme outliers.

Computer Vision

Homography, object tracking, and instance segmentation techniques.

Information Fusion

Fusing multi-modal sensory inputs like LiDAR and RGB arrays for SLAM.

Machine & Deep Learning

Training specialized neural networks, optimization algorithms, and NLP applications.

Environmental Science

Applying data-driven and geospatial techniques directly to ecological conservation and analysis.

Projects

Featured Projects

Selected academic and applied work across vision, ML, and systems.

01

FinRAG: Retrieval-Augmented Generation for Financial QA

Built a financial-domain RAG system that answers investment, risk-analysis, and forecasting queries with improved factuality and reduced hallucinations.

PythonPyTorchFAISSLLMs

Aug–Dec 2025

02

Instance Segmentation Using Mask R-CNN

Implemented a three-class instance segmentation pipeline on a custom dataset using Detectron2/Mask R-CNN. Trained & fine-tuned models analyzing mAP, IoU.

Mask R-CNNDetectron2PyTorch

Aug–Dec 2025

03

Tango Puzzle Solver with AC-3 and A* Search

End-to-end solver for the Tango logic puzzle with interactive visualization via Pygame. Extended with Q-learning experiments.

CSPA*Pygame

Feb–May 2025

04

EDA and DE Pipeline

Exploratory Data Analysis and Data Engineering pipeline ensuring robust transformation, cleaning, and preparation of raw datasets for scalable machine learning.

PythonPandasETL

2024

05

ORB-SLAM3 Experiments

Explorations and integrations using the ORB-SLAM3 framework for visual, visual-inertial, and multi-map SLAM operations across diverse environments.

C++Computer VisionSLAMROS

2024

Blog

Latest Articles

Insights and tutorials on deep learning, machine learning algorithms, and optimization.

Forward & Backward Propagation Explained

A detailed walkthrough of how neural networks make predictions and learn from mistakes, with step-by-step math and Python code.

May 28, 2026 • Deep Learning

Visual Guide: 7 Activation Functions Explained

A comprehensive breakdown of ReLU, Sigmoid, Tanh, and other critical neural network activation functions with generated mathematical plots.

March 19, 2026 • Deep Learning

Gradient Descent & Its Variants (BGD vs SGD)

Understanding the calculus behind model optimization and why mini-batches strike the perfect balance between speed and stability.

March 19, 2026 • Optimization

Stop Relying on Accuracy: Real Evaluation Metrics

Why the Confusion Matrix, Precision, Recall, and F1-Scores are far more important than raw Accuracy — especially with imbalanced datasets.

March 19, 2026 • Evaluation

Principal Component Analysis (PCA) Intuition

Defeating the Curse of Dimensionality by using variance-maximizing orthogonal transformations to compress massive datasets.

March 19, 2026 • Dimensionality Reduction

Law of Large Numbers vs Central Limit Theorem

Unpacking two of the most commonly confused fundamental theorems in classical probability and how they power data science inference.

March 19, 2026 • Statistics

CV

Curriculum Vitae

Download Resume

Contact

Get in Touch

I am a recent MS Data Science graduate actively looking for full-time roles in Data Analytics, Machine Learning, and Computer Vision. Whether you have an opportunity, a collaboration idea, or just want to connect — I'd love to hear from you!

Email

hi.surajthapa@gmail.com

Location

Connecticut, USA

LinkedIn

linkedin.com/in/mgrsuraz

Hi, I'm Suraj Thapa

About Me

Core Skills

Research Interests

Robust Estimation

Computer Vision

Information Fusion

Machine & Deep Learning

Environmental Science

Featured Projects

FinRAG: Retrieval-Augmented Generation for Financial QA

Instance Segmentation Using Mask R-CNN

Tango Puzzle Solver with AC-3 and A* Search

EDA and DE Pipeline

ORB-SLAM3 Experiments

Latest Articles

Forward & Backward Propagation Explained

Visual Guide: 7 Activation Functions Explained

Gradient Descent & Its Variants (BGD vs SGD)

Stop Relying on Accuracy: Real Evaluation Metrics

Principal Component Analysis (PCA) Intuition

Law of Large Numbers vs Central Limit Theorem

Curriculum Vitae

Get in Touch