Sanjay Jyoti Dutta

Fun Projects

Following are a collection of practice projects, which sparks my interest in further explorations.

Tomato Leaf Disease Detection
Github

The Tomato Leaf Disease Detection dataset, available on Kaggle, comprises over 20,000 labeled images across 10 classes, facilitating the classification of various tomato leaf diseases. This extensive collection supports the development and evaluation of machine learning models aimed at accurately identifying and diagnosing tomato plant ailments, thereby contributing to advancements in agricultural disease management.

Video Classification using Vision Transformer (ViT)
Github

This project aimed to classify videos using a Vision Transformer (ViT) model applied to the UCF50 dataset, which contained videos of 50 different action classes. The process began with extracting and preprocessing frames from videos, resizing, and normalizing them to create a consistent dataset. The extracted features and labels were saved as .npy files for future use in training. The Vision Transformer model was then designed with custom layers for tubelet embedding and positional encoding to handle the spatial-temporal information in the video frames.

Image Classification using Vision Transformer (ViT)
Github

The Vision Transformer (ViT) stands out as a groundbreaking architecture that redefines how we approach computer vision tasks. By leveraging the Transformer model using TensorFlow, ViT processes images as sequences of patches, offering a compelling alternative to traditional Convolutional Neural Networks (CNNs). We walked through the implementation of ViT for classifying flower images.

Human Action Recognition Using Detectron2 and LSTM
Github

The goal was to combine Detectron2 for pose estimation and LSTM for action classification and we have built a powerful human action recognition system.

Training an SVM Regressor on the California Housing Dataset
Github

This code tunes the hyperparameters for an SVM regressor using the California housing dataset and evaluates the model's performance using Root Mean Squared Error (RMSE).

Train & Fine-Tune a Decision Tree on the Moons Dataset
Github

The objectiv was to train a Decision Tree on the moons dataset, fine-tune it using grid search, evaluate its performance, and visualize the resulting decision boundary.

Regression using a Decision Tree
Github

The goal was to perform regression using Decision Trees, visualize the dataset and the resulting tree structure, providing a clear understanding of the model's decision-making process.

Train and Visualize a Decision Tree
Github

The primary goal was to train and visualize a Decision Tree classifier using the Iris dataset. Decision Trees are intuitive and powerful models for classification and regression tasks. By visualizing the Decision Tree, you can gain insights into how the model makes decisions based on the features of the dataset.

Comparing LinearSVC, SVC and SGDClassifier
Github

The goal was to generates a linearly separable dataset, trains a LinearSVC, SVC with a linear kernel and SGDClassifier, and plots their decision boundaries to see if they produce roughly the same model.

Implement Batch Gradient Descent with early stopping for Softmax Regression (without using Scikit-Learn)
Github

The goal was to implement Batch Gradient Descent with early stopping for Softmax Regression. The early stopping mechanism monitors the cross-entropy loss and stops training if the validation error does not improve for a specified number of epochs (patience).

Titanic Dataset with Machine Learning (Kaggle Challenge!)
Github

The Titanic dataset is a classic machine learning problem. It provides information about the passengers aboard the Titanic, and the goal is to predict whether a passenger survived or not based on various features such as age, gender, class, and more. This project is an excellent introduction to data cleaning, feature engineering, and model building. We prepared the data for training and then train a RandomForestClassifier.

Data Augmentation Using KNeighborsClassifier
Github

In this project, we have done MNIST Classification with Data Augmentation Using KNeighborsClassifier

KNeighborsClassifier for MNIST
Github

In this project, we built a machine learning classifier for the MNIST dataset, aiming to achieve over 97% accuracy on the test set. The MNIST dataset is a classic dataset in the field of machine learning, consisting of 70,000 images of handwritten digits (0–9). Each image is 28x28 pixels, and the task is to classify each image into the corresponding digit.

Football Player Segmentation with U-Net
Github

In this project using TensorFlow library, we analyzed images using semantic image segmentation. Semantic segmentation's goal is to categorize each pixel in an image into a class or object. We analyzed football (or soccer) player positions on the field. There are 512 images in the set, together with JSON file containing image information.

Training a DCGAN in PyTorch
Github

We worked on how to train DCGAN Model using PyTorch to generate images.

PyTorch: Transfer Learning and Image Classification
Github

Here, the goal was to perform transfer learning for image classification using the PyTorch deep learning library.

PyTorch object detection with pre-trained networks
Github

Here, we used PyTorch to detect objects in input images using seminal, state-of-the-art image classification networks, including Faster R-CNN with ResNet, Faster R-CNN with MobileNet, and RetinaNet. Also performed real-time object detection in video streams.

PyTorch image classification with pre-trained networks
Github

Here we used PyTorch to classify input images using seminal, state-of-the-art image classification networks, including VGG, Inception, DenseNet, and ResNet.

Regression with CNNs
Github

In this project, the goal was to train a Convolutional Neural Network (CNN) for regression prediction with Keras and then train a CNN to predict house prices from a set of images.

PyTorch: Training your first Convolutional Neural Network (CNN)
Github

Here, A Convolutional Neural Network (CNN) is developed using the PyTorch deep learning library. This network will be able to recognize handwritten Hiragana characters.

Fashion MNIST with Keras and Deep Learning
Github

The objective of the project was to create a deep learning model to classify images of clothing from the Fashion MNIST dataset. The Fashion MNIST dataset is a collection of grayscale images of 10 different categories of clothing and accessories, like T-shirts, trousers, pullovers, dresses, coats, sandals, shirts, sneakers, bags, and ankle boots.

Smile detection with OpenCV, Keras, and TensorFlow
Github

This project used Haar cascade face detector, extract the face region of interest (ROI) from the image and then pass the ROI through LeNet for smile detection.

Breaking captchas with deep learning, Keras, and TensorFlow
Github

This project demonstrated how to use deep learning techniques, specifically with frameworks like Keras and TensorFlow, to automatically solve CAPTCHA challenges.

Use Checkpoint Strategies with Keras and TensorFlow
Github

This amied to use Early Stopping and Model Checkpointing in training Keras models encapsulates a sophisticated approach to deep learning.

ImageNet: VGGNet, ResNet, Inception, and Xception with Keras
Github

The project was designed to classify an image by identifying the main subject in the image, leveraging pre-trained deep learning models available through TensorFlow's Keras library. It accepts an image file and a model name as input parameters. The script supports various state-of-the-art image classification models like VGG16, VGG19, ResNet50, InceptionV3, and Xception, which have been trained on the ImageNet dataset.

Visualize network architecture
Github

The goal was to visualize network architecture using Keras and TensorFlow.

MiniVGGNet Implementation
Github

The aim was to implement MiniVGGNet to work on CIFAR-10 data set.

LeNet: Recognizing Handwritten Digits
Github

The goal was for building, training, evaluating, and plotting the performance of a convolutional neural network (LeNet) for digit classification on the MNIST dataset.

First Deep Learning Project in Python
Github

The goal was to create the first deep learning neural network model in Python using Keras. Here, we started by loading and preparing our dataset, followed by defining and compiling a Keras neural network model. We trained the model on our data, evaluate its performance, and then use it to make predictions on new data. We used Pima Indians onset of diabetes dataset.

Implementing Convolutions with Python
Github

We explored hands-on code that illustrates how to implement and apply convolution operations and kernels to images. This insight aided in understanding the internal workings of Convolutional Neural Networks (CNNs) during their training phase.

Backpropagation from Scratch with Python
Github

Mastering Backpropagation: A Step-by-Step Guide to Implementing it with Python

Perceptron Neural Network
Github

The project demonstrated how a perceptron model could learn bitwise operations through a basic machine learning process involving training with input features and corresponding labels, followed by testing to evaluate the model's predictions.

Pedestrian Detection with 4 Different Computer Vision Techniques
Github

This project explores pedestrian detection using four different computer vision techniques.
Method 1: Background Subtraction + Contour Extraction
Method 2: Haar Cascades (Viola-Jones Classifiers)
Method 3: Histogram of Oriented Gradients (HOG) and Support Vector Machine (SVM)
Method 4: Single Shot Detector (SSD) with MobileNet

Object Detection in a Video
Github

This project showcased the implementation of Haar Cascade classifiers for object detection in video streams. Haar Cascades are a popular method for object detection due to their efficiency and effectiveness, particularly in detecting faces and other predefined objects. Using OpenCV, this project demonstrates how to apply Haar Cascades to real-time video data to identify and track objects.

Hand Gesture Recognition
Github

This project focused on counting fingers in a real-time video using OpenCV.

Smile Detection
Github

The Smile Detection Project aimed at identifying smiles real-time video feeds using a facial landmark detector to accurately determine the presence of a smile.

Face Detection
Github

The Face Detection project aimed to identify and locate human faces within a digital image utilizing Haar Cascades.

OpenCV Basics
Github

These project series provides an essential overview of computer vision techniques using OpenCV. It begins with fundamental image operations—loading, displaying, and pixel manipulation—then advances to drawing, translation, rotation, resizing, flipping, and cropping. Additionally, it explores arithmetic operations, bitwise manipulations, masking, and channel manipulation. Accompanied by downloadable source code for each tutorial, this series offers a practical and efficient way to grasp the key functionalities of OpenCV, making it perfect for beginners eager to learn quickly.

Research

Fun Projects