Sanjay Jyoti Dutta

I am a Ph.D. Research Scholar and a Research Assistant specializing in Computer Vision under the supervision of Prof Reyer Zwiggelaar and Prof Tossapon Boongoen at Aberystwyth University, United Kingdom. I am passionate about learning machine learning algorithms and solving practical challenges in their application to real-world problems.

My research concentrates on the use of machine learning (especially Deep Learning) methods for activity recognition from videos. A deep learning architecture is proposed for egocentric video, which integrates both action appearance and motion within a single model. Development and evaluation are done based on publicly available data.

In addition, I had the opportunity to collaborate with the Life Sciences and Veterinary Sciences departments at my university on several machine learning projects. This interdisciplinary work involved developing and applying advanced machine learning techniques to solve complex problems in biological and veterinary sciences, enhancing research outcomes and fostering innovation at the intersection of technology and life sciences.


Email  /  Google Scholar  /  Researchgate  /  Linkedin  /  Github  /  X (Twitter)  /  Medium  / 

ORCID iD icon https://orcid.org/0009-0007-6149-789X

Visit my fun projects

profile photo

Research

I'm interested in Computer Vision, Deep Learning, Machine Learning, Image Processing and Video Analysis.

Human Activity Recognition , On going, 2024
Publication

Will be updated.

Sheep Lambing Detection , On going, 2024
Publication

Will be updated.

Noise Profiling for ANNs: A Bio-inspired Approach , 2023
Publication

A novel approach to noise profiling for artificial neural networks (ANNs) is proposed, which is inspired by the sensory systems of insects. This approach entails the utilization of both Gaussian and Chaotic noises to enhance the adaptability, learning and generalization capabilities of ANNs.

Virtual Lab phase II (Integration and maintenance) (2015-2017)
project page / Publication

Indian Institute of Technology Guwahati is one of the contributors to Virtual laboratories, which are an essential part of E-learning because all the students in their institutes may not have sufficient lab facilities. These experiments can be accessed from anywhere and anytime. Therefore, the Ministry of Human Resource Development (MHRD), Govt. of India took an initiative of integration of virtual laboratories under the national mission on Education through Information and Communication Technology (NME-ICT) . The motive of virtual lab integration is to make all the developed projects into an open source repository such that all the lab information is available to a community, students as well as academic institutes, for use and development, to convert all licensed contents into a platform that is independent of any licensed software.

Remote Triggered Digital System Laboratory (2012-2015)
Project page / Publication

We developed this project in Indian Institute of Technology Guwahati. The Ministry of Human Resource Development (MHRD), Govt. of India took the initiative of Remote Triggered Digital System Laboratory under the National Mission on Education through Information and Communication Technology (NME-ICT). This virtual laboratory provides the theoretical understanding of digital electronics to the students by performing various experiments.

Human Resource Management System
Publication

The Human Management System is a Java based (J2EE) system which provides intranet automation of HR software. The aim of the paper is based on a project that helps the overall management of the employees, who work in a company. The proposed system contains all the information regarding employees in the company. The system is developed on good interaction as well as communication facilities between the HR administrator and the working employees.

Attented Conferences and Workshops
  • Postgraduate Research Conference 2024, Aberystwyth University, United Kingdom, April 2024.
  • The AI Research Hub Symposium, Aberystwyth University, United Kingdom, September 2023.
  • The 22nd UK Workshop on Computational Intelligence, Aston University, United Kingdom, September 2023.
  • Faculty of Business and Physical Sciences Postgraduate Research Conference 2023, Aberystwyth University, United Kingdom, July 2023.
  • 1st AI Summer School for Beginners, Aberystwyth University, United Kingdom, August 2022.
  • 5th Summer school on Artificial Intelligence, Indian Institute of Information Technology Hyderabad, India, August 2021.
  • IEEE 5th International conference for Convergence in technology, Pune, India, March 2019.
  • IEEE Fourteenth International Conference on Information Processing (IcInPro), Bangalore, December 2018.
  • IEEE 3rd International Conference for Convergence in Technology (I2CT), Pune, India, April 2018.
  • MATLAB workshop in Reflux 7.0, Indian Institute of Technology, Guwahati, India, 2019.
  • Workshop in Data Science in Financial technology, Research Conclave, Indian Institute of Technology, Guwahati, India, 2019.
  • Workshop in machine learning, Research Conclave, Indian Institute of Technology, Guwahati, India, 2019.
  • Poster presentation in Research Conclave, Indian Institute of Technology, Guwahati, India, 2017.
  • A one-day workshop on virtual laboratory jointly organized by Assam Engineering College and Indian Institute of Technology, Guwahati, India, February 2017.
  • A One-day workshop on virtual laboratory jointly organized by NIT Meghalaya, Shillong and Indian Institute of Technology, Guwahati, India, November 2016.
  • A One-day workshop virtual laboratory jointly organized by Central Institute of Technology, Kokrajhar and Indian Institute of Technology, Guwahati, India, August 2016.
  • Virtual Labs Summer Sprint Integration workshop, Indian Institute of Technology, Guwahati, India, 2015.
  • First Integration workshop, International Institute of Information Technology Hyderabad, India, 2014.
  • Worked as a volunteer, 33rd Foundations of Software Technology and Theoretical Computer Science, Indian Institute of Technology, Guwahati, India, 2013.

Fun Projects

Following are a collection of practice projects, which sparks my interest in further explorations.

Tomato Leaf Disease Detection
Github

The Tomato Leaf Disease Detection dataset, available on Kaggle, comprises over 20,000 labeled images across 10 classes, facilitating the classification of various tomato leaf diseases. This extensive collection supports the development and evaluation of machine learning models aimed at accurately identifying and diagnosing tomato plant ailments, thereby contributing to advancements in agricultural disease management.

Video Classification using Vision Transformer (ViT)
Github

This project aimed to classify videos using a Vision Transformer (ViT) model applied to the UCF50 dataset, which contained videos of 50 different action classes. The process began with extracting and preprocessing frames from videos, resizing, and normalizing them to create a consistent dataset. The extracted features and labels were saved as .npy files for future use in training. The Vision Transformer model was then designed with custom layers for tubelet embedding and positional encoding to handle the spatial-temporal information in the video frames.

Image Classification using Vision Transformer (ViT)
Github

The Vision Transformer (ViT) stands out as a groundbreaking architecture that redefines how we approach computer vision tasks. By leveraging the Transformer model using TensorFlow, ViT processes images as sequences of patches, offering a compelling alternative to traditional Convolutional Neural Networks (CNNs). We walked through the implementation of ViT for classifying flower images.

Human Action Recognition Using Detectron2 and LSTM
Github

The goal was to combine Detectron2 for pose estimation and LSTM for action classification and we have built a powerful human action recognition system.

Training an SVM Regressor on the California Housing Dataset
Github

This code tunes the hyperparameters for an SVM regressor using the California housing dataset and evaluates the model's performance using Root Mean Squared Error (RMSE).

Train & Fine-Tune a Decision Tree on the Moons Dataset
Github

The objectiv was to train a Decision Tree on the moons dataset, fine-tune it using grid search, evaluate its performance, and visualize the resulting decision boundary.

Regression using a Decision Tree
Github

The goal was to perform regression using Decision Trees, visualize the dataset and the resulting tree structure, providing a clear understanding of the model's decision-making process.

Train and Visualize a Decision Tree
Github

The primary goal was to train and visualize a Decision Tree classifier using the Iris dataset. Decision Trees are intuitive and powerful models for classification and regression tasks. By visualizing the Decision Tree, you can gain insights into how the model makes decisions based on the features of the dataset.

Comparing LinearSVC, SVC and SGDClassifier
Github

The goal was to generates a linearly separable dataset, trains a LinearSVC, SVC with a linear kernel and SGDClassifier, and plots their decision boundaries to see if they produce roughly the same model.

Implement Batch Gradient Descent with early stopping for Softmax Regression (without using Scikit-Learn)
Github

The goal was to implement Batch Gradient Descent with early stopping for Softmax Regression. The early stopping mechanism monitors the cross-entropy loss and stops training if the validation error does not improve for a specified number of epochs (patience).

Titanic Dataset with Machine Learning (Kaggle Challenge!)
Github

The Titanic dataset is a classic machine learning problem. It provides information about the passengers aboard the Titanic, and the goal is to predict whether a passenger survived or not based on various features such as age, gender, class, and more. This project is an excellent introduction to data cleaning, feature engineering, and model building. We prepared the data for training and then train a RandomForestClassifier.

Data Augmentation Using KNeighborsClassifier
Github

In this project, we have done MNIST Classification with Data Augmentation Using KNeighborsClassifier

KNeighborsClassifier for MNIST
Github

In this project, we built a machine learning classifier for the MNIST dataset, aiming to achieve over 97% accuracy on the test set. The MNIST dataset is a classic dataset in the field of machine learning, consisting of 70,000 images of handwritten digits (0–9). Each image is 28x28 pixels, and the task is to classify each image into the corresponding digit.

Football Player Segmentation with U-Net
Github

In this project using TensorFlow library, we analyzed images using semantic image segmentation. Semantic segmentation's goal is to categorize each pixel in an image into a class or object. We analyzed football (or soccer) player positions on the field. There are 512 images in the set, together with JSON file containing image information.

Training a DCGAN in PyTorch
Github

We worked on how to train DCGAN Model using PyTorch to generate images.

PyTorch: Transfer Learning and Image Classification
Github

Here, the goal was to perform transfer learning for image classification using the PyTorch deep learning library.

PyTorch object detection with pre-trained networks
Github

Here, we used PyTorch to detect objects in input images using seminal, state-of-the-art image classification networks, including Faster R-CNN with ResNet, Faster R-CNN with MobileNet, and RetinaNet. Also performed real-time object detection in video streams.

PyTorch image classification with pre-trained networks
Github

Here we used PyTorch to classify input images using seminal, state-of-the-art image classification networks, including VGG, Inception, DenseNet, and ResNet.

Regression with CNNs
Github

In this project, the goal was to train a Convolutional Neural Network (CNN) for regression prediction with Keras and then train a CNN to predict house prices from a set of images.

PyTorch: Training your first Convolutional Neural Network (CNN)
Github

Here, A Convolutional Neural Network (CNN) is developed using the PyTorch deep learning library. This network will be able to recognize handwritten Hiragana characters.

Fashion MNIST with Keras and Deep Learning
Github

The objective of the project was to create a deep learning model to classify images of clothing from the Fashion MNIST dataset. The Fashion MNIST dataset is a collection of grayscale images of 10 different categories of clothing and accessories, like T-shirts, trousers, pullovers, dresses, coats, sandals, shirts, sneakers, bags, and ankle boots.

Smile detection with OpenCV, Keras, and TensorFlow
Github

This project used Haar cascade face detector, extract the face region of interest (ROI) from the image and then pass the ROI through LeNet for smile detection.

Breaking captchas with deep learning, Keras, and TensorFlow
Github

This project demonstrated how to use deep learning techniques, specifically with frameworks like Keras and TensorFlow, to automatically solve CAPTCHA challenges.

Use Checkpoint Strategies with Keras and TensorFlow
Github

This amied to use Early Stopping and Model Checkpointing in training Keras models encapsulates a sophisticated approach to deep learning.

ImageNet: VGGNet, ResNet, Inception, and Xception with Keras
Github

The project was designed to classify an image by identifying the main subject in the image, leveraging pre-trained deep learning models available through TensorFlow's Keras library. It accepts an image file and a model name as input parameters. The script supports various state-of-the-art image classification models like VGG16, VGG19, ResNet50, InceptionV3, and Xception, which have been trained on the ImageNet dataset.

Visualize network architecture
Github

The goal was to visualize network architecture using Keras and TensorFlow.

MiniVGGNet Implementation
Github

The aim was to implement MiniVGGNet to work on CIFAR-10 data set.

LeNet: Recognizing Handwritten Digits
Github

The goal was for building, training, evaluating, and plotting the performance of a convolutional neural network (LeNet) for digit classification on the MNIST dataset.

First Deep Learning Project in Python
Github

The goal was to create the first deep learning neural network model in Python using Keras. Here, we started by loading and preparing our dataset, followed by defining and compiling a Keras neural network model. We trained the model on our data, evaluate its performance, and then use it to make predictions on new data. We used Pima Indians onset of diabetes dataset.

Implementing Convolutions with Python
Github

We explored hands-on code that illustrates how to implement and apply convolution operations and kernels to images. This insight aided in understanding the internal workings of Convolutional Neural Networks (CNNs) during their training phase.

Backpropagation from Scratch with Python
Github

Mastering Backpropagation: A Step-by-Step Guide to Implementing it with Python

Perceptron Neural Network
Github

The project demonstrated how a perceptron model could learn bitwise operations through a basic machine learning process involving training with input features and corresponding labels, followed by testing to evaluate the model's predictions.

Pedestrian Detection with 4 Different Computer Vision Techniques
Github

This project explores pedestrian detection using four different computer vision techniques.
Method 1: Background Subtraction + Contour Extraction
Method 2: Haar Cascades (Viola-Jones Classifiers)
Method 3: Histogram of Oriented Gradients (HOG) and Support Vector Machine (SVM)
Method 4: Single Shot Detector (SSD) with MobileNet

Object Detection in a Video
Github

This project showcased the implementation of Haar Cascade classifiers for object detection in video streams. Haar Cascades are a popular method for object detection due to their efficiency and effectiveness, particularly in detecting faces and other predefined objects. Using OpenCV, this project demonstrates how to apply Haar Cascades to real-time video data to identify and track objects.

Hand Gesture Recognition
Github

This project focused on counting fingers in a real-time video using OpenCV.

Smile Detection
Github

The Smile Detection Project aimed at identifying smiles real-time video feeds using a facial landmark detector to accurately determine the presence of a smile.

Face Detection
Github

The Face Detection project aimed to identify and locate human faces within a digital image utilizing Haar Cascades.

OpenCV Basics
Github

These project series provides an essential overview of computer vision techniques using OpenCV. It begins with fundamental image operations—loading, displaying, and pixel manipulation—then advances to drawing, translation, rotation, resizing, flipping, and cropping. Additionally, it explores arithmetic operations, bitwise manipulations, masking, and channel manipulation. Accompanied by downloadable source code for each tutorial, this series offers a practical and efficient way to grasp the key functionalities of OpenCV, making it perfect for beginners eager to learn quickly.

Back to Top