Md Moniruzzaman

Stony Brook University · New York 11794 · USA · mmoniruzzama@cs.stonybrook.edu

I am currently a Ph.D. student in the Department of Computer Science at Stony Brook University (SBU). I am a Research Assistant at the Computer Vision and Biomedical Imaging lab under the supervision of Dr. Zhaozheng Yin. During my Ph.D., I was a Research Intern at Peloton Interactive Inc., where I Collaborated with Peloton’s machine learning team on the Repetitive Action Counting project. Before SBU, I received my Bachelor of Science (B.Sc) degree in Electronics and Communication Engineering from Khulna University of Engineering Technology, Bangladesh.

Research Interests: I am interested in designing and implementing machine learning and deep learning-based algorithms to solve computer vision problems, such as image and video classification. In particular, I am familiar with human action analysis from cameras and wearable sensors: human action recognition, temporal action localization, future action anticipation, human pose estimation, online action detection, and repetitive action counting. During my Ph.D., many of my research ideas have been accepted to top-tier journals and conferences such as IEEE-TMM, IEEE-TCSVT, IEEE-JBHI, and ACMMM.


Research Projects

Human Action Recognition

Our Proposal: Human Action Recognition by Discriminative Feature Pooling and Video Segmentation Attention Model (IEEE-TMM 2021)

recognition

Innovation: We introduce a simple yet effective network that embeds a novel Discriminative Feature Pooling (DFP) mechanism and a novel Video Segment Attention Model (VSAM), for video-based human action recognition from both trimmed and untrimmed videos. Our DFP module emphasizes the most critical spatial, temporal, and channel-wise features related to the actions within a video segment, while our VSAM emphasizes the most representative features across the video segments.

recognition_results
This research is supported by the National Science Foundation (NSF)

Temporal Action Localization

Our Proposal: Feature Weakening, Contextualization, and Discrimination for Weakly Supervised Temporal Action Localization (IEEE-TMM 2023)

localization

Innovation: (1) We develop a Feature Weakening (FW) module for action completeness modeling; (2) We introduce a Feature Contextualization (FC) module to generate more representative contextualized features; and (3) We propose a Feature Discrimination (FD) module to highlight the most discriminative video segments/classes corresponding to each class/segment, respectively.

localization_results
This research is supported by the National Science Foundation (NSF)

Our Proposal: Collaborative Foreground, Background, and Action Modeling Network for Weakly Supervised Temporal Action Localization (IEEE-TCSVT 2023)

localization_tcsvt

Innovation: We introduce a novel collaborative foreground, background, and action Modeling Network (FBA-Net) that consists of a foreground modeling (FM) branch, the background modeling (BM) branch, and the class-specific action and background modeling (CM) branch. These branches collaborate with each other to localize both the discriminative and ambiguous action frames and suppress both the discriminative and ambiguous background frames.

This research is supported by the National Science Foundation (NSF)

Our Proposal: Action Completeness Modeling with Background Aware Networks for weakly-Supervised Temporal Action Localization (ACMMM 2020)

localization_acmmm

Innovation: (1) We design a novel Background Aware Network (BANet) to suppress both highly discriminative and ambiguous background frames to significantly reduce the false positive rate; and (2) We propose an action completeness modeling framework that contains multiple BANets, where the BANets are forced to localize different but complementary action instances in both highly discriminative and ambiguous action frames.

This research is supported by the National Science Foundation (NSF)

Future Action Anticipation

Our Proposal: Jointly-Learnt Networks for Future Action Anticipation via Self-Knowledge Distillation and Cycle Consistency (IEEE-TCSVT 2022)

anticipation_tcsvt

Innovation: (a) We propose a Jointly-learnt Action Anticipation Network (J-AAN) that anticipates future actions from the observed past actions in both direct and recursive ways; (b) We utilize a Self-knowledge distillation mechanism to train the J-AAN, where the J-AAN gradually distills its own knowledge during the training to soften the hard labels to model the uncertainty of future action anticipation; and (c) We design Forward and backward J-AANs with cycle consistency, where the backward J-AAN evaluates how well the forward JAAN anticipates future actions by anticipating past actions from the anticipated future actions.

This research is supported by the National Science Foundation (NSF)

Human Pose Reconstruction and Prediction

Our Proposal: Wearable Motion Capture: Reconstructing and Predicting 3D Human Poses from Wearable Sensors (IEEE-JBHI 2023)

pose

Innovation: We introduce a novel Attention-Oriented Recurrent Neural Network (AttRNet) that contains a sensor-wise attention-oriented recurrent encoder, a reconstruction module, and a dynamic temporal attention-oriented recurrent decoder, to reconstruct the 3D human pose over time and predict the 3D human poses at the following time steps from the wearable IMU sensors and wearable cameras.

pred1

pred2

This research is supported by the National Science Foundation (NSF)

Fire and Traffic Accident Scene Classification

Our Proposal: Spatial Attention Mechanism for Weakly Supervised Fire and Traffic Accident Scene Classification (SMARTCOMP 2019)

accident1

accident2

Innovation: we introduce a simple yet effective framework that integrates the convolutional feature maps of deep Convolutional Neural Networks with a spatial attention mechanism for fire and traffic accident scene classification. In addition to the image-based traffic scene classification, the model is also applied on a set of collected videos for real-world applications.

This research is supported by the Mid-America Transportation Center (MATC), and Intelligent Systems Center (ISC) at Missouri University of Science and Technology.

Publications

Publication during Ph.D.:

1. Md Moniruzzaman, Zhaozheng Yin, Md Sanzid Bin Hossain, Hwan Choi, and Zhishan Guo. “Wearable Motion Capture: Reconstructing and Predicting 3D Human Poses from Wearable Sensors.” IEEE Journal of Biomedical and Health Informatics (IEEE-JBHI), 2023.

2. Md Moniruzzaman, and Zhaozheng Yin. “Feature Weakening, Contextualization, and Discrimination for Weakly Supervised Temporal Action Localization.” IEEE Transactions on Multimedia (IEEE-TMM), Accepted, 2023. (Impact Factor: 8.182)

3. Md Moniruzzaman, and Zhaozheng Yin. “Collaborative Foreground, Background, and Action Modeling Network for Weakly Supervised Temporal Action Localization.” IEEE Transactions on Circuits and Systems for Video Technology (IEEE-TCSVT), Accepted, 2023. (Impact Factor: 5.859)

4. Md Moniruzzaman, Zhaozheng Yin, Zhihai He, Ruwen Qin, and Ming C Leu. “Jointly-Learnt Networks for Future Action Anticipation via Self-Knowledge Distillation and Cycle Consistency.” IEEE Transactions on Circuits and Systems for Video Technology (IEEE-TCSVT), 2022. (Impact Factor: 5.859)

5. Md Moniruzzaman, Zhaozheng Yin, Zhihai He, Ruwen Qin, and Ming C Leu. “Human Action Recognition by Discriminative Feature Pooling and Video Segmentation Attention Model.” IEEE Transactions on Multimedia (IEEE-TMM), 2021. (Impact Factor: 8.182)

6. Md Al-Amin, Ruwen Qin, Md Moniruzzaman, Zhaozheng Yin, Wenjin Tao, and Ming C Leu. “An individualized system of skeletal data-based CNN classifiers for action recognition in manufacturing assembly.” Journal of Intelligent Manufacturing, 2021.

7. Wenjin Tao, Haodong Chen, Md Moniruzzaman, Ming C. Leu, Zhaozheng Yin, and Ruwen Qin. "Attention-based sensor fusion for human activity recognition using IMU signals." arXiv preprint arXiv:2112.11224, 2021.

8. Md Moniruzzaman, Zhaozheng Yin, Zhihai He, Ruwen Qin, and Ming C Leu. “Action Completeness Modeling with Background Aware Networks for Weakly-Supervised Temporal Action Localization.” In Proceedings of the 28th ACM International Conference on Multimedia (ACM-MM), 2020.

9. Md Moniruzzaman, Zhaozheng Yin, and Ruwen Qin. “Spatial Attention Mechanism for Weakly Supervised Fire and Traffic Accident Scene Classification.” In IEEE International Conference on Smart Computing, 2019.

Publication Pre Ph.D.:

10. Md Moniruzzaman, Md Abul Kayum Hawlader, and Md Foisal Hossain. "Wavelet based watermarking approach of hiding patient information in medical image for medical image authentication." In 17th International Conference on Computer and Information Technology (ICCIT), 2014.

11. Md Moniruzzaman, Md Abul Kayum Hawlader, and Md Foisal Hossain. "An image fragile watermarking scheme based on chaotic system for image tamper detection." In International Conference on Informatics, Electronics & Vision (ICIEV), 2014.

12. Md Moniruzzaman, Md Shafuzzaman, and Md Foisal Hossain. "Brightness preserving Bi-histogram equalization using edge pixels information." In International Conference on Electrical Information and Communication Technology (EICT), 2014.

13. Md Moniruzzaman, Md Abul Kayum Hawlader, and Md Foisal Hossain. "Robust RGB color image watermarking scheme based on DWT-SVD and chaotic system." In The 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), 2014.

14. Md Moniruzzaman, Md Abul Kayum Hawlader, and Md Foisal Hossain. "Watermarking scheme based on game of life cellular automaton." In International Conference on Informatics, Electronics & Vision (ICIEV), 2014.

15. Md Moniruzzaman, Md Abul Kayum Hawlader, Md Foisal Hossain, and Mohd Abdur Rashid. "SVD and chaotic system based watermarking approach for recovering crime scene image." In International Conference on Electrical and Computer Engineering, 2014.


Courseworks

- Analysis of Algorithms

- Advanced Topics in Data Mining

- Applied Spatial and Temporal Data Analysis

- Clustering Algorithm

- Computing with Logic

- Deep Learning Neural Networks

- Introduction to Machine Learning

- Introduction to Computer Vision

- Introduction to Deep Learning

- Machine Learning in Computer Vision

- Theory of Database Systems


Awards & Certifications

  • Academic Achievement Award, Department of Computer Science, Missouri University of Science and Technology, 2019.
  • NSF Travel Grant Award in IEEE International Conference on Smart Computing, 2019.
  • Academic Achievement Award, Department of Computer Science, Missouri University of Science and Technology, 2018.
  • 3 rd Place - Best Poster Award, Poster Competition, Intelligent Systems Center (ISC), 2018.
  • 2 nd Place - Best Paper Award, Graduate Research Symposium, Intelligent Systems Center (ISC), 2018.