Search | arXiv e-print repository

Ego-to-Exo: Interfacing Third Person Visuals from Egocentric Views in Real-time for Improved ROV Teleoperation

Authors: Adnan Abdullah, Ruo Chen, Ioannis Rekleitis, Md Jahidul Islam

Abstract: Underwater ROVs (Remotely Operated Vehicles) are unmanned submersible vehicles designed for exploring and operating in the depths of the ocean. Despite using high-end cameras, typical teleoperation engines based on first-person (egocentric) views limit a surface operator's ability to maneuver the ROV in complex deep-water missions. In this paper, we present an interactive teleoperation interface t… ▽ More Underwater ROVs (Remotely Operated Vehicles) are unmanned submersible vehicles designed for exploring and operating in the depths of the ocean. Despite using high-end cameras, typical teleoperation engines based on first-person (egocentric) views limit a surface operator's ability to maneuver the ROV in complex deep-water missions. In this paper, we present an interactive teleoperation interface that enhances the operational capabilities via increased situational awareness. This is accomplished by (i) offering on-demand "third"-person (exocentric) visuals from past egocentric views, and (ii) facilitating enhanced peripheral information with augmented ROV pose information in real-time. We achieve this by integrating a 3D geometry-based Ego-to-Exo view synthesis algorithm into a monocular SLAM system for accurate trajectory estimation. The proposed closed-form solution only uses past egocentric views from the ROV and a SLAM backbone for pose estimation, which makes it portable to existing ROV platforms. Unlike data-driven solutions, it is invariant to applications and waterbody-specific scenes. We validate the geometric accuracy of the proposed framework through extensive experiments of 2-DOF indoor navigation and 6-DOF underwater cave exploration in challenging low-light conditions. A subjective evaluation on 15 human teleoperators further confirms the effectiveness of the integrated features for improved teleoperation. We demonstrate the benefits of dynamic Ego-to-Exo view generation and real-time pose rendering for remote ROV teleoperation by following navigation guides such as cavelines inside underwater caves. This new way of interactive ROV teleoperation opens up promising opportunities for future research in subsea telerobotics. △ Less

Submitted 27 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

Comments: V2, 9 pages

arXiv:2405.12150 [pdf, other]

Bangladeshi Native Vehicle Detection in Wild

Authors: Bipin Saha, Md. Johirul Islam, Shaikh Khaled Mostaque, Aditya Bhowmik, Tapodhir Karmakar Taton, Md. Nakib Hayat Chowdhury, Mamun Bin Ibne Reaz

Abstract: The success of autonomous navigation relies on robust and precise vehicle recognition, hindered by the scarcity of region-specific vehicle detection datasets, impeding the development of context-aware systems. To advance terrestrial object detection research, this paper proposes a native vehicle detection dataset for the most commonly appeared vehicle classes in Bangladesh. 17 distinct vehicle cla… ▽ More The success of autonomous navigation relies on robust and precise vehicle recognition, hindered by the scarcity of region-specific vehicle detection datasets, impeding the development of context-aware systems. To advance terrestrial object detection research, this paper proposes a native vehicle detection dataset for the most commonly appeared vehicle classes in Bangladesh. 17 distinct vehicle classes have been taken into account, with fully annotated 81542 instances of 17326 images. Each image width is set to at least 1280px. The dataset's average vehicle bounding box-to-image ratio is 4.7036. This Bangladesh Native Vehicle Dataset (BNVD) has accounted for several geographical, illumination, variety of vehicle sizes, and orientations to be more robust on surprised scenarios. In the context of examining the BNVD dataset, this work provides a thorough assessment with four successive You Only Look Once (YOLO) models, namely YOLO v5, v6, v7, and v8. These dataset's effectiveness is methodically evaluated and contrasted with other vehicle datasets already in use. The BNVD dataset exhibits mean average precision(mAP) at 50% intersection over union (IoU) is 0.848 corresponding precision and recall values of 0.841 and 0.774. The research findings indicate a mAP of 0.643 at an IoU range of 0.5 to 0.95. The experiments show that the BNVD dataset serves as a reliable representation of vehicle distribution and presents considerable complexities. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 13 pages, 8 figures

arXiv:2404.11815 [pdf, other]

AquaSonic: Acoustic Manipulation of Underwater Data Center Operations and Resource Management

Authors: Jennifer Sheldon, Weidong Zhu, Adnan Abdullah, Sri Hrushikesh Varma Bhupathiraju, Takeshi Sugawara, Kevin R. B. Butler, Md Jahidul Islam, Sara Rampazzi

Abstract: Underwater datacenters (UDCs) hold promise as next-generation data storage due to their energy efficiency and environmental sustainability benefits. While the natural cooling properties of water save power, the isolated aquatic environment and long-range sound propagation in water create unique vulnerabilities which differ from those of on-land data centers. Our research discovers the unique vulne… ▽ More Underwater datacenters (UDCs) hold promise as next-generation data storage due to their energy efficiency and environmental sustainability benefits. While the natural cooling properties of water save power, the isolated aquatic environment and long-range sound propagation in water create unique vulnerabilities which differ from those of on-land data centers. Our research discovers the unique vulnerabilities of fault-tolerant storage devices, resource allocation software, and distributed file systems to acoustic injection attacks in UDCs. With a realistic testbed approximating UDC server operations, we empirically characterize the capabilities of acoustic injection underwater and find that an attacker can reduce fault-tolerant RAID 5 storage system throughput by 17% up to 100%. Our closed-water analyses reveal that attackers can (i) cause unresponsiveness and automatic node removal in a distributed filesystem with only 2.4 minutes of sustained acoustic injection, (ii) induce a distributed database's latency to increase by up to 92.7% to reduce system reliability, and (iii) induce load-balance managers to redirect up to 74% of resources to a target server to cause overload or force resource colocation. Furthermore, we perform open-water experiments in a lake and find that an attacker can cause controlled throughput degradation at a maximum allowable distance of 6.35 m using a commercial speaker. We also investigate and discuss the effectiveness of standard defenses against acoustic injection attacks. Finally, we formulate a novel machine learning-based detection system that reaches 0% False Positive Rate and 98.2% True Positive Rate trained on our dataset of profiled hard disk drives under 30-second FIO benchmark execution. With this work, we aim to help manufacturers proactively protect UDCs against acoustic injection attacks and ensure the security of subsea computing infrastructures. △ Less

Submitted 7 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: Accepted to IEEE S&P 2024

arXiv:2403.17407 [pdf, other]

Transcribing Bengali Text with Regional Dialects to IPA using District Guided Tokens

Authors: S M Jishanul Islam, Sadia Ahmmed, Sahid Hossain Mustakim

Abstract: Accurate transcription of Bengali text to the International Phonetic Alphabet (IPA) is a challenging task due to the complex phonology of the language and context-dependent sound changes. This challenge is even more for regional Bengali dialects due to unavailability of standardized spelling conventions for these dialects, presence of local and foreign words popular in those regions and phonologic… ▽ More Accurate transcription of Bengali text to the International Phonetic Alphabet (IPA) is a challenging task due to the complex phonology of the language and context-dependent sound changes. This challenge is even more for regional Bengali dialects due to unavailability of standardized spelling conventions for these dialects, presence of local and foreign words popular in those regions and phonological diversity across different regions. This paper presents an approach to this sequence-to-sequence problem by introducing the District Guided Tokens (DGT) technique on a new dataset spanning six districts of Bangladesh. The key idea is to provide the model with explicit information about the regional dialect or "district" of the input text before generating the IPA transcription. This is achieved by prepending a district token to the input sequence, effectively guiding the model to understand the unique phonetic patterns associated with each district. The DGT technique is applied to fine-tune several transformer-based models, on this new dataset. Experimental results demonstrate the effectiveness of DGT, with the ByT5 model achieving superior performance over word-based models like mT5, BanglaT5, and umT5. This is attributed to ByT5's ability to handle a high percentage of out-of-vocabulary words in the test set. The proposed approach highlights the importance of incorporating regional dialect information into ubiquitous natural language processing systems for languages with diverse phonological variations. The following work was a result of the "Bhashamul" challenge, which is dedicated to solving the problem of Bengali text with regional dialects to IPA transcription https://www.kaggle.com/competitions/regipa/. The training and inference notebooks are available through the competition link. △ Less

Submitted 2 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: Updated missing references to the dataset and corrected some sentences in sections 1 and 2. This work became the champion of the Bhashamul challenge

ACM Class: F.2.2; I.2.7

arXiv:2401.04340 [pdf, other]

Online Allocation with Replenishable Budgets: Worst Case and Beyond

Authors: Jianyi Yang, Pengfei Li, Mohammad Jaminur Islam, Shaolei Ren

Abstract: This paper studies online resource allocation with replenishable budgets, where budgets can be replenished on top of the initial budget and an agent sequentially chooses online allocation decisions without violating the available budget constraint at each round. We propose a novel online algorithm, called OACP (Opportunistic Allocation with Conservative Pricing), that conservatively adjusts dual v… ▽ More This paper studies online resource allocation with replenishable budgets, where budgets can be replenished on top of the initial budget and an agent sequentially chooses online allocation decisions without violating the available budget constraint at each round. We propose a novel online algorithm, called OACP (Opportunistic Allocation with Conservative Pricing), that conservatively adjusts dual variables while opportunistically utilizing available resources. OACP achieves a bounded asymptotic competitive ratio in adversarial settings as the number of decision rounds T gets large. Importantly, the asymptotic competitive ratio of OACP is optimal in the absence of additional assumptions on budget replenishment. To further improve the competitive ratio, we make a mild assumption that there is budget replenishment every T^* >= 1 decision rounds and propose OACP+ to dynamically adjust the total budget assignment for online allocation. Next, we move beyond the worst-case and propose LA-OACP (Learning-Augmented OACP/OACP+), a novel learning-augmented algorithm for online allocation with replenishable budgets. We prove that LA-OACP can improve the average utility compared to OACP/OACP+ when the ML predictor is properly trained, while still offering worst-case utility guarantees when the ML predictions are arbitrarily wrong. Finally, we run simulation studies of sustainable AI inference powered by renewables, validating our analysis and demonstrating the empirical benefits of LA-OACP. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: Accepted by ACM SIGMETRICS 2024

arXiv:2309.11038 [pdf, other]

CaveSeg: Deep Semantic Segmentation and Scene Parsing for Autonomous Underwater Cave Exploration

Authors: A. Abdullah, T. Barua, R. Tibbetts, Z. Chen, M. J. Islam, I. Rekleitis

Abstract: In this paper, we present CaveSeg - the first visual learning pipeline for semantic segmentation and scene parsing for AUV navigation inside underwater caves. We address the problem of scarce annotated training data by preparing a comprehensive dataset for semantic segmentation of underwater cave scenes. It contains pixel annotations for important navigation markers (e.g. caveline, arrows), obstac… ▽ More In this paper, we present CaveSeg - the first visual learning pipeline for semantic segmentation and scene parsing for AUV navigation inside underwater caves. We address the problem of scarce annotated training data by preparing a comprehensive dataset for semantic segmentation of underwater cave scenes. It contains pixel annotations for important navigation markers (e.g. caveline, arrows), obstacles (e.g. ground plane and overhead layers), scuba divers, and open areas for servoing. Through comprehensive benchmark analyses on cave systems in USA, Mexico, and Spain locations, we demonstrate that robust deep visual models can be developed based on CaveSeg for fast semantic scene parsing of underwater cave environments. In particular, we formulate a novel transformer-based model that is computationally light and offers near real-time execution in addition to achieving state-of-the-art performance. Finally, we explore the design choices and implications of semantic segmentation for visual servoing by AUVs inside underwater caves. The proposed model and benchmark dataset open up promising opportunities for future research in autonomous underwater cave exploration and mapping. △ Less

Submitted 10 May, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

Comments: Accepted in ICRA 2024. 10 pages, 9 figures

arXiv:2303.03670 [pdf, other]

Weakly Supervised Caveline Detection For AUV Navigation Inside Underwater Caves

Authors: Boxiao Yu, Reagan Tibbetts, Titon Barua, Ailani Morales, Ioannis Rekleitis, Md Jahidul Islam

Abstract: Underwater caves are challenging environments that are crucial for water resource management, and for our understanding of hydro-geology and history. Mapping underwater caves is a time-consuming, labor-intensive, and hazardous operation. For autonomous cave mapping by underwater robots, the major challenge lies in vision-based estimation in the complete absence of ambient light, which results in c… ▽ More Underwater caves are challenging environments that are crucial for water resource management, and for our understanding of hydro-geology and history. Mapping underwater caves is a time-consuming, labor-intensive, and hazardous operation. For autonomous cave mapping by underwater robots, the major challenge lies in vision-based estimation in the complete absence of ambient light, which results in constantly moving shadows due to the motion of the camera-light setup. Thus, detecting and following the caveline as navigation guidance is paramount for robots in autonomous cave mapping missions. In this paper, we present a computationally light caveline detection model based on a novel Vision Transformer (ViT)-based learning pipeline. We address the problem of scarce annotated training data by a weakly supervised formulation where the learning is reinforced through a series of noisy predictions from intermediate sub-optimal models. We validate the utility and effectiveness of such weak supervision for caveline detection and tracking in three different cave locations: USA, Mexico, and Spain. Experimental results demonstrate that our proposed model, CL-ViT, balances the robustness-efficiency trade-off, ensuring good generalization performance while offering 10+ FPS on single-board (Jetson TX2) devices. △ Less

Submitted 28 June, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

arXiv:2211.15013 [pdf, other]

Enhancing Data Security for Cloud Computing Applications through Distributed Blockchain-based SDN Architecture in IoT Networks

Authors: Anichur Rahman, Md. Jahidul Islam, Rafiqul Islam, Ayesha Aziz, Dipanjali Kundu, Sadia Sazzad, Md. Razaul Karim, Mahedi Hasan, Ziaur Rahman, Said Elnaffar, Shahab S. Band

Abstract: Blockchain (BC) and Software Defined Networking (SDN) are some of the most prominent emerging technologies in recent research. These technologies provide security, integrity, as well as confidentiality in their respective applications. Cloud computing has also been a popular comprehensive technology for several years. Confidential information is often shared with the cloud infrastructure to give c… ▽ More Blockchain (BC) and Software Defined Networking (SDN) are some of the most prominent emerging technologies in recent research. These technologies provide security, integrity, as well as confidentiality in their respective applications. Cloud computing has also been a popular comprehensive technology for several years. Confidential information is often shared with the cloud infrastructure to give customers access to remote resources, such as computation and storage operations. However, cloud computing also presents substantial security threats, issues, and challenges. Therefore, to overcome these difficulties, we propose integrating Blockchain and SDN in the cloud computing platform. In this research, we introduce the architecture to better secure clouds. Moreover, we leverage a distributed Blockchain approach to convey security, confidentiality, privacy, integrity, adaptability, and scalability in the proposed architecture. BC provides a distributed or decentralized and efficient environment for users. Also, we present an SDN approach to improving the reliability, stability, and load balancing capabilities of the cloud infrastructure. Finally, we provide an experimental evaluation of the performance of our SDN and BC-based implementation using different parameters, also monitoring some attacks in the system and proving its efficacy. △ Less

Submitted 27 November, 2022; originally announced November 2022.

Comments: 12 Pages 16 Figures 3 Tables

ACM Class: E.3

arXiv:2209.12358 [pdf, other]

UDepth: Fast Monocular Depth Estimation for Visually-guided Underwater Robots

Authors: Boxiao Yu, Jiayi Wu, Md Jahidul Islam

Abstract: In this paper, we present a fast monocular depth estimation method for enabling 3D perception capabilities of low-cost underwater robots. We formulate a novel end-to-end deep visual learning pipeline named UDepth, which incorporates domain knowledge of image formation characteristics of natural underwater scenes. First, we adapt a new input space from raw RGB image space by exploiting underwater l… ▽ More In this paper, we present a fast monocular depth estimation method for enabling 3D perception capabilities of low-cost underwater robots. We formulate a novel end-to-end deep visual learning pipeline named UDepth, which incorporates domain knowledge of image formation characteristics of natural underwater scenes. First, we adapt a new input space from raw RGB image space by exploiting underwater light attenuation prior, and then devise a least-squared formulation for coarse pixel-wise depth prediction. Subsequently, we extend this into a domain projection loss that guides the end-to-end learning of UDepth on over 9K RGB-D training samples. UDepth is designed with a computationally light MobileNetV2 backbone and a Transformer-based optimizer for ensuring fast inference rates on embedded systems. By domain-aware design choices and through comprehensive experimental analyses, we demonstrate that it is possible to achieve state-of-the-art depth estimation performance while ensuring a small computational footprint. Specifically, with 70%-80% less network parameters than existing benchmarks, UDepth achieves comparable and often better depth estimation performance. While the full model offers over 66 FPS (13 FPS) inference rates on a single GPU (CPU core), our domain projection for coarse depth prediction runs at 51.5 FPS rates on single-board NVIDIA Jetson TX2s. The inference pipelines are available at https://github.com/uf-robopi/UDepth. △ Less

Submitted 2 February, 2023; v1 submitted 25 September, 2022; originally announced September 2022.

Comments: 10 pages, 6 figures

arXiv:2208.02177 [pdf, other]

On the Integration of Blockchain and SDN: Overview, Applications, and Future Perspectives

Authors: Anichur Rahman, Antonio Montieri, Dipanjali Kundu, Md. Razaul Karim, Md. Jahidul Islam, Sara Umme, Alfredo Nascita, Antonio Pescapè

Abstract: Blockchain (BC) and Software-Defined Networking (SDN) are leading technologies which have recently found applications in several network-related scenarios and have consequently experienced a growing interest in the research community. Indeed, current networks connect a massive number of objects over the Internet and in this complex scenario, to ensure security, privacy, confidentiality, and progra… ▽ More Blockchain (BC) and Software-Defined Networking (SDN) are leading technologies which have recently found applications in several network-related scenarios and have consequently experienced a growing interest in the research community. Indeed, current networks connect a massive number of objects over the Internet and in this complex scenario, to ensure security, privacy, confidentiality, and programmability, the utilization of BC and SDN have been successfully proposed. In this work, we provide a comprehensive survey regarding these two recent research trends and review the related state-of-the-art literature. We first describe the main features of each technology and discuss their most common and used variants. Furthermore, we envision the integration of such technologies to jointly take advantage of these latter efficiently. Indeed, we consider their group-wise utilization -- named BC-SDN -- based on the need for stronger security and privacy. Additionally, we cover the application fields of these technologies both individually and combined. Finally, we discuss the open issues of reviewed research and describe potential directions for future avenues regarding the integration of BC and SDN. To summarize, the contribution of the present survey spans from an overview of the literature background on BC and SDN to the discussion of the benefits and limitations of BC-SDN integration in different fields, which also raises open challenges and possible future avenues examined herein. To the best of our knowledge, compared to existing surveys, this is the first work that analyzes the aforementioned aspects in light of a broad BC-SDN integration, with a specific focus on security and privacy issues in actual utilization scenarios. △ Less

Submitted 3 August, 2022; originally announced August 2022.

Comments: 42 pages, 14 figures, to be published in Journal of Network and Systems Management - Special Issue on Blockchains and Distributed Ledgers in Network and Service Management

ACM Class: C.2.3; C.2.4

arXiv:2206.14972 [pdf]

Machine Learning Approaches to Predict Breast Cancer: Bangladesh Perspective

Authors: Taminul Islam, Arindom Kundu, Nazmul Islam Khan, Choyon Chandra Bonik, Flora Akter, Md Jihadul Islam

Abstract: Nowadays, Breast cancer has risen to become one of the most prominent causes of death in recent years. Among all malignancies, this is the most frequent and the major cause of death for women globally. Manually diagnosing this disease requires a good amount of time and expertise. Breast cancer detection is time-consuming, and the spread of the disease can be reduced by developing machine-based bre… ▽ More Nowadays, Breast cancer has risen to become one of the most prominent causes of death in recent years. Among all malignancies, this is the most frequent and the major cause of death for women globally. Manually diagnosing this disease requires a good amount of time and expertise. Breast cancer detection is time-consuming, and the spread of the disease can be reduced by developing machine-based breast cancer predictions. In Machine learning, the system can learn from prior instances and find hard-to-detect patterns from noisy or complicated data sets using various statistical, probabilistic, and optimization approaches. This work compares several machine learning algorithm's classification accuracy, precision, sensitivity, and specificity on a newly collected dataset. In this work Decision tree, Random Forest, Logistic Regression, Naive Bayes, and XGBoost, these five machine learning approaches have been implemented to get the best performance on our dataset. This study focuses on finding the best algorithm that can forecast breast cancer with maximum accuracy in terms of its classes. This work evaluated the quality of each algorithm's data classification in terms of efficiency and effectiveness. And also compared with other published work on this domain. After implementing the model, this study achieved the best model accuracy, 94% on Random Forest and XGBoost. △ Less

Submitted 29 June, 2022; originally announced June 2022.

Comments: 15 pages, 9 figures, accepted for publication as a book chapter to 2nd International Conference on Ubiquitous Computing and Intelligent Information Systems

arXiv:2112.03395 [pdf, other]

doi 10.1145/3510003.3510052

Manas: Mining Software Repositories to Assist AutoML

Authors: Giang Nguyen, Md Johir Islam, Rangeet Pan, Hridesh Rajan

Abstract: Today deep learning is widely used for building software. A software engineering problem with deep learning is that finding an appropriate convolutional neural network (CNN) model for the task can be a challenge for developers. Recent work on AutoML, more precisely neural architecture search (NAS), embodied by tools like Auto-Keras aims to solve this problem by essentially viewing it as a search p… ▽ More Today deep learning is widely used for building software. A software engineering problem with deep learning is that finding an appropriate convolutional neural network (CNN) model for the task can be a challenge for developers. Recent work on AutoML, more precisely neural architecture search (NAS), embodied by tools like Auto-Keras aims to solve this problem by essentially viewing it as a search problem where the starting point is a default CNN model, and mutation of this CNN model allows exploration of the space of CNN models to find a CNN model that will work best for the problem. These works have had significant success in producing high-accuracy CNN models. There are two problems, however. First, NAS can be very costly, often taking several hours to complete. Second, CNN models produced by NAS can be very complex that makes it harder to understand them and costlier to train them. We propose a novel approach for NAS, where instead of starting from a default CNN model, the initial model is selected from a repository of models extracted from GitHub. The intuition being that developers solving a similar problem may have developed a better starting point compared to the default model. We also analyze common layer patterns of CNN models in the wild to understand changes that the developers make to improve their models. Our approach uses commonly occurring changes as mutation operators in NAS. We have extended Auto-Keras to implement our approach. Our evaluation using 8 top voted problems from Kaggle for tasks including image classification and image regression shows that given the same search time, without loss of accuracy, Manas produces models with 42.9% to 99.6% fewer number of parameters than Auto-Keras' models. Benchmarked on GPU, Manas' models train 30.3% to 641.6% faster than Auto-Keras' models. △ Less

Submitted 13 February, 2022; v1 submitted 6 December, 2021; originally announced December 2021.

arXiv:2112.01890 [pdf, other]

Fast Direct Stereo Visual SLAM

Authors: Jiawei Mo, Md Jahidul Islam, Junaed Sattar

Abstract: We propose a novel approach for fast and accurate stereo visual Simultaneous Localization and Mapping (SLAM) independent of feature detection and matching. We extend monocular Direct Sparse Odometry (DSO) to a stereo system by optimizing the scale of the 3D points to minimize photometric error for the stereo configuration, which yields a computationally efficient and robust method compared to conv… ▽ More We propose a novel approach for fast and accurate stereo visual Simultaneous Localization and Mapping (SLAM) independent of feature detection and matching. We extend monocular Direct Sparse Odometry (DSO) to a stereo system by optimizing the scale of the 3D points to minimize photometric error for the stereo configuration, which yields a computationally efficient and robust method compared to conventional stereo matching. We further extend it to a full SLAM system with loop closure to reduce accumulated errors. With the assumption of forward camera motion, we imitate a LiDAR scan using the 3D points obtained from the visual odometry and adapt a LiDAR descriptor for place recognition to facilitate more efficient detection of loop closures. Afterward, we estimate the relative pose using direct alignment by minimizing the photometric error for potential loop closures. Optionally, further improvement over direct alignment is achieved by using the Iterative Closest Point (ICP) algorithm. Lastly, we optimize a pose graph to improve SLAM accuracy globally. By avoiding feature detection or matching in our SLAM system, we ensure high computational efficiency and robustness. Thorough experimental validations on public datasets demonstrate its effectiveness compared to the state-of-the-art approaches. △ Less

Submitted 3 December, 2021; originally announced December 2021.

arXiv:2012.10011 [pdf]

doi 10.14569/IJACSA.2020.0110980

DistB-SDoIndustry: Enhancing Security in Industry 4.0 Services based on Distributed Blockchain through Software Defined Networking-IoT Enabled Architecture

Authors: Anichur Rahman, Umme Sara, Dipanjali Kundu, Saiful Islam, Md. Jahidul Islam, Mahedi Hasan, Ziaur Rahman, Mostofa Kamal Nasir

Abstract: The concept of Industry 4.0 is a newly emerging focus of research throughout the world. However, it has lots of challenges to control data, and it can be addressed with various technologies like Internet of Things (IoT), Big Data, Artificial Intelligence (AI), Software Defined Networking (SDN), and Blockchain (BC) for managing data securely. Further, the complexity of sensors, appliances, sensor n… ▽ More The concept of Industry 4.0 is a newly emerging focus of research throughout the world. However, it has lots of challenges to control data, and it can be addressed with various technologies like Internet of Things (IoT), Big Data, Artificial Intelligence (AI), Software Defined Networking (SDN), and Blockchain (BC) for managing data securely. Further, the complexity of sensors, appliances, sensor networks connecting to the internet and the model of Industry 4.0 has created the challenge of designing systems, infrastructure and smart applications capable of continuously analyzing the data produced. Regarding these, the authors present a distributed Blockchain-based security to industry 4.0 applications with SDN-IoT enabled environment. Where the Blockchain can be capable of leading the robust, privacy and confidentiality to our desired system. In addition, the SDN-IoT incorporates the different services of industry 4.0 with more security as well as flexibility. Furthermore, the authors offer an excellent combination among the technologies like IoT, SDN and Blockchain to improve the security and privacy of Industry 4.0 services properly. Finally , the authors evaluate performance and security in a variety of ways in the presented architecture. △ Less

Submitted 17 December, 2020; originally announced December 2020.

Comments: 8 Pages, 6 Figures

ACM Class: J.7

Journal ref: IJACSA, 11(9), 2020

arXiv:2012.09987 [pdf]

doi 10.1109/ACCESS.2020.3039113

DistB-Condo: Distributed Blockchain-based IoT-SDN Model for Smart Condominium

Authors: Anichur Rahman, Md. Jahidul Islam, Ziaur Rahman, Md. Mahfuz Reza, Adnan Anwar, M. A. Parvez Mahmud, Mostofa Kamal Nasir, Rafidah Md Noor

Abstract: Condominium network refers to intra-organization networks, where smart buildings or apartments are connected and share resources over the network. Secured communication platform or channel has been highlighted as a key requirement for a reliable condominium which can be ensured by the utilization of the advanced techniques and platforms like Software-Defined Network (SDN), Network Function Virtual… ▽ More Condominium network refers to intra-organization networks, where smart buildings or apartments are connected and share resources over the network. Secured communication platform or channel has been highlighted as a key requirement for a reliable condominium which can be ensured by the utilization of the advanced techniques and platforms like Software-Defined Network (SDN), Network Function Virtualization (NFV) and Blockchain (BC). These technologies provide a robust, and secured platform to meet all kinds of challenges, such as safety, confidentiality, flexibility, efficiency, and availability. This work suggests a distributed, scalable IoT-SDN with Blockchain-based NFV framework for a smart condominium (DistB-Condo) that can act as an efficient secured platform for a small community. Moreover, the Blockchain-based IoT-SDN with NFV framework provides the combined benefits of leading technologies. It also presents an optimized Cluster Head Selection (CHS) algorithm for selecting a Cluster Head (CH) among the clusters that efficiently saves energy. Besides, a decentralized and secured Blockchain approach has been introduced that allows more prominent security and privacy to the desired condominium network. Our proposed approach has also the ability to detect attacks in an IoT environment. Eventually, this article evaluates the performance of the proposed architecture using different parameters (e.g., throughput, packet arrival rate, and response time). The proposed approach outperforms the existing OF-Based SDN. DistB-Condo has better throughput on average, and the bandwidth (Mbps) much higher than the OF-Based SDN approach in the presence of attacks. Also, the proposed model has an average response time of 5% less than the core model. △ Less

Submitted 17 December, 2020; originally announced December 2020.

Comments: 17 Pages, 12 Tables, 17 Figures

ACM Class: H.1.1

Journal ref: EEE Access, vol. 8, pp. 209594-209609, 2020

arXiv:2012.05990 [pdf, other]

A Generative Approach for Detection-driven Underwater Image Enhancement

Authors: Chelsey Edge, Md Jahidul Islam, Christopher Morse, Junaed Sattar

Abstract: In this paper, we introduce a generative model for image enhancement specifically for improving diver detection in the underwater domain. In particular, we present a model that integrates generative adversarial network (GAN)-based image enhancement with the diver detection task. Our proposed approach restructures the GAN objective function to include information from a pre-trained diver detector w… ▽ More In this paper, we introduce a generative model for image enhancement specifically for improving diver detection in the underwater domain. In particular, we present a model that integrates generative adversarial network (GAN)-based image enhancement with the diver detection task. Our proposed approach restructures the GAN objective function to include information from a pre-trained diver detector with the goal to generate images which would enhance the accuracy of the detector in adverse visual conditions. By incorporating the detector output into both the generator and discriminator networks, our model is able to focus on enhancing images beyond aesthetic qualities and specifically to improve robotic detection of scuba divers. We train our network on a large dataset of scuba divers, using a state-of-the-art diver detector, and demonstrate its utility on images collected from oceanic explorations of human-robot teams. Experimental evaluations demonstrate that our approach significantly improves diver detection performance over raw, unenhanced images, and even outperforms detection performance on the output of state-of-the-art underwater image enhancement algorithms. Finally, we demonstrate the inference performance of our network on embedded devices to highlight the feasibility of operating on board mobile robotic platforms. △ Less

Submitted 10 December, 2020; originally announced December 2020.

Comments: Under review for ICRA 2021

arXiv:2011.06252 [pdf, other]

SVAM: Saliency-guided Visual Attention Modeling by Autonomous Underwater Robots

Authors: Md Jahidul Islam, Ruobing Wang, Junaed Sattar

Abstract: This paper presents a holistic approach to saliency-guided visual attention modeling (SVAM) for use by autonomous underwater robots. Our proposed model, named SVAM-Net, integrates deep visual features at various scales and semantics for effective salient object detection (SOD) in natural underwater images. The SVAM-Net architecture is configured in a unique way to jointly accommodate bottom-up and… ▽ More This paper presents a holistic approach to saliency-guided visual attention modeling (SVAM) for use by autonomous underwater robots. Our proposed model, named SVAM-Net, integrates deep visual features at various scales and semantics for effective salient object detection (SOD) in natural underwater images. The SVAM-Net architecture is configured in a unique way to jointly accommodate bottom-up and top-down learning within two separate branches of the network while sharing the same encoding layers. We design dedicated spatial attention modules (SAMs) along these learning pathways to exploit the coarse-level and fine-level semantic features for SOD at four stages of abstractions. The bottom-up branch performs a rough yet reasonably accurate saliency estimation at a fast rate, whereas the deeper top-down branch incorporates a residual refinement module (RRM) that provides fine-grained localization of the salient objects. Extensive performance evaluation of SVAM-Net on benchmark datasets clearly demonstrates its effectiveness for underwater SOD. We also validate its generalization performance by several ocean trials' data that include test images of diverse underwater scenes and waterbodies, and also images with unseen natural objects. Moreover, we analyze its computational feasibility for robotic deployments and demonstrate its utility in several important use cases of visual attention modeling. △ Less

Submitted 14 April, 2022; v1 submitted 12 November, 2020; originally announced November 2020.

arXiv:2011.03106 [pdf, other]

IMU-Assisted Learning of Single-View Rolling Shutter Correction

Authors: Jiawei Mo, Md Jahidul Islam, Junaed Sattar

Abstract: Rolling shutter distortion is highly undesirable for photography and computer vision algorithms (e.g., visual SLAM) because pixels can be potentially captured at different times and poses. In this paper, we propose a deep neural network to predict depth and row-wise pose from a single image for rolling shutter correction. Our contribution in this work is to incorporate inertial measurement unit (I… ▽ More Rolling shutter distortion is highly undesirable for photography and computer vision algorithms (e.g., visual SLAM) because pixels can be potentially captured at different times and poses. In this paper, we propose a deep neural network to predict depth and row-wise pose from a single image for rolling shutter correction. Our contribution in this work is to incorporate inertial measurement unit (IMU) data into the pose refinement process, which, compared to the state-of-the-art, greatly enhances the pose prediction. The improved accuracy and robustness make it possible for numerous vision algorithms to use imagery captured by rolling shutter cameras and produce highly accurate results. We also extend a dataset to have real rolling shutter images, IMU data, depth maps, camera poses, and corresponding global shutter images for rolling shutter correction training. We demonstrate the efficacy of the proposed method by evaluating the performance of Direct Sparse Odometry (DSO) algorithm on rolling shutter imagery corrected using the proposed approach. Results show marked improvements of the DSO algorithm over using uncorrected imagery, validating the proposed approach. △ Less

Submitted 14 September, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

arXiv:2005.00972 [pdf, other]

Repairing Deep Neural Networks: Fix Patterns and Challenges

Authors: Md Johirul Islam, Rangeet Pan, Giang Nguyen, Hridesh Rajan

Abstract: Significant interest in applying Deep Neural Network (DNN) has fueled the need to support engineering of software that uses DNNs. Repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. What challenges should automated repair… ▽ More Significant interest in applying Deep Neural Network (DNN) has fueled the need to support engineering of software that uses DNNs. Repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. What challenges should automated repair tools address? What are the repair patterns whose automation could help developers? Which repair patterns should be assigned a higher priority for building automated bug repair tools? This work presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack overflow and 555 repairs from Github for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns; the most common bug fix patterns are fixing data dimension and neural network connectivity; DNN bug fixes have the potential to introduce adversarial vulnerabilities; DNN bug fixes frequently introduce new bugs; and DNN bug localization, reuse of trained model, and coping with frequent releases are major challenges faced by developers when fixing bugs. We also contribute a benchmark of 667 DNN (bug, repair) instances. △ Less

Submitted 2 May, 2020; originally announced May 2020.

arXiv:2004.01241 [pdf, other]

Semantic Segmentation of Underwater Imagery: Dataset and Benchmark

Authors: Md Jahidul Islam, Chelsey Edge, Yuyang Xiao, Peigen Luo, Muntaqim Mehtaz, Christopher Morse, Sadman Sakib Enan, Junaed Sattar

Abstract: In this paper, we present the first large-scale dataset for semantic Segmentation of Underwater IMagery (SUIM). It contains over 1500 images with pixel annotations for eight object categories: fish (vertebrates), reefs (invertebrates), aquatic plants, wrecks/ruins, human divers, robots, and sea-floor. The images have been rigorously collected during oceanic explorations and human-robot collaborati… ▽ More In this paper, we present the first large-scale dataset for semantic Segmentation of Underwater IMagery (SUIM). It contains over 1500 images with pixel annotations for eight object categories: fish (vertebrates), reefs (invertebrates), aquatic plants, wrecks/ruins, human divers, robots, and sea-floor. The images have been rigorously collected during oceanic explorations and human-robot collaborative experiments, and annotated by human participants. We also present a benchmark evaluation of state-of-the-art semantic segmentation approaches based on standard performance metrics. In addition, we present SUIM-Net, a fully-convolutional encoder-decoder model that balances the trade-off between performance and computational efficiency. It offers competitive performance while ensuring fast end-to-end inference, which is essential for its use in the autonomy pipeline of visually-guided underwater robots. In particular, we demonstrate its usability benefits for visual servoing, saliency prediction, and detailed scene understanding. With a variety of use cases, the proposed model and benchmark dataset open up promising opportunities for future research in underwater robot vision. △ Less

Submitted 13 September, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

arXiv:2002.01155 [pdf, other]

Simultaneous Enhancement and Super-Resolution of Underwater Imagery for Improved Visual Perception

Authors: Md Jahidul Islam, Peigen Luo, Junaed Sattar

Abstract: In this paper, we introduce and tackle the simultaneous enhancement and super-resolution (SESR) problem for underwater robot vision and provide an efficient solution for near real-time applications. We present Deep SESR, a residual-in-residual network-based generative model that can learn to restore perceptual image qualities at 2x, 3x, or 4x higher spatial resolution. We supervise its training by… ▽ More In this paper, we introduce and tackle the simultaneous enhancement and super-resolution (SESR) problem for underwater robot vision and provide an efficient solution for near real-time applications. We present Deep SESR, a residual-in-residual network-based generative model that can learn to restore perceptual image qualities at 2x, 3x, or 4x higher spatial resolution. We supervise its training by formulating a multi-modal objective function that addresses the chrominance-specific underwater color degradation, lack of image sharpness, and loss in high-level feature representation. It is also supervised to learn salient foreground regions in the image, which in turn guides the network to learn global contrast enhancement. We design an end-to-end training pipeline to jointly learn the saliency prediction and SESR on a shared hierarchical feature space for fast inference. Moreover, we present UFO-120, the first dataset to facilitate large-scale SESR learning; it contains over 1500 training samples and a benchmark test set of 120 samples. By thorough experimental evaluation on the UFO-120 and other standard datasets, we demonstrate that Deep SESR outperforms the existing solutions for underwater image enhancement and super-resolution. We also validate its generalization performance on several test cases that include underwater images with diverse spectral and spatial degradation levels, and also terrestrial images with unseen natural objects. Lastly, we analyze its computational feasibility for single-board deployments and demonstrate its operational benefits for visually-guided underwater robots. The model and dataset information will be available at: https://github.com/xahidbuffon/Deep-SESR. △ Less

Submitted 4 February, 2020; originally announced February 2020.

arXiv:1911.07623 [pdf, other]

Machine Vision for Improved Human-Robot Cooperation in Adverse Underwater Conditions

Authors: Md Jahidul Islam

Abstract: Visually-guided underwater robots are deployed alongside human divers for cooperative exploration, inspection, and monitoring tasks in numerous shallow-water and coastal-water applications. The most essential capability of such companion robots is to visually interpret their surroundings and assist the divers during various stages of an underwater mission. Despite recent technological advancements… ▽ More Visually-guided underwater robots are deployed alongside human divers for cooperative exploration, inspection, and monitoring tasks in numerous shallow-water and coastal-water applications. The most essential capability of such companion robots is to visually interpret their surroundings and assist the divers during various stages of an underwater mission. Despite recent technological advancements, the existing systems and solutions for real-time visual perception are greatly affected by marine artifacts such as poor visibility, lighting variation, and the scarcity of salient features. The difficulties are exacerbated by a host of non-linear image distortions caused by the vulnerabilities of underwater light propagation (e.g., wavelength-dependent attenuation, absorption, and scattering). In this dissertation, we present a set of novel and improved visual perception solutions to address these challenges for effective underwater human-robot cooperation. Specifically, we develop robust and efficient modules for Autonomous Underwater Vehicles (AUVs) to follow and interact with companion divers by accurately perceiving their surroundings while relying on noisy visual sensing alone. Moreover, our proposed perception solutions enable visually-guided robots to see better in noisy sensing conditions and do better with limited computational resources and real-time constraints. The research outcomes entail novel design and efficient implementation of the underlying vision and learning-based algorithms with extensive field experimental validations and feasibility analyses for single-board deployments. In addition to advancing the state-of-the-art, the proposed methodologies and systems take us one step closer toward bridging the gap between theory and practice for improved human-robot cooperation in the wild. △ Less

Submitted 29 July, 2021; v1 submitted 28 October, 2019; originally announced November 2019.

Comments: Doctoral dissertation document - University of Minnesota, Twin Cities (May 21, 2021)

arXiv:1909.09437 [pdf, other]

Underwater Image Super-Resolution using Deep Residual Multipliers

Authors: Md Jahidul Islam, Sadman Sakib Enan, Peigen Luo, Junaed Sattar

Abstract: We present a deep residual network-based generative model for single image super-resolution (SISR) of underwater imagery for use by autonomous underwater robots. We also provide an adversarial training pipeline for learning SISR from paired data. In order to supervise the training, we formulate an objective function that evaluates the \textit{perceptual quality} of an image based on its global con… ▽ More We present a deep residual network-based generative model for single image super-resolution (SISR) of underwater imagery for use by autonomous underwater robots. We also provide an adversarial training pipeline for learning SISR from paired data. In order to supervise the training, we formulate an objective function that evaluates the \textit{perceptual quality} of an image based on its global content, color, and local style information. Additionally, we present USR-248, a large-scale dataset of three sets of underwater images of 'high' (640x480) and 'low' (80x60, 160x120, and 320x240) spatial resolution. USR-248 contains paired instances for supervised training of 2x, 4x, or 8x SISR models. Furthermore, we validate the effectiveness of our proposed model through qualitative and quantitative experiments and compare the results with several state-of-the-art models' performances. We also analyze its practical feasibility for applications such as scene understanding and attention modeling in noisy visual conditions. △ Less

Submitted 24 February, 2020; v1 submitted 20 September, 2019; originally announced September 2019.

arXiv:1906.11940 [pdf, other]

What Do Developers Ask About ML Libraries? A Large-scale Study Using Stack Overflow

Authors: Md Johirul Islam, Hoan Anh Nguyen, Rangeet Pan, Hridesh Rajan

Abstract: Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To that end, this work reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras… ▽ More Modern software systems are increasingly including machine learning (ML) as an integral component. However, we do not yet understand the difficulties faced by software developers when learning about ML libraries and using them within their systems. To that end, this work reports on a detailed (manual) examination of 3,243 highly-rated Q&A posts related to ten ML libraries, namely Tensorflow, Keras, scikit-learn, Weka, Caffe, Theano, MLlib, Torch, Mahout, and H2O, on Stack Overflow, a popular online technical Q&A forum. We classify these questions into seven typical stages of an ML pipeline to understand the correlation between the library and the stage. Then we study the questions and perform statistical analysis to explore the answer to four research objectives (finding the most difficult stage, understanding the nature of problems, nature of libraries and studying whether the difficulties stayed consistent over time). Our findings reveal the urgent need for software engineering (SE) research in this area. Both static and dynamic analyses are mostly absent and badly needed to help developers find errors earlier. While there has been some early research on debugging, much more work is needed. API misuses are prevalent and API design improvements are sorely needed. Last and somewhat surprisingly, a tug of war between providing higher levels of abstractions and the need to understand the behavior of the trained model is prevalent. △ Less

Submitted 27 June, 2019; originally announced June 2019.

arXiv:1906.01388 [pdf, other]

A Comprehensive Study on Deep Learning Bug Characteristics

Authors: Md Johirul Islam, Giang Nguyen, Rangeet Pan, Hridesh Rajan

Abstract: Deep learning has gained substantial popularity in recent years. Developers mainly rely on libraries and tools to add deep learning capabilities to their software. What kinds of bugs are frequently found in such software? What are the root causes of such bugs? What impacts do such bugs have? Which stages of deep learning pipeline are more bug prone? Are there any antipatterns? Understanding such c… ▽ More Deep learning has gained substantial popularity in recent years. Developers mainly rely on libraries and tools to add deep learning capabilities to their software. What kinds of bugs are frequently found in such software? What are the root causes of such bugs? What impacts do such bugs have? Which stages of deep learning pipeline are more bug prone? Are there any antipatterns? Understanding such characteristics of bugs in deep learning software has the potential to foster the development of better deep learning platforms, debugging mechanisms, development practices, and encourage the development of analysis and verification frameworks. Therefore, we study 2716 high-quality posts from Stack Overflow and 500 bug fix commits from Github about five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand the types of bugs, root causes of bugs, impacts of bugs, bug-prone stage of deep learning pipeline as well as whether there are some common antipatterns found in this buggy software. The key findings of our study include: data bug and logic bug are the most severe bug types in deep learning software appearing more than 48% of the times, major root causes of these bugs are Incorrect Model Parameter (IPS) and Structural Inefficiency (SI) showing up more than 43% of the times. We have also found that the bugs in the usage of deep learning libraries have some common antipatterns that lead to a strong correlation of bug types among the libraries. △ Less

Submitted 3 June, 2019; originally announced June 2019.

Journal ref: The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (Aug. 2019)

arXiv:1905.13284 [pdf, other]

Identifying Classes Susceptible to Adversarial Attacks

Authors: Rangeet Pan, Md Johirul Islam, Shibbir Ahmed, Hridesh Rajan

Abstract: Despite numerous attempts to defend deep learning based image classifiers, they remain susceptible to the adversarial attacks. This paper proposes a technique to identify susceptible classes, those classes that are more easily subverted. To identify the susceptible classes we use distance-based measures and apply them on a trained model. Based on the distance among original classes, we create mapp… ▽ More Despite numerous attempts to defend deep learning based image classifiers, they remain susceptible to the adversarial attacks. This paper proposes a technique to identify susceptible classes, those classes that are more easily subverted. To identify the susceptible classes we use distance-based measures and apply them on a trained model. Based on the distance among original classes, we create mapping among original classes and adversarial classes that helps to reduce the randomness of a model to a significant amount in an adversarial setting. We analyze the high dimensional geometry among the feature classes and identify the k most susceptible target classes in an adversarial attack. We conduct experiments using MNIST, Fashion MNIST, CIFAR-10 (ImageNet and ResNet-32) datasets. Finally, we evaluate our techniques in order to determine which distance-based measure works best and how the randomness of a model changes with perturbation. △ Less

Submitted 30 May, 2019; originally announced May 2019.

arXiv:1903.09766 [pdf, other]

Fast Underwater Image Enhancement for Improved Visual Perception

Authors: Md Jahidul Islam, Youya Xia, Junaed Sattar

Abstract: In this paper, we present a conditional generative adversarial network-based model for real-time underwater image enhancement. To supervise the adversarial training, we formulate an objective function that evaluates the perceptual image quality based on its global content, color, local texture, and style information. We also present EUVP, a large-scale dataset of a paired and unpaired collection o… ▽ More In this paper, we present a conditional generative adversarial network-based model for real-time underwater image enhancement. To supervise the adversarial training, we formulate an objective function that evaluates the perceptual image quality based on its global content, color, local texture, and style information. We also present EUVP, a large-scale dataset of a paired and unpaired collection of underwater images (of `poor' and `good' quality) that are captured using seven different cameras over various visibility conditions during oceanic explorations and human-robot collaborative experiments. In addition, we perform several qualitative and quantitative evaluations which suggest that the proposed model can learn to enhance underwater image quality from both paired and unpaired training. More importantly, the enhanced images provide improved performances of standard models for underwater object detection, human pose estimation, and saliency prediction. These results validate that it is suitable for real-time preprocessing in the autonomy pipeline by visually-guided underwater robots. The model and associated training pipelines are available at https://github.com/xahidbuffon/funie-gan. △ Less

Submitted 8 February, 2020; v1 submitted 23 March, 2019; originally announced March 2019.

arXiv:1903.00820 [pdf, other]

Robot-to-Robot Relative Pose Estimation using Humans as Markers

Authors: Md Jahidul Islam, Jiawei Mo, Junaed Sattar

Abstract: In this paper, we propose a method to determine the 3D relative pose of pairs of communicating robots by using human pose-based key-points as correspondences. We adopt a 'leader-follower' framework, where at first, the leader robot visually detects and triangulates the key-points using the state-of-the-art pose detector named OpenPose. Afterward, the follower robots match the corresponding 2D proj… ▽ More In this paper, we propose a method to determine the 3D relative pose of pairs of communicating robots by using human pose-based key-points as correspondences. We adopt a 'leader-follower' framework, where at first, the leader robot visually detects and triangulates the key-points using the state-of-the-art pose detector named OpenPose. Afterward, the follower robots match the corresponding 2D projections on their respective calibrated cameras and find their relative poses by solving the perspective-n-point (PnP) problem. In the proposed method, we design an efficient person re-identification technique for associating the mutually visible humans in the scene. Additionally, we present an iterative optimization algorithm to refine the associated key-points based on their local structural properties in the image space. We demonstrate that these refinement processes are essential to establish accurate key-point correspondences across viewpoints. Furthermore, we evaluate the performance of the proposed relative pose estimation system through several experiments conducted in terrestrial and underwater environments. Finally, we discuss the relevant operational challenges of this approach and analyze its feasibility for multi-robot cooperative systems in human-dominated social settings and feature-deprived environments such as underwater. △ Less

Submitted 6 September, 2020; v1 submitted 2 March, 2019; originally announced March 2019.

arXiv:1809.06849 [pdf, other]

Towards a Generic Diver-Following Algorithm: Balancing Robustness and Efficiency in Deep Visual Detection

Authors: Md Jahidul Islam, Michael Fulton, Junaed Sattar

Abstract: This paper explores the design and development of a class of robust diver-following algorithms for autonomous underwater robots. By considering the operational challenges for underwater visual tracking in diverse real-world settings, we formulate a set of desired features of a generic diver following algorithm. We attempt to accommodate these features and maximize general tracking performance by e… ▽ More This paper explores the design and development of a class of robust diver-following algorithms for autonomous underwater robots. By considering the operational challenges for underwater visual tracking in diverse real-world settings, we formulate a set of desired features of a generic diver following algorithm. We attempt to accommodate these features and maximize general tracking performance by exploiting the state-of-the-art deep object detection models. We fine-tune the building blocks of these models with a goal of balancing the trade-off between robustness and efficiency in an onboard setting under real-time constraints. Subsequently, we design an architecturally simple Convolutional Neural Network (CNN)-based diver-detection model that is much faster than the state-of-the-art deep models yet provides comparable detection performances. In addition, we validate the performance and effectiveness of the proposed diver-following modules through a number of field experiments in closed-water and open-water environments. △ Less

Submitted 18 September, 2018; originally announced September 2018.

arXiv:1805.00105 [pdf, other]

A Cyberinfrastructure for BigData Transportation Engineering

Authors: Md Johirul Islam, Anuj Sharma, Hridesh Rajan

Abstract: Big Data-driven transportation engineering has the potential to improve utilization of road infrastructure, decrease traffic fatalities, improve fuel consumption, decrease construction worker injuries, among others. Despite these benefits, research on Big Data-driven transportation engineering is difficult today due to the computational expertise required to get started. This work proposes BoaT, a… ▽ More Big Data-driven transportation engineering has the potential to improve utilization of road infrastructure, decrease traffic fatalities, improve fuel consumption, decrease construction worker injuries, among others. Despite these benefits, research on Big Data-driven transportation engineering is difficult today due to the computational expertise required to get started. This work proposes BoaT, a transportation-specific programming language, and it's Big Data infrastructure that is aimed at decreasing this barrier to entry. Our evaluation that uses over two dozen research questions from six categories show that research is easier to realize as a BoaT computer program, an order of magnitude faster when this program is run, and exhibits 12-14x decrease in storage requirements. △ Less

Submitted 30 April, 2018; originally announced May 2018.

arXiv:1804.02479 [pdf, other]

Understanding Human Motion and Gestures for Underwater Human-Robot Collaboration

Authors: Md Jahidul Islam

Abstract: In this paper, we present a number of robust methodologies for an underwater robot to visually detect, follow, and interact with a diver for collaborative task execution. We design and develop two autonomous diver-following algorithms, the first of which utilizes both spatial- and frequency-domain features pertaining to human swimming patterns in order to visually track a diver. The second algorit… ▽ More In this paper, we present a number of robust methodologies for an underwater robot to visually detect, follow, and interact with a diver for collaborative task execution. We design and develop two autonomous diver-following algorithms, the first of which utilizes both spatial- and frequency-domain features pertaining to human swimming patterns in order to visually track a diver. The second algorithm uses a convolutional neural network-based model for robust tracking-by-detection. In addition, we propose a hand gesture-based human-robot communication framework that is syntactically simpler and computationally more efficient than the existing grammar-based frameworks. In the proposed interaction framework, deep visual detectors are used to provide accurate hand gesture recognition; subsequently, a finite-state machine performs robust and efficient gesture-to-instruction mapping. The distinguishing feature of this framework is that it can be easily adopted by divers for communicating with underwater robots without using artificial markers or requiring memorization of complex language rules. Furthermore, we validate the performance and effectiveness of the proposed methodologies through extensive field experiments in closed- and open-water environments. Finally, we perform a user interaction study to demonstrate the usability benefits of our proposed interaction framework compared to existing methods. △ Less

Submitted 6 April, 2018; originally announced April 2018.

Comments: arXiv admin note: text overlap with arXiv:1709.08772

arXiv:1804.01079 [pdf, other]

Robotic Detection of Marine Litter Using Deep Visual Detection Models

Authors: Michael Fulton, Jungseok Hong, Md Jahidul Islam, Junaed Sattar

Abstract: Trash deposits in aquatic environments have a destructive effect on marine ecosystems and pose a long-term economic and environmental threat. Autonomous underwater vehicles (AUVs) could very well contribute to the solution of this problem by finding and eventually removing trash. This paper evaluates a number of deep-learning algorithms preforming the task of visually detecting trash in realistic… ▽ More Trash deposits in aquatic environments have a destructive effect on marine ecosystems and pose a long-term economic and environmental threat. Autonomous underwater vehicles (AUVs) could very well contribute to the solution of this problem by finding and eventually removing trash. This paper evaluates a number of deep-learning algorithms preforming the task of visually detecting trash in realistic underwater environments, with the eventual goal of exploration, mapping, and extraction of such debris by using AUVs. A large and publicly-available dataset of actual debris in open-water locations is annotated for training a number of convolutional neural network architectures for object detection. The trained networks are then evaluated on a set of images from other portions of that dataset, providing insight into approaches for developing the detection capabilities of an AUえーゆーV for underwater trash removal. In addition, the evaluation is performed on three different platforms of varying processing power, which serves to assess these algorithms' fitness for real-time applications. △ Less

Submitted 21 September, 2018; v1 submitted 3 April, 2018; originally announced April 2018.

Comments: Under review for ICRA 2019

arXiv:1803.08202 [pdf, other]

Person Following by Autonomous Robots: A Categorical Overview

Authors: Md Jahidul Islam, Jungseok Hong, Junaed Sattar

Abstract: A wide range of human-robot collaborative applications in diverse domains such as manufacturing, health care, the entertainment industry, and social interactions, require an autonomous robot to follow its human companion. Different working environments and applications pose diverse challenges by adding constraints on the choice of sensors, the degree of autonomy, and dynamics of a person-following… ▽ More A wide range of human-robot collaborative applications in diverse domains such as manufacturing, health care, the entertainment industry, and social interactions, require an autonomous robot to follow its human companion. Different working environments and applications pose diverse challenges by adding constraints on the choice of sensors, the degree of autonomy, and dynamics of a person-following robot. Researchers have addressed these challenges in many ways and contributed to the development of a large body of literature. This paper provides a comprehensive overview of the literature by categorizing different aspects of person-following by autonomous robots. Also, the corresponding operational challenges are identified based on various design choices for ground, underwater, and aerial scenarios. In addition, state-of-the-art methods for perception, planning, control, and interaction are elaborately discussed and their applicability in varied operational scenarios are presented. Then, some of the prominent methods are qualitatively compared, corresponding practicalities are illustrated, and their feasibility is analyzed for various use-cases. Furthermore, several prospective application areas are identified, and open problems are highlighted for future research. △ Less

Submitted 17 September, 2019; v1 submitted 21 March, 2018; originally announced March 2018.

arXiv:1801.04011 [pdf, other]

Enhancing Underwater Imagery using Generative Adversarial Networks

Authors: Cameron Fabbri, Md Jahidul Islam, Junaed Sattar

Abstract: Autonomous underwater vehicles (AUVs) rely on a variety of sensors - acoustic, inertial and visual - for intelligent decision making. Due to its non-intrusive, passive nature, and high information content, vision is an attractive sensing modality, particularly at shallower depths. However, factors such as light refraction and absorption, suspended particles in the water, and color distortion affec… ▽ More Autonomous underwater vehicles (AUVs) rely on a variety of sensors - acoustic, inertial and visual - for intelligent decision making. Due to its non-intrusive, passive nature, and high information content, vision is an attractive sensing modality, particularly at shallower depths. However, factors such as light refraction and absorption, suspended particles in the water, and color distortion affect the quality of visual data, resulting in noisy and distorted images. AUVs that rely on visual sensing thus face difficult challenges, and consequently exhibit poor performance on vision-driven tasks. This paper proposes a method to improve the quality of visual underwater scenes using Generative Adversarial Networks (GANs), with the goal of improving input to vision-driven behaviors further down the autonomy pipeline. Furthermore, we show how recently proposed methods are able to generate a dataset for the purpose of such underwater image restoration. For any visually-guided underwater robots, this improvement can result in increased safety and reliability through robust visual perception. To that effect, we present quantitative and qualitative data which demonstrates that images corrected through the proposed approach generate more visually appealing images, and also provide increased accuracy for a diver tracking algorithm. △ Less

Submitted 11 January, 2018; originally announced January 2018.

Comments: Submitted to ICRA 2018

arXiv:1711.10377 [pdf]

Sentiment analysis of twitter data

Authors: Hamid Bagheri, Md Johirul Islam

Abstract: Social networks are the main resources to gather information about people's opinion and sentiments towards different topics as they spend hours daily on social media and share their opinion. In this technical paper, we show the application of sentimental analysis and how to connect to Twitter and run sentimental analysis queries. We run experiments on different queries from politics to humanity an… ▽ More Social networks are the main resources to gather information about people's opinion and sentiments towards different topics as they spend hours daily on social media and share their opinion. In this technical paper, we show the application of sentimental analysis and how to connect to Twitter and run sentimental analysis queries. We run experiments on different queries from politics to humanity and show the interesting results. We realized that the neutral sentiments for tweets are significantly high which clearly shows the limitations of the current works. △ Less

Submitted 15 December, 2017; v1 submitted 15 November, 2017; originally announced November 2017.

Comments: 5 pages

arXiv:1709.08772 [pdf, other]

Dynamic Reconfiguration of Mission Parameters in Underwater Human-Robot Collaboration

Authors: Md Jahidul Islam, Marc Ho, Junaed Sattar

Abstract: This paper presents a real-time programming and parameter reconfiguration method for autonomous underwater robots in human-robot collaborative tasks. Using a set of intuitive and meaningful hand gestures, we develop a syntactically simple framework that is computationally more efficient than a complex, grammar-based approach. In the proposed framework, a convolutional neural network is trained to… ▽ More This paper presents a real-time programming and parameter reconfiguration method for autonomous underwater robots in human-robot collaborative tasks. Using a set of intuitive and meaningful hand gestures, we develop a syntactically simple framework that is computationally more efficient than a complex, grammar-based approach. In the proposed framework, a convolutional neural network is trained to provide accurate hand gesture recognition; subsequently, a finite-state machine-based deterministic model performs efficient gesture-to-instruction mapping, and further improves robustness of the interaction scheme. The key aspect of this framework is that it can be easily adopted by divers for communicating simple instructions to underwater robots without using artificial tags such as fiducial markers, or requiring them to memorize a potentially complex set of language rules. Extensive experiments are performed both on field-trial data and through simulation, which demonstrate the robustness, efficiency, and portability of this framework in a number of different scenarios. Finally, a user interaction study is presented that illustrates the gain in usability of our proposed interaction framework compared to the existing methods for underwater domains. △ Less

Submitted 20 February, 2018; v1 submitted 25 September, 2017; originally announced September 2017.

arXiv:1709.08292 [pdf, other]

Underwater Multi-Robot Convoying using Visual Tracking by Detection

Authors: Florian Shkurti, Wei-Di Chang, Peter Henderson, Md Jahidul Islam, Juan Camilo Gamboa Higuera, Jimmy Li, Travis Manderson, Anqi Xu, Gregory Dudek, Junaed Sattar

Abstract: We present a robust multi-robot convoying approach that relies on visual detection of the leading agent, thus enabling target following in unstructured 3-D environments. Our method is based on the idea of tracking-by-detection, which interleaves efficient model-based object detection with temporal filtering of image-based bounding box estimation. This approach has the important advantage of mitiga… ▽ More We present a robust multi-robot convoying approach that relies on visual detection of the leading agent, thus enabling target following in unstructured 3-D environments. Our method is based on the idea of tracking-by-detection, which interleaves efficient model-based object detection with temporal filtering of image-based bounding box estimation. This approach has the important advantage of mitigating tracking drift (i.e. drifting away from the target object), which is a common symptom of model-free trackers and is detrimental to sustained convoying in practice. To illustrate our solution, we collected extensive footage of an underwater robot in ocean settings, and hand-annotated its location in each frame. Based on this dataset, we present an empirical comparison of multiple tracker variants, including the use of several convolutional neural networks, both with and without recurrent connections, as well as frequency-based model-free trackers. We also demonstrate the practicality of this tracking-by-detection strategy in real-world scenarios by successfully controlling a legged underwater robot in five degrees of freedom to follow another robot's independent motion. △ Less

Submitted 24 September, 2017; originally announced September 2017.

Comments: Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017

arXiv:1205.1428 [pdf]

High Velocity Penetration/Perforation Using Coupled Smooth Particle Hydrodynamics-Finite Element Method

Authors: S. Swaddiwudhipong, M. J. Islam, Z. S. Liu

Abstract: Finite element method (FEM) suffers from a serious mesh distortion problem when used for high velocity impact analyses. The smooth particle hydrodynamics (SPH) method is appropriate for this class of problems involving severe damages but at considerable computational cost. It is beneficial if the latter is adopted only in severely distorted regions and FEM further away. The coupled smooth particle… ▽ More Finite element method (FEM) suffers from a serious mesh distortion problem when used for high velocity impact analyses. The smooth particle hydrodynamics (SPH) method is appropriate for this class of problems involving severe damages but at considerable computational cost. It is beneficial if the latter is adopted only in severely distorted regions and FEM further away. The coupled smooth particle hydrodynamics - finite element method (SFM) has been adopted in a commercial hydrocode LS-DYNA to study the perforation of Weldox 460E steel and AA5083-H116 aluminum plates with varying thicknesses and various projectile nose geometries including blunt, conical and ogival noses. Effects of the SPH domain size and particle density are studied considering the friction effect between the projectile and the target materials. The simulated residual velocities and the ballistic limit velocities from the SFM agree well with the published experimental data. The study shows that SFM is able to emulate the same failure mechanisms of the steel and aluminum plates as observed in various experimental investigations for initial impact velocity of 170 m/s and higher. △ Less

Submitted 7 May, 2012; originally announced May 2012.

Comments: 18 pages; International Journal of Protective Structures 2010

MSC Class: 74C05 ACM Class: G.2.0

arXiv:1007.5129 [pdf]

doi 10.5121/ijaia.2010.1301

An Efficient Automatic Mass Classification Method In Digitized Mammograms Using Artificial Neural Network

Authors: Mohammed J. Islam, Majid Ahmadi, Maher A. Sid-Ahmed

Abstract: In this paper we present an efficient computer aided mass classification method in digitized mammograms using Artificial Neural Network (ANN), which performs benign-malignant classification on region of interest (ROI) that contains mass. One of the major mammographic characteristics for mass classification is texture. ANN exploits this important factor to classify the mass into benign or malignant… ▽ More In this paper we present an efficient computer aided mass classification method in digitized mammograms using Artificial Neural Network (ANN), which performs benign-malignant classification on region of interest (ROI) that contains mass. One of the major mammographic characteristics for mass classification is texture. ANN exploits this important factor to classify the mass into benign or malignant. The statistical textural features used in characterizing the masses are mean, standard deviation, entropy, skewness, kurtosis and uniformity. The main aim of the method is to increase the effectiveness and efficiency of the classification process in an objective manner to reduce the numbers of false-positive of malignancies. Three layers artificial neural network (ANN) with seven features was proposed for classifying the marked regions into benign and malignant and 90.91% sensitivity and 83.87% specificity is achieved that is very much promising compare to the radiologist's sensitivity 75%. △ Less

Submitted 29 July, 2010; originally announced July 2010.

Comments: 13 pages, 10 figures

Journal ref: International Journal of Artificial Intelligence & Applications 1.3 (2010) 1-13

Showing 1–39 of 39 results for author: Islam, M J