Search | arXiv e-print repository

Artificial Neural Networks to Recognize Speakers Division from Continuous Bengali Speech

Authors: Hasmot Ali, Md. Fahad Hossain, Md. Mehedi Hasan, Sheikh Abujar, Sheak Rashed Haider Noori

Abstract: Voice based applications are ruling over the era of automation because speech has a lot of factors that determine a speakers information as well as speech. Modern Automatic Speech Recognition (ASR) is a blessing in the field of Human-Computer Interaction (HCI) for efficient communication among humans and devices using Artificial Intelligence technology. Speech is one of the easiest mediums of comm… ▽ More Voice based applications are ruling over the era of automation because speech has a lot of factors that determine a speakers information as well as speech. Modern Automatic Speech Recognition (ASR) is a blessing in the field of Human-Computer Interaction (HCI) for efficient communication among humans and devices using Artificial Intelligence technology. Speech is one of the easiest mediums of communication because it has a lot of identical features for different speakers. Nowadays it is possible to determine speakers and their identity using their speech in terms of speaker recognition. In this paper, we presented a method that will provide a speakers geographical identity in a certain region using continuous Bengali speech. We consider eight different divisions of Bangladesh as the geographical region. We applied the Mel Frequency Cepstral Coefficient (MFCC) and Delta features on an Artificial Neural Network to classify speakers division. We performed some preprocessing tasks like noise reduction and 8-10 second segmentation of raw audio before feature extraction. We used our dataset of more than 45 hours of audio data from 633 individual male and female speakers. We recorded the highest accuracy of 85.44%. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2402.17028 [pdf]

Separation of biocrude produced from hydrothermal liquefaction of faecal sludge without any solvent

Authors: H M Fairooz Adnan, Md Khalekuzzaman, Md. Atik Fayshal, Md. Mehedi Hasan

Abstract: In this study faecal sludge is used as raw biomass due to its abundance, low cost, and easy availability. After HTL operation, product separation is getting challenging. Current developed studies observed the separation of aqueous and biocrude oil products occurs during the HTL process more popularly with the use of an organic solvent which is quite expensive. Focusing on this critical issue, this… ▽ More In this study faecal sludge is used as raw biomass due to its abundance, low cost, and easy availability. After HTL operation, product separation is getting challenging. Current developed studies observed the separation of aqueous and biocrude oil products occurs during the HTL process more popularly with the use of an organic solvent which is quite expensive. Focusing on this critical issue, this study aims to separate the biocrude and aqueous phase without using any solvent by gravity separation technique. From FTIR analysis data it showed that centrifuged at 6000 rpm partial separation of biocrude and aqueous phase (AP) was noticed. however, at 9000 rpm, FTIR analysis showed that biocrude samples included aliphatic hydrocarbons, phenols, and esters where no signs of any carbon chain were found at AP which indicated the products are successfully separated. The separated Crude portion had the higher A-Factor (0.68) and lower C-Factor (0.58) value which indicates the oil quality was immature grade of lower kerogen type II (i.e., moderate oil-prone). This low-cost technique can be economically advantageous for commercial-scale biocrude production. △ Less

Submitted 26 February, 2024; originally announced February 2024.

Comments: Conference: WasteSafe 2023At: Khulna University of Engineering & Technology, Khulna, Bangladesh Volume: 8

arXiv:2401.12340 [pdf, other]

doi 10.1109/TAES.2023.3337768

Contrastive Learning and Cycle Consistency-based Transductive Transfer Learning for Target Annotation

Authors: Shoaib Meraj Sami, Md Mahedi Hasan, Nasser M. Nasrabadi, Raghuveer Rao

Abstract: Annotating automatic target recognition (ATR) is a highly challenging task, primarily due to the unavailability of labeled data in the target domain. Hence, it is essential to construct an optimal target domain classifier by utilizing the labeled information of the source domain images. The transductive transfer learning (TTL) method that incorporates a CycleGAN-based unpaired domain translation n… ▽ More Annotating automatic target recognition (ATR) is a highly challenging task, primarily due to the unavailability of labeled data in the target domain. Hence, it is essential to construct an optimal target domain classifier by utilizing the labeled information of the source domain images. The transductive transfer learning (TTL) method that incorporates a CycleGAN-based unpaired domain translation network has been previously proposed in the literature for effective ATR annotation. Although this method demonstrates great potential for ATR, it severely suffers from lower annotation performance, higher Fréchet Inception Distance (FID) score, and the presence of visual artifacts in the synthetic images. To address these issues, we propose a hybrid contrastive learning base unpaired domain translation (H-CUT) network that achieves a significantly lower FID score. It incorporates both attention and entropy to emphasize the domain-specific region, a noisy feature mixup module to generate high variational synthetic negative patches, and a modulated noise contrastive estimation (MoNCE) loss to reweight all negative patches using optimal transport for better performance. Our proposed contrastive learning and cycle-consistency-based TTL (C3TTL) framework consists of two H-CUT networks and two classifiers. It simultaneously optimizes cycle-consistency, MoNCE, and identity losses. In C3TTL, two H-CUT networks have been employed through a bijection mapping to feed the reconstructed source domain images into a pretrained classifier to guide the optimal target domain classifier. Extensive experimental analysis conducted on three ATR datasets demonstrates that the proposed C3TTL method is effective in annotating civilian and military vehicles, as well as ship targets. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: This Paper is Accepted in IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS. This Arxiv version is an older version than the reviewed version

arXiv:2212.14744

A Comparison Study of Deep CNN Architecture in Detecting of Pneumonia

Authors: Al Mohidur Rahman Porag, Md. Mahedi Hasan, Dr. Md Taimur Ahad

Abstract: Pneumonia, a respiratory infection brought on by bacteria or viruses, affects a large number of people, especially in developing and impoverished countries where high levels of pollution, unclean living conditions, and overcrowding are frequently observed, along with insufficient medical infrastructure. Pleural effusion, a condition in which fluids fill the lung and complicate breathing, is brough… ▽ More Pneumonia, a respiratory infection brought on by bacteria or viruses, affects a large number of people, especially in developing and impoverished countries where high levels of pollution, unclean living conditions, and overcrowding are frequently observed, along with insufficient medical infrastructure. Pleural effusion, a condition in which fluids fill the lung and complicate breathing, is brought on by pneumonia. Early detection of pneumonia is essential for ensuring curative care and boosting survival rates. The approach most usually used to diagnose pneumonia is chest X-ray imaging. The purpose of this work is to develop a method for the automatic diagnosis of bacterial and viral pneumonia in digital x-ray pictures. This article first presents the authors' technique, and then gives a comprehensive report on recent developments in the field of reliable diagnosis of pneumonia. In this study, here tuned a state-of-the-art deep convolutional neural network to classify plant diseases based on images and tested its performance. Deep learning architecture is compared empirically. VGG19, ResNet with 152v2, Resnext101, Seresnet152, Mobilenettv2, and DenseNet with 201 layers are among the architectures tested. Experiment data consists of two groups, sick and healthy X-ray pictures. To take appropriate action against plant diseases as soon as possible, rapid disease identification models are preferred. DenseNet201 has shown no overfitting or performance degradation in our experiments, and its accuracy tends to increase as the number of epochs increases. Further, DenseNet201 achieves state-of-the-art performance with a significantly a smaller number of parameters and within a reasonable computing time. This architecture outperforms the competition in terms of testing accuracy, scoring 95%. Each architecture was trained using Keras, using Theano as the backend. △ Less

Submitted 14 February, 2023; v1 submitted 30 December, 2022; originally announced December 2022.

Comments: I have to remake the artical. Case there was some accuracy problem

arXiv:2207.09627 [pdf, other]

EVHA: Explainable Vision System for Hardware Testing and Assurance -- An Overview

Authors: Md Mahfuz Al Hasan, Mohammad Tahsin Mostafiz, Thomas An Le, Jake Julia, Nidish Vashistha, Shayan Taheri, Navid Asadizanjani

Abstract: Due to the ever-growing demands for electronic chips in different sectors the semiconductor companies have been mandated to offshore their manufacturing processes. This unwanted matter has made security and trustworthiness of their fabricated chips concerning and caused creation of hardware attacks. In this condition, different entities in the semiconductor supply chain can act maliciously and exe… ▽ More Due to the ever-growing demands for electronic chips in different sectors the semiconductor companies have been mandated to offshore their manufacturing processes. This unwanted matter has made security and trustworthiness of their fabricated chips concerning and caused creation of hardware attacks. In this condition, different entities in the semiconductor supply chain can act maliciously and execute an attack on the design computing layers, from devices to systems. Our attack is a hardware Trojan that is inserted during mask generation/fabrication in an untrusted foundry. The Trojan leaves a footprint in the fabricated through addition, deletion, or change of design cells. In order to tackle this problem, we propose Explainable Vision System for Hardware Testing and Assurance (EVHA) in this work that can detect the smallest possible change to a design in a low-cost, accurate, and fast manner. The inputs to this system are Scanning Electron Microscopy (SEM) images acquired from the Integrated Circuits (ICs) under examination. The system output is determination of IC status in terms of having any defect and/or hardware Trojan through addition, deletion, or change in the design cells at the cell-level. This article provides an overview on the design, development, implementation, and analysis of our defense system. △ Less

Submitted 19 July, 2022; originally announced July 2022.

Comments: Please contact Dr. Shayan Taheri for any questions and/or comments regarding the paper arXiv submission at: "www.shayan-taheri.com". The Paper Initial Submission: The ACM Journal on Emerging Technologies in Computing Systems (JETC)

arXiv:2111.03890 [pdf, other]

doi 10.1109/CSDE53843.2021.9718400

Demystifying Deep Learning Models for Retinal OCT Disease Classification using Explainable AI

Authors: Tasnim Sakib Apon, Mohammad Mahmudul Hasan, Abrar Islam, MD. Golam Rabiul Alam

Abstract: In the world of medical diagnostics, the adoption of various deep learning techniques is quite common as well as effective, and its statement is equally true when it comes to implementing it into the retina Optical Coherence Tomography (OCT) sector, but (i)These techniques have the black box characteristics that prevent the medical professionals to completely trust the results generated from them… ▽ More In the world of medical diagnostics, the adoption of various deep learning techniques is quite common as well as effective, and its statement is equally true when it comes to implementing it into the retina Optical Coherence Tomography (OCT) sector, but (i)These techniques have the black box characteristics that prevent the medical professionals to completely trust the results generated from them (ii)Lack of precision of these methods restricts their implementation in clinical and complex cases (iii)The existing works and models on the OCT classification are substantially large and complicated and they require a considerable amount of memory and computational power, reducing the quality of classifiers in real-time applications. To meet these problems, in this paper a self-developed CNN model has been proposed which is comparatively smaller and simpler along with the use of Lime that introduces Explainable AI to the study and helps to increase the interpretability of the model. This addition will be an asset to the medical experts for getting major and detailed information and will help them in making final decisions and will also reduce the opacity and vulnerability of the conventional deep learning models. △ Less

Submitted 6 November, 2021; originally announced November 2021.

arXiv:2002.03130 [pdf]

doi 10.5120/17195-7390

Design and Implementation of Butterworth, Chebyshev-I and Elliptic Filter for Speech Signal Analysis

Authors: Prajoy Podder, Md. Mehedi Hasan, Md. Rafiqul Islam, Mursalin Sayeed

Abstract: In the field of digital signal processing, the function of a filter is to remove unwanted parts of the signal such as random noise that is also undesirable. To remove noise from the speech signal transmission or to extract useful parts of the signal such as the components lying within a certain frequency range, filters are necessary. Filters are broadly used in signal processing and communication… ▽ More In the field of digital signal processing, the function of a filter is to remove unwanted parts of the signal such as random noise that is also undesirable. To remove noise from the speech signal transmission or to extract useful parts of the signal such as the components lying within a certain frequency range, filters are necessary. Filters are broadly used in signal processing and communication systems in applications such as channel equalization, noise reduction, radar, audio processing, speech signal processing, video processing, biomedical signal processing that is noisy ECG, EEG, EMG signal filtering, electrical circuit analysis and analysis of economic and financial data. In this paper, three types of infinite impulse response filter i.e. Butterworth, Chebyshev type I and Elliptical filter have been discussed theoretically and experimentally. Butterworth, Chebyshev type I and elliptic low pass, high pass, band pass and band stop filter have been designed in this paper using MATLAB Software. The impulse responses, magnitude responses, phase responses of Butterworth, Chebyshev type I and Elliptical filter for filtering the speech signal have been observed in this paper. Analyzing the Speech signal, its sampling rate and spectrum response have also been found. △ Less

Submitted 27 May, 2020; v1 submitted 8 February, 2020; originally announced February 2020.

Journal ref: International Journal of Computer Applications 98(7):12-18, July 2014

arXiv:2001.09494 [pdf, other]

Efficient, Effective and Well Justified Estimation of Active Nodes within a Cluster

Authors: Md Mahmudul Hasan, Shuangqing Wei, Ramachandran Vaidyanathan

Abstract: Reliable and efficient estimation of the size of a dynamically changing cluster in an IoT network is critical in its nominal operation. Most previous estimation schemes worked with relatively smaller frame size and large number of rounds. Here we propose a new estimator named \textquotedblleft Gaussian Estimator of Active Nodes,\textquotedblright (GEAN), that works with large enough frame size und… ▽ More Reliable and efficient estimation of the size of a dynamically changing cluster in an IoT network is critical in its nominal operation. Most previous estimation schemes worked with relatively smaller frame size and large number of rounds. Here we propose a new estimator named \textquotedblleft Gaussian Estimator of Active Nodes,\textquotedblright (GEAN), that works with large enough frame size under which testing statistics is well approximated as a Gaussian variable, thereby requiring less number of frames, and thus less total number of channel slots to attain a desired accuracy in estimation. More specifically, the selection of the frame size is done according to Triangular Array Central Limit Theorem which also enables us to quantify the approximation error. Larger frame size helps the statistical average to converge faster to the ensemble mean of the estimator and the quantification of the approximation error helps to determine the number of rounds to keep up with the accuracy requirements. We present the analysis of our scheme under two different channel models i.e. $ \{0,1 \} $ and $ \{0,1,e \} $, whereas all previous schemes worked only under $ \{0,1 \} $ channel model. The overall performance of GEAN is better than the previously proposed schemes considering the number of slots required for estimation to achieve a given level of estimation accuracy. △ Less

Submitted 26 January, 2020; originally announced January 2020.

Comments: 15 pages, 11 figures. arXiv admin note: text overlap with arXiv:1701.05952

arXiv:1903.02189 [pdf, other]

Grid-Connected Emergency Back-Up Power Supply

Authors: Dhiman Chowdhury, Mohammad Sharif Miah, Md. Feroz Hossain, Md. Mostafijur Rahman, Md. Marzan Hossain, Md. Nazim Uddin Sheikh, Md. Mehedi Hasan, Uzzal Sarker, Abu Shahir Md. Khalid Hasan

Abstract: This paper documents a design and modelling of a grid-connected emergency back-up power supply for medium power applications. There are a rectifier-link boost derived battery charging circuit and a 4-switch push-pull power inverter circuit which are controlled by pulse width modulation (PWM) signals. This paper presents a state averaging model and Laplace domain transfer function of the charging c… ▽ More This paper documents a design and modelling of a grid-connected emergency back-up power supply for medium power applications. There are a rectifier-link boost derived battery charging circuit and a 4-switch push-pull power inverter circuit which are controlled by pulse width modulation (PWM) signals. This paper presents a state averaging model and Laplace domain transfer function of the charging circuit and a switching converter model of the power inverter circuit. A changeover relay based transfer switch controls the power flow towards the utility loads. During off-grid situations, loads are fed power by the proposed inverter circuit and during on-grid situations, battery is charged by an ac-link rectifier-fed boost converter. There is a relay switching circuit to control the charging phenomenon of the battery. The proposed design has been simulated in PLECS and the simulation results corroborate the reliability of the presented framework. △ Less

Submitted 6 March, 2019; originally announced March 2019.

arXiv:1808.10086 [pdf, other]

Artifacts Detection and Error Block Analysis from Broadcasted Videos

Authors: Md Mehedi Hasan, Tasneem Rahman, Kiok Ahn, Oksam Chae

Abstract: With the advancement of IPTV and HDTV technology, previous subtle errors in videos are now becoming more prominent because of the structure oriented and compression based artifacts. In this paper, we focus towards the development of a real-time video quality check system. Light weighted edge gradient magnitude information is incorporated to acquire the statistical information and the distorted fra… ▽ More With the advancement of IPTV and HDTV technology, previous subtle errors in videos are now becoming more prominent because of the structure oriented and compression based artifacts. In this paper, we focus towards the development of a real-time video quality check system. Light weighted edge gradient magnitude information is incorporated to acquire the statistical information and the distorted frames are then estimated based on the characteristics of their surrounding frames. Then we apply the prominent texture patterns to classify them in different block errors and analyze them not only in video error detection application but also in error concealment, restoration and retrieval. Finally, evaluating the performance through experiments on prominent datasets and broadcasted videos show that the proposed algorithm is very much efficient to detect errors for video broadcast and surveillance applications in terms of computation time and analysis of distorted frames. △ Less

Submitted 29 August, 2018; originally announced August 2018.

Showing 1–10 of 10 results for author: Hasan, M M