(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–27 of 27 results for author: Buduru, A B

.
  1. arXiv:2407.06110  [pdf, other

    cs.CV

    FGA: Fourier-Guided Attention Network for Crowd Count Estimation

    Authors: Yashwardhan Chaudhuri, Ankit Kumar, Arun Balaji Buduru, Adel Alshamrani

    Abstract: Crowd counting is gaining societal relevance, particularly in domains of Urban Planning, Crowd Management, and Public Safety. This paper introduces Fourier-guided attention (FGA), a novel attention mechanism for crowd count estimation designed to address the inefficient full-scale global pattern capture in existing works on convolution-based attention networks. FGA efficiently captures multi-scale… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted to IJCNN'24

  2. arXiv:2406.10448  [pdf, other

    eess.AS cs.SD

    AVR: Synergizing Foundation Models for Audio-Visual Humor Detection

    Authors: Sarthak Sharma, Orchid Chetia Phukan, Drishti Singh, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this work, we present, AVR application for audio-visual humor detection. While humor detection has traditionally centered around textual analysis, recent advancements have spotlighted multimodal approaches. However, these methods lean on textual cues as a modality, necessitating the use of ASR systems for transcribing the audio-data. This heavy reliance on ASR accuracy can pose challenges in re… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024 Show & Tell Demonstrations

  3. arXiv:2406.09156  [pdf, other

    cs.LG cs.CV cs.MM cs.SD eess.AS

    Towards Multilingual Audio-Visual Question Answering

    Authors: Orchid Chetia Phukan, Priyabrata Mallick, Swarup Ranjan Behera, Aalekhya Satya Narayani, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this paper, we work towards extending Audio-Visual Question Answering (AVQA) to multilingual settings. Existing AVQA research has predominantly revolved around English and replicating it for addressing AVQA in other languages requires a substantial allocation of resources. As a scalable solution, we leverage machine translation and present two multilingual AVQA datasets for eight languages crea… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

    MSC Class: 68T45

  4. arXiv:2406.06798  [pdf, other

    eess.AS cs.SD

    The Reasonable Effectiveness of Speaker Embeddings for Violence Detection

    Authors: Sarthak Jain, Orchid Chetia Phukan, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this paper, we focus on audio violence detection (AVD). AVD is necessary for several reasons, especially in the context of maintaining safety, preventing harm, and ensuring security in various environments. This calls for accurate AVD systems. Like many related applications in audio processing, the most common approach for improving the performance, would be by leveraging self-supervised (SSL)… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 24 Show & Tell Demonstrations

  5. arXiv:2406.06781  [pdf, other

    eess.AS cs.SD

    PERSONA: An Application for Emotion Recognition, Gender Recognition and Age Estimation

    Authors: Devyani Koshal, Orchid Chetia Phukan, Sarthak Jain, Arun Balaji Buduru, Rajesh Sharma

    Abstract: Emotion Recognition (ER), Gender Recognition (GR), and Age Estimation (AE) constitute paralinguistic tasks that rely not on the spoken content but primarily on speech characteristics such as pitch and tone. While previous research has made significant strides in developing models for each task individually, there has been comparatively less emphasis on concurrently learning these tasks, despite th… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024 Show & Tell Demonstrations

  6. arXiv:2406.06774  [pdf, other

    eess.AS cs.SD

    ComFeAT: Combination of Neural and Spectral Features for Improved Depression Detection

    Authors: Orchid Chetia Phukan, Sarthak Jain, Shubham Singh, Muskaan Singh, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this work, we focus on the detection of depression through speech analysis. Previous research has widely explored features extracted from pre-trained models (PTMs) primarily trained for paralinguistic tasks. Although these features have led to sufficient advances in speech-based depression detection, their performance declines in real-world settings. To address this, in this paper, we introduce… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024 Show & Tell Demonstrations

  7. arXiv:2406.03205  [pdf, other

    eess.AS

    CoLLAB: A Collaborative Approach for Multilingual Abuse Detection

    Authors: Orchid Chetia Phukan, Yashasvi Chaurasia, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this study, we investigate representations from paralingual Pre-Trained model (PTM) for Audio Abuse Detection (AAD), which has not been explored for AAD. Our results demonstrate their superiority compared to other PTM representations on the ADIMA benchmark. Furthermore, combining PTM representations enhances AAD performance. Despite these improvements, challenges with cross-lingual generalizabi… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  8. arXiv:2405.06049  [pdf, other

    cs.CV cs.CR cs.LG

    BB-Patch: BlackBox Adversarial Patch-Attack using Zeroth-Order Optimization

    Authors: Satyadwyoom Kumar, Saurabh Gupta, Arun Balaji Buduru

    Abstract: Deep Learning has become popular due to its vast applications in almost all domains. However, models trained using deep learning are prone to failure for adversarial samples and carry a considerable risk in sensitive applications. Most of these adversarial attack strategies assume that the adversary has access to the training data, the model parameters, and the input during deployment, hence, focu… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  9. arXiv:2404.00827  [pdf, other

    eess.SP

    SONIC: Synergizing VisiON Foundation Models for Stress RecogNItion from ECG signals

    Authors: Orchid Chetia Phukan, Ankita Das, Arun Balaji Buduru, Rajesh Sharma

    Abstract: Stress recognition through physiological signals such as Electrocardiogram (ECG) signals has garnered significant attention. Traditionally, research in this field predominantly focused on utilizing handcrafted features or raw signals as inputs for learning algorithms. However, there is now a burgeoning interest within the community in leveraging large-scale vision foundation models (VFMs) like Res… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  10. arXiv:2404.00809  [pdf, other

    eess.AS

    Heterogeneity over Homogeneity: Investigating Multilingual Speech Pre-Trained Models for Detecting Audio Deepfake

    Authors: Orchid Chetia Phukan, Gautam Siddharth Kashyap, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this work, we investigate multilingual speech Pre-Trained models (PTMs) for Audio deepfake detection (ADD). We hypothesize that multilingual PTMs trained on large-scale diverse multilingual data gain knowledge about diverse pitches, accents, and tones, during their pre-training phase and making them more robust to variations. As a result, they will be more effective for detecting audio deepfake… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: Accepted to NAACL (Findings) 2024

  11. arXiv:2402.01579  [pdf, other

    eess.AS cs.CL cs.SD

    Are Paralinguistic Representations all that is needed for Speech Emotion Recognition?

    Authors: Orchid Chetia Phukan, Gautam Siddharth Kashyap, Arun Balaji Buduru, Rajesh Sharma

    Abstract: Availability of representations from pre-trained models (PTMs) have facilitated substantial progress in speech emotion recognition (SER). Particularly, representations from PTM trained for paralinguistic speech processing have shown state-of-the-art (SOTA) performance for SER. However, such paralinguistic PTM representations haven't been evaluated for SER in linguistic environments other than Engl… ▽ More

    Submitted 11 July, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted to INTERSPEECH 24

  12. arXiv:2401.05968  [pdf, other

    cs.CV

    A Lightweight Feature Fusion Architecture For Resource-Constrained Crowd Counting

    Authors: Yashwardhan Chaudhuri, Ankit Kumar, Orchid Chetia Phukan, Arun Balaji Buduru

    Abstract: Crowd counting finds direct applications in real-world situations, making computational efficiency and performance crucial. However, most of the previous methods rely on a heavy backbone and a complex downstream architecture that restricts the deployment. To address this challenge and enhance the versatility of crowd-counting models, we introduce two lightweight models. These models maintain the s… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  13. arXiv:2307.09938  [pdf, other

    astro-ph.EP physics.space-ph

    Tracking an Untracked Space Debris After an Inelastic Collision Using Physics Informed Neural Network

    Authors: Harsha M., Gurpreet Singh, Vinod Kumar, Arun Balaji Buduru, Sanat K. Biswas

    Abstract: With the sustained rise in satellite deployment in Low Earth Orbits, the collision risk from untracked space debris is also increasing. Often small-sized space debris (below 10 cm) are hard to track using the existing state-of-the-art methods. However, knowing such space debris' trajectory is crucial to avoid future collisions. We present a Physics Informed Neural Network (PINN) - based approach f… ▽ More

    Submitted 25 January, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: 23 pages, 18 figures (consolidated into 13 figures by using sub-figures), accepted as a journal paper by Nature Scientific Report

  14. arXiv:2306.10338  [pdf, other

    cs.CY

    Trauma lurking in the shadows: A Reddit case study of mental health issues in online posts about Childhood Sexual Abuse

    Authors: Orchid Chetia Phukan, Rajesh Sharma, Arun Balaji Buduru

    Abstract: Childhood Sexual Abuse (CSA) is a menace to society and has long-lasting effects on the mental health of the survivors. From time to time CSA survivors are haunted by various mental health issues in their lifetime. Proper care and attention towards CSA survivors facing mental health issues can drastically improve the mental health conditions of CSA survivors. Previous works leveraging online socia… ▽ More

    Submitted 17 June, 2023; originally announced June 2023.

  15. arXiv:2305.18640  [pdf, other

    eess.AS

    Transforming the Embeddings: A Lightweight Technique for Speech Emotion Recognition Tasks

    Authors: Orchid Chetia Phukan, Arun Balaji Buduru, Rajesh Sharma

    Abstract: Speech emotion recognition (SER) is a field that has drawn a lot of attention due to its applications in diverse fields. A current trend in methods used for SER is to leverage embeddings from pre-trained models (PTMs) as input features to downstream models. However, the use of embeddings from speaker recognition PTMs hasn't garnered much focus in comparison to other PTM embeddings. To fill this ga… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: Accepted to Interspeech 2023

  16. arXiv:2304.11472  [pdf, other

    eess.AS cs.AI cs.LG

    A Comparative Study of Pre-trained Speech and Audio Embeddings for Speech Emotion Recognition

    Authors: Orchid Chetia Phukan, Arun Balaji Buduru, Rajesh Sharma

    Abstract: Pre-trained models (PTMs) have shown great promise in the speech and audio domain. Embeddings leveraged from these models serve as inputs for learning algorithms with applications in various downstream tasks. One such crucial task is Speech Emotion Recognition (SER) which has a wide range of applications, including dynamic analysis of customer calls, mental health assessment, and personalized lang… ▽ More

    Submitted 22 April, 2023; originally announced April 2023.

  17. arXiv:2110.15923  [pdf, other

    cs.SI

    Efficient Representation of Interaction Patterns with Hyperbolic Hierarchical Clustering for Classification of Users on Twitter

    Authors: Tanvi Karandikar, Avinash Prabhu, Avinash Tulasi, Arun Balaji Buduru, Ponnurangam Kumaraguru

    Abstract: Social media platforms play an important role in democratic processes. During the 2019 General Elections of India, political parties and politicians widely used Twitter to share their ideals, advocate their agenda and gain popularity. Twitter served as a ground for journalists, politicians and voters to interact. The organic nature of these interactions can be upended by malicious accounts on Twit… ▽ More

    Submitted 1 November, 2021; v1 submitted 29 October, 2021; originally announced October 2021.

  18. arXiv:2107.05104  [pdf, other

    cs.SI cs.HC

    "A Virus Has No Religion": Analyzing Islamophobia on Twitter During the COVID-19 Outbreak

    Authors: Mohit Chandra, Manvith Reddy, Shradha Sehgal, Saurabh Gupta, Arun Balaji Buduru, Ponnurangam Kumaraguru

    Abstract: The COVID-19 pandemic has disrupted people's lives driving them to act in fear, anxiety, and anger, leading to worldwide racist events in the physical world and online social networks. Though there are works focusing on Sinophobia during the COVID-19 pandemic, less attention has been given to the recent surge in Islamophobia. A large number of positive cases arising out of the religious Tablighi J… ▽ More

    Submitted 25 July, 2021; v1 submitted 11 July, 2021; originally announced July 2021.

  19. arXiv:2009.13854  [pdf, other

    cs.AI eess.SY

    Multi-objective Reinforcement Learning based approach for User-Centric Power Optimization in Smart Home Environments

    Authors: Saurabh Gupta, Siddhant Bhambri, Karan Dhingra, Arun Balaji Buduru, Ponnurangam Kumaraguru

    Abstract: Smart homes require every device inside them to be connected with each other at all times, which leads to a lot of power wastage on a daily basis. As the devices inside a smart home increase, it becomes difficult for the user to control or operate every individual device optimally. Therefore, users generally rely on power management systems for such optimization but often are not satisfied with th… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: 8 pages, 7 figures, Accepted at IEEE SMDS'2020

  20. arXiv:2009.13839  [pdf, other

    cs.CV cs.AI cs.CR

    imdpGAN: Generating Private and Specific Data with Generative Adversarial Networks

    Authors: Saurabh Gupta, Arun Balaji Buduru, Ponnurangam Kumaraguru

    Abstract: Generative Adversarial Network (GAN) and its variants have shown promising results in generating synthetic data. However, the issues with GANs are: (i) the learning happens around the training samples and the model often ends up remembering them, consequently, compromising the privacy of individual samples - this becomes a major concern when GANs are applied to training data including personally i… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: 9 pages, 7 figures, Accepted at IEEE TPS'2020

  21. arXiv:2007.06078  [pdf, other

    eess.AS cs.CL cs.SD

    Fine-grained Language Identification with Multilingual CapsNet Model

    Authors: Mudit Verma, Arun Balaji Buduru

    Abstract: Due to a drastic improvement in the quality of internet services worldwide, there is an explosion of multilingual content generation and consumption. This is especially prevalent in countries with large multilingual audience, who are increasingly consuming media outside their linguistic familiarity/preference. Hence, there is an increasing need for real-time and fine-grained content analysis servi… ▽ More

    Submitted 12 July, 2020; originally announced July 2020.

    Comments: 5 pages, 6 figures

  22. arXiv:1912.03298  [pdf, other

    cs.AI

    Making Smart Homes Smarter: Optimizing Energy Consumption with Human in the Loop

    Authors: Mudit Verma, Siddhant Bhambri, Saurabh Gupta, Arun Balaji Buduru

    Abstract: Rapid advancements in the Internet of Things (IoT) have facilitated more efficient deployment of smart environment solutions for specific user requirement. With the increase in the number of IoT devices, it has become difficult for the user to control or operate every individual smart device into achieving some desired goal like optimized power consumption, scheduled appliance running time, etc. F… ▽ More

    Submitted 4 May, 2020; v1 submitted 6 December, 2019; originally announced December 2019.

  23. arXiv:1912.01667  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    A Survey of Black-Box Adversarial Attacks on Computer Vision Models

    Authors: Siddhant Bhambri, Sumanyu Muku, Avinash Tulasi, Arun Balaji Buduru

    Abstract: Machine learning has seen tremendous advances in the past few years, which has lead to deep learning models being deployed in varied applications of day-to-day life. Attacks on such models using perturbations, particularly in real-life scenarios, pose a severe challenge to their applicability, pushing research into the direction which aims to enhance the robustness of these models. After the intro… ▽ More

    Submitted 7 February, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: 33 pages

  24. arXiv:1909.10012  [pdf, other

    cs.CL

    Is change the only constant? Profile change perspective on #LokSabhaElections2019

    Authors: Kumari Neha, Shashank Srikanth, Sonali Singhal, Shwetanshu Singh, Arun Balaji Buduru, Ponnurangam Kumaraguru

    Abstract: Users on Twitter are identified with the help of their profile attributes that consists of username, display name, profile image, to name a few. The profile attributes that users adopt can reflect their interests, belief, or thematic inclinations. Literature has proposed the implications and significance of profile attribute change for a random population of users. However, the use of profile attr… ▽ More

    Submitted 22 September, 2019; originally announced September 2019.

    Comments: 8 pages, 11 figures, 4 tables

  25. arXiv:1909.07151  [pdf, other

    cs.SI cs.IR

    Hashtags are (not) judgemental: The untold story of Lok Sabha elections 2019

    Authors: Saurabh Gupta, Asmit Kumar Singh, Arun Balaji Buduru, Ponnurangam Kumaraguru

    Abstract: Hashtags in online social media have become a way for users to build communities around topics, promote opinions, and categorize messages. In the political context, hashtags on Twitter are used by users to campaign for their parties, spread news, or to get followers and get a general idea by following a discussion built around a hashtag. In the past, researchers have studied certain types and spec… ▽ More

    Submitted 28 April, 2020; v1 submitted 16 September, 2019; originally announced September 2019.

  26. arXiv:1909.07144  [pdf, other

    cs.SI

    Catching up with trends: The changing landscape of political discussions on twitter in 2014 and 2019

    Authors: Avinash Tulasi, Kanay Gupta, Omkar Gurjar, Sathvik Sanjeev Buggana, Paras Mehan, Arun Balaji Buduru, Ponnurangam Kumaraguru

    Abstract: The advent of 4G increased the usage of internet in India, which took a huge number of discussions online. Online Social Networks (OSNs) are the center of these discussions. During elections, political discussions constitute a significant portion of the trending topics on these networks. Politicians and political parties catch up with these trends, and social media then becomes a part of their pub… ▽ More

    Submitted 18 September, 2019; v1 submitted 16 September, 2019; originally announced September 2019.

  27. arXiv:1810.11937  [pdf, other

    cs.AI

    An approach to predictively securing critical cloud infrastructures through probabilistic modeling

    Authors: Satvik Jain, Arun Balaji Buduru, Anshuman Chhabra

    Abstract: Cloud infrastructures are being increasingly utilized in critical infrastructures such as banking/finance, transportation and utility management. Sophistication and resources used in recent security breaches including those on critical infrastructures show that attackers are no longer limited by monetary/computational constraints. In fact, they may be aided by entities with large financial and hum… ▽ More

    Submitted 28 October, 2018; originally announced October 2018.