(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 216 results for author: Gupta, G

.
  1. arXiv:2407.08179  [pdf, other

    cs.AI cs.LG cs.LO

    CoGS: Causality Constrained Counterfactual Explanations using goal-directed ASP

    Authors: Sopam Dasgupta, Joaquín Arias, Elmer Salazar, Gopal Gupta

    Abstract: Machine learning models are increasingly used in areas such as loan approvals and hiring, yet they often function as black boxes, obscuring their decision-making processes. Transparency is crucial, and individuals need explanations to understand decisions, especially for the ones not desired by the user. Ethical and legal considerations require informing individuals of changes in input attribute v… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2406.14901  [pdf, other

    cs.IR

    IDentity with Locality: An ideal hash for gene sequence search

    Authors: Aditya Desai, Gaurav Gupta, Tianyi Zhang, Anshumali Shrivastava

    Abstract: Gene sequence search is a fundamental operation in computational genomics. Due to the petabyte scale of genome archives, most gene search systems now use hashing-based data structures such as Bloom Filters (BF). The state-of-the-art systems such as Compact bit-slicing signature index (COBS) and Repeated And Merged Bloom filters (RAMBO) use BF with Random Hash (RH) functions for gene representation… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 13 pages

  3. arXiv:2405.20622  [pdf, other

    cs.LG

    Superfast Selection for Decision Tree Algorithms

    Authors: Huaduo Wang, Gopal Gupta

    Abstract: We present a novel and systematic method, called Superfast Selection, for selecting the "optimal split" for decision tree and feature selection algorithms over tabular data. The method speeds up split selection on a single feature by lowering the time complexity, from O(MN) (using the standard selection methods) to O(M), where M represents the number of input examples and N the number of unique va… ▽ More

    Submitted 3 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

  4. arXiv:2405.15956  [pdf, other

    cs.AI cs.LG cs.LO

    CFGs: Causality Constrained Counterfactual Explanations using goal-directed ASP

    Authors: Sopam Dasgupta, Joaquín Arias, Elmer Salazar, Gopal Gupta

    Abstract: Machine learning models that automate decision-making are increasingly used in consequential areas such as loan approvals, pretrial bail approval, and hiring. Unfortunately, most of these models are black boxes, i.e., they are unable to reveal how they reach these prediction decisions. A need for transparency demands justification for such predictions. An affected individual might also desire expl… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.04382

  5. arXiv:2405.15886  [pdf, other

    cs.CV

    A Neurosymbolic Framework for Bias Correction in CNNs

    Authors: Parth Padalkar, Natalia Ślusarz, Ekaterina Komendantskaya, Gopal Gupta

    Abstract: Recent efforts in interpreting Convolutional Neural Networks (CNNs) focus on translating the activation of CNN filters into stratified Answer Set Programming (ASP) rule-sets. The CNN filters are known to capture high-level image concepts, thus the predicates in the rule-set are mapped to the concept that their corresponding filter represents. Hence, the rule-set effectively exemplifies the decisio… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  6. arXiv:2405.06712  [pdf, other

    cs.CL cs.AI

    Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses

    Authors: Gaurav Kumar Gupta, Aditi Singh, Sijo Valayakkad Manikandan, Abul Ehtesham

    Abstract: The recent swift development of LLMs like GPT-4, Gemini, and GPT-3.5 offers a transformative opportunity in medicine and healthcare, especially in digital diagnostics. This study evaluates each model diagnostic abilities by interpreting a user symptoms and determining diagnoses that fit well with common illnesses, and it demonstrates how each of these models could significantly increase diagnostic… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 14 pages, 4 figures

  7. arXiv:2405.05852  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.RO stat.ML

    Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

    Authors: Gunshi Gupta, Karmesh Yadav, Yarin Gal, Dhruv Batra, Zsolt Kira, Cong Lu, Tim G. J. Rudner

    Abstract: Embodied AI agents require a fine-grained understanding of the physical world mediated through visual and language inputs. Such capabilities are difficult to learn solely from task-specific data. This has led to the emergence of pre-trained vision-language models as a tool for transferring representations learned from internet-scale data to downstream tasks and new domains. However, commonly used… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  8. arXiv:2405.03637  [pdf, other

    cs.LG

    Collage: Light-Weight Low-Precision Strategy for LLM Training

    Authors: Tao Yu, Gaurav Gupta, Karthick Gopalswamy, Amith Mamidala, Hao Zhou, Jeffrey Huynh, Youngsuk Park, Ron Diamant, Anoop Deoras, Luke Huan

    Abstract: Large models training is plagued by the intense compute cost and limited hardware memory. A practical solution is low-precision representation but is troubled by loss in numerical accuracy and unstable training rendering the model less useful. We argue that low-precision floating points can perform well provided the error is properly compensated at the critical locations in the training process. W… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  9. arXiv:2404.15120  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci physics.app-ph physics.optics

    Motion of 2D exciton in momentum space leads to pseudospin distribution narrowing on the Bloch Sphere

    Authors: Garima Gupta, Kenji Watanabe, Takashi Taniguchi, Kausik Majumdar

    Abstract: Motional narrowing implies narrowing induced by motion, for example, in nuclear resonance, the thermally induced random motion of the nuclei in an inhomogeneous environment leads to counter-intuitive narrowing of the resonance line. Similarly, the excitons in monolayer semiconductors experience magnetic inhomogeneity: the electron-hole spin-exchange interaction manifests as an in-plane pseudo-magn… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted in Nano Letters

    Journal ref: Nano Letters 24, 5413, 2024

  10. arXiv:2404.10630  [pdf, other

    cs.CL cs.LG

    HLAT: High-quality Large Language Model Pre-trained on AWS Trainium

    Authors: Haozheng Fan, Hao Zhou, Guangtai Huang, Parameswaran Raman, Xinwei Fu, Gaurav Gupta, Dhananjay Ram, Yida Wang, Jun Huan

    Abstract: Getting large language models (LLMs) to perform well on the downstream tasks requires pre-training over trillions of tokens. This typically demands a large number of powerful computational devices in addition to a stable distributed training framework to accelerate the training. The growing number of applications leveraging AI/ML had led to a scarcity of the expensive conventional accelerators (su… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  11. arXiv:2404.09403  [pdf, other

    cs.LG

    Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning

    Authors: Xiongye Xiao, Gengshuo Liu, Gaurav Gupta, Defu Cao, Shixuan Li, Yaxing Li, Tianqing Fang, Mingxi Cheng, Paul Bogdan

    Abstract: Integrating and processing information from various sources or modalities are critical for obtaining a comprehensive and accurate perception of the real world in autonomous systems and cyber-physical systems. Drawing inspiration from neuroscience, we develop the Information-Theoretic Hierarchical Perception (ITHP) model, which utilizes the concept of information bottleneck. Different from most tra… ▽ More

    Submitted 22 April, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: Accepted by ICLR 2024. Camera Ready Version

  12. arXiv:2403.10642  [pdf, other

    cs.LG math.NA

    Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs

    Authors: S. Chandra Mouli, Danielle C. Maddix, Shima Alizadeh, Gaurav Gupta, Andrew Stuart, Michael W. Mahoney, Yuyang Wang

    Abstract: Existing work in scientific machine learning (SciML) has shown that data-driven learning of solution operators can provide a fast approximate alternative to classical numerical partial differential equation (PDE) solvers. Of these, Neural Operators (NOs) have emerged as particularly promising. We observe that several uncertainty quantification (UQ) methods for NOs fail for test inputs that are eve… ▽ More

    Submitted 12 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: ICML 2024

  13. arXiv:2403.05882  [pdf, other

    cs.LG

    DiffRed: Dimensionality Reduction guided by stable rank

    Authors: Prarabdh Shukla, Gagan Raj Gupta, Kunal Dutta

    Abstract: In this work, we propose a novel dimensionality reduction technique, DiffRed, which first projects the data matrix, A, along first $k_1$ principal components and the residual matrix $A^{*}$ (left after subtracting its $k_1$-rank approximation) along $k_2$ Gaussian random vectors. We evaluate M1, the distortion of mean-squared pair-wise distance, and Stress, the normalized value of RMS of distortio… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  14. arXiv:2402.15968  [pdf, other

    cs.LG cs.AI

    CoDream: Exchanging dreams instead of models for federated aggregation with heterogeneous models

    Authors: Abhishek Singh, Gauri Gupta, Ritvik Kapila, Yichuan Shi, Alex Dang, Sheshank Shankar, Mohammed Ehab, Ramesh Raskar

    Abstract: Federated Learning (FL) enables collaborative optimization of machine learning models across decentralized data by aggregating model parameters. Our approach extends this concept by aggregating "knowledge" derived from models, instead of model parameters. We present a novel framework called CoDream, where clients collaboratively optimize randomly initialized data using federated optimization in th… ▽ More

    Submitted 27 February, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

    Comments: 16 pages, 12 figures, 5 tables

  15. arXiv:2402.04382  [pdf, other

    cs.AI

    Counterfactual Generation with Answer Set Programming

    Authors: Sopam Dasgupta, Farhad Shakerin, Joaquín Arias, Elmer Salazar, Gopal Gupta

    Abstract: Machine learning models that automate decision-making are increasingly being used in consequential areas such as loan approvals, pretrial bail approval, hiring, and many more. Unfortunately, most of these models are black-boxes, i.e., they are unable to reveal how they reach these prediction decisions. A need for transparency demands justification for such predictions. An affected individual might… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 16 Pages

  16. arXiv:2401.08588  [pdf

    cs.CV

    Improved Pothole Detection Using YOLOv7 and ESRGAN

    Authors: Nirmal Kumar Rout, Gyanateet Dutta, Varun Sinha, Arghadeep Dey, Subhrangshu Mukherjee, Gopal Gupta

    Abstract: Potholes are common road hazards that is causing damage to vehicles and posing a safety risk to drivers. The introduction of Convolutional Neural Networks (CNNs) is widely used in the industry for object detection based on Deep Learning methods and has achieved significant progress in hardware improvement and software implementations. In this paper, a unique better algorithm is proposed to warrant… ▽ More

    Submitted 10 November, 2023; originally announced January 2024.

  17. arXiv:2401.04795  [pdf, other

    cs.MA cs.LG cs.SI physics.soc-ph

    First 100 days of pandemic; an interplay of pharmaceutical, behavioral and digital interventions -- A study using agent based modeling

    Authors: Gauri Gupta, Ritvik Kapila, Ayush Chopra, Ramesh Raskar

    Abstract: Pandemics, notably the recent COVID-19 outbreak, have impacted both public health and the global economy. A profound understanding of disease progression and efficient response strategies is thus needed to prepare for potential future outbreaks. In this paper, we emphasize the potential of Agent-Based Models (ABM) in capturing complex infection dynamics and understanding the impact of intervention… ▽ More

    Submitted 5 February, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: 12 pages, 12 figures, In Proc. of the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024), Auckland, New Zealand, 2024

  18. arXiv:2312.17168  [pdf, other

    cs.LG cs.AI

    Can Active Sampling Reduce Causal Confusion in Offline Reinforcement Learning?

    Authors: Gunshi Gupta, Tim G. J. Rudner, Rowan Thomas McAllister, Adrien Gaidon, Yarin Gal

    Abstract: Causal confusion is a phenomenon where an agent learns a policy that reflects imperfect spurious correlations in the data. Such a policy may falsely appear to be optimal during training if most of the training data contain such spurious correlations. This phenomenon is particularly pronounced in domains such as robotics, with potentially large gaps between the open- and closed-loop performance of… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Proceedings of the 2nd Conference on Causal Learning and Reasoning (CLeaR 2021)

  19. arXiv:2312.02337  [pdf, other

    cs.CL

    Measuring Distributional Shifts in Text: The Advantage of Language Model-Based Embeddings

    Authors: Gyandev Gupta, Bashir Rastegarpanah, Amalendu Iyer, Joshua Rubin, Krishnaram Kenthapadi

    Abstract: An essential part of monitoring machine learning models in production is measuring input and output data drift. In this paper, we present a system for measuring distributional shifts in natural language data and highlight and investigate the potential advantage of using large language models (LLMs) for this problem. Recent advancements in LLMs and their successful adoption in different domains ind… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  20. TiO2 multi-leg nanotubes for Surface-enhanced Raman scattering

    Authors: Harini S, Garima Gupta, Somnath C. Roy, Rambabu Yalavarthi

    Abstract: In the recent past, significant research efforts have been put forth to fabricate low-cost noble metal-free substrates for surface-enhanced Raman spectroscopy (SERS) applications. Here we propose semiconducting TiO2 multi-leg nanotubes (TiO2 MLNTs, with and without the gold nanoparticle coating) as SERS substrates. TiO2 MLNTs show unique multi-leg morphology compared to the conventional non-multi-… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Journal ref: Journal of Physics D: Applied Physics, 2024-04-24

  21. arXiv:2311.02399  [pdf, ps, other

    cs.LG cs.DC

    Entropy Aware Training for Fast and Accurate Distributed GNN

    Authors: Dhruv Deshmukh, Gagan Raj Gupta, Manisha Chawla, Vishwesh Jatala, Anirban Haldar

    Abstract: Several distributed frameworks have been developed to scale Graph Neural Networks (GNNs) on billion-size graphs. On several benchmarks, we observe that the graph partitions generated by these frameworks have heterogeneous data distributions and class imbalance, affecting convergence, and resulting in lower performance than centralized implementations. We holistically address these challenges and d… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

    Comments: 8 pages, 3 figures, 5 tables, accepted at ICDM'23

    ACM Class: I.5.1; I.5.2

  22. arXiv:2311.00429  [pdf, other

    eess.IV cs.LG

    Crop Disease Classification using Support Vector Machines with Green Chromatic Coordinate (GCC) and Attention based feature extraction for IoT based Smart Agricultural Applications

    Authors: Shashwat Jha, Vishvaditya Luhach, Gauri Shanker Gupta, Beependra Singh

    Abstract: Crops hold paramount significance as they serve as the primary provider of energy, nutrition, and medicinal benefits for the human population. Plant diseases, however, can negatively affect leaves during agricultural cultivation, resulting in significant losses in crop output and economic value. Therefore, it is crucial for farmers to identify crop diseases. However, this method frequently necessi… ▽ More

    Submitted 6 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

  23. arXiv:2310.14497  [pdf, other

    cs.AI

    Counterfactual Explanation Generation with s(CASP)

    Authors: Sopam Dasgupta, Farhad Shakerin, Joaquín Arias, Elmer Salazar, Gopal Gupta

    Abstract: Machine learning models that automate decision-making are increasingly being used in consequential areas such as loan approvals, pretrial bail, hiring, and many more. Unfortunately, most of these models are black-boxes, i.e., they are unable to reveal how they reach these prediction decisions. A need for transparency demands justification for such predictions. An affected individual might desire e… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: 18 Pages

  24. arXiv:2310.13073  [pdf, other

    cs.LG cs.CV

    Using Logic Programming and Kernel-Grouping for Improving Interpretability of Convolutional Neural Networks

    Authors: Parth Padalkar, Gopal Gupta

    Abstract: Within the realm of deep learning, the interpretability of Convolutional Neural Networks (CNNs), particularly in the context of image classification tasks, remains a formidable challenge. To this end we present a neurosymbolic framework, NeSyFOLD-G that generates a symbolic rule-set using the last layer kernels of the CNN to make its underlying knowledge interpretable. What makes NeSyFOLD-G differ… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: text overlap with arXiv:2301.12667

  25. arXiv:2309.16202  [pdf

    cs.CL

    Marathi-English Code-mixed Text Generation

    Authors: Dhiraj Amin, Sharvari Govilkar, Sagar Kulkarni, Yash Shashikant Lalit, Arshi Ajaz Khwaja, Daries Xavier, Sahil Girijashankar Gupta

    Abstract: Code-mixing, the blending of linguistic elements from distinct languages to form meaningful sentences, is common in multilingual settings, yielding hybrid languages like Hinglish and Minglish. Marathi, India's third most spoken language, often integrates English for precision and formality. Developing code-mixed language systems, like Marathi-English (Minglish), faces resource constraints. This re… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  26. arXiv:2309.15877   

    cs.LG cs.AI

    Neuro-Inspired Hierarchical Multimodal Learning

    Authors: Xiongye Xiao, Gengshuo Liu, Gaurav Gupta, Defu Cao, Shixuan Li, Yaxing Li, Tianqing Fang, Mingxi Cheng, Paul Bogdan

    Abstract: Integrating and processing information from various sources or modalities are critical for obtaining a comprehensive and accurate perception of the real world. Drawing inspiration from neuroscience, we develop the Information-Theoretic Hierarchical Perception (ITHP) model, which utilizes the concept of information bottleneck. Distinct from most traditional fusion models that aim to incorporate all… ▽ More

    Submitted 23 April, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: I am requesting the withdrawal of this submission due to an inadvertent duplication. The paper was submitted twice under different IDs, which was not intentional. The other submission (arXiv:2404.09403) contains the most updated and comprehensive version of the paper, and I would like to retain that as the sole version on the platform

  27. arXiv:2309.02398  [pdf, other

    astro-ph.SR

    Exploring magnetic coupling of solar atmosphere through frequency modulations of 3-min slow magnetoacoustic waves

    Authors: Ananya Rawat, Girjesh Gupta

    Abstract: Coronal fan loops rooted in sunspot umbra show outward propagating waves with subsonic phase speed and period around 3-min. However, their source region in the lower atmosphere is still ambiguous. We performed multi-wavelength observations of a clean fan loop system rooted in sunspot observed by Interface Region Imaging Spectrograph (IRIS) and Solar Dynamics Observatory (SDO). We utilised less exp… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: This is a slight extended version of the paper accepted for publication in Bulletin of Liège Royal Society of Sciences (proceedings of the third BINA workshop). arXiv admin note: text overlap with arXiv:2308.03490

  28. arXiv:2308.15014  [pdf, other

    cs.IR

    CAPS: A Practical Partition Index for Filtered Similarity Search

    Authors: Gaurav Gupta, Jonah Yi, Benjamin Coleman, Chen Luo, Vihan Lakshman, Anshumali Shrivastava

    Abstract: With the surging popularity of approximate near-neighbor search (ANNS), driven by advances in neural representation learning, the ability to serve queries accompanied by a set of constraints has become an area of intense interest. While the community has recently proposed several algorithms for constrained ANNS, almost all of these methods focus on integration with graph-based indexes, the predomi… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: 14 pages

  29. arXiv:2308.08637  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci physics.app-ph physics.optics

    Polarized and narrow excitonic emission from graphene-capped monolayer WS$_2$ through resonant phonon relaxation

    Authors: Garima Gupta, Kausik Majumdar

    Abstract: The broadening and polarization of excitonic luminescence in monolayer TMDs largely suffer from inhomogeneity and temperature - an unresolved problem to date. In this work, through few-layer-graphene encapsulation of monolayer WS$_2$, we reduce the inter-excitonic energy separation, which then can have a narrow resonance with a specific phonon mode of our choice. The resulting single-step exciton… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Journal ref: Phys. Rev. B 108, 075436, 2023

  30. arXiv:2308.08141  [pdf

    cond-mat.mtrl-sci

    Investigation of charge carrier dynamics in Ti3C2Tx MXene for ultrafast photonics applications

    Authors: Ankita Rawat, Nitesh K. Chourasia, Saurabh K. Saini, Gaurav Rajput, Aditya Yadav, Ritesh Kumar Chourasia, Govind Gupta, P. K. Kulriya

    Abstract: The rapid advancement of nanomaterials has paved the way for various technological breakthroughs, and MXenes, in particular, have gained substantial attention due to their unique properties such as high conductivity, broad-spectrum absorption strength, and tunable band gap. This article presents the impact of the process parameters on the structural and optical properties of Ti3C2Tx MXene for appl… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: 21 pages , 6 figures

  31. Exploring source region of 3-min slow magnetoacoustic waves observed in coronal fan loops rooted in sunspot umbra

    Authors: Ananya Rawat, Girjesh R. Gupta

    Abstract: Sunspots host various oscillations and wave phenomena like umbral flashes, umbral oscillations, running penumbral waves, and coronal waves. All fan loops rooted in sunspot umbra constantly show a 3-min period propagating slow magnetoacoustic waves in the corona. However, their origin in the lower atmosphere is still unclear. In this work, we studied these oscillations in detail along a clean fan l… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: Accepted for publication in MNRAS

  32. arXiv:2307.03898   

    cs.CV eess.IV

    StyleGAN3: Generative Networks for Improving the Equivariance of Translation and Rotation

    Authors: Tianlei Zhu, Junqi Chen, Renzhe Zhu, Gaurav Gupta

    Abstract: StyleGAN can use style to affect facial posture and identity features, and noise to affect hair, wrinkles, skin color and other details. Among these, the outcomes of the picture processing will vary slightly between different versions of styleGAN. As a result, the comparison of performance differences between styleGAN2 and the two modified versions of styleGAN3 will be the main focus of this study… ▽ More

    Submitted 5 February, 2024; v1 submitted 8 July, 2023; originally announced July 2023.

    Comments: But now we feel we haven't fully studied our work and have found some new great results. So after careful consideration, we're going to rework this manuscript and try to give a more accurate model

  33. arXiv:2306.01460  [pdf, other

    cs.LG

    ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages

    Authors: Andrew Jesson, Chris Lu, Gunshi Gupta, Angelos Filos, Jakob Nicolaus Foerster, Yarin Gal

    Abstract: This paper introduces an effective and practical step toward approximate Bayesian inference in on-policy actor-critic deep reinforcement learning. This step manifests as three simple modifications to the Asynchronous Advantage Actor-Critic (A3C) algorithm: (1) applying a ReLU function to advantage estimates, (2) spectral normalization of actor-critic weights, and (3) incorporating dropout as a Bay… ▽ More

    Submitted 24 November, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  34. arXiv:2305.18404  [pdf, ps, other

    cs.CL cs.LG stat.ML

    Conformal Prediction with Large Language Models for Multi-Choice Question Answering

    Authors: Bhawesh Kumar, Charlie Lu, Gauri Gupta, Anil Palepu, David Bellamy, Ramesh Raskar, Andrew Beam

    Abstract: As large language models continue to be widely developed, robust uncertainty quantification techniques will become crucial for their safe deployment in high-stakes scenarios. In this work, we explore how conformal prediction can be used to provide uncertainty quantification in language models for the specific task of multiple-choice question-answering. We find that the uncertainty estimates from c… ▽ More

    Submitted 7 July, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: Updated sections on prompt engineering. Expanded sections 4.1 and 4.2 and appendix. Included additional references. Work published at the ICML 2023 (Neural Conversational AI TEACH) workshop

  35. arXiv:2305.18225  [pdf, other

    cs.DC cs.AI

    Locksynth: Deriving Synchronization Code for Concurrent Data Structures with ASP

    Authors: Sarat Chandra Varanasi, Neeraj Mittal, Gopal Gupta

    Abstract: We present Locksynth, a tool that automatically derives synchronization needed for destructive updates to concurrent data structures that involve a constant number of shared heap memory write operations. Locksynth serves as the implementation of our prior work on deriving abstract synchronization code. Designing concurrent data structures involves inferring correct synchronization code starting wi… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

  36. arXiv:2305.15786  [pdf, other

    cs.LG math.ST stat.ML

    Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting

    Authors: Hilaf Hasson, Danielle C. Maddix, Yuyang Wang, Gaurav Gupta, Youngsuk Park

    Abstract: Ensembling is among the most popular tools in machine learning (ML) due to its effectiveness in minimizing variance and thus improving generalization. Most ensembling methods for black-box base learners fall under the umbrella of "stacked generalization," namely training an ML algorithm that takes the inferences from the base learners as input. While stacking has been widely applied in practice, i… ▽ More

    Submitted 28 August, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: ICML 2023

  37. arXiv:2304.03431  [pdf, other

    cs.LG cs.AI

    Domain Generalization In Robust Invariant Representation

    Authors: Gauri Gupta, Ritvik Kapila, Keshav Gupta, Ramesh Raskar

    Abstract: Unsupervised approaches for learning representations invariant to common transformations are used quite often for object recognition. Learning invariances makes models more robust and practical to use in real-world scenarios. Since data transformations that do not change the intrinsic properties of the object cause the majority of the complexity in recognition tasks, models that are invariant to t… ▽ More

    Submitted 24 February, 2024; v1 submitted 6 April, 2023; originally announced April 2023.

    Comments: 7 pages, 5 figures, ICLR 2023 workshop

  38. arXiv:2303.10624  [pdf, other

    cs.LG cs.DC

    PFSL: Personalized & Fair Split Learning with Data & Label Privacy for thin clients

    Authors: Manas Wadhwa, Gagan Raj Gupta, Ashutosh Sahu, Rahul Saini, Vidhi Mittal

    Abstract: The traditional framework of federated learning (FL) requires each client to re-train their models in every iteration, making it infeasible for resource-constrained mobile devices to train deep-learning (DL) models. Split learning (SL) provides an alternative by using a centralized server to offload the computation of activations and gradients for a subset of the model but suffers from problems of… ▽ More

    Submitted 19 March, 2023; originally announced March 2023.

    Comments: To be published in : THE 23RD IEEE/ACM INTERNATIONAL SYMPOSIUM ON Cluster, Cloud and Internet Computing. Granted: Open Research Objects (ORO) and Research Objects Reviewed (ROR) badges. See https://www.niso.org/publications/rp-31-2021-badging for definitions of the badges. Code available at: https://github.com/mnswdhw/PFSL

  39. arXiv:2303.08941  [pdf, other

    cs.AI cs.LO

    Automated Interactive Domain-Specific Conversational Agents that Understand Human Dialogs

    Authors: Yankai Zeng, Abhiramon Rajasekharan, Parth Padalkar, Kinjal Basu, Joaquín Arias, Gopal Gupta

    Abstract: Achieving human-like communication with machines remains a classic, challenging topic in the field of Knowledge Representation and Reasoning and Natural Language Processing. These Large Language Models (LLMs) rely on pattern-matching rather than a true understanding of the semantic meaning of a sentence. As a result, they may generate incorrect responses. To generate an assuredly correct response,… ▽ More

    Submitted 17 March, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

  40. arXiv:2303.07537  [pdf, other

    cs.LG q-bio.QM

    Fractional dynamics foster deep learning of COPD stage prediction

    Authors: Chenzhong Yin, Mihai Udrescu, Gaurav Gupta, Mingxi Cheng, Andrei Lihu, Lucretia Udrescu, Paul Bogdan, David M Mannino, Stefan Mihaicuta

    Abstract: Chronic obstructive pulmonary disease (COPD) is one of the leading causes of death worldwide. Current COPD diagnosis (i.e., spirometry) could be unreliable because the test depends on an adequate effort from the tester and testee. Moreover, the early diagnosis of COPD is challenging. We address COPD detection by constructing two novel physiological signals datasets (4432 records from 54 patients i… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Published on Advanced Science

  41. arXiv:2303.02304  [pdf, other

    cs.LG

    Coupled Multiwavelet Neural Operator Learning for Coupled Partial Differential Equations

    Authors: Xiongye Xiao, Defu Cao, Ruochen Yang, Gaurav Gupta, Gengshuo Liu, Chenzhong Yin, Radu Balan, Paul Bogdan

    Abstract: Coupled partial differential equations (PDEs) are key tasks in modeling the complex dynamics of many physical processes. Recently, neural operators have shown the ability to solve PDEs by learning the integral kernel directly in Fourier/Wavelet space, so the difficulty for solving the coupled PDEs depends on dealing with the coupled mappings between the functions. Towards this end, we propose a \t… ▽ More

    Submitted 8 December, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted to ICLR 2023

  42. arXiv:2302.11002  [pdf, other

    cs.LG math.AP math.NA

    Learning Physical Models that Can Respect Conservation Laws

    Authors: Derek Hansen, Danielle C. Maddix, Shima Alizadeh, Gaurav Gupta, Michael W. Mahoney

    Abstract: Recent work in scientific machine learning (SciML) has focused on incorporating partial differential equation (PDE) information into the learning process. Much of this work has focused on relatively "easy" PDE operators (e.g., elliptic and parabolic), with less emphasis on relatively "hard" PDE operators (e.g., hyperbolic). Within numerical PDEs, the latter problem class requires control of a type… ▽ More

    Submitted 10 October, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: ICML 2023, Physica D: Nonlinear Phenomena, Accepted

    Journal ref: Physica D: Nonlinear Phenomena, 457 (2024) 133952

  43. Reliable Natural Language Understanding with Large Language Models and Answer Set Programming

    Authors: Abhiramon Rajasekharan, Yankai Zeng, Parth Padalkar, Gopal Gupta

    Abstract: Humans understand language by extracting information (meaning) from sentences, combining it with existing commonsense knowledge, and then performing reasoning to draw conclusions. While large language models (LLMs) such as GPT-3 and ChatGPT are able to leverage patterns in the text to solve a variety of NLP tasks, they fall short in problems that require reasoning. They also cannot reliably explai… ▽ More

    Submitted 30 August, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: In Proceedings ICLP 2023, arXiv:2308.14898

    Journal ref: EPTCS 385, 2023, pp. 274-287

  44. arXiv:2301.12667  [pdf, other

    cs.LG cs.AI cs.CV

    NeSyFOLD: Neurosymbolic Framework for Interpretable Image Classification

    Authors: Parth Padalkar, Huaduo Wang, Gopal Gupta

    Abstract: Deep learning models such as CNNs have surpassed human performance in computer vision tasks such as image classification. However, despite their sophistication, these models lack interpretability which can lead to biased outcomes reflecting existing prejudices in the data. We aim to make predictions made by a CNN interpretable. Hence, we present a novel framework called NeSyFOLD to create a neuros… ▽ More

    Submitted 20 August, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  45. arXiv:2212.10772  [pdf, other

    cs.CV

    Low-Light Image and Video Enhancement: A Comprehensive Survey and Beyond

    Authors: Shen Zheng, Yiling Ma, Jinqian Pan, Changjie Lu, Gaurav Gupta

    Abstract: This paper presents a comprehensive survey of low-light image and video enhancement, addressing two primary challenges in the field. The first challenge is the prevalence of mixed over-/under-exposed images, which are not adequately addressed by existing methods. In response, this work introduces two enhanced variants of the SICE dataset: SICE_Grad and SICE_Mix, designed to better represent these… ▽ More

    Submitted 1 January, 2024; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: 21 pages, 10 tables, and 17 figures

  46. arXiv:2212.08151  [pdf, other

    cs.LG

    First De-Trend then Attend: Rethinking Attention for Time-Series Forecasting

    Authors: Xiyuan Zhang, Xiaoyong Jin, Karthick Gopalswamy, Gaurav Gupta, Youngsuk Park, Xingjian Shi, Hao Wang, Danielle C. Maddix, Yuyang Wang

    Abstract: Transformer-based models have gained large popularity and demonstrated promising results in long-term time-series forecasting in recent years. In addition to learning attention in time domain, recent works also explore learning attention in frequency domains (e.g., Fourier domain, wavelet domain), given that seasonal patterns can be better captured in these domains. In this work, we seek to unders… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: NeurIPS 2022 All Things Attention Workshop

  47. arXiv:2212.07477  [pdf, other

    cs.LG math.AP math.OA

    Guiding continuous operator learning through Physics-based boundary constraints

    Authors: Nadim Saad, Gaurav Gupta, Shima Alizadeh, Danielle C. Maddix

    Abstract: Boundary conditions (BCs) are important groups of physics-enforced constraints that are necessary for solutions of Partial Differential Equations (PDEs) to satisfy at specific spatial locations. These constraints carry important physical meaning, and guarantee the existence and the uniqueness of the PDE solution. Current neural-network based approaches that aim to solve PDEs rely only on training… ▽ More

    Submitted 2 March, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: Nadim and Gaurav contributed equally in this work. 31 pages, 7 figures, 16 tables

    Journal ref: ICLR 2023

  48. arXiv:2211.09855  [pdf, other

    cs.CL

    ProtSi: Prototypical Siamese Network with Data Augmentation for Few-Shot Subjective Answer Evaluation

    Authors: Yining Lu, Jingxi Qiu, Gaurav Gupta

    Abstract: Subjective answer evaluation is a time-consuming and tedious task, and the quality of the evaluation is heavily influenced by a variety of subjective personal characteristics. Instead, machine evaluation can effectively assist educators in saving time while also ensuring that evaluations are fair and realistic. However, most existing methods using regular machine learning and natural language proc… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

  49. arXiv:2209.09408  [pdf, other

    cs.LG eess.IV

    Deep learning at the edge enables real-time streaming ptychographic imaging

    Authors: Anakha V Babu, Tao Zhou, Saugat Kandel, Tekin Bicer, Zhengchun Liu, William Judge, Daniel J. Ching, Yi Jiang, Sinisa Veseli, Steven Henke, Ryan Chard, Yudong Yao, Ekaterina Sirazitdinova, Geetika Gupta, Martin V. Holt, Ian T. Foster, Antonino Miceli, Mathew J. Cherukara

    Abstract: Coherent microscopy techniques provide an unparalleled multi-scale view of materials across scientific and technological fields, from structural materials to quantum devices, from integrated circuits to biological cells. Driven by the construction of brighter sources and high-rate detectors, coherent X-ray microscopy methods like ptychography are poised to revolutionize nanoscale materials charact… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

  50. arXiv:2208.10488  [pdf

    cs.HC cs.IR cs.LG

    Friendliness Of Stack Overflow Towards Newbies

    Authors: Aneesh Tickoo, Shweta Chauhan, Gagan Raj Gupta

    Abstract: In today's modern digital world, we have a number of online Question and Answer platforms like Stack Exchange, Quora, and GFG that serve as a medium for people to communicate and help each other. In this paper, we analyzed the effectiveness of Stack Overflow in helping newbies to programming. Every user on this platform goes through a journey. For the first 12 months, we consider them to be a newb… ▽ More

    Submitted 21 August, 2022; originally announced August 2022.

    Comments: 12 pages, International Conference on Sustainable Future: Innovations in Education