-
Distance Measurement for UAVs in Deep Hazardous Tunnels
Authors:
Vishal Choudhary,
Shashi Kant Gupta,
Shaohui Foong,
Hock Beng Lim
Abstract:
The localization of Unmanned aerial vehicles (UAVs) in deep tunnels is extremely challenging due to their inaccessibility and hazardous environment. Conventional outdoor localization techniques (such as using GPS) and indoor localization techniques (such as those based on WiFi, Infrared (IR), Ultra-Wideband, etc.) do not work in deep tunnels. We are developing a UAV-based system for the inspection…
▽ More
The localization of Unmanned aerial vehicles (UAVs) in deep tunnels is extremely challenging due to their inaccessibility and hazardous environment. Conventional outdoor localization techniques (such as using GPS) and indoor localization techniques (such as those based on WiFi, Infrared (IR), Ultra-Wideband, etc.) do not work in deep tunnels. We are developing a UAV-based system for the inspection of defects in the Deep Tunnel Sewerage System (DTSS) in Singapore. To enable the UAV localization in the DTSS, we have developed a distance measurement module based on the optical flow technique. However, the standard optical flow technique does not work well in tunnels with poor lighting and a lack of features. Thus, we have developed an enhanced optical flow algorithm with prediction, to improve the distance measurement for UAVs in deep hazardous tunnels.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation
Authors:
Archana Swaminathan,
Anubhav Gupta,
Kamal Gupta,
Shishira R. Maiya,
Vatsal Agarwal,
Abhinav Shrivastava
Abstract:
Neural Radiance Fields (NeRFs) have revolutionized the reconstruction of static scenes and objects in 3D, offering unprecedented quality. However, extending NeRFs to model dynamic objects or object articulations remains a challenging problem. Previous works have tackled this issue by focusing on part-level reconstruction and motion estimation for objects, but they often rely on heuristics regardin…
▽ More
Neural Radiance Fields (NeRFs) have revolutionized the reconstruction of static scenes and objects in 3D, offering unprecedented quality. However, extending NeRFs to model dynamic objects or object articulations remains a challenging problem. Previous works have tackled this issue by focusing on part-level reconstruction and motion estimation for objects, but they often rely on heuristics regarding the number of moving parts or object categories, which can limit their practical use. In this work, we introduce LEIA, a novel approach for representing dynamic 3D objects. Our method involves observing the object at distinct time steps or "states" and conditioning a hypernetwork on the current state, using this to parameterize our NeRF. This approach allows us to learn a view-invariant latent representation for each state. We further demonstrate that by interpolating between these states, we can generate novel articulation configurations in 3D space that were previously unseen. Our experimental results highlight the effectiveness of our method in articulating objects in a manner that is independent of the viewing angle and joint configuration. Notably, our approach outperforms previous methods that rely on motion information for articulation registration.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Functional H_infity Filtering for Descriptor Systems with Monotone nonlinearities
Authors:
Rishabh Sharma,
Mahendra Kumar Gupta,
Nutan Kumar Tomar
Abstract:
This paper introduces a novel approach to design of functional H_\infty filters for a class of nonlinear descriptor systems subjected to disturbances. Departing from conventional assumptions regarding system regularity, we adopt a more inclusive approach by considering general descriptor systems that satisfy a rank condition on their coefficient matrices. Under this rank condition, we establish a…
▽ More
This paper introduces a novel approach to design of functional H_\infty filters for a class of nonlinear descriptor systems subjected to disturbances. Departing from conventional assumptions regarding system regularity, we adopt a more inclusive approach by considering general descriptor systems that satisfy a rank condition on their coefficient matrices. Under this rank condition, we establish a linear matrix inequality (LMI) as a sufficient criterion ensuring the stability of the error system and constraining the L 2 gain of the mapping from disturbances to errors to a predetermined level. The efficacy of the proposed approach is demonstrated through a practical example involving a simple constrained mechanical system.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Effects of Interfacial Oxygen Diffusion on the Magnetic Properties and Thermal Stability of Pd/CoFeB/Pd/Ta Heterostructure
Authors:
Saravanan Lakshmanan,
Cristian Romanque,
Mario Mery,
Manivel Raja Muthuvel,
Nanhe Kumar Gupta,
Carlos Garcia
Abstract:
We investigated the effects of annealing temperatures (TA) on a Pd (5 nm)/CoFeB (10 nm)/Pd (3 nm)/Ta (10 nm) multilayer structure. The as-deposited sample showed an amorphous state with in-plane uniaxial magnetic anisotropy (UMA), resulting in low coercivity and moderate damping constant (α) values. Increasing TA led to crystallization, forming bcc-CoFe (110) crystals, which increased in-plane coe…
▽ More
We investigated the effects of annealing temperatures (TA) on a Pd (5 nm)/CoFeB (10 nm)/Pd (3 nm)/Ta (10 nm) multilayer structure. The as-deposited sample showed an amorphous state with in-plane uniaxial magnetic anisotropy (UMA), resulting in low coercivity and moderate damping constant (α) values. Increasing TA led to crystallization, forming bcc-CoFe (110) crystals, which increased in-plane coercivity and introduced isotropic magnetic anisotropy, slightly reducing the α. The two-fold UMA persists up to 600 C, and the thermal stability of the in-plane magnetic anisotropy remains intact even TA = 700 C. The TA significantly influenced the magnetic properties such as in-plane saturation magnetization (Ms//), in-plane and out-of-plane coercivities, and in-plane effective magnetic anisotropy energy density (Keff). Above 600 C, Keff decreased, indicating a transition towards uniaxial perpendicular magnetic anisotropy. Interfacial oxidation and diffusion from the Ta capping layer to the Pd/CoFeB/Pd interfaces were observed, influencing chemical bonding states. Annealing at 700 C, reduced oxygen within TaOx through a redox reaction involving Ta crystallization, forming TaB, PdO, and BOx states. Ferromagnetic resonance spectra analysis indicated variations in resonance field (Hr) due to local chemical environments. The α reduction, reaching a minimum at 300 C annealing, was attributed to reduced structural disorder from inhomogeneities. Tailoring magnetic anisotropy and spin dynamic properties in Pd/CoFeB/Pd/Ta structures through TA-controlled oxygen diffusion/oxidation highlights their potential for SOT, DMI, and magnetic skyrmion-based spintronic devices.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Infinite-Length Limit of Spectral Curves and Inverse Scattering
Authors:
Niklas Beisert,
Kunal Gupta
Abstract:
Integrability equips models of theoretical physics with efficient methods for the exact construction of useful states and their evolution. Relevant tools for classical integrable field models in one spatial dimensional are spectral curves in the case of periodic fields and inverse scattering for asymptotic boundary conditions. Even though the two methods are quite different in many ways, they ough…
▽ More
Integrability equips models of theoretical physics with efficient methods for the exact construction of useful states and their evolution. Relevant tools for classical integrable field models in one spatial dimensional are spectral curves in the case of periodic fields and inverse scattering for asymptotic boundary conditions. Even though the two methods are quite different in many ways, they ought to be related by taking the periodicity length of closed boundary conditions to infinity.
Using the Korteweg-de Vries equation and the continuous Heisenberg magnet as prototypical classical integrable field models, we discuss and illustrate how data for spectral curves transforms into asymptotic scattering data. In order to gain intuition and also for concreteness, we review how the elliptic states of these models degenerate into solitons at infinite length.
△ Less
Submitted 8 September, 2024;
originally announced September 2024.
-
HYDRA: Hybrid Data Multiplexing and Run-time Layer Configurable DNN Accelerator
Authors:
Sonu Kumar,
Komal Gupta,
Gopal Raut,
Mukul Lokhande,
Santosh Kumar Vishvakarma
Abstract:
Deep neural networks (DNNs) offer plenty of challenges in executing efficient computation at edge nodes, primarily due to the huge hardware resource demands. The article proposes HYDRA, hybrid data multiplexing, and runtime layer configurable DNN accelerators to overcome the drawbacks. The work proposes a layer-multiplexed approach, which further reuses a single activation function within the exec…
▽ More
Deep neural networks (DNNs) offer plenty of challenges in executing efficient computation at edge nodes, primarily due to the huge hardware resource demands. The article proposes HYDRA, hybrid data multiplexing, and runtime layer configurable DNN accelerators to overcome the drawbacks. The work proposes a layer-multiplexed approach, which further reuses a single activation function within the execution of a single layer with improved Fused-Multiply-Accumulate (FMA). The proposed approach works in iterative mode to reuse the same hardware and execute different layers in a configurable fashion. The proposed architectures achieve reductions over 90% of power consumption and resource utilization improvements of state-of-the-art works, with 35.21 TOPSW. The proposed architecture reduces the area overhead (N-1) times required in bandwidth, AF and layer architecture. This work shows HYDRA architecture supports optimal DNN computations while improving performance on resource-constrained edge devices.
△ Less
Submitted 8 September, 2024;
originally announced September 2024.
-
Operational Safety in Human-in-the-loop Human-in-the-plant Autonomous Systems
Authors:
Ayan Banerjee,
Aranyak Maity,
Imane Lamrani,
Sandeep K. S. Gupta
Abstract:
Control affine assumptions, human inputs are external disturbances, in certified safe controller synthesis approaches are frequently violated in operational deployment under causal human actions. This paper takes a human-in-the-loop human-in-the-plant (HIL-HIP) approach towards ensuring operational safety of safety critical autonomous systems: human and real world controller (RWC) are modeled as a…
▽ More
Control affine assumptions, human inputs are external disturbances, in certified safe controller synthesis approaches are frequently violated in operational deployment under causal human actions. This paper takes a human-in-the-loop human-in-the-plant (HIL-HIP) approach towards ensuring operational safety of safety critical autonomous systems: human and real world controller (RWC) are modeled as a unified system. A three-way interaction is considered: a) through personalized inputs and biological feedback processes between HIP and HIL, b) through sensors and actuators between RWC and HIP, and c) through personalized configuration changes and data feedback between HIL and RWC. We extend control Lyapunov theory by generating barrier function (CLBF) under human action plans, model the HIL as a combination of Markov Chain for spontaneous events and Fuzzy inference system for event responses, the RWC as a black box, and integrate the HIL-HIP model with neural architectures that can learn CLBF certificates. We show that synthesized HIL-HIP controller for automated insulin delivery in Type 1 Diabetes is the only controller to meet safety requirements for human action inputs.
△ Less
Submitted 22 August, 2024;
originally announced September 2024.
-
Inhomogeneous hysteresis in local STM tunnel conductance with gate-voltage in single-layer MoS$_2$ on SiO$_2$
Authors:
Santu Prasad Jana,
Suraina Gupta,
Anjan Kumar Gupta
Abstract:
Randomly distributed traps at the MoS$_2$/SiO$_2$ interface result in non-ideal transport behavior, including hysteresis in MoS$_2$/SiO$_2$ field effect transistors (FETs). Thus traps are mostly detrimental to the FET performance but they also offer some application potential. Our STM/S measurements on atomically resolved few-layer and single-layer MoS$_2$ on SiO$_2$ show n-doped behavior with the…
▽ More
Randomly distributed traps at the MoS$_2$/SiO$_2$ interface result in non-ideal transport behavior, including hysteresis in MoS$_2$/SiO$_2$ field effect transistors (FETs). Thus traps are mostly detrimental to the FET performance but they also offer some application potential. Our STM/S measurements on atomically resolved few-layer and single-layer MoS$_2$ on SiO$_2$ show n-doped behavior with the expected band gap close to 2.0 and 1.4 eV, respectively. The local tunnel conductance with gate-voltage $V_{\rm g}$ sweep exhibits a turn-on/off at a threshold $V_{\rm g}$ at which the tip's Fermi-energy nearly coincides with the local conduction band minimum. This threshold value is found to depend on $V_{\rm g}$ sweep direction amounting to local hysteresis. The hysteresis is, expectedly, found to depend on both the extent and rate of $V_{\rm g}$-sweep. Further, the spatial variation in the local $V_{\rm g}$ threshold and the details of tunnel conductance Vs $V_{\rm g}$ behavior indicate inhomogenieties in both the traps' density and their energy distribution. The latter even leads to the pinning of the local Fermi energy in some regions. Further, some rare locations exhibit a p-doping with both p and n-type $V_{\rm g}$-thresholds in local conductance and an unusual hysteresis.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking
Authors:
Md. Mahfuzur Rahman,
Sunzida Siddique,
Marufa Kamal,
Rakib Hossain Rifat,
Kishor Datta Gupta
Abstract:
Unmanned Aerial Vehicles (UAVs), have greatly revolutionized the process of gathering and analyzing data in diverse research domains, providing unmatched adaptability and effectiveness. This paper presents a thorough examination of Unmanned Aerial Vehicle (UAV) datasets, emphasizing their wide range of applications and progress. UAV datasets consist of various types of data, such as satellite imag…
▽ More
Unmanned Aerial Vehicles (UAVs), have greatly revolutionized the process of gathering and analyzing data in diverse research domains, providing unmatched adaptability and effectiveness. This paper presents a thorough examination of Unmanned Aerial Vehicle (UAV) datasets, emphasizing their wide range of applications and progress. UAV datasets consist of various types of data, such as satellite imagery, images captured by drones, and videos. These datasets can be categorized as either unimodal or multimodal, offering a wide range of detailed and comprehensive information. These datasets play a crucial role in disaster damage assessment, aerial surveillance, object recognition, and tracking. They facilitate the development of sophisticated models for tasks like semantic segmentation, pose estimation, vehicle re-identification, and gesture recognition. By leveraging UAV datasets, researchers can significantly enhance the capabilities of computer vision models, thereby advancing technology and improving our understanding of complex, dynamic environments from an aerial perspective. This review aims to encapsulate the multifaceted utility of UAV datasets, emphasizing their pivotal role in driving innovation and practical applications in multiple domains.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
Anisotropic Spin Stripe Domains in Bilayer La$_3$Ni$_2$O$_7$
Authors:
N. K Gupta,
R. Gong,
Y. Wu,
M. Kang,
C. T. Parzyck,
B. Z. Gregory,
N. Costa,
R. Sutarto,
S. Sarker,
A. Singer,
D. G. Schlom,
K. M. Shen,
D. G. Hawthorn
Abstract:
The discovery of superconductivity in La$_3$Ni$_2$O$_7$ under pressure has motivated the investigation of a parent spin density wave (SDW) state which could provide the underlying pairing interaction. Here, we employ resonant soft x-ray scattering and polarimetry on thin films of bilayer La$_3$Ni$_2$O$_7$ to determine that the magnetic structure of the SDW forms unidirectional diagonal spin stripe…
▽ More
The discovery of superconductivity in La$_3$Ni$_2$O$_7$ under pressure has motivated the investigation of a parent spin density wave (SDW) state which could provide the underlying pairing interaction. Here, we employ resonant soft x-ray scattering and polarimetry on thin films of bilayer La$_3$Ni$_2$O$_7$ to determine that the magnetic structure of the SDW forms unidirectional diagonal spin stripes with moments lying within the NiO$_2$ plane and perpendicular to $\mathbf{Q}_{SDW}$, but without the strong charge disproportionation typically associated with other nickelates. These stripes form anisotropic domains with shorter correlation lengths perpendicular versus parallel to $\mathbf{Q}_{SDW}$, revealing nanoscale rotational and translational symmetry breaking analogous to the cuprate and Fe-based superconductors, with Bloch-like antiferromagnetic domain walls separating orthogonal domains.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
Physical Rule-Guided Convolutional Neural Network
Authors:
Kishor Datta Gupta,
Marufa Kamal,
Rakib Hossain Rifat,
Mohd Ariful Haque,
Roy George
Abstract:
The black-box nature of Convolutional Neural Networks (CNNs) and their reliance on large datasets limit their use in complex domains with limited labeled data. Physics-Guided Neural Networks (PGNNs) have emerged to address these limitations by integrating scientific principles and real-world knowledge, enhancing model interpretability and efficiency. This paper proposes a novel Physics-Guided CNN…
▽ More
The black-box nature of Convolutional Neural Networks (CNNs) and their reliance on large datasets limit their use in complex domains with limited labeled data. Physics-Guided Neural Networks (PGNNs) have emerged to address these limitations by integrating scientific principles and real-world knowledge, enhancing model interpretability and efficiency. This paper proposes a novel Physics-Guided CNN (PGCNN) architecture that incorporates dynamic, trainable, and automated LLM-generated, widely recognized rules integrated into the model as custom layers to address challenges like limited data and low confidence scores. The PGCNN is evaluated on multiple datasets, demonstrating superior performance compared to a baseline CNN model. Key improvements include a significant reduction in false positives and enhanced confidence scores for true detection. The results highlight the potential of PGCNNs to improve CNN performance for broader application areas.
△ Less
Submitted 3 September, 2024;
originally announced September 2024.
-
Large Language Models for Automatic Detection of Sensitive Topics
Authors:
Ruoyu Wen,
Stephanie Elena Crowe,
Kunal Gupta,
Xinyue Li,
Mark Billinghurst,
Simon Hoermann,
Dwain Allan,
Alaeddin Nassani,
Thammathip Piumsomboon
Abstract:
Sensitive information detection is crucial in content moderation to maintain safe online communities. Assisting in this traditionally manual process could relieve human moderators from overwhelming and tedious tasks, allowing them to focus solely on flagged content that may pose potential risks. Rapidly advancing large language models (LLMs) are known for their capability to understand and process…
▽ More
Sensitive information detection is crucial in content moderation to maintain safe online communities. Assisting in this traditionally manual process could relieve human moderators from overwhelming and tedious tasks, allowing them to focus solely on flagged content that may pose potential risks. Rapidly advancing large language models (LLMs) are known for their capability to understand and process natural language and so present a potential solution to support this process. This study explores the capabilities of five LLMs for detecting sensitive messages in the mental well-being domain within two online datasets and assesses their performance in terms of accuracy, precision, recall, F1 scores, and consistency. Our findings indicate that LLMs have the potential to be integrated into the moderation workflow as a convenient and precise detection tool. The best-performing model, GPT-4o, achieved an average accuracy of 99.5\% and an F1-score of 0.99. We discuss the advantages and potential challenges of using LLMs in the moderation workflow and suggest that future research should address the ethical considerations of utilising this technology.
△ Less
Submitted 2 September, 2024;
originally announced September 2024.
-
Simplicial degree $d$ self-maps on $n$-spheres
Authors:
Biplab Basak,
Raju Kumar Gupta,
Ayushi Trivedi
Abstract:
The degree of a map between orientable manifolds is a crucial concept in topology, providing deep insights into the structure and properties of the manifolds and the corresponding maps. This concept has been thoroughly investigated, particularly in the realm of simplicial maps between orientable triangulable spaces. In this paper, we concentrate on constructing simplicial degree $d$ self-maps on…
▽ More
The degree of a map between orientable manifolds is a crucial concept in topology, providing deep insights into the structure and properties of the manifolds and the corresponding maps. This concept has been thoroughly investigated, particularly in the realm of simplicial maps between orientable triangulable spaces. In this paper, we concentrate on constructing simplicial degree $d$ self-maps on $n$-spheres. We describe the construction of several such maps, demonstrating that for every $d \in \mathbb{Z} \setminus {0}$, there exists a degree $d$ simplicial map from a triangulated $n$-sphere with $3|d| + n - 1$ vertices to $\mathbb{S}^n_{n+2}$. Further, we prove that, for every $d \in \mathbb{Z} \setminus {0}$, there exists a simplicial map of degree $3 d$ from a triangulated $n$-sphere with $6|d| + n$ vertices, as well as a simplicial map of degree $3d+\frac{d}{|d|}$ from a triangulated $n$-sphere with $6|d|+n+3$ vertices, to $\mathbb{S}^{n}_{n+2}$. Furthermore, we show that for any $|k| \geq 2$ and $n \geq |k|$, a degree $k$ simplicial map exists from a triangulated $n$-sphere $K$ with $|k| + n + 3$ vertices to $\mathbb{S}^n_{n+2}$. We also prove that for $d = 2$ and 3, these constructions produce vertex-minimal degree $d$ self-maps of $n$-spheres. Additionally, for every $n \geq 2$, we construct a degree $n+1$ simplicial map from a triangulated $n$-sphere with $2n + 4$ vertices to $\mathbb{S}^{n}_{n+2}$. We also prove that this construction provides facet minimal degree $n+1$ self-maps of $n$-spheres.
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
Building FKG.in: a Knowledge Graph for Indian Food
Authors:
Saransh Kumar Gupta,
Lipika Dey,
Partha Pratim Das,
Ramesh Jain
Abstract:
This paper presents an ontology design along with knowledge engineering, and multilingual semantic reasoning techniques to build an automated system for assimilating culinary information for Indian food in the form of a knowledge graph. The main focus is on designing intelligent methods to derive ontology designs and capture all-encompassing knowledge about food, recipes, ingredients, cooking char…
▽ More
This paper presents an ontology design along with knowledge engineering, and multilingual semantic reasoning techniques to build an automated system for assimilating culinary information for Indian food in the form of a knowledge graph. The main focus is on designing intelligent methods to derive ontology designs and capture all-encompassing knowledge about food, recipes, ingredients, cooking characteristics, and most importantly, nutrition, at scale. We present our ongoing work in this workshop paper, describe in some detail the relevant challenges in curating knowledge of Indian food, and propose our high-level ontology design. We also present a novel workflow that uses AI, LLM, and language technology to curate information from recipe blog sites in the public domain to build knowledge graphs for Indian food. The methods for knowledge curation proposed in this paper are generic and can be replicated for any domain. The design is application-agnostic and can be used for AI-driven smart analysis, building recommendation systems for Personalized Digital Health, and complementing the knowledge graph for Indian food with contextual information such as user information, food biochemistry, geographic information, agricultural information, etc.
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
MaskCycleGAN-based Whisper to Normal Speech Conversion
Authors:
K. Rohith Gupta,
K. Ramnath,
S. Johanan Joysingh,
P. Vijayalakshmi,
T. Nagarajan
Abstract:
Whisper to normal speech conversion is an active area of research. Various architectures based on generative adversarial networks have been proposed in the recent past. Especially, recent study shows that MaskCycleGAN, which is a mask guided, and cyclic consistency keeping, generative adversarial network, performs really well for voice conversion from spectrogram representations. In the current wo…
▽ More
Whisper to normal speech conversion is an active area of research. Various architectures based on generative adversarial networks have been proposed in the recent past. Especially, recent study shows that MaskCycleGAN, which is a mask guided, and cyclic consistency keeping, generative adversarial network, performs really well for voice conversion from spectrogram representations. In the current work we present a MaskCycleGAN approach for the conversion of whispered speech to normal speech. We find that tuning the mask parameters, and pre-processing the signal with a voice activity detector provides superior performance when compared to the existing approach. The wTIMIT dataset is used for evaluation. Objective metrics such as PESQ and G-Loss are used to evaluate the converted speech, along with subjective evaluation using mean opinion score. The results show that the proposed approach offers considerable benefits.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
HER2 and FISH Status Prediction in Breast Biopsy H&E-Stained Images Using Deep Learning
Authors:
Ardhendu Sekhar,
Vrinda Goel,
Garima Jain,
Abhijeet Patil,
Ravi Kant Gupta,
Amit Sethi
Abstract:
The current standard for detecting human epidermal growth factor receptor 2 (HER2) status in breast cancer patients relies on HER2 amplification, identified through fluorescence in situ hybridization (FISH) or immunohistochemistry (IHC). However, hematoxylin and eosin (H\&E) tumor stains are more widely available, and accurately predicting HER2 status using H\&E could reduce costs and expedite tre…
▽ More
The current standard for detecting human epidermal growth factor receptor 2 (HER2) status in breast cancer patients relies on HER2 amplification, identified through fluorescence in situ hybridization (FISH) or immunohistochemistry (IHC). However, hematoxylin and eosin (H\&E) tumor stains are more widely available, and accurately predicting HER2 status using H\&E could reduce costs and expedite treatment selection. Deep Learning algorithms for H&E have shown effectiveness in predicting various cancer features and clinical outcomes, including moderate success in HER2 status prediction. In this work, we employed a customized weak supervision classification technique combined with MoCo-v2 contrastive learning to predict HER2 status. We trained our pipeline on 182 publicly available H&E Whole Slide Images (WSIs) from The Cancer Genome Atlas (TCGA), for which annotations by the pathology team at Yale School of Medicine are publicly available. Our pipeline achieved an Area Under the Curve (AUC) of 0.85 across four different test folds. Additionally, we tested our model on 44 H&E slides from the TCGA-BRCA dataset, which had an HER2 score of 2+ and included corresponding HER2 status and FISH test results. These cases are considered equivocal for IHC, requiring an expensive FISH test on their IHC slides for disambiguation. Our pipeline demonstrated an AUC of 0.81 on these challenging H&E slides. Reducing the need for FISH test can have significant implications in cancer treatment equity for underserved populations.
△ Less
Submitted 28 August, 2024; v1 submitted 25 August, 2024;
originally announced August 2024.
-
Few-Shot Histopathology Image Classification: Evaluating State-of-the-Art Methods and Unveiling Performance Insights
Authors:
Ardhendu Sekhar,
Ravi Kant Gupta,
Amit Sethi
Abstract:
This paper presents a study on few-shot classification in the context of histopathology images. While few-shot learning has been studied for natural image classification, its application to histopathology is relatively unexplored. Given the scarcity of labeled data in medical imaging and the inherent challenges posed by diverse tissue types and data preparation techniques, this research evaluates…
▽ More
This paper presents a study on few-shot classification in the context of histopathology images. While few-shot learning has been studied for natural image classification, its application to histopathology is relatively unexplored. Given the scarcity of labeled data in medical imaging and the inherent challenges posed by diverse tissue types and data preparation techniques, this research evaluates the performance of state-of-the-art few-shot learning methods for various scenarios on histology data. We have considered four histopathology datasets for few-shot histopathology image classification and have evaluated 5-way 1-shot, 5-way 5-shot and 5-way 10-shot scenarios with a set of state-of-the-art classification techniques. The best methods have surpassed an accuracy of 70%, 80% and 85% in the cases of 5-way 1-shot, 5-way 5-shot and 5-way 10-shot cases, respectively. We found that for histology images popular meta-learning approaches is at par with standard fine-tuning and regularization methods. Our experiments underscore the challenges of working with images from different domains and underscore the significance of unbiased and focused evaluations in advancing computer vision techniques for specialized domains, such as histology images.
△ Less
Submitted 25 August, 2024;
originally announced August 2024.
-
Impact of Annealing on Perpendicular Magnetic Anisotropy in W/MgAl2O4/CoFeMnSi/W/CoFeMnSi/MgAl2O4/W. Double Storage Layers for Upcoming MTJs
Authors:
L. Saravanan,
Nanhe Kumar Gupta,
Vireshwar Mishra,
Sujeet Chaudhary,
Carlos Garcia
Abstract:
In this study, we achieved the improvement of uniaxial perpendicular magnetic anisotropy (PMA) in the W/MgAl2O4/CoFeMnSi/W/CoFeMnSi/MgAl2O4/W heterostructure by manipulating the annealing temperature (TA) [350 C, 450 C, and 550 C]. We observed a maximum effective PMA energy density (Keff) of = 1.604 x 106 erg/cc with low saturation magnetization (Ms) at the specified TA. The enhancement of Keff wi…
▽ More
In this study, we achieved the improvement of uniaxial perpendicular magnetic anisotropy (PMA) in the W/MgAl2O4/CoFeMnSi/W/CoFeMnSi/MgAl2O4/W heterostructure by manipulating the annealing temperature (TA) [350 C, 450 C, and 550 C]. We observed a maximum effective PMA energy density (Keff) of = 1.604 x 106 erg/cc with low saturation magnetization (Ms) at the specified TA. The enhancement of Keff with Ms is significantly influenced by structural variations at the interfaces of CoFeMnSi and MgAl2O4, attributed to sufficient interfacial oxidation dependent on the TA. The TA was identified as a critical factor affecting the surface morphology, grain size, and surface roughness of the multilayer. Fourier-transform infrared (FT-IR) measurements were employed to confirm the presence of Co-O or Fe-O bond in the multilayer structures, elucidating the true origin of PMA. The control of interfacial oxidation at the interface during annealing is crucial for regulating the strength of PMA. Therefore, this double CoFeMnSi/MgAl2O4-based multilayer presents a promising avenue, serving as a favorable candidate for future p-MTJs-based spintronic devices with enhanced thermal stability.
△ Less
Submitted 24 August, 2024;
originally announced August 2024.
-
SiTe CiM: Signed Ternary Computing-in-Memory for Ultra-Low Precision Deep Neural Networks
Authors:
Niharika Thakuria,
Akul Malhotra,
Sandeep K. Thirumala,
Reena Elangovan,
Anand Raghunathan,
Sumeet K. Gupta
Abstract:
Ternary Deep Neural Networks (DNN) have shown a large potential for highly energy-constrained systems by virtue of their low power operation (due to ultra-low precision) with only a mild degradation in accuracy. To enable an energy-efficient hardware substrate for such systems, we propose a compute-enabled memory design, referred to as SiTe-CiM, which features computing-in-memory (CiM) of dot prod…
▽ More
Ternary Deep Neural Networks (DNN) have shown a large potential for highly energy-constrained systems by virtue of their low power operation (due to ultra-low precision) with only a mild degradation in accuracy. To enable an energy-efficient hardware substrate for such systems, we propose a compute-enabled memory design, referred to as SiTe-CiM, which features computing-in-memory (CiM) of dot products between signed ternary (SiTe) inputs and weights. SiTe CiM is based on cross-coupling of two bit cells to enable CiM of dot products in the signed ternary regime. We explore SiTe CiM with 8T-SRAM, 3T-embedded DRAM (3T-eDRAM) and 3T-ferroelectric metal FET (FEMFET) memories. We propose two flavors of this technique, namely SiTe CiM I/II. In SiTe CiM I, we employ two additional transistors per cell for cross-coupling, achieving fast CiM operations, albeit incurring an area overhead ranging from 18% to 34% (compared to standard ternary memories). In SiTe CiM II, four extra transistors are utilized for every 16 cells in a column, thereby incurring only 6% area cost (but leading to slower CiM than SiTe CiM I). Based on the array analysis, our designs achieve up to 88% lower CiM latency and 78% CiM energy savings across various technologies considered, as compared to their respective near-memory computing counterparts. Further, we perform system level analysis by incorporating SiTe CiM I/II arrays in a ternary DNN accelerator and show up to 7X throughput boost and up to 2.5X energy reduction compared to the near-memory ternary DNN accelerators.
△ Less
Submitted 24 August, 2024;
originally announced August 2024.
-
Superconductor to metal quantum phase transition with magnetic field in Josephson coupled lead islands on Graphene
Authors:
Suraina Gupta,
Santu Prasad Jana,
Rukshana Pervin,
Anjan K. Gupta
Abstract:
Superconductor-to-metal transition with magnetic field and gate-voltage is studied in a Josephson junction array comprising of randomly distributed lead islands on exfoliated single-layer graphene with a back-gate. The low magnetic-field superconductivity onset temperature is fitted to the Werthamer-Helfand-Hohenberg theory to model the temperature dependence of the upper critical field. The magne…
▽ More
Superconductor-to-metal transition with magnetic field and gate-voltage is studied in a Josephson junction array comprising of randomly distributed lead islands on exfoliated single-layer graphene with a back-gate. The low magnetic-field superconductivity onset temperature is fitted to the Werthamer-Helfand-Hohenberg theory to model the temperature dependence of the upper critical field. The magnetoresistance in the intermediate temperature and field regime is described using thermally activated flux flow dictated by field dependent activation barrier. The barrier also depends on the gate voltage which dictates the inter-island Josephson coupling and disorder. The magnetoresistance near the upper critical field at low temperatures shows signatures of a gate dependent continuous quantum phase transition between superconductor and metal. The finite size scaling analysis shows that this transition belongs to the $(2+1)$D-XY universality class without disorder.
△ Less
Submitted 24 August, 2024;
originally announced August 2024.
-
Deep Learning for Automated Wound Classification And Segmentation
Authors:
Md. Zihad Bin Jahangir,
Sumaiya Akter,
MD Abdullah Al Nasim,
Kishor Datta Gupta,
Roy George
Abstract:
Wounds, such as foot ulcers, pressure ulcers, leg ulcers, and infected wounds, come up with substantial problems for healthcare professionals. Prompt and accurate segmentation is crucial for effective treatment. However, contemporary methods need an exhaustive model that is qualified for both classification and segmentation, especially lightweight ones. In this work, we tackle this issue by presen…
▽ More
Wounds, such as foot ulcers, pressure ulcers, leg ulcers, and infected wounds, come up with substantial problems for healthcare professionals. Prompt and accurate segmentation is crucial for effective treatment. However, contemporary methods need an exhaustive model that is qualified for both classification and segmentation, especially lightweight ones. In this work, we tackle this issue by presenting a new architecture that incorporates U-Net, which is optimized for both wound classification and effective segmentation. We curated four extensive and diverse collections of wound images, utilizing the publicly available Medetec Dataset, and supplemented with additional data sourced from the Internet. Our model performed exceptionally well, with an F1 score of 0.929, a Dice score of 0.931 in segmentation, and an accuracy of 0.915 in classification, proving its effectiveness in both classification and segmentation work. This accomplishment highlights the potential of our approach to automating wound care management.
△ Less
Submitted 10 August, 2024;
originally announced August 2024.
-
Imagen 3
Authors:
Imagen-Team-Google,
:,
Jason Baldridge,
Jakob Bauer,
Mukul Bhutani,
Nicole Brichtova,
Andrew Bunner,
Kelvin Chan,
Yichang Chen,
Sander Dieleman,
Yuqing Du,
Zach Eaton-Rosen,
Hongliang Fei,
Nando de Freitas,
Yilin Gao,
Evgeny Gladchenko,
Sergio Gómez Colmenarejo,
Mandy Guo,
Alex Haig,
Will Hawkins,
Hexiang Hu,
Huilian Huang,
Tobenna Peter Igwe,
Christos Kaplanis,
Siavash Khodadadeh
, et al. (227 additional authors not shown)
Abstract:
We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.
We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
Comparative Evaluation of Memory Technologies for Synaptic Crossbar Arrays- Part 2: Design Knobs and DNN Accuracy Trends
Authors:
Jeffry Victor,
Chunguang Wang,
Sumeet K. Gupta
Abstract:
Crossbar memory arrays have been touted as the workhorse of in-memory computing (IMC)-based acceleration of Deep Neural Networks (DNNs), but the associated hardware non-idealities limit their efficacy. To address this, cross-layer design solutions that reduce the impact of hardware non-idealities on DNN accuracy are needed. In Part 1 of this paper, we established the co-optimization strategies for…
▽ More
Crossbar memory arrays have been touted as the workhorse of in-memory computing (IMC)-based acceleration of Deep Neural Networks (DNNs), but the associated hardware non-idealities limit their efficacy. To address this, cross-layer design solutions that reduce the impact of hardware non-idealities on DNN accuracy are needed. In Part 1 of this paper, we established the co-optimization strategies for various memory technologies and their crossbar arrays, and conducted a comparative technology evaluation in the context of IMC robustness. In this part, we analyze various design knobs such as array size and bit-slice (number of bits per device) and their impact on the performance of 8T SRAM, ferroelectric transistor (FeFET), Resistive RAM (ReRAM) and spin-orbit-torque magnetic RAM (SOT-MRAM) in the context of inference accuracy at 7nm technology node. Further, we study the effect of circuit design solutions such as Partial Wordline Activation (PWA) and custom ADC reference levels that reduce the hardware non-idealities and comparatively analyze the response of each technology to such accuracy enhancing techniques. Our results on ResNet-20 (with CIFAR-10) show that PWA increases accuracy by up to 32.56% while custom ADC reference levels yield up to 31.62% accuracy enhancement. We observe that compared to the other technologies, FeFET, by virtue of its small layout height and high distinguishability of its memory states, is best suited for large arrays. For higher bit-slices and a more complex dataset (ResNet-50 with Cifar-100) we found that ReRAM matches the performance of FeFET.
△ Less
Submitted 11 August, 2024;
originally announced August 2024.
-
Securing the Diagnosis of Medical Imaging: An In-depth Analysis of AI-Resistant Attacks
Authors:
Angona Biswas,
MD Abdullah Al Nasim,
Kishor Datta Gupta,
Roy George,
Abdur Rashid
Abstract:
Machine learning (ML) is a rapidly developing area of medicine that uses significant resources to apply computer science and statistics to medical issues. ML's proponents laud its capacity to handle vast, complicated, and erratic medical data. It's common knowledge that attackers might cause misclassification by deliberately creating inputs for machine learning classifiers. Research on adversarial…
▽ More
Machine learning (ML) is a rapidly developing area of medicine that uses significant resources to apply computer science and statistics to medical issues. ML's proponents laud its capacity to handle vast, complicated, and erratic medical data. It's common knowledge that attackers might cause misclassification by deliberately creating inputs for machine learning classifiers. Research on adversarial examples has been extensively conducted in the field of computer vision applications. Healthcare systems are thought to be highly difficult because of the security and life-or-death considerations they include, and performance accuracy is very important. Recent arguments have suggested that adversarial attacks could be made against medical image analysis (MedIA) technologies because of the accompanying technology infrastructure and powerful financial incentives. Since the diagnosis will be the basis for important decisions, it is essential to assess how strong medical DNN tasks are against adversarial attacks. Simple adversarial attacks have been taken into account in several earlier studies. However, DNNs are susceptible to more risky and realistic attacks. The present paper covers recent proposed adversarial attack strategies against DNNs for medical imaging as well as countermeasures. In this study, we review current techniques for adversarial imaging attacks, detections. It also encompasses various facets of these techniques and offers suggestions for the robustness of neural networks to be improved in the future.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
The extremes of AGN variability: outbursts, deep fades, changing looks, exceptional spectral states, and semi-periodicities
Authors:
S. Komossa,
D. Grupe,
P. Marziani,
L. C. Popovic,
S. Marceta-Mandic,
E. Bon,
D. Ilic,
A. B. Kovacevic,
A. Kraus,
Z. Haiman,
V. Petrecca,
D. De Cicco,
M. S. Dimitrijevic,
V. A. Sreckovic,
J. Kovacevic Dojcinovic,
M. Pannikkote,
N. Bon,
K. K. Gupta,
F. Iacob
Abstract:
The extremes of Active Galactic Nuclei (AGN) variability offer valuable new insights into the drivers and physics of AGN. We discuss some of the most extreme cases of AGN variability; the highest amplitudes, deep minima states, extreme spectral states, Seyfert-type changes, and semi-periodic signals, including new X-ray observations. The properties of changing-look (CL) AGN are briefly reviewed an…
▽ More
The extremes of Active Galactic Nuclei (AGN) variability offer valuable new insights into the drivers and physics of AGN. We discuss some of the most extreme cases of AGN variability; the highest amplitudes, deep minima states, extreme spectral states, Seyfert-type changes, and semi-periodic signals, including new X-ray observations. The properties of changing-look (CL) AGN are briefly reviewed and a classification scheme is proposed which encompasses the variety of CL phenomena; distinguishing slow and fast events, repeat events, and frozen-look AGN which do not show any emission-line response. Long-term light curves that are densely covered over multiple years, along with follow-up spectroscopy, are utilized to gain insight into the underlying variability mechanisms including accretion disk and broad-line region physics. Remarkable differences are seen, for instance, in the optical spectral response to extreme outbursts, implying distinct intrinsic variability mechanisms. Furthermore, we discuss methods for distinguishing between CL AGN and CL look-alike events (tidal disruption events or supernovae in dense media). Finally, semi-periodic light curve variability is addressed and the latest multiwavelength (MWL) light curve of the binary supermassive black hole (SMBH) candidate OJ 287 from the MOMO project is presented. Recent results from that project have clearly established the need for new binary SMBH modelling matching the tight new constraints from observations, including the measurement of a low (primary) SMBH mass of ~10^8 Msun which also implies that OJ 287 is no longer in the regime of near-future pulsar timing arrays.
△ Less
Submitted 31 July, 2024;
originally announced August 2024.
-
A divide-and-conquer approach for spatio-temporal analysis of large house price data from Greater London
Authors:
Kapil Gupta,
Soudeep Deb
Abstract:
Statistical research in real estate markets, particularly in understanding the spatio-temporal dynamics of house prices, has garnered significant attention in recent times. Although Bayesian methods are common in spatio-temporal modeling, standard Markov chain Monte Carlo (MCMC) techniques are usually slow for large datasets such as house price data. To tackle this problem, we propose a divide-and…
▽ More
Statistical research in real estate markets, particularly in understanding the spatio-temporal dynamics of house prices, has garnered significant attention in recent times. Although Bayesian methods are common in spatio-temporal modeling, standard Markov chain Monte Carlo (MCMC) techniques are usually slow for large datasets such as house price data. To tackle this problem, we propose a divide-and-conquer spatio-temporal modeling approach. This method involves partitioning the data into multiple subsets and applying an appropriate Gaussian process model to each subset in parallel. The results from each subset are then combined using the Wasserstein barycenter technique to obtain the global parameters for the original problem. The proposed methodology allows for multiple observations per spatial and time unit, thereby offering added benefits for practitioners. As a real-life application, we analyze house price data of more than 0.6 million transactions from 983 middle layer super output areas in London over a period of eight years. The methodology provides insightful findings about the effects of various amenities, trend patterns, and the relationship between prices and carbon emissions. Furthermore, as demonstrated through a cross-validation study, it shows good predictive accuracy while balancing computational efficiency.
△ Less
Submitted 22 July, 2024;
originally announced July 2024.
-
Encouraging Responsible Use of Generative AI in Education: A Reward-Based Learning Approach
Authors:
Aditi Singh,
Abul Ehtesham,
Saket Kumar,
Gaurav Kumar Gupta,
Tala Talaei Khoei
Abstract:
This research introduces an innovative mathematical learning approach that integrates generative AI to cultivate a structured learning rather than quick solution. Our method combines chatbot capabilities and generative AI to offer interactive problem-solving exercises, enhancing learning through a stepby-step approach for varied problems, advocating for the responsible use of AI in education. Our…
▽ More
This research introduces an innovative mathematical learning approach that integrates generative AI to cultivate a structured learning rather than quick solution. Our method combines chatbot capabilities and generative AI to offer interactive problem-solving exercises, enhancing learning through a stepby-step approach for varied problems, advocating for the responsible use of AI in education. Our approach emphasizes that immediate answers from ChatGPT can impede real learning. We introduce a reward-based system that requires students to solve mathematical problems effectively to receive the final answer. This encourages a progressive learning path from basic to complex problems, rewarding mastery with final solutions. The goal is to transition students from seeking quick fixes to engaging actively in a comprehensive learning experience.
△ Less
Submitted 26 June, 2024;
originally announced July 2024.
-
Small Signal Capacitance in Ferroelectric HZO: Mechanisms and Physical Insights
Authors:
Revanth Koduru,
Atanu K. Saha,
Martin M. Frank,
Sumeet K. Gupta
Abstract:
This study presents a theoretical investigation of the physical mechanisms governing small signal capacitance in ferroelectrics, focusing on Hafnium Zirconium Oxide. Utilizing a time-dependent Ginzburg Landau formalism-based 2D multi-grain phase-field simulation framework, we simulate the capacitance of metal-ferroelectric-insulator-metal (MFIM) capacitors. Our simulation methodology closely mirro…
▽ More
This study presents a theoretical investigation of the physical mechanisms governing small signal capacitance in ferroelectrics, focusing on Hafnium Zirconium Oxide. Utilizing a time-dependent Ginzburg Landau formalism-based 2D multi-grain phase-field simulation framework, we simulate the capacitance of metal-ferroelectric-insulator-metal (MFIM) capacitors. Our simulation methodology closely mirrors the experimental procedures for measuring ferroelectric small signal capacitance, and the outcomes replicate the characteristic butterfly capacitance-voltage behavior. We delve into the components of the ferroelectric capacitance associated with the dielectric response and polarization switching, discussing the primary physical mechanisms - domain bulk response and domain wall response - contributing to the butterfly characteristics. We explore their interplay and relative contributions to the capacitance and correlate them to the polarization domain characteristics. Additionally, we investigate the impact of increasing domain density with ferroelectric thickness scaling, demonstrating an enhancement in the polarization capacitance component (in addition to the dielectric component). We further analyze the relative contributions of the domain bulk and domain wall responses across different ferroelectric thicknesses. Lastly, we establish the relation of polarization capacitance components to the capacitive memory window (for memory applications) and reveal a non-monotonic dependence of the maximum memory window on HZO thickness.
△ Less
Submitted 20 July, 2024;
originally announced July 2024.
-
Swift-BAT GUANO follow-up of gravitational-wave triggers in the third LIGO-Virgo-KAGRA observing run
Authors:
Gayathri Raman,
Samuele Ronchini,
James Delaunay,
Aaron Tohuvavohu,
Jamie A. Kennea,
Tyler Parsotan,
Elena Ambrosi,
Maria Grazia Bernardini,
Sergio Campana,
Giancarlo Cusumano,
Antonino D'Ai,
Paolo D'Avanzo,
Valerio D'Elia,
Massimiliano De Pasquale,
Simone Dichiara,
Phil Evans,
Dieter Hartmann,
Paul Kuin,
Andrea Melandri,
Paul O'Brien,
Julian P. Osborne,
Kim Page,
David M. Palmer,
Boris Sbarufatti,
Gianpiero Tagliaferri
, et al. (1797 additional authors not shown)
Abstract:
We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wav…
▽ More
We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wave Transient Catalogs (GWTC-3). Targeted searches were carried out on the entire GW sample using the maximum--likelihood NITRATES pipeline on the BAT data made available via the GUANO infrastructure. We do not detect any significant electromagnetic emission that is temporally and spatially coincident with any of the GW candidates. We report flux upper limits in the 15-350 keV band as a function of sky position for all the catalog candidates. For GW candidates where the Swift-BAT false alarm rate is less than 10$^{-3}$ Hz, we compute the GW--BAT joint false alarm rate. Finally, the derived Swift-BAT upper limits are used to infer constraints on the putative electromagnetic emission associated with binary black hole mergers.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
Vortices on Cylinders and Warped Exponential Networks
Authors:
Kunal Gupta,
Pietro Longhi
Abstract:
We study 3d $\mathcal{N}=2$ $U(1)$ Chern-Simons-matter QFT on a cylinder $C\times\mathbb{R}$. The topology of $C$ gives rise to BPS sectors of low-energy solitons known as kinky vortices, which interpolate between (possibly) different vacua at the ends of the cylinder and at the same time carry magnetic flux.
We compute the spectrum of BPS vortices on the cylinder in an isolated Higgs vacuum, th…
▽ More
We study 3d $\mathcal{N}=2$ $U(1)$ Chern-Simons-matter QFT on a cylinder $C\times\mathbb{R}$. The topology of $C$ gives rise to BPS sectors of low-energy solitons known as kinky vortices, which interpolate between (possibly) different vacua at the ends of the cylinder and at the same time carry magnetic flux.
We compute the spectrum of BPS vortices on the cylinder in an isolated Higgs vacuum, through the framework of \emph{warped} exponential networks, which we introduce. We then conjecture a relation between these and standard vortices on $\mathbb{R}^2$, which are related to genus-zero open Gromov-Witten invariants of toric branes. More specifically, we show that in the limit of large Fayet-Iliopoulos coupling, the spectrum of kinky vortices on $C$ undergoes an infinite sequence of wall-crossing transitions, and eventually stabilizes. We then propose an exact relation between a generating series of stabilized CFIV indices and the Gromov-Witten disk potential, and discuss its consequences for the structure of moduli spaces of vortices.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Effect of ground-state deformation on the Isoscalar Giant Monopole Resonance and the first observation of overtones of the Isoscalar Giant Quadrupole Resonance in rare-earth Nd isotopes
Authors:
M. Abdullah,
S. Bagchi,
M. N. Harakeh,
H. Akimune,
D. Das,
T. Doi,
L. M. Donaldson,
Y. Fujikawa,
M. Fujiwara,
T. Furuno,
U. Garg,
Y. K. Gupta,
K. B. Howard,
Y. Hijikata,
K. Inaba,
S. Ishida,
M. Itoh,
N. Kalantar-Nayestanaki,
D. Kar,
T. Kawabata,
S. Kawashima,
K. Khokhar,
K. Kitamura,
N. Kobayashi,
Y. Matsuda
, et al. (11 additional authors not shown)
Abstract:
The strength distributions of the Isoscalar Giant Monopole Resonance (ISGMR) and Isoscalar Giant Quadrupole Resonance (ISGQR) in 142,146-150Nd have been determined via inelastic alpha-particle scattering with the Grand Raiden (GR) Spectrometer at the Research Center for Nuclear Physics (RCNP), Japan. In the deformed nuclei 146-150Nd, the ISGMR strength distributions exhibit a splitting into two co…
▽ More
The strength distributions of the Isoscalar Giant Monopole Resonance (ISGMR) and Isoscalar Giant Quadrupole Resonance (ISGQR) in 142,146-150Nd have been determined via inelastic alpha-particle scattering with the Grand Raiden (GR) Spectrometer at the Research Center for Nuclear Physics (RCNP), Japan. In the deformed nuclei 146-150Nd, the ISGMR strength distributions exhibit a splitting into two components, while the nearly spherical nucleus 142Nd displays a single peak in the ISGMR strength distribution. A noteworthy achievement in this study is the first-time detection of overtones in the Isoscalar Giant Quadrupole Resonance (ISGQR) strength distributions within Nd isotopes at an excitation energy around 25 MeV obtained through Multipole Decomposition Analysis (MDA).
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
The infrastructure powering IBM's Gen AI model development
Authors:
Talia Gershon,
Seetharami Seelam,
Brian Belgodere,
Milton Bonilla,
Lan Hoang,
Danny Barnett,
I-Hsin Chung,
Apoorve Mohan,
Ming-Hung Chen,
Lixiang Luo,
Robert Walkup,
Constantinos Evangelinos,
Shweta Salaria,
Marc Dombrowa,
Yoonho Park,
Apo Kayi,
Liran Schour,
Alim Alim,
Ali Sydney,
Pavlos Maniotis,
Laurent Schares,
Bernard Metzler,
Bengi Karacali-Akyamac,
Sophia Wen,
Tatsuhiro Chiba
, et al. (121 additional authors not shown)
Abstract:
AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering effi…
▽ More
AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering efficient and high-performing AI training requires an end-to-end solution that combines hardware, software and holistic telemetry to cater for multiple types of AI workloads. In this report, we describe IBM's hybrid cloud infrastructure that powers our generative AI model development. This infrastructure includes (1) Vela: an AI-optimized supercomputing capability directly integrated into the IBM Cloud, delivering scalable, dynamic, multi-tenant and geographically distributed infrastructure for large-scale model training and other AI workflow steps and (2) Blue Vela: a large-scale, purpose-built, on-premises hosting environment that is optimized to support our largest and most ambitious AI model training tasks. Vela provides IBM with the dual benefit of high performance for internal use along with the flexibility to adapt to an evolving commercial landscape. Blue Vela provides us with the benefits of rapid development of our largest and most ambitious models, as well as future-proofing against the evolving model landscape in the industry. Taken together, they provide IBM with the ability to rapidly innovate in the development of both AI models and commercial offerings.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Observation of near-scission "polar" and "equatorial" proton emission in heavy-ion induced fission
Authors:
Pawan Singh,
Y. K. Gupta,
G. K. Prajapati,
B. N. Joshi,
V. G. Prajapati,
N. Sirswal,
K. Ramachandran,
A. S. Pradeep,
V. S. Dagre,
M. Kumar,
A. Jhingan,
N. Deshmukh,
B. V. John,
B. K. Nayak,
D. C. Biswas,
R. K. Choudhury
Abstract:
Proton and $α$-particle energy spectra were measured in coincidence with fission fragments at different relative angles in $^{16}$O (96 MeV) + $^{232}$Th reaction. The multiplicity spectra were analyzed within the framework of a Moving Source Disentangling Analysis (MSDA) to determine contributions from different emission stages. The MSDA conclusively shows ``Near Scission Emission (NSE)" as an es…
▽ More
Proton and $α$-particle energy spectra were measured in coincidence with fission fragments at different relative angles in $^{16}$O (96 MeV) + $^{232}$Th reaction. The multiplicity spectra were analyzed within the framework of a Moving Source Disentangling Analysis (MSDA) to determine contributions from different emission stages. The MSDA conclusively shows ``Near Scission Emission (NSE)" as an essential component in the multiplicity spectra. In contrast to NSE $α$ particles which emit mainly perpendicular (``equatorial emission"), the NSE protons are observed to be emitted perpendicular as well as parallel (``polar emission") to the fission axis with similar intensities ($\sim$20\% for each). Thus, around 40\% of total pre-scission protons are emitted near the scission stage, whereas the same fraction for $α$ particles is only around 10\%. The inevitable presence of ``polar" and ``equatorial" NSE protons in a heavy-ion induced fission has been observed for the first time. Present results open up a new avenue to study the heavy-ion induced fission dynamics.
△ Less
Submitted 22 August, 2024; v1 submitted 5 July, 2024;
originally announced July 2024.
-
Stokes' paradox in rarefied gases: A perspective through the method of fundamental solutions
Authors:
Himanshi,
Anirudh Singh Rana,
Vinay Kumar Gupta
Abstract:
In the realm of fluid dynamics, a curious and counterintuitive phenomenon is Stokes' paradox. While Stokes equations -- used for modeling slow and steady flows -- lead to a meaningful solution to the problem of slow and steady flow past a sphere, they fail to yield a non-trivial solution to the problem of slow and steady flow past an infinitely long cylinder (a two-dimensional problem essentially)…
▽ More
In the realm of fluid dynamics, a curious and counterintuitive phenomenon is Stokes' paradox. While Stokes equations -- used for modeling slow and steady flows -- lead to a meaningful solution to the problem of slow and steady flow past a sphere, they fail to yield a non-trivial solution to the problem of slow and steady flow past an infinitely long cylinder (a two-dimensional problem essentially); this is referred to as Stokes' paradox. We revisit this paradox in the context of rarefied gas flows by means of the method of fundamental solutions (MFS). To this end, we adopt an extended hydrodynamic model, referred to as the CCR model, consisting of the balance equations for the mass, momentum and energy and closed with the coupled constitutive relations. We determine an analytic solution of the CCR model for the problem and compare it with the MFS-based numerical solution. Apart from addressing flow past a circular cylinder, we aim to showcase the capability of the MFS to predict the flow past other objects in two dimensions for which the analytic solutions do not exist. For that, we investigate the problem of rarefied gas flow past an infinitely long semicircular cylinder.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNet
Authors:
Manish Dhakal,
Arman Chhetri,
Aman Kumar Gupta,
Prabin Lamichhane,
Suraj Pandey,
Subarna Shakya
Abstract:
This paper presents an end-to-end deep learning model for Automatic Speech Recognition (ASR) that transcribes Nepali speech to text. The model was trained and tested on the OpenSLR (audio, text) dataset. The majority of the audio dataset have silent gaps at both ends which are clipped during dataset preprocessing for a more uniform mapping of audio frames and their corresponding texts. Mel Frequen…
▽ More
This paper presents an end-to-end deep learning model for Automatic Speech Recognition (ASR) that transcribes Nepali speech to text. The model was trained and tested on the OpenSLR (audio, text) dataset. The majority of the audio dataset have silent gaps at both ends which are clipped during dataset preprocessing for a more uniform mapping of audio frames and their corresponding texts. Mel Frequency Cepstral Coefficients (MFCCs) are used as audio features to feed into the model. The model having Bidirectional LSTM paired with ResNet and one-dimensional CNN produces the best results for this dataset out of all the models (neural networks with variations of LSTM, GRU, CNN, and ResNet) that have been trained so far. This novel model uses Connectionist Temporal Classification (CTC) function for loss calculation during training and CTC beam search decoding for predicting characters as the most likely sequence of Nepali text. On the test dataset, the character error rate (CER) of 17.06 percent has been achieved. The source code is available at: https://github.com/manishdhakal/ASR-Nepali-using-CNN-BiLSTM-ResNet.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Present and Future of AI in Renewable Energy Domain : A Comprehensive Survey
Authors:
Abdur Rashid,
Parag Biswas,
Angona Biswas,
MD Abdullah Al Nasim,
Kishor Datta Gupta,
Roy George
Abstract:
Artificial intelligence (AI) has become a crucial instrument for streamlining processes in various industries, including electrical power systems, as a result of recent digitalization. Algorithms for artificial intelligence are data-driven models that are based on statistical learning theory and are used as a tool to take use of the data that the power system and its users generate. Initially, we…
▽ More
Artificial intelligence (AI) has become a crucial instrument for streamlining processes in various industries, including electrical power systems, as a result of recent digitalization. Algorithms for artificial intelligence are data-driven models that are based on statistical learning theory and are used as a tool to take use of the data that the power system and its users generate. Initially, we perform a thorough literature analysis of artificial intelligence (AI) applications related to renewable energy (RE). Next, we present a thorough analysis of renewable energy factories and assess their suitability, along with a list of the most widely used and appropriate AI algorithms. Nine AI-based strategies are identified here to assist Renewable Energy (RE) in contemporary power systems. This survey paper comprises an extensive review of the several AI techniques used for renewable energy as well as a methodical analysis of the literature for the study of various intelligent system application domains across different disciplines of renewable energy. This literature review identifies the performance and outcomes of nine different research methods by assessing them, and it aims to distill valuable insights into their strengths and limitations. This study also addressed three main topics: using AI technology for renewable power generation, utilizing AI for renewable energy forecasting, and optimizing energy systems. Additionally, it explored AI's superiority over conventional models in controllability, data handling, cyberattack prevention, smart grid implementation, robotics- AI's significance in shaping the future of the energy industry. Furthermore, this article outlines future directions in the integration of AI for renewable energy.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
AI-Driven Approaches for Optimizing Power Consumption: A Comprehensive Survey
Authors:
Parag Biswas,
Abdur Rashid,
Angona Biswas,
Md Abdullah Al Nasim,
Kishor Datta Gupta,
Roy George
Abstract:
Reduced environmental effect, lower operating costs, and a stable and sustainable energy supply for current and future generations are the main reasons why power optimization is important. Power optimization makes ensuring that energy is used more effectively, cutting down on waste and optimizing the utilization of resources.In today's world, power optimization and artificial intelligence (AI) int…
▽ More
Reduced environmental effect, lower operating costs, and a stable and sustainable energy supply for current and future generations are the main reasons why power optimization is important. Power optimization makes ensuring that energy is used more effectively, cutting down on waste and optimizing the utilization of resources.In today's world, power optimization and artificial intelligence (AI) integration are essential to changing the way energy is produced, used, and distributed. Real-time monitoring and analysis of power usage trends is made possible by AI-driven algorithms and predictive analytics, which enable dynamic modifications to effectively satisfy demand. Efficiency and sustainability are increased when power consumption is optimized in different sectors thanks to the use of intelligent systems. This survey paper comprises an extensive review of the several AI techniques used for power optimization as well as a methodical analysis of the literature for the study of various intelligent system application domains across different disciplines of power consumption.This literature review identifies the performance and outcomes of 17 different research methods by assessing them, and it aims to distill valuable insights into their strengths and limitations. Furthermore, this article outlines future directions in the integration of AI for power consumption optimization.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Average edge order of normal $3$-pseudomanifolds
Authors:
Biplab Basak,
Raju Kumar Gupta
Abstract:
In their work [9], Feng Luo and Richard Stong introduced the concept of the average edge order, denoted as $μ_0(K)$. They demonstrated that if $μ_0(K)\leq \frac{9}{2}$ for a closed $3$-manifold $K$, then $K$ must be a sphere. Building upon this foundation, Makoto Tamura extended similar results to $3$-manifolds with non-empty boundaries in [10, 11]. In our present study, we extend these findings t…
▽ More
In their work [9], Feng Luo and Richard Stong introduced the concept of the average edge order, denoted as $μ_0(K)$. They demonstrated that if $μ_0(K)\leq \frac{9}{2}$ for a closed $3$-manifold $K$, then $K$ must be a sphere. Building upon this foundation, Makoto Tamura extended similar results to $3$-manifolds with non-empty boundaries in [10, 11]. In our present study, we extend these findings to normal $3$-pseudomanifolds. Specifically, we establish that for a normal $3$-pseudomanifold $K$ with singularities, $μ_0(K)\geq\frac{30}{7}$. Moreover, equality holds if and only if $K$ is a one-vertex suspension of $\mathbb{RP}^2$ with seven vertices. Furthermore, we establish that when $\frac{30}{7}\leqμ_0(K)\leq\frac{9}{2}$, the $3$-pseudomanifold $K$ can be derived from some boundary complexes of $4$-simplices by a sequence of possible operations, including connected sums, bistellar $1$-moves, edge contractions, edge expansions, vertex folding, and edge folding. In the end, we discuss some normal $3$-pseudomanifolds exhibiting higher average edge orders.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training
Authors:
Akul Malhotra,
Sumeet Kumar Gupta
Abstract:
Improving the hardware efficiency of deep neural network (DNN) accelerators with techniques such as quantization and sparsity enhancement have shown an immense promise. However, their inference accuracy in non-ideal real-world settings (such as in the presence of hardware faults) is yet to be systematically analyzed. In this work, we investigate the impact of memory faults on activation-sparse qua…
▽ More
Improving the hardware efficiency of deep neural network (DNN) accelerators with techniques such as quantization and sparsity enhancement have shown an immense promise. However, their inference accuracy in non-ideal real-world settings (such as in the presence of hardware faults) is yet to be systematically analyzed. In this work, we investigate the impact of memory faults on activation-sparse quantized DNNs (AS QDNNs). We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults, with AS QDNNs exhibiting up to 11.13% lower accuracy than the standard QDNNs. We establish that the degraded accuracy correlates with a sharper minima in the loss landscape for AS QDNNs, which makes them more sensitive to perturbations in the weight values due to faults. Based on this observation, we employ sharpness-aware quantization (SAQ) training to mitigate the impact of memory faults. The AS and standard QDNNs trained with SAQ have up to 19.50% and 15.82% higher inference accuracy, respectively compared to their conventionally trained equivalents. Moreover, we show that SAQ-trained AS QDNNs show higher accuracy in faulty settings than standard QDNNs trained conventionally. Thus, sharpness-aware training can be instrumental in achieving sparsity-related latency benefits without compromising on fault tolerance.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Gate voltage modulation of the superconducting state in a degenerate semiconductor
Authors:
Bikash C. Barik,
Himadri Chakraborti,
Buddhadeb Pal,
Aditya K. Jain,
Swagata Bhunia,
Sounak Samanta,
Apurba Laha,
Suddhasatta Mahapatra,
K. Das Gupta
Abstract:
In this work, we demonstrate that the modulation of carrier density can alter the superconducting transition temperature by up to $204$ mK in epitaxial Indium Nitride on Gallium Nitride, accounting for the $10$% of the transition temperature in ungated conditions. Our samples are likely free from strong localization effects and significant granularity, as indicated by $( k_{f l} \gg 1 )$, suggesti…
▽ More
In this work, we demonstrate that the modulation of carrier density can alter the superconducting transition temperature by up to $204$ mK in epitaxial Indium Nitride on Gallium Nitride, accounting for the $10$% of the transition temperature in ungated conditions. Our samples are likely free from strong localization effects and significant granularity, as indicated by $( k_{f l} \gg 1 )$, suggesting that the primary determinant of the transition temperature in InN is carrier density, rather than disorder scattering. The observed behavior is consistent with BCS s-wave superconductivity, corroborated by the superconducting parameters we measured. Furthermore, we observed a $60$% bipolar suppression of the supercurrent in our experiments.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
On Improving Error Resilience of Neural End-to-End Speech Coders
Authors:
Kishan Gupta,
Nicola Pia,
Srikanth Korse,
Andreas Brendel,
Guillaume Fuchs,
Markus Multrus
Abstract:
Error resilient tools like Packet Loss Concealment (PLC) and Forward Error Correction (FEC) are essential to maintain a reliable speech communication for applications like Voice over Internet Protocol (VoIP), where packets are frequently delayed and lost. In recent times, end-to-end neural speech codecs have seen a significant rise, due to their ability to transmit speech signal at low bitrates bu…
▽ More
Error resilient tools like Packet Loss Concealment (PLC) and Forward Error Correction (FEC) are essential to maintain a reliable speech communication for applications like Voice over Internet Protocol (VoIP), where packets are frequently delayed and lost. In recent times, end-to-end neural speech codecs have seen a significant rise, due to their ability to transmit speech signal at low bitrates but few considerations were made about their error resilience in a real system. Recently introduced Neural End-to-End Speech Codec (NESC) can reproduce high quality natural speech at low bitrates. We extend its robustness to packet losses by adding a low complexity network to predict the codebook indices in latent space. Furthermore, we propose a method to add an in-band FEC at an additional bitrate of 0.8 kbps. Both subjective and objective assessment indicate the effectiveness of proposed methods, and demonstrate that coupling PLC and FEC provide significant robustness against packet losses.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
UVIS: Unsupervised Video Instance Segmentation
Authors:
Shuaiyi Huang,
Saksham Suri,
Kamal Gupta,
Sai Saketh Rambhatla,
Ser-nam Lim,
Abhinav Shrivastava
Abstract:
Video instance segmentation requires classifying, segmenting, and tracking every object across video frames. Unlike existing approaches that rely on masks, boxes, or category labels, we propose UVIS, a novel Unsupervised Video Instance Segmentation (UVIS) framework that can perform video instance segmentation without any video annotations or dense label-based pretraining. Our key insight comes fro…
▽ More
Video instance segmentation requires classifying, segmenting, and tracking every object across video frames. Unlike existing approaches that rely on masks, boxes, or category labels, we propose UVIS, a novel Unsupervised Video Instance Segmentation (UVIS) framework that can perform video instance segmentation without any video annotations or dense label-based pretraining. Our key insight comes from leveraging the dense shape prior from the self-supervised vision foundation model DINO and the openset recognition ability from the image-caption supervised vision-language model CLIP. Our UVIS framework consists of three essential steps: frame-level pseudo-label generation, transformer-based VIS model training, and query-based tracking. To improve the quality of VIS predictions in the unsupervised setup, we introduce a dual-memory design. This design includes a semantic memory bank for generating accurate pseudo-labels and a tracking memory bank for maintaining temporal consistency in object tracks. We evaluate our approach on three standard VIS benchmarks, namely YoutubeVIS-2019, YoutubeVIS-2021, and Occluded VIS. Our UVIS achieves 21.1 AP on YoutubeVIS-2019 without any video annotations or dense pretraining, demonstrating the potential of our unsupervised VIS framework.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Realization of higher coordinated Er in high-pressure cotunnite phase of Er$_2$Ti$_2$O$_7$
Authors:
M. Modak,
Rahul Kaiwart,
Santosh K. Gupta,
A. Dwivedi,
K. K. Pandey,
A. K. Poswal,
H. K. Poswal
Abstract:
In this article we report the structural stability of Er$_2$Ti$_2$O$_7$ cubic pyrochlore with pressure using x-ray diffraction, Raman spectroscopy, photoluminescence, x-ray absorption and ab-initio calculations. Our studies establish a phase transformation in Er$_2$Ti$_2$O$_7$ from ambient cubic phase to high-pressure orthorhombic (cotunnite) phase, initiated at ~40 GPa. The transformation is slug…
▽ More
In this article we report the structural stability of Er$_2$Ti$_2$O$_7$ cubic pyrochlore with pressure using x-ray diffraction, Raman spectroscopy, photoluminescence, x-ray absorption and ab-initio calculations. Our studies establish a phase transformation in Er$_2$Ti$_2$O$_7$ from ambient cubic phase to high-pressure orthorhombic (cotunnite) phase, initiated at ~40 GPa. The transformation is sluggish and it does not complete even at the highest measured pressure in our study i.e. ~60.0 GPa. This is further supported by the first principle calculations which reveal that cotunnite phase is energetically more stable than the ambient phase above ~53 GPa. After complete release of pressure, the high-pressure cotunnite phase is retained while the fraction of untransformed pyrochlore phase becomes amorphous. Furthermore, the EXAFS data of the recovered sample at L3 edge of Er3+ ion show an increase in the coordination number of cations from eight at ambient to nine in the high-pressure phase. The mechanism of structural transformation is explained in terms of accumulation of cation antisite defects and subsequent disordering of cations and anions in their respective sublattice. The amorphization of the pyrochlore phase upon release is interpreted as the inability of accommodating the point defects at ambient conditions, which are formed in the pyrochlore lattice under compression.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Coexistence of Topological Dirac and Dirac Nodal line semimetal in SrCaP belonging to Nodal line semimetal family SrCaX(X= Bi, Sb, As, P)
Authors:
Shivendra Kumar Gupta,
Ashish Kore,
Saurabh Kumar Sen,
Poorva Singh
Abstract:
Nodal line semimetals represent precursor states for various topological phases, exhibiting intrinsic topological characteristics and intriguing properties. These materials host rare and distinctive topological features, which can give rise to exotic phenomena, thereby garnering significant attention in both fundamental research and technological applications. In this study, we conduct ab-initio c…
▽ More
Nodal line semimetals represent precursor states for various topological phases, exhibiting intrinsic topological characteristics and intriguing properties. These materials host rare and distinctive topological features, which can give rise to exotic phenomena, thereby garnering significant attention in both fundamental research and technological applications. In this study, we conduct ab-initio calculations to explore the properties of SrCaX (X = Bi, Sb, As, P), identifying these as multiple Dirac nodal line semimetals protected by Z2 quantized Berry phases and manifesting multiple drum-head-like surface states. The nodal lines in these compounds are situated at the M point when kz = 0 and at the A point when kz = π. Notably, SrCaX family exhibits a unique characteristic wherein they host both type II Dirac point and topological nodal line semimetal within a single crystal structure, hence providing an excellent platform for studying the interplay between different topological properties. Additionally, in SrCaP topological Dirac semimetal, Type II Dirac point and topological nodal line semimetal features coexist in a single crystal. These special features in this series of materials make them ideal candidates for further investigation by experimental means.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Second-Order Algorithms for Finding Local Nash Equilibria in Zero-Sum Games
Authors:
Kushagra Gupta,
Xinjie Liu,
Ufuk Topcu,
David Fridovich-Keil
Abstract:
Zero-sum games arise in a wide variety of problems, including robust optimization and adversarial learning. However, algorithms deployed for finding a local Nash equilibrium in these games often converge to non-Nash stationary points. This highlights a key challenge: for any algorithm, the stability properties of its underlying dynamical system can cause non-Nash points to be potential attractors.…
▽ More
Zero-sum games arise in a wide variety of problems, including robust optimization and adversarial learning. However, algorithms deployed for finding a local Nash equilibrium in these games often converge to non-Nash stationary points. This highlights a key challenge: for any algorithm, the stability properties of its underlying dynamical system can cause non-Nash points to be potential attractors. To overcome this challenge, algorithms must account for subtleties involving the curvatures of players' costs. To this end, we leverage dynamical system theory and develop a second-order algorithm for finding a local Nash equilibrium in the smooth, possibly nonconvex-nonconcave, zero-sum game setting. First, we prove that this novel method guarantees convergence to only local Nash equilibria with a local linear convergence rate. We then interpret a version of this method as a modified Gauss-Newton algorithm with local superlinear convergence to the neighborhood of a point that satisfies first-order local Nash equilibrium conditions. In comparison, current related state-of-the-art methods do not offer convergence rate guarantees. Furthermore, we show that this approach naturally generalizes to settings with convex and potentially coupled constraints while retaining earlier guarantees of convergence to only local (generalized) Nash equilibria.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Aurora: A Foundation Model of the Atmosphere
Authors:
Cristian Bodnar,
Wessel P. Bruinsma,
Ana Lucic,
Megan Stanley,
Johannes Brandstetter,
Patrick Garvan,
Maik Riechert,
Jonathan Weyn,
Haiyu Dong,
Anna Vaughan,
Jayesh K. Gupta,
Kit Tambiratnam,
Alex Archibald,
Elizabeth Heider,
Max Welling,
Richard E. Turner,
Paris Perdikaris
Abstract:
Deep learning foundation models are revolutionizing many facets of science by leveraging vast amounts of data to learn general-purpose representations that can be adapted to tackle diverse downstream tasks. Foundation models hold the promise to also transform our ability to model our planet and its subsystems by exploiting the vast expanse of Earth system data. Here we introduce Aurora, a large-sc…
▽ More
Deep learning foundation models are revolutionizing many facets of science by leveraging vast amounts of data to learn general-purpose representations that can be adapted to tackle diverse downstream tasks. Foundation models hold the promise to also transform our ability to model our planet and its subsystems by exploiting the vast expanse of Earth system data. Here we introduce Aurora, a large-scale foundation model of the atmosphere trained on over a million hours of diverse weather and climate data. Aurora leverages the strengths of the foundation modelling approach to produce operational forecasts for a wide variety of atmospheric prediction problems, including those with limited training data, heterogeneous variables, and extreme events. In under a minute, Aurora produces 5-day global air pollution predictions and 10-day high-resolution weather forecasts that outperform state-of-the-art classical simulation tools and the best specialized deep learning models. Taken together, these results indicate that foundation models can transform environmental forecasting.
△ Less
Submitted 28 May, 2024; v1 submitted 20 May, 2024;
originally announced May 2024.
-
Exploring Ordinality in Text Classification: A Comparative Study of Explicit and Implicit Techniques
Authors:
Siva Rajesh Kasa,
Aniket Goel,
Karan Gupta,
Sumegh Roychowdhury,
Anish Bhanushali,
Nikhil Pattisapu,
Prasanna Srinivasa Murthy
Abstract:
Ordinal Classification (OC) is a widely encountered challenge in Natural Language Processing (NLP), with applications in various domains such as sentiment analysis, rating prediction, and more. Previous approaches to tackle OC have primarily focused on modifying existing or creating novel loss functions that \textbf{explicitly} account for the ordinal nature of labels. However, with the advent of…
▽ More
Ordinal Classification (OC) is a widely encountered challenge in Natural Language Processing (NLP), with applications in various domains such as sentiment analysis, rating prediction, and more. Previous approaches to tackle OC have primarily focused on modifying existing or creating novel loss functions that \textbf{explicitly} account for the ordinal nature of labels. However, with the advent of Pretrained Language Models (PLMs), it became possible to tackle ordinality through the \textbf{implicit} semantics of the labels as well. This paper provides a comprehensive theoretical and empirical examination of both these approaches. Furthermore, we also offer strategic recommendations regarding the most effective approach to adopt based on specific settings.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
CPS-LLM: Large Language Model based Safe Usage Plan Generator for Human-in-the-Loop Human-in-the-Plant Cyber-Physical System
Authors:
Ayan Banerjee,
Aranyak Maity,
Payal Kamboj,
Sandeep K. S. Gupta
Abstract:
We explore the usage of large language models (LLM) in human-in-the-loop human-in-the-plant cyber-physical systems (CPS) to translate a high-level prompt into a personalized plan of actions, and subsequently convert that plan into a grounded inference of sequential decision-making automated by a real-world CPS controller to achieve a control goal. We show that it is relatively straightforward to c…
▽ More
We explore the usage of large language models (LLM) in human-in-the-loop human-in-the-plant cyber-physical systems (CPS) to translate a high-level prompt into a personalized plan of actions, and subsequently convert that plan into a grounded inference of sequential decision-making automated by a real-world CPS controller to achieve a control goal. We show that it is relatively straightforward to contextualize an LLM so it can generate domain-specific plans. However, these plans may be infeasible for the physical system to execute or the plan may be unsafe for human users. To address this, we propose CPS-LLM, an LLM retrained using an instruction tuning framework, which ensures that generated plans not only align with the physical system dynamics of the CPS but are also safe for human users. The CPS-LLM consists of two innovative components: a) a liquid time constant neural network-based physical dynamics coefficient estimator that can derive coefficients of dynamical models with some unmeasured state variables; b) the model coefficients are then used to train an LLM with prompts embodied with traces from the dynamical system and the corresponding model coefficients. We show that when the CPS-LLM is integrated with a contextualized chatbot such as BARD it can generate feasible and safe plans to manage external events such as meals for automated insulin delivery systems used by Type 1 Diabetes subjects.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
Simple and Efficient Quantization Techniques for Neural Speech Coding
Authors:
Andreas Brendel,
Nicola Pia,
Kishan Gupta,
Guillaume Fuchs,
Markus Multrus
Abstract:
Neural audio coding has emerged as a vivid research direction by promising good audio quality at very low bitrates unachievable by classical coding techniques. Here, end-to-end trainable autoencoder-like models represent the state of the art, where a discrete representation in the bottleneck of the autoencoder has to be learned that allows for efficient transmission of the input audio signal. This…
▽ More
Neural audio coding has emerged as a vivid research direction by promising good audio quality at very low bitrates unachievable by classical coding techniques. Here, end-to-end trainable autoencoder-like models represent the state of the art, where a discrete representation in the bottleneck of the autoencoder has to be learned that allows for efficient transmission of the input audio signal. This discrete representation is typically generated by applying a quantizer to the output of the neural encoder. In almost all state-of-the-art neural audio coding approaches, this quantizer is realized as a Vector Quantizer (VQ) and a lot of effort has been spent to alleviate drawbacks of this quantization technique when used together with a neural audio coder. In this paper, we propose simple alternatives to VQ, which are based on projected Scalar Quantization (SQ). These quantization techniques do not need any additional losses, scheduling parameters or codebook storage thereby simplifying the training of neural audio codecs. Furthermore, we propose a new causal network architecture for neural speech coding that shows good performance at very low computational complexity.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses
Authors:
Gaurav Kumar Gupta,
Aditi Singh,
Sijo Valayakkad Manikandan,
Abul Ehtesham
Abstract:
The recent swift development of LLMs like GPT-4, Gemini, and GPT-3.5 offers a transformative opportunity in medicine and healthcare, especially in digital diagnostics. This study evaluates each model diagnostic abilities by interpreting a user symptoms and determining diagnoses that fit well with common illnesses, and it demonstrates how each of these models could significantly increase diagnostic…
▽ More
The recent swift development of LLMs like GPT-4, Gemini, and GPT-3.5 offers a transformative opportunity in medicine and healthcare, especially in digital diagnostics. This study evaluates each model diagnostic abilities by interpreting a user symptoms and determining diagnoses that fit well with common illnesses, and it demonstrates how each of these models could significantly increase diagnostic accuracy and efficiency. Through a series of diagnostic prompts based on symptoms from medical databases, GPT-4 demonstrates higher diagnostic accuracy from its deep and complete history of training on medical data. Meanwhile, Gemini performs with high precision as a critical tool in disease triage, demonstrating its potential to be a reliable model when physicians are trying to make high-risk diagnoses. GPT-3.5, though slightly less advanced, is a good tool for medical diagnostics. This study highlights the need to study LLMs for healthcare and clinical practices with more care and attention, ensuring that any system utilizing LLMs promotes patient privacy and complies with health information privacy laws such as HIPAA compliance, as well as the social consequences that affect the varied individuals in complex healthcare contexts. This study marks the start of a larger future effort to study the various ways in which assigning ethical concerns to LLMs task of learning from human biases could unearth new ways to apply AI in complex medical settings.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.