(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 227 results for author: Mishra, M

.
  1. arXiv:2407.20016  [pdf, other

    hep-th gr-qc

    Stability and topological nature of charged Gauss-Bonnet AdS black holes in five dimensions

    Authors: Imtak Jeon, Bum-Hoon Lee, Wonwoo Lee, Madhu Mishra

    Abstract: We examine the thermodynamic characteristics and phase structures of a black hole, where the black hole horizon could be a hypersurface with positive, zero, or negative constant curvature, within the framework of Einstein-Maxwell theory, incorporating a negative cosmological constant and a Gauss-Bonnet correction. Our research follows the topological approach to black hole thermodynamics where we… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: 29 pages, 21 figures, 3 Tables

  2. arXiv:2407.13739  [pdf, other

    cs.AI cs.CL cs.SE

    Scaling Granite Code Models to 128K Context

    Authors: Matt Stallone, Vaibhav Saxena, Leonid Karlinsky, Bridget McGinn, Tim Bula, Mayank Mishra, Adriana Meza Soria, Gaoyuan Zhang, Aditya Prasad, Yikang Shen, Saptha Surendran, Shanmukha Guttula, Hima Patel, Parameswaran Selvam, Xuan-Hong Dang, Yan Koyfman, Atin Sood, Rogerio Feris, Nirmit Desai, David D. Cox, Ruchir Puri, Rameswar Panda

    Abstract: This paper introduces long-context Granite code models that support effective context windows of up to 128K tokens. Our solution for scaling context length of Granite 3B/8B code models from 2K/4K to 128K consists of a light-weight continual pretraining by gradually increasing its RoPE base frequency with repository-level file packing and length-upsampled long-context data. Additionally, we also re… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  3. arXiv:2407.09105  [pdf, other

    cs.LG cs.AI

    Enhancing Training Efficiency Using Packing with Flash Attention

    Authors: Achintya Kundu, Rhui Dih Lee, Laura Wynter, Raghu Kiran Ganti, Mayank Mishra

    Abstract: Padding is often used in tuning LLM models by adding special tokens to shorter training examples to match the length of the longest sequence in each batch. While this ensures uniformity for batch processing, it introduces inefficiencies by including irrelevant padding tokens in the computation and wastes GPU resources. Hugging Face SFT trainer has always offered the option to use packing to combin… ▽ More

    Submitted 29 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

  4. arXiv:2407.06893  [pdf

    cs.CL cs.CE

    Measuring Sustainability Intention of ESG Fund Disclosure using Few-Shot Learning

    Authors: Mayank Singh, Nazia Nafis, Abhijeet Kumar, Mridul Mishra

    Abstract: Global sustainable fund universe encompasses open-end funds and exchange-traded funds (ETF) that, by prospectus or other regulatory filings, claim to focus on Environment, Social and Governance (ESG). Challengingly, the claims can only be confirmed by examining the textual disclosures to check if there is presence of intentionality and ESG focus on its investment strategy. Currently, there is no r… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: This paper was presented at 'AI applications in ESG Conference' at IIM Bangalore, India (Nov, 2023)

  5. arXiv:2407.05467  [pdf, other

    cs.DC cs.AI

    The infrastructure powering IBM's Gen AI model development

    Authors: Talia Gershon, Seetharami Seelam, Brian Belgodere, Milton Bonilla, Lan Hoang, Danny Barnett, I-Hsin Chung, Apoorve Mohan, Ming-Hung Chen, Lixiang Luo, Robert Walkup, Constantinos Evangelinos, Shweta Salaria, Marc Dombrowa, Yoonho Park, Apo Kayi, Liran Schour, Alim Alim, Ali Sydney, Pavlos Maniotis, Laurent Schares, Bernard Metzler, Bengi Karacali-Akyamac, Sophia Wen, Tatsuhiro Chiba , et al. (121 additional authors not shown)

    Abstract: AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering effi… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Corresponding Authors: Talia Gershon, Seetharami Seelam,Brian Belgodere, Milton Bonilla

  6. arXiv:2406.09318  [pdf, ps, other

    cs.GT cs.AI cs.MA

    Characterising Interventions in Causal Games

    Authors: Manuj Mishra, James Fox, Michael Wooldridge

    Abstract: Causal games are probabilistic graphical models that enable causal queries to be answered in multi-agent settings. They extend causal Bayesian networks by specifying decision and utility variables to represent the agents' degrees of freedom and objectives. In multi-agent settings, whether each agent decides on their policy before or after knowing the causal intervention is important as this affect… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted to the 40th Conference on Uncertainty in Artificial Intelligence (UAI-2024)

  7. arXiv:2406.08301  [pdf, other

    nucl-ex

    Jet modification via $πぱい^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV

    Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, S. Afanasiev, C. Aidala, N. N. Ajitanand, Y. Akiba, H. Al-Bataineh, J. Alexander, M. Alfred, K. Aoki, N. Apadula, L. Aphecetche, J. Asai, H. Asano, E. T. Atomssa, R. Averbeck, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, G. Baksay, L. Baksay, A. Baldisseri , et al. (510 additional authors not shown)

    Abstract: High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 534 authors from 83 institutions, 12 pages, 7 figures. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

  8. arXiv:2406.03128  [pdf, ps, other

    math.CA math.FA

    The Weyl Transform of a smooth measure on a real-analytic submanifold

    Authors: Mansi Mishra, M. K. Vemuri

    Abstract: If $μみゅー$ is a smooth measure supported on a real-analytic submanifold of $\mathbb{R}^{2n}$ which is not contained in any affine hyperplane, then the Weyl transform of $μみゅー$ is a compact operator.

    Submitted 5 June, 2024; originally announced June 2024.

    MSC Class: 22D10; 22E30; 43A05; 43A80; 53D55

  9. arXiv:2405.12981  [pdf, other

    cs.LG cs.CL

    Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

    Authors: William Brandon, Mayank Mishra, Aniruddha Nrusimha, Rameswar Panda, Jonathan Ragan Kelly

    Abstract: Key-value (KV) caching plays an essential role in accelerating decoding for transformer-based autoregressive large language models (LLMs). However, the amount of memory required to store the KV cache can become prohibitive at long sequence lengths and large batch sizes. Since the invention of the transformer, two of the most effective interventions discovered for reducing the size of the KV cache… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  10. arXiv:2405.04324  [pdf, other

    cs.AI cs.CL cs.SE

    Granite Code Models: A Family of Open Foundation Models for Code Intelligence

    Authors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal , et al. (21 additional authors not shown)

    Abstract: Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabili… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Corresponding Authors: Rameswar Panda, Ruchir Puri; Equal Contributors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang

  11. arXiv:2404.06423  [pdf, other

    cs.RO cs.AI cs.LG

    Deep Reinforcement Learning-Based Approach for a Single Vehicle Persistent Surveillance Problem with Fuel Constraints

    Authors: Manav Mishra, Hritik Bana, Saswata Sarkar, Sujeevraja Sanjeevi, PB Sujit, Kaarthik Sundar

    Abstract: This article presents a deep reinforcement learning-based approach to tackle a persistent surveillance mission requiring a single unmanned aerial vehicle initially stationed at a depot with fuel or time-of-flight constraints to repeatedly visit a set of targets with equal priority. Owing to the vehicle's fuel or time-of-flight constraints, the vehicle must be regularly refueled, or its battery mus… ▽ More

    Submitted 2 May, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: 6 pages

    Report number: LA-UR-24-23186

  12. arXiv:2404.05567  [pdf, other

    cs.LG cs.AI cs.CL

    Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models

    Authors: Bowen Pan, Yikang Shen, Haokun Liu, Mayank Mishra, Gaoyuan Zhang, Aude Oliva, Colin Raffel, Rameswar Panda

    Abstract: Mixture-of-Experts (MoE) language models can reduce computational costs by 2-4$\times$ compared to dense models without sacrificing performance, making them more efficient in computation-bounded scenarios. However, MoE models generally require 2-4$\times$ times more parameters to achieve comparable performance to a dense model, which incurs larger GPU memory requirements and makes MoE models less… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  13. arXiv:2404.03605  [pdf, other

    cs.LG cs.CL

    Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization

    Authors: Aniruddha Nrusimha, Mayank Mishra, Naigang Wang, Dan Alistarh, Rameswar Panda, Yoon Kim

    Abstract: We consider the problem of accurate quantization for language models, where both the weights and activations are uniformly quantized to 4 bits per parameter, the lowest bitwidth format natively supported by GPU hardware. In this context, the key challenge is activation quantization: it is known that language models contain outlier channels whose values on average are orders of magnitude higher tha… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  14. arXiv:2404.03177  [pdf

    cond-mat.mtrl-sci

    Direct visualization of local magnetic domain dynamics in a 2D Van der Walls material/ferromagnet interface

    Authors: Joseph Vimal Vas, Rohit Medwal, Sourabh Manna, Mayank Mishra, Aaron Muller, John Rex Mohan, Yasuhiro Fukuma, Martial Duchamp, Rajdeep Singh Rawat

    Abstract: Exploring new strategies for controlling the magnetic domain propagation is the key to realize ultrafast, high-density domain wall-based memory and logic devices for next generation computing. These strategies include strain modulation in multiferroic devices, geometric confinement and area-selective pinning of domain wall. 2D Van der Waals materials introduce localized modifications to the interf… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Merged Manuscript and supplementary file. Submitted to Communications Physics (under review)

  15. arXiv:2404.02900  [pdf, other

    cs.CV cs.AI cs.LG

    DeiT-LT Distillation Strikes Back for Vision Transformer Training on Long-Tailed Datasets

    Authors: Harsh Rangwani, Pradipto Mondal, Mayank Mishra, Ashish Ramayee Asokan, R. Venkatesh Babu

    Abstract: Vision Transformer (ViT) has emerged as a prominent architecture for various computer vision tasks. In ViT, we divide the input image into patch tokens and process them through a stack of self attention blocks. However, unlike Convolutional Neural Networks (CNN), ViTs simple architecture has no informative inductive bias (e.g., locality,etc. ). Due to this, ViT requires a large amount of data for… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: CVPR 2024. Project Page: https://rangwani-harsh.github.io/DeiT-LT

  16. arXiv:2404.00399  [pdf, other

    cs.CL cs.AI cs.LG

    Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

    Authors: Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak , et al. (20 additional authors not shown)

    Abstract: Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and StarCoder aim to democratize access to pretrained models for collaborative community development. However, such existing models face challenges: limited multilingual capabilities, continual pretraining causing catastrophic forgetting, where… ▽ More

    Submitted 23 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Preprint

  17. DataAgent: Evaluating Large Language Models' Ability to Answer Zero-Shot, Natural Language Queries

    Authors: Manit Mishra, Abderrahman Braham, Charles Marsom, Bryan Chung, Gavin Griffin, Dakshesh Sidnerlikar, Chatanya Sarin, Arjun Rajaram

    Abstract: Conventional processes for analyzing datasets and extracting meaningful information are often time-consuming and laborious. Previous work has identified manual, repetitive coding and data collection as major obstacles that hinder data scientists from undertaking more nuanced labor and high-level projects. To combat this, we evaluated OpenAI's GPT-3.5 as a "Language Data Scientist" (LDS) that can e… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: 5 pages, Submitted to International Conference on AI in Cybersecurity

  18. arXiv:2403.15305  [pdf, ps, other

    astro-ph.HE hep-ph

    X-ray emission spectrum for axion-photon conversion in magnetospheres of strongly magnetized neutron stars

    Authors: Shubham Yadav, M. Mishra, Tapomoy Guha Sarkar

    Abstract: Detecting axionic dark matter (DM) could be possible in an X-ray spectrum from strongly magnetized neutron stars (NSs). We examine the possibility of axion-photon conversion in the magnetospheres of strongly magnetized NSs. In the current work, we investigate how the modified Tolman Oppenheimer Volkoff (TOV) system of equations (in the presence of a magnetic field) affects the energy spectrum of a… ▽ More

    Submitted 20 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: Accepted in European Physical Journal C (15 pages, 17 figures)

  19. arXiv:2403.08936  [pdf, other

    cs.MA cs.AI cs.RO

    Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning

    Authors: Peihong Yu, Manav Mishra, Alec Koppel, Carl Busart, Priya Narayan, Dinesh Manocha, Amrit Bedi, Pratap Tokekar

    Abstract: Multi-Agent Reinforcement Learning (MARL) algorithms face the challenge of efficient exploration due to the exponential increase in the size of the joint state-action space. While demonstration-guided learning has proven beneficial in single-agent settings, its direct applicability to MARL is hindered by the practical difficulty of obtaining joint expert demonstrations. In this work, we introduce… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  20. arXiv:2402.19173  [pdf, other

    cs.SE cs.AI

    StarCoder 2 and The Stack v2: The Next Generation

    Authors: Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo , et al. (41 additional authors not shown)

    Abstract: The BigCode project, an open-scientific collaboration focused on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In partnership with Software Heritage (SWH), we build The Stack v2 on top of the digital commons of their source code archive. Alongside the SWH repositories spanning 619 programming languages, we carefully select other high-quality data… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  21. arXiv:2402.13044  [pdf, ps, other

    hep-ph astro-ph.HE

    Conversion of Emitted Axionic Dark Matter to Photons for Non-Rotating Magnetized Neutron Stars

    Authors: Shubham Yadav, M. Mishra, Tapomoy Guha Sarkar

    Abstract: We attempt to find the impact of a modified Tolman Oppenheimer Volkoff (TOV) system of equations on the luminosities of direct photons, neutrinos and axions for a particular axion mass in the presence of a magnetic field. We employ two different equation of states (EoSs) namely APR and FPS to generate the profiles of mass and pressure for spherically symmetric and non-rotating Neutron stars (NSs).… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 12 pages, 14 figures. arXiv admin note: text overlap with arXiv:2212.11652

  22. arXiv:2402.02479  [pdf, other

    cs.LG cs.AI cs.CL cs.HC

    BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback

    Authors: Gaurav Pandey, Yatin Nandwani, Tahira Naseem, Mayank Mishra, Guangxuan Xu, Dinesh Raghu, Sachindra Joshi, Asim Munawar, Ramón Fernandez Astudillo

    Abstract: Distribution matching methods for language model alignment such as Generation with Distributional Control (GDC) and Distributional Policy Gradient (DPG) have not received the same level of attention in reinforcement learning from human feedback (RLHF) as contrastive methods such as Sequence Likelihood Calibration (SLiC), Direct Preference Optimization (DPO) and its variants. We identify high varia… ▽ More

    Submitted 10 June, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted at ICML 2024 (main conference)

  23. arXiv:2401.04951  [pdf, ps, other

    math.MG math.FA

    Conjugacy classes of automorphisms of the unit ball in a complex Hilbert space

    Authors: Rachna Aggarwal, Krishnendu Gongopadhyay, Mukund Madhav Mishra

    Abstract: In this article, we consider the ball model of an infinite dimensional complex hyperbolic space, i.e. the open unit ball of a complex Hilbert space centered at the origin equipped with the Caratheodory metric. We consider the group of holomorphic automorphisms of the ball and classify the conjugacy classes of automorphisms. We also compute the centralizers for elements in the group of automorphism… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    MSC Class: 51M10; 51F25

  24. arXiv:2312.07078  [pdf, ps, other

    math.DG math.SP

    A generalization of a result of Minakshisundaram and Pleijel

    Authors: Mansi Mishra, Ankita Sharma, M. K. Vemuri

    Abstract: Minakshisundaram and Pleijel gave an asymptotic formula for the sum of squares of the pointwise values of the eigenfunctions of the Laplace-Beltrami operator on a compact Riemannian manifold, with eigenvalues less than a fixed number. Here, a generalization is given, where the pointwise values are replaced by the Fourier coefficients of a smooth measure supported on a compact submanifold.

    Submitted 16 February, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: 13 pages

    MSC Class: 58J50; 58J35

  25. arXiv:2309.12076  [pdf, other

    quant-ph physics.optics

    Super-resolution and super-sensitivity of quantum LiDAR with multi-photonic state and binary outcome photon counting measurement

    Authors: Priyanka Sharma, Manoj K. Mishra, Devendra Kumar Mishra

    Abstract: Here we are investigating the enhancement in phase sensitivity and resolution in Mach-Zehnder interferometer (MZI) based quantum LiDAR. We are using multi-photonic state (MPS), superposition of four coherent states [1], as the input state and binary outcome parity photon counting measurement and binary outcome zero-nonzero photon counting measurement as the measurement schemes. We thoroughly inves… ▽ More

    Submitted 3 April, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: We welcome comments

  26. arXiv:2309.02376  [pdf

    physics.app-ph

    Performance analysis of InAlN/GaN HEMT and optimization for high frequency applications

    Authors: Jagori Raychaudhuri, Jayjit Mukherjee, Amit Malik, Sudhir Kumar, D. S. Rawal, Meena Mishra, Santanu Ghosh

    Abstract: An InAlN/GaN HEMT device was studied using extensive temperature dependent DC IV measurements and CV measurements. Barrier traps in the InAlN layer were characterized using transient analysis. Forward gate current was modelled using analytical equations. RF performance of the device was also studied and device parameters were extracted following small signal equivalent circuit model. Extensive sim… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  27. Investigation of RF performance of Ku-band GaN HEMT device and an in-depth analysis of short channel effects

    Authors: Jagori Raychaudhuri, Jayjit Mukherjee, Sudhir Kumar, D. S. Rawal, Meena Mishra, Santanu Ghosh

    Abstract: In this paper, we have characterized an AlGaN/GaN High Electron Mobility Transistor (HEMT) with a short gate length (Lg $\approx$ 0.15$μみゅー$m). We have studied the effect of short gate length on the small signal parameters, linearity parameters and gm-gd ratio in GaN HEMT devices. To understand how scaling results in the variation of the above-mentioned parameters a comparative study with higher gate… ▽ More

    Submitted 9 April, 2024; v1 submitted 5 September, 2023; originally announced September 2023.

    Journal ref: Physica Scripta, Vol. 99, No. 4, 2024

  28. Existence and Uniqueness of Solution to Unsteady Darcy-Brinkman Problem with Korteweg Stress for Modelling Miscible Porous Media Flow

    Authors: Sahil Kundu, Surya Narayan Maharana, Manoranjan Mishra

    Abstract: The work investigates a model that combines a convection-diffusion-reaction equation for solute concentration with an unsteady Darcy-Brinkman equation for the flow field, including the Kortweg stress. Additionally, the flow field experiences an external body force term while the permeability fluctuates with solute concentration. Such models are used to describe flows in porous mediums such as frac… ▽ More

    Submitted 24 May, 2024; v1 submitted 9 August, 2023; originally announced August 2023.

    MSC Class: 76D03 (Primary) 76S05; 35D30; 35Q35 (Secondary)

  29. arXiv:2307.12139  [pdf

    cond-mat.mtrl-sci

    Dense plasma irradiated platinum with improved spin Hall effect

    Authors: Sachin Kumar, Sourabh Manna, John Rex Mohan, Utkarsh Shashank, Jospeh Vimal, Mayank Mishra, Surbhi Gupta, Hironori Asada, Yasuhiro Fukuma, Rajdeep Singh Rawat, Rohit Medwal

    Abstract: The impurity incorporation in host high-spin orbit coupling materials like platinum has shown improved charge-to-spin conversion by modifying the up-spin and down-spin electron trajectories by bending or skewing them in opposite directions. This enables efficient generation, manipulation, and transport of spin currents. In this study, we irradiate the platinum with non-focus dense plasma to incorp… ▽ More

    Submitted 22 July, 2023; originally announced July 2023.

  30. arXiv:2305.11790  [pdf, other

    cs.CL

    Prompting with Pseudo-Code Instructions

    Authors: Mayank Mishra, Prince Kumar, Riyaz Bhat, Rudra Murthy V, Danish Contractor, Srikanth Tamilselvam

    Abstract: Prompting with natural language instructions has recently emerged as a popular method of harnessing the capabilities of large language models. Given the inherent ambiguity present in natural language, it is intuitive to consider the possible advantages of prompting with less ambiguous prompt styles, such as the use of pseudo-code. In this paper we explore if prompting via pseudo-code instruction… ▽ More

    Submitted 19 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Published in EMNLP 2023 main track

  31. arXiv:2305.06161  [pdf, other

    cs.CL cs.AI cs.PL cs.SE

    StarCoder: may the source be with you!

    Authors: Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu , et al. (42 additional authors not shown)

    Abstract: The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention. StarCoderBase is trained on 1 trillion tokens sourced from The Stack, a large colle… ▽ More

    Submitted 13 December, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

  32. arXiv:2302.07440  [pdf

    cs.CV eess.IV

    Road Redesign Technique Achieving Enhanced Road Safety by Inpainting with a Diffusion Model

    Authors: Sumit Mishra, Medhavi Mishra, Taeyoung Kim, Dongsoo Har

    Abstract: Road infrastructure can affect the occurrence of road accidents. Therefore, identifying roadway features with high accident probability is crucial. Here, we introduce image inpainting that can assist authorities in achieving safe roadway design with minimal intervention in the current roadway structure. Image inpainting is based on inpainting safe roadway elements in a roadway image, replacing acc… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

    Comments: 9 Pages, 6 figures, 4 tables

  33. arXiv:2301.03988  [pdf, other

    cs.SE cs.AI cs.LG

    SantaCoder: don't reach for the stars!

    Authors: Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo , et al. (16 additional authors not shown)

    Abstract: The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This tech report describes the progress of the collaboration until December 2022, outlining the current state of the Personally Identifiable Information (PII) redaction pipeline, the experiments conducted to de-risk the model architecture, and the experiments investigat… ▽ More

    Submitted 24 February, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

  34. arXiv:2212.13827  [pdf, other

    cs.LG cs.CV

    Escaping Saddle Points for Effective Generalization on Class-Imbalanced Data

    Authors: Harsh Rangwani, Sumukh K Aithal, Mayank Mishra, R. Venkatesh Babu

    Abstract: Real-world datasets exhibit imbalances of varying types and degrees. Several techniques based on re-weighting and margin adjustment of loss are often used to enhance the performance of neural networks, particularly on minority classes. In this work, we analyze the class-imbalanced learning problem by examining the loss landscape of neural networks trained with re-weighting and margin-based techniq… ▽ More

    Submitted 28 December, 2022; originally announced December 2022.

    Comments: NeurIPS 2022. Code: https://github.com/val-iisc/Saddle-LongTail

  35. arXiv:2212.11652  [pdf, ps, other

    astro-ph.HE astro-ph.GA

    Thermal Evolution and Axion Emission Properties of Strongly Magnetized Neutron Stars

    Authors: Shubham Yadav, M. Mishra, Tapomoy Guha Sarkar, Captain R. Singh

    Abstract: Emission properties of compact astrophysical objects such as Neutron stars (NSs) are associated with crucial astronomical observables. In the current work, we obtain the mass, pressure profiles of the non-rotating NSs using the modified Tolman Oppenheimer Volkoff (TOV) system of equations in the presence of intense magnetic field. We obtain the profiles by using a specific distance-dependent magne… ▽ More

    Submitted 27 February, 2024; v1 submitted 22 December, 2022; originally announced December 2022.

    Comments: Accepted in European Physical Journal C. 19 pages , 34 figures

  36. arXiv:2212.09624  [pdf

    q-fin.GN cs.AI cs.IR

    Holder Recommendations using Graph Representation Learning & Link Prediction

    Authors: Rachna Saxena, Abhijeet Kumar, Mridul Mishra

    Abstract: Lead recommendations for financial products such as funds or ETF is potentially challenging in investment space due to changing market scenarios, and difficulty in capturing financial holder's mindset and their philosophy. Current methods surface leads based on certain product categorization and attributes like returns, fees, category etc. to suggest similar product to investors which may not capt… ▽ More

    Submitted 10 November, 2022; originally announced December 2022.

    Comments: 6 pages, 6 figures, 2 tables Presented at a workshop in ACM AI in Finance conference

  37. arXiv:2211.12729  [pdf, ps, other

    math.CA

    The Weyl Transform of a measure

    Authors: Mansi Mishra, M. K. Vemuri

    Abstract: (1) Suppose $μみゅー$ is a smooth measure on a hypersurface of positive Gaussian curvature in $\R^{2n}$. If $n\ge 2$, then $W(μみゅー)$, the Weyl transform of $μみゅー$, is a compact operator, and if $p>n\ge 6$ then $W(μみゅー)$ belongs to the $p$-Schatten class. (2) There exist Schatten class operators with linearly dependent quantum translates.

    Submitted 23 November, 2022; originally announced November 2022.

    MSC Class: 22D10; 22E30; 43A05; 43A80; 47B10

  38. arXiv:2211.11395  [pdf, ps, other

    math.RT

    Prasad's Conjecture about dualizing involutions

    Authors: Prashant Arote, Manish Mishra

    Abstract: Let $G$ be a connected reductive group defined over a finite field $\mathbb{F}_q$ with corresponding Frobenius $F$. Let $ιいおた_G$ denote the duality involution defined by D. Prasad under the hypothesis $2\mathrm{H}^1(F,Z(G))=0$, where $Z(G)$ denotes the center of $G$. We show that for each irreducible character $ρろー$ of $G^F$, the involution $ιいおた_G$ takes $ρろー$ to its dual $ρろー^{\vee}$ if and only if for a su… ▽ More

    Submitted 16 November, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Title changed. Final version. To appear in IMRN

  39. Higher derivative invariants in four dimensional N=3 Poincare supergravity

    Authors: Subramanya Hegde, Madhu Mishra, Debangshu Mukherjee, Bindusar Sahoo

    Abstract: In this paper, we use the superconformal approach to derive the higher derivative action for N = 3 Poincare supergravity in four space-time dimensions. We first study the coupling of N = 3 vector multiplets to conformal supergravity. Thereafter we combine it with the pure N = 3 conformal supergravity action and use a minimum of three vector multiplets as compensators to arrive at Poincare supergra… ▽ More

    Submitted 27 January, 2023; v1 submitted 12 November, 2022; originally announced November 2022.

    Comments: 31 pages,minor changes

  40. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  41. Fixed points and normal automorphisms of the unit ball of bounded operators on $\mathbb{C}^n$

    Authors: Rachna Aggarwal, Krishnendu Gongopadhyay, Mukund Madhav Mishra

    Abstract: We examine the group of isometries of the open unit ball of a complex Banach space of certain bounded linear operators equipped with the Carathéodory metric. Therein we obtain a charactrization of the normal isometries in terms of their special type of fixed points.

    Submitted 7 March, 2023; v1 submitted 30 October, 2022; originally announced October 2022.

    MSC Class: Primary 32M15; Secondary 47B02; 47B15; 47B91

  42. arXiv:2210.07295  [pdf, other

    cs.CL cs.LG

    Joint Reasoning on Hybrid-knowledge sources for Task-Oriented Dialog

    Authors: Mayank Mishra, Danish Contractor, Dinesh Raghu

    Abstract: Traditional systems designed for task oriented dialog utilize knowledge present only in structured knowledge sources to generate responses. However, relevant information required to generate responses may also reside in unstructured sources, such as documents. Recent state of the art models such as HyKnow and SeKnow aimed at overcoming these challenges make limiting assumptions about the knowledge… ▽ More

    Submitted 7 February, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

  43. arXiv:2209.00574  [pdf, ps, other

    math.RT

    Harish-Chandra Induction and Jordan Decomposition of Characters

    Authors: Prashant Arote, Manish Mishra

    Abstract: We show that for any finite connected reductive group, a Jordan decomposition can always be chosen such that it commutes with Harish-Chandra induction. En route, we show that the endomorphism algebra of the Harish-Chandra induction of a cuspidal representation of a Levi subgroup is isomorphic to a unipotent counterpart. These results generalize the well known results for groups with connected cent… ▽ More

    Submitted 20 December, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: Proof of Lemma 5.1 in previous version is corrected

    MSC Class: 20C33

  44. arXiv:2206.08213  [pdf, other

    cs.LG cs.CV

    A Closer Look at Smoothness in Domain Adversarial Training

    Authors: Harsh Rangwani, Sumukh K Aithal, Mayank Mishra, Arihant Jain, R. Venkatesh Babu

    Abstract: Domain adversarial training has been ubiquitous for achieving invariant representations and is used widely for various domain adaptation tasks. In recent times, methods converging to smooth optima have shown improved generalization for supervised learning tasks like classification. In this work, we analyze the effect of smoothness enhancing formulations on domain adversarial training, the objectiv… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: ICML 2022. Code: https://github.com/val-iisc/SDAT

  45. arXiv:2206.00485  [pdf, other

    cs.CY cs.HC

    Co-creation and ownership for AI radio

    Authors: Skylar Gordon, Robert Mahari, Manaswi Mishra, Ziv Epstein

    Abstract: Recent breakthroughs in AI-generated music open the door for new forms for co-creation and co-creativity. We present Artificial$.\!$fm, a proof-of-concept casual creator that blends AI-music generation, subjective ratings, and personalized recommendation for the creation and curation of AI-generated music. Listeners can rate emergent songs to steer the evolution of future music. They can also pers… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  46. Demonstration of Entanglement-Enhanced Covert Sensing

    Authors: Shuhong Hao, Haowei Shi, Christos N. Gagatsos, Mayank Mishra, Boulat Bash, Ivan Djordjevic, Saikat Guha, Quntao Zhuang, Zheshen Zhang

    Abstract: The laws of quantum physics endow superior performance and security for information processing: quantum sensing harnesses nonclassical resources to enable measurement precision unmatched by classical sensing, whereas quantum cryptography aims to unconditionally protect the secrecy of the processed information. Here, we present the theory and experiment for entanglement-enhanced covert sensing, a p… ▽ More

    Submitted 27 June, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: 19 pages, 12 figures

    Journal ref: Phys. Rev. Lett 129, 010501 (2022)

  47. arXiv:2204.04575  [pdf, other

    hep-ex nucl-ex physics.ins-det

    The COHERENT Experimental Program

    Authors: D. Akimov, S. Alawabdeh, P. An, A. Arteaga, C. Awe, P. S. Barbeau, C. Barry, B. Becker, V. Belov, I. Bernardi, M. A. Blackston, L. Blokland, C. Bock, B. Bodur, A. Bolozdynya, R. Bouabid, A. Bracho, J. Browning, B. Cabrera-Palmer, N. Chen, D. Chernyak, E. Conley, J. Daughhetee, J. Daughtry, E. Day , et al. (106 additional authors not shown)

    Abstract: The COHERENT experiment located in Neutrino Alley at the Spallation Neutron Source (SNS), Oak Ridge National Laboratory (ORNL), has made the world's first two measurements of coherent elastic neutrino-nucleus scattering (CEvNS), on CsI and argon, using neutrinos produced at the SNS. The COHERENT collaboration continues to pursue CEvNS measurements on various targets as well as additional studies o… ▽ More

    Submitted 9 April, 2022; originally announced April 2022.

    Comments: 38 papers, 24 figures; Snowmass contribution

  48. Thermodynamics of BPS and Near-BPS AdS6 Black Holes

    Authors: Madhu Mishra, Amitabh Virmani

    Abstract: We develop the thermodynamics of BPS and near-BPS AdS6 black holes. We study the phase diagram of BPS black holes in the grand canonical ensemble. We highlight two distinct deformations orthogonal to the BPS surface: (i) increasing the temperature while keeping the charges fixed, (ii) changing the charges while maintaining extremality such that the BPS constraint is no longer satisfied. For both t… ▽ More

    Submitted 24 March, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: 35 pages, 5 figures; v2: minor changes, reference added

    Journal ref: https://link.springer.com/article/10.1007/JHEP06(2022)087

  49. arXiv:2202.03734  [pdf, other

    cs.LG cs.AI cs.CY

    Cascaded Debiasing: Studying the Cumulative Effect of Multiple Fairness-Enhancing Interventions

    Authors: Bhavya Ghai, Mihir Mishra, Klaus Mueller

    Abstract: Understanding the cumulative effect of multiple fairness enhancing interventions at different stages of the machine learning (ML) pipeline is a critical and underexplored facet of the fairness literature. Such knowledge can be valuable to data scientists/ML practitioners in designing fair ML pipelines. This paper takes the first step in exploring this area by undertaking an extensive empirical stu… ▽ More

    Submitted 22 August, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

    Comments: Accepted to ACM CIKM Conference 2022

  50. Using the non-hydrodynamic mode to study the onset of hydrodynamic behavior in ultraperipheral symmetric nuclear collisions

    Authors: Nikhil Hatwar, Madhukar Mishra

    Abstract: With the attempts of extending the hydrodynamic framework of heavy-ion collision to proton-proton and other small and low energy systems, we are confronted with the question of how small the system can get and still be safely modelled as a fluid. One of the transport coefficients required in the $2^{nd}$ order relativistic viscous hydrodynamics is the shear relaxation time, inclusion of which solv… ▽ More

    Submitted 1 December, 2022; v1 submitted 1 February, 2022; originally announced February 2022.