(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 494 results for author: Kumar, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.08655  [pdf, other

    eess.IV cs.AI cs.LG physics.med-ph

    SPOCKMIP: Segmentation of Vessels in MRAs with Enhanced Continuity using Maximum Intensity Projection as Loss

    Authors: Chethan Radhakrishna, Karthikesh Varma Chintalapati, Sri Chandana Hudukula Ram Kumar, Raviteja Sutrave, Hendrik Mattern, Oliver Speck, Andreas Nürnberger, Soumick Chatterjee

    Abstract: Identification of vessel structures of different sizes in biomedical images is crucial in the diagnosis of many neurodegenerative diseases. However, the sparsity of good-quality annotations of such images makes the task of vessel segmentation challenging. Deep learning offers an efficient way to segment vessels of different sizes by learning their high-level feature representations and the spatial… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2407.08041  [pdf, other

    cs.CV

    TACLE: Task and Class-aware Exemplar-free Semi-supervised Class Incremental Learning

    Authors: Jayateja Kalla, Rohit Kumar, Soma Biswas

    Abstract: We propose a novel TACLE (TAsk and CLass-awarE) framework to address the relatively unexplored and challenging problem of exemplar-free semi-supervised class incremental learning. In this scenario, at each new task, the model has to learn new classes from both (few) labeled and unlabeled data without access to exemplars from previous classes. In addition to leveraging the capabilities of pre-train… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  3. arXiv:2407.07786  [pdf, ps, other

    cs.HC cs.AI cs.CY

    The Human Factor in AI Red Teaming: Perspectives from Social and Collaborative Computing

    Authors: Alice Qian Zhang, Ryland Shaw, Jacy Reese Anthis, Ashlee Milton, Emily Tseng, Jina Suh, Lama Ahmad, Ram Shankar Siva Kumar, Julian Posada, Benjamin Shestakofsky, Sarah T. Roberts, Mary L. Gray

    Abstract: Rapid progress in general-purpose AI has sparked significant interest in "red teaming," a practice of adversarial testing originating in military and cybersecurity applications. AI red teaming raises many questions about the human factor, such as how red teamers are selected, biases and blindspots in how tests are conducted, and harmful content's psychological effects on red teamers. A growing bod… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Workshop proposal accepted to CSCW 2024

  4. arXiv:2407.06093  [pdf, other

    cs.AI

    Artificial Intuition: Efficient Classification of Scientific Abstracts

    Authors: Harsh Sakhrani, Naseela Pervez, Anirudh Ravi Kumar, Fred Morstatter, Alexandra Graddy Reed, Andrea Belz

    Abstract: It is desirable to coarsely classify short scientific texts, such as grant or publication abstracts, for strategic insight or research portfolio management. These texts efficiently transmit dense information to experts possessing a rich body of knowledge to aid interpretation. Yet this task is remarkably difficult to automate because of brevity and the absence of context. To address this gap, we h… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  5. arXiv:2406.19040  [pdf, ps, other

    cs.LG cs.CR cs.DS

    On Convex Optimization with Semi-Sensitive Features

    Authors: Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Raghu Meka, Chiyuan Zhang

    Abstract: We study the differentially private (DP) empirical risk minimization (ERM) problem under the semi-sensitive DP setting where only some features are sensitive. This generalizes the Label DP setting where only the label is sensitive. We give improved upper and lower bounds on the excess risk for DP-ERM. In particular, we show that the error only scales polylogarithmically in terms of the sensitive d… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: To appear in COLT 2024

  6. arXiv:2406.16305  [pdf, ps, other

    cs.DS cs.CR

    On Computing Pairwise Statistics with Local Differential Privacy

    Authors: Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Adam Sealfon

    Abstract: We study the problem of computing pairwise statistics, i.e., ones of the form $\binom{n}{2}^{-1} \sum_{i \ne j} f(x_i, x_j)$, where $x_i$ denotes the input to the $i$th user, with differential privacy (DP) in the local model. This formulation captures important metrics such as Kendall's $τたう$ coefficient, Area Under Curve, Gini's mean difference, Gini's entropy, etc. We give several novel and generi… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Published in NeurIPS 2023

  7. arXiv:2406.16135  [pdf, other

    cs.CL cs.LG

    Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models

    Authors: Lynn Chua, Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chulin Xie, Chiyuan Zhang

    Abstract: Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora. But can these models relate corresponding concepts across languages, effectively being crosslingual? This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks. We observe that while these models show promising surface-level crosslingual abilities on machine translation… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  8. arXiv:2406.15470  [pdf, other

    cs.CL cs.AI cs.SI

    Mental Disorder Classification via Temporal Representation of Text

    Authors: Raja Kumar, Kishan Maharaj, Ashita Saxena, Pushpak Bhattacharyya

    Abstract: Mental disorders pose a global challenge, aggravated by the shortage of qualified mental health professionals. Mental disorder prediction from social media posts by current LLMs is challenging due to the complexities of sequential text data and the limited context length of language models. Current language model-based approaches split a single data instance into multiple chunks to compensate for… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: RK and KM contributed equally to this work, 15 pages, 5 figures, 9 table

  9. arXiv:2406.15335  [pdf, other

    cs.CV cs.CY

    Keystroke Dynamics Against Academic Dishonesty in the Age of LLMs

    Authors: Debnath Kundu, Atharva Mehta, Rajesh Kumar, Naman Lal, Avinash Anand, Apoorv Singh, Rajiv Ratn Shah

    Abstract: The transition to online examinations and assignments raises significant concerns about academic integrity. Traditional plagiarism detection systems often struggle to identify instances of intelligent cheating, particularly when students utilize advanced generative AI tools to craft their responses. This study proposes a keystroke dynamics-based method to differentiate between bona fide and assist… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted for publication at The IEEE International Joint Conference on Biometrics (IJCB2024), contains 9 pages, 3 figures, 3 tables

    ACM Class: I.5.4

  10. arXiv:2406.14322  [pdf, other

    cs.CL cs.CR cs.LG

    Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning

    Authors: Lynn Chua, Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Daogao Liu, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang

    Abstract: Large language models (LLMs) have emerged as powerful tools for tackling complex tasks across diverse domains, but they also raise privacy concerns when fine-tuned on sensitive data due to potential memorization. While differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit, current evaluations on LLMs most… ▽ More

    Submitted 3 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  11. arXiv:2406.11757  [pdf, other

    cs.AI cs.CL cs.CY cs.HC

    STAR: SocioTechnical Approach to Red Teaming Language Models

    Authors: Laura Weidinger, John Mellor, Bernat Guillen Pegueroles, Nahema Marchal, Ravin Kumar, Kristian Lum, Canfer Akbulut, Mark Diaz, Stevie Bergman, Mikel Rodriguez, Verena Rieser, William Isaac

    Abstract: This research introduces STAR, a sociotechnical framework that improves on current best practices for red teaming safety of large language models. STAR makes two key contributions: it enhances steerability by generating parameterised instructions for human red teamers, leading to improved coverage of the risk surface. Parameterised instructions also provide more detailed insights into model failur… ▽ More

    Submitted 10 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 8 pages, 5 figures, 5 pages appendix. * denotes equal contribution

  12. arXiv:2406.11409  [pdf, other

    cs.CL cs.AI

    CodeGemma: Open Code Models Based on Gemma

    Authors: CodeGemma Team, Heri Zhao, Jeffrey Hui, Joshua Howland, Nam Nguyen, Siqi Zuo, Andrea Hu, Christopher A. Choquette-Choo, Jingyue Shen, Joe Kelley, Kshitij Bansal, Luke Vilnis, Mateo Wirth, Paul Michel, Peter Choy, Pratik Joshi, Ravin Kumar, Sarmad Hashmi, Shubham Agrawal, Zhitao Gong, Jane Fine, Tris Warkentin, Ale Jakse Hartman, Bin Ni, Kathy Korevec , et al. (2 additional authors not shown)

    Abstract: This paper introduces CodeGemma, a collection of specialized open code models built on top of Gemma, capable of a variety of code and natural language generation tasks. We release three model variants. CodeGemma 7B pretrained (PT) and instruction-tuned (IT) variants have remarkably resilient natural language understanding, excel in mathematical reasoning, and match code capabilities of other open… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: v1: 11 pages, 4 figures, 5 tables. v2: Update metadata

  13. arXiv:2406.10893  [pdf, other

    eess.IV cs.AI cs.CV q-bio.QM q-bio.TO

    Development and Validation of Fully Automatic Deep Learning-Based Algorithms for Immunohistochemistry Reporting of Invasive Breast Ductal Carcinoma

    Authors: Sumit Kumar Jha, Purnendu Mishra, Shubham Mathur, Gursewak Singh, Rajiv Kumar, Kiran Aatre, Suraj Rengarajan

    Abstract: Immunohistochemistry (IHC) analysis is a well-accepted and widely used method for molecular subtyping, a procedure for prognosis and targeted therapy of breast carcinoma, the most common type of tumor affecting women. There are four molecular biomarkers namely progesterone receptor (PR), estrogen receptor (ER), antigen Ki67, and human epidermal growth factor receptor 2 (HER2) whose assessment is n… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  14. arXiv:2406.10764  [pdf, other

    cs.CL

    GNOME: Generating Negotiations through Open-Domain Mapping of Exchanges

    Authors: Darshan Deshpande, Shambhavi Sinha, Anirudh Ravi Kumar, Debaditya Pal, Jonathan May

    Abstract: Language Models have previously shown strong negotiation capabilities in closed domains where the negotiation strategy prediction scope is constrained to a specific setup. In this paper, we first show that these models are not generalizable beyond their original training domain despite their wide-scale pretraining. Following this, we propose an automated framework called GNOME, which processes exi… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  15. arXiv:2406.07860  [pdf, other

    cs.CL cs.AI cs.LG

    BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain

    Authors: Rahul Kumar, Amar Raja Dibbu, Shrutendra Harsola, Vignesh Subrahmaniam, Ashutosh Modi

    Abstract: Several large-scale datasets (e.g., WikiSQL, Spider) for developing natural language interfaces to databases have recently been proposed. These datasets cover a wide breadth of domains but fall short on some essential domains, such as finance and accounting. Given that accounting databases are used worldwide, particularly by non-technical people, there is an imminent need to develop models that co… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted at NAACL 2024; 20 Pages (main + appendix)

  16. arXiv:2406.05828  [pdf, other

    cs.CV cs.AI eess.IV

    Multi-Stain Multi-Level Convolutional Network for Multi-Tissue Breast Cancer Image Segmentation

    Authors: Akash Modi, Sumit Kumar Jha, Purnendu Mishra, Rajiv Kumar, Kiran Aatre, Gursewak Singh, Shubham Mathur

    Abstract: Digital pathology and microscopy image analysis are widely employed in the segmentation of digitally scanned IHC slides, primarily to identify cancer and pinpoint regions of interest (ROI) indicative of tumor presence. However, current ROI segmentation models are either stain-specific or suffer from the issues of stain and scanner variance due to different staining protocols or modalities across m… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  17. arXiv:2406.04560  [pdf, other

    cs.RO

    meSch: Multi-Agent Energy-Aware Scheduling for Task Persistence

    Authors: Kaleb Ben Naveed, An Dang, Rahul Kumar, Dimitra Panagou

    Abstract: This paper develops a scheduling protocol for a team of autonomous robots that operate in long-term persistent tasks. The proposed framework, called meSch, accounts for the robots' limited battery capacity and the presence of a single charging station, and achieves the following contributions: 1) First, it guarantees exclusive use of the charging station by one robot at a time; the approach is onl… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  18. arXiv:2405.18751  [pdf, other

    cs.CV cs.AI

    On the Limits of Multi-modal Meta-Learning with Auxiliary Task Modulation Using Conditional Batch Normalization

    Authors: Jordi Armengol-Estapé, Vincent Michalski, Ramnath Kumar, Pierre-Luc St-Charles, Doina Precup, Samira Ebrahimi Kahou

    Abstract: Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples. Recent studies show that cross-modal learning can improve representations for few-shot classification. More specifically, language is a rich modality that can be used to guide visual learning. In this work, we experiment with a multi-modal architecture for few-shot learning that consists o… ▽ More

    Submitted 30 May, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  19. arXiv:2405.18534  [pdf, ps, other

    cs.DS cs.CR

    Individualized Privacy Accounting via Subsampling with Applications in Combinatorial Optimization

    Authors: Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Adam Sealfon

    Abstract: In this work, we give a new technique for analyzing individualized privacy accounting via the following simple observation: if an algorithm is one-sided add-DP, then its subsampled variant satisfies two-sided DP. From this, we obtain several improved algorithms for private combinatorial optimization problems, including decomposable submodular maximization and set cover. Our error guarantees are as… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: To appear in ICML 2024

  20. arXiv:2405.16048  [pdf, ps, other

    cs.IT

    A New Construction of Optimal Symmetrical ZCCS

    Authors: Rajen Kumar, Prashant Kumar Srivastava, Sudhan Majhi

    Abstract: We propose new constructions for a two-dimensional ($2$D) perfect array, complete complementary code (CCC), and multiple CCCs as an optimal symmetrical $Z$-complementary code set (ZCCS). We propose a method to generate a two-dimensional perfect array and CCC. By utilising mutually orthogonal sequences, we developed a method to extend the length of a CCC without affecting the set or code size. Addi… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted in 'IEEE International Symposium on Information Theory (ISIT 2024)'

  21. arXiv:2405.13930  [pdf, other

    cond-mat.mtrl-sci cs.RO cs.SE

    AlabOS: A Python-based Reconfigurable Workflow Management Framework for Autonomous Laboratories

    Authors: Yuxing Fei, Bernardus Rendy, Rishi Kumar, Olympia Dartsi, Hrushikesh P. Sahasrabuddhe, Matthew J. McDermott, Zheren Wang, Nathan J. Szymanski, Lauren N. Walters, David Milsted, Yan Zeng, Anubhav Jain, Gerbrand Ceder

    Abstract: The recent advent of autonomous laboratories, coupled with algorithms for high-throughput screening and active learning, promises to accelerate materials discovery and innovation. As these autonomous systems grow in complexity, the demand for robust and efficient workflow management software becomes increasingly critical. In this paper, we introduce AlabOS, a general-purpose software framework for… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 30 pages, 5 figures

  22. arXiv:2405.11295  [pdf

    eess.IV cs.CV cs.LG cs.MM

    Medical Image Analysis for Detection, Treatment and Planning of Disease using Artificial Intelligence Approaches

    Authors: Nand Lal Yadav, Satyendra Singh, Rajesh Kumar, Sudhakar Singh

    Abstract: X-ray is one of the prevalent image modalities for the detection and diagnosis of the human body. X-ray provides an actual anatomical structure of an organ present with disease or absence of disease. Segmentation of disease in chest X-ray images is essential for the diagnosis and treatment. In this paper, a framework for the segmentation of X-ray images using artificial intelligence techniques has… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: 10 pages, 3 figures

    Journal ref: International Journal of Microsystems and IoT, Vol. 1, Issue 5, pp.278- 287, 2023

  23. arXiv:2405.09562  [pdf, other

    eess.SP cs.LG

    MEET: Mixture of Experts Extra Tree-Based sEMG Hand Gesture Identification

    Authors: Naveen Gehlot, Ashutosh Jena, Rajesh Kumar, Mahipal Bukya

    Abstract: Artificial intelligence (AI) has made significant advances in recent years and opened up new possibilities in exploring applications in various fields such as biomedical, robotics, education, industry, etc. Among these fields, human hand gesture recognition is a subject of study that has recently emerged as a research interest in robotic hand control using electromyography (EMG). Surface electromy… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  24. Online Load and Graph Balancing for Random Order Inputs

    Authors: Sungjin Im, Ravi Kumar, Shi Li, Aditya Petety, Manish Purohit

    Abstract: Online load balancing for heterogeneous machines aims to minimize the makespan (maximum machine workload) by scheduling arriving jobs with varying sizes on different machines. In the adversarial setting, where an adversary chooses not only the collection of job sizes but also their arrival order, the problem is well-understood and the optimal competitive ratio is known to be $Θしーた(\log m)$ where $m$… ▽ More

    Submitted 20 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

  25. arXiv:2405.02769  [pdf, other

    cs.LG cs.MA math.OC

    Linear Convergence of Independent Natural Policy Gradient in Games with Entropy Regularization

    Authors: Youbang Sun, Tao Liu, P. R. Kumar, Shahin Shahrampour

    Abstract: This work focuses on the entropy-regularized independent natural policy gradient (NPG) algorithm in multi-agent reinforcement learning. In this work, agents are assumed to have access to an oracle with exact policy evaluation and seek to maximize their respective independent rewards. Each individual's reward is assumed to depend on the actions of all the agents in the multi-agent system, leading t… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  26. arXiv:2404.16326  [pdf, other

    cs.LG

    NeuroKoopman Dynamic Causal Discovery

    Authors: Rahmat Adesunkanmi, Balaji Sesha Srikanth Pokuri, Ratnesh Kumar

    Abstract: In many real-world applications where the system dynamics has an underlying interdependency among its variables (such as power grid, economics, neuroscience, omics networks, environmental ecosystems, and others), one is often interested in knowing whether the past values of one time series influences the future of another, known as Granger causality, and the associated underlying dynamics. This pa… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  27. arXiv:2404.10881  [pdf, ps, other

    cs.LG math.OC stat.ML

    Differentially Private Optimization with Sparse Gradients

    Authors: Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Ravi Kumar, Pasin Manurangsi

    Abstract: Motivated by applications of large embedding models, we study differentially private (DP) optimization problems under sparsity of individual gradients. We start with new near-optimal bounds for the classic mean estimation problem but with sparse data, improving upon existing algorithms particularly for the high-dimensional regime. Building on this, we obtain pure- and approximate-DP algorithms wit… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  28. arXiv:2404.08277  [pdf, other

    cs.CV

    FaceFilterSense: A Filter-Resistant Face Recognition and Facial Attribute Analysis Framework

    Authors: Shubham Tiwari, Yash Sethia, Ritesh Kumar, Ashwani Tanwar, Rudresh Dwivedi

    Abstract: With the advent of social media, fun selfie filters have come into tremendous mainstream use affecting the functioning of facial biometric systems as well as image recognition systems. These filters vary from beautification filters and Augmented Reality (AR)-based filters to filters that modify facial landmarks. Hence, there is a need to assess the impact of such filters on the performance of exis… ▽ More

    Submitted 18 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

  29. arXiv:2404.06352  [pdf, other

    cs.CV cs.RO

    DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning

    Authors: Senthil Yogamani, David Unger, Venkatraman Narayanan, Varun Ravi Kumar

    Abstract: Semantic segmentation is an effective way to perform scene understanding. Recently, segmentation in 3D Bird's Eye View (BEV) space has become popular as its directly used by drive policy. However, there is limited work on BEV segmentation for surround-view fisheye cameras, commonly used in commercial vehicles. As this task has no real-world public dataset and existing synthetic datasets do not han… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  30. arXiv:2404.01786  [pdf

    cs.CL

    Generative AI-Based Text Generation Methods Using Pre-Trained GPT-2 Model

    Authors: Rohit Pandey, Hetvi Waghela, Sneha Rakshit, Aparna Rangari, Anjali Singh, Rahul Kumar, Ratnadeep Ghosal, Jaydip Sen

    Abstract: This work delved into the realm of automatic text generation, exploring a variety of techniques ranging from traditional deterministic approaches to more modern stochastic methods. Through analysis of greedy search, beam search, top-k sampling, top-p sampling, contrastive searching, and locally typical searching, this work has provided valuable insights into the strengths, weaknesses, and potentia… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: This report pertains to the Capstone Project done by Group 5 of the Fall batch of 2023 students at Praxis Tech School, Kolkata, India. The reports consists of 57 pages and it includes 17 figures and 8 tables. This is the preprint which will be submitted to IEEE CONIT 2024 for review

  31. arXiv:2403.17673  [pdf, other

    cs.LG cs.CR cs.DS

    How Private are DP-SGD Implementations?

    Authors: Lynn Chua, Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang

    Abstract: We demonstrate a substantial gap between the privacy guarantees of the Adaptive Batch Linear Queries (ABLQ) mechanism under different types of batch sampling: (i) Shuffling, and (ii) Poisson subsampling; the typical analysis of Differentially Private Stochastic Gradient Descent (DP-SGD) follows by interpreting it as a post-processing of ABLQ. While shuffling-based DP-SGD is more commonly used in p… ▽ More

    Submitted 6 June, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Proceedings of ICML 2024

  32. arXiv:2403.17199  [pdf, other

    cs.CL

    Extracting Social Support and Social Isolation Information from Clinical Psychiatry Notes: Comparing a Rule-based NLP System and a Large Language Model

    Authors: Braja Gopal Patra, Lauren A. Lepow, Praneet Kasi Reddy Jagadeesh Kumar, Veer Vekaria, Mohit Manoj Sharma, Prakash Adekkanattu, Brian Fennessy, Gavin Hynes, Isotta Landi, Jorge A. Sanchez-Ruiz, Euijung Ryu, Joanna M. Biernacka, Girish N. Nadkarni, Ardesheer Talati, Myrna Weissman, Mark Olfson, J. John Mann, Alexander W. Charney, Jyotishman Pathak

    Abstract: Background: Social support (SS) and social isolation (SI) are social determinants of health (SDOH) associated with psychiatric outcomes. In electronic health records (EHRs), individual-level SS/SI is typically documented as narrative clinical notes rather than structured coded data. Natural language processing (NLP) algorithms can automate the otherwise labor-intensive process of data extraction.… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 2 figures, 3 tables

  33. arXiv:2403.16338  [pdf, other

    cs.CV cs.AI

    Impact of Video Compression Artifacts on Fisheye Camera Visual Perception Tasks

    Authors: Madhumitha Sakthi, Louis Kerofsky, Varun Ravi Kumar, Senthil Yogamani

    Abstract: Autonomous driving systems require extensive data collection schemes to cover the diverse scenarios needed for building a robust and safe system. The data volumes are in the order of Exabytes and have to be stored for a long period of time (i.e., more than 10 years of the vehicle's life cycle). Lossless compression doesn't provide sufficient compression ratios, hence, lossy video compression has b… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  34. arXiv:2403.15224  [pdf, other

    cs.CR cs.DS

    Differentially Private Ad Conversion Measurement

    Authors: John Delaney, Badih Ghazi, Charlie Harrison, Christina Ilvento, Ravi Kumar, Pasin Manurangsi, Martin Pal, Karthik Prabhakar, Mariana Raykova

    Abstract: In this work, we study ad conversion measurement, a central functionality in digital advertising, where an advertiser seeks to estimate advertiser website (or mobile app) conversions attributed to ad impressions that users have interacted with on various publisher websites (or mobile apps). Using differential privacy (DP), a notion that has gained in popularity due to its strong mathematical guara… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: To appear in PoPETS 2024

  35. arXiv:2403.13793  [pdf, other

    cs.LG

    Evaluating Frontier Models for Dangerous Capabilities

    Authors: Mary Phuong, Matthew Aitchison, Elliot Catt, Sarah Cogan, Alexandre Kaskasoli, Victoria Krakovna, David Lindner, Matthew Rahtz, Yannis Assael, Sarah Hodkinson, Heidi Howard, Tom Lieberum, Ramana Kumar, Maria Abi Raad, Albert Webson, Lewis Ho, Sharon Lin, Sebastian Farquhar, Marcus Hutter, Gregoire Deletang, Anian Ruoss, Seliem El-Sayed, Sasha Brown, Anca Dragan, Rohin Shah , et al. (2 additional authors not shown)

    Abstract: To understand the risks posed by a new AI system, we must understand what it can and cannot do. Building on prior work, we introduce a programme of new "dangerous capability" evaluations and pilot them on Gemini 1.0 models. Our evaluations cover four areas: (1) persuasion and deception; (2) cyber-security; (3) self-proliferation; and (4) self-reasoning. We do not find evidence of strong dangerous… ▽ More

    Submitted 5 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  36. arXiv:2403.11108  [pdf, ps, other

    cs.CL

    HarmPot: An Annotation Framework for Evaluating Offline Harm Potential of Social Media Text

    Authors: Ritesh Kumar, Ojaswee Bhalla, Madhu Vanthi, Shehlat Maknoon Wani, Siddharth Singh

    Abstract: In this paper, we discuss the development of an annotation schema to build datasets for evaluating the offline harm potential of social media texts. We define "harm potential" as the potential for an online public post to cause real-world physical harm (i.e., violence). Understanding that real-world violence is often spurred by a web of triggers, often combining several online tactics and pre-exis… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted for: LREC COLING 2024

  37. arXiv:2403.10644  [pdf, ps, other

    cs.IT

    Multiple Spectrally Null Constrained Complete Complementary Codes of Various Lengths Over Small Alphabet

    Authors: Rajen Kumar, Palash Sarkar, Prashant Kumar Srivastava, Sudhan Majhi

    Abstract: Complete complementary codes (CCCs) are highly valuable in the fields of information security, radar and communication. The spectrally null constrained (SNC) problem arises in radar and modern communication systems due to the reservation or prohibition of specific spectrums from transmission. The literature on SNC-CCCs is somewhat limited in comparison to the literature on traditional CCCs. The ma… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  38. arXiv:2403.05738  [pdf, other

    cs.LG cs.GT

    Provable Policy Gradient Methods for Average-Reward Markov Potential Games

    Authors: Min Cheng, Ruida Zhou, P. R. Kumar, Chao Tian

    Abstract: We study Markov potential games under the infinite horizon average reward criterion. Most previous studies have been for discounted rewards. We prove that both algorithms based on independent policy gradient and independent natural policy gradient converge globally to a Nash equilibrium for the average reward criterion. To set the stage for gradient-based methods, we first establish that the avera… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 38 pages, 7 figures, published to AISTAT-24

  39. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  40. arXiv:2403.01382  [pdf, other

    cs.CL

    Automatic Question-Answer Generation for Long-Tail Knowledge

    Authors: Rohan Kumar, Youngmin Kim, Sunitha Ravi, Haitian Sun, Christos Faloutsos, Ruslan Salakhutdinov, Minji Yoon

    Abstract: Pretrained Large Language Models (LLMs) have gained significant attention for addressing open-domain Question Answering (QA). While they exhibit high accuracy in answering questions related to common knowledge, LLMs encounter difficulties in learning about uncommon long-tail knowledge (tail entities). Since manually constructing QA datasets demands substantial human resources, the types of existin… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: Accepted at KDD 2023 KnowledgeNLP

  41. arXiv:2402.13192  [pdf, other

    math.PR cs.PF

    Spatial Queues with Nearest Neighbour Shifts

    Authors: B. R. Vinay Kumar, Lasse Leskelä

    Abstract: In this work we study multi-server queues on a Euclidean space. Consider $N$ servers that are distributed uniformly in $[0,1]^d$. Customers (users) arrive at the servers according to independent Poisson processes of intensity $λらむだ$. However, they probabilistically decide whether to join the queue they arrived at, or move to one of the nearest neighbours. The strategy followed by the customers affect… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: A part of this work was accepted to the conference International Teletraffic Congress (ITC 35) held between 3--5 October 2023 in Turin, Italy

    MSC Class: 60K30; 05C80

  42. arXiv:2402.12080  [pdf, other

    cs.CL

    Can LLMs Compute with Reasons?

    Authors: Harshit Sandilya, Peehu Raj, Jainit Sushil Bafna, Srija Mukhopadhyay, Shivansh Sharma, Ellwil Sharma, Arastu Sharma, Neeta Trivedi, Manish Shrivastava, Rajesh Kumar

    Abstract: Large language models (LLMs) often struggle with complex mathematical tasks, prone to "hallucinating" incorrect answers due to their reliance on statistical patterns. This limitation is further amplified in average Small LangSLMs with limited context and training data. To address this challenge, we propose an "Inductive Learning" approach utilizing a distributed network of SLMs. This network lever… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 8 pages

    MSC Class: 68T50 ACM Class: I.2.7

  43. arXiv:2402.11464  [pdf, ps, other

    econ.TH cs.GT

    Weighted Myerson value for Network games

    Authors: Niharika Kakoty, Surajit Borkotokey, Rajnish Kumar, Abhijit Bora

    Abstract: We study the weighted Myerson value for Network games extending a similar concept for communication situations. Network games, unlike communication situations, treat direct and indirect links among players differently and distinguish their effects in both worth generation and allocation processes. The weighted Myerson value is an allocation rule for Network games that generalizes the Myerson value… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    MSC Class: 91A12

  44. arXiv:2402.10797  [pdf, other

    cs.MS cs.LG stat.CO stat.ML

    BlackJAX: Composable Bayesian inference in JAX

    Authors: Alberto Cabezas, Adrien Corenflos, Junpeng Lao, Rémi Louf, Antoine Carnec, Kaustubh Chaudhari, Reuben Cohn-Gordon, Jeremie Coullon, Wei Deng, Sam Duffield, Gerardo Durán-Martín, Marcin Elantkowski, Dan Foreman-Mackey, Michele Gregori, Carlos Iguaran, Ravin Kumar, Martin Lysy, Kevin Murphy, Juan Camilo Orduz, Karm Patel, Xi Wang, Rob Zinkov

    Abstract: BlackJAX is a library implementing sampling and variational inference algorithms commonly used in Bayesian computation. It is designed for ease of use, speed, and modularity by taking a functional approach to the algorithms' implementation. BlackJAX is written in Python, using JAX to compile and run NumpPy-like samplers and variational methods on CPUs, GPUs, and TPUs. The library integrates well w… ▽ More

    Submitted 22 February, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: Companion paper for the library https://github.com/blackjax-devs/blackjax Update: minor changes and updated the list of authors to include technical contributors

  45. arXiv:2402.07363  [pdf, other

    cs.GT cs.LG

    Strategically-Robust Learning Algorithms for Bidding in First-Price Auctions

    Authors: Rachitesh Kumar, Jon Schneider, Balasubramanian Sivan

    Abstract: Learning to bid in repeated first-price auctions is a fundamental problem at the interface of game theory and machine learning, which has seen a recent surge in interest due to the transition of display advertising to first-price auctions. In this work, we propose a novel concave formulation for pure-strategy bidding in first-price auctions, and use it to analyze natural Gradient-Ascent-based algo… ▽ More

    Submitted 7 July, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  46. arXiv:2402.06826  [pdf, other

    cs.CV cs.RO

    Neural Rendering based Urban Scene Reconstruction for Autonomous Driving

    Authors: Shihao Shen, Louis Kerofsky, Varun Ravi Kumar, Senthil Yogamani

    Abstract: Dense 3D reconstruction has many applications in automated driving including automated annotation validation, multimodal data augmentation, providing ground truth annotations for systems lacking LiDAR, as well as enhancing auto-labeling accuracy. LiDAR provides highly accurate but sparse depth, whereas camera images enable estimation of dense depth but noisy particularly at long ranges. In this pa… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: Accepted for publication in Electronic Imaging, Autonomous Vehicles and Machines 2024. Qualitative results are shared in https://youtu.be/EK47fYJiY3M

  47. arXiv:2402.06176  [pdf, other

    eess.SY cs.MA cs.RO math.DS math.OC

    Cooperative Nonlinear Guidance Strategies for Guaranteed Pursuit-Evasion

    Authors: Saurabh Kumar, Shashi Ranjan Kumar, Abhinav Sinha

    Abstract: This paper addresses the pursuit-evasion problem involving three agents -- a purser, an evader, and a defender. We develop cooperative guidance laws for the evader-defender team that guarantee that the defender intercepts the pursuer before it reaches the vicinity of the evader. Unlike heuristic methods, optimal control, differential game formulation, and recently proposed time-constrained guidanc… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  48. arXiv:2402.05918  [pdf, other

    eess.SY cs.MA math.DS math.OC nlin.AO

    Consensus-driven Deviated Pursuit for Guaranteed Simultaneous Interception of Moving Targets

    Authors: Abhinav Sinha, Dwaipayan Mukherjee, Shashi Ranjan Kumar

    Abstract: This work proposes a cooperative strategy that employs deviated pursuit guidance to simultaneously intercept a moving (but not manoeuvring) target. As opposed to many existing cooperative guidance strategies which use estimates of time-to-go, based on proportional-navigation guidance, the proposed strategy uses an exact expression for time-to-go to ensure simultaneous interception. The guidance de… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  49. arXiv:2402.02154  [pdf, other

    cs.CV cs.LG

    Evaluating the Robustness of Off-Road Autonomous Driving Segmentation against Adversarial Attacks: A Dataset-Centric analysis

    Authors: Pankaj Deoli, Rohit Kumar, Axel Vierling, Karsten Berns

    Abstract: This study investigates the vulnerability of semantic segmentation models to adversarial input perturbations, in the domain of off-road autonomous driving. Despite good performance in generic conditions, the state-of-the-art classifiers are often susceptible to (even) small perturbations, ultimately resulting in inaccurate predictions with high confidence. Prior research has directed their focus o… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: 8 pages

  50. arXiv:2401.15246  [pdf, other

    cs.LG cs.CR cs.IR

    Training Differentially Private Ad Prediction Models with Semi-Sensitive Features

    Authors: Lynn Chua, Qiliang Cui, Badih Ghazi, Charlie Harrison, Pritish Kamath, Walid Krichene, Ravi Kumar, Pasin Manurangsi, Krishna Giri Narra, Amer Sinha, Avinash Varadarajan, Chiyuan Zhang

    Abstract: Motivated by problems arising in digital advertising, we introduce the task of training differentially private (DP) machine learning models with semi-sensitive features. In this setting, a subset of the features is known to the attacker (and thus need not be protected) while the remaining features as well as the label are unknown to the attacker and should be protected by the DP guarantee. This ta… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: 7 pages, 4 figures