(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 57 results for author: Tian, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.18908  [pdf, other

    cs.LG cs.CL cs.CV

    Wolf: Captioning Everything with a World Summarization Framework

    Authors: Boyi Li, Ligeng Zhu, Ran Tian, Shuhan Tan, Yuxiao Chen, Yao Lu, Yin Cui, Sushant Veer, Max Ehrlich, Jonah Philion, Xinshuo Weng, Fuzhao Xue, Andrew Tao, Ming-Yu Liu, Sanja Fidler, Boris Ivanovic, Trevor Darrell, Jitendra Malik, Song Han, Marco Pavone

    Abstract: We propose Wolf, a WOrLd summarization Framework for accurate video captioning. Wolf is an automated captioning framework that adopts a mixture-of-experts approach, leveraging complementary strengths of Vision Language Models (VLMs). By utilizing both image and video models, our framework captures different levels of information and summarizes them efficiently. Our approach can be applied to enhan… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  2. arXiv:2407.01531  [pdf, other

    cs.RO cs.LG

    Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning

    Authors: Yixiao Wang, Yifei Zhang, Mingxiao Huo, Ran Tian, Xiang Zhang, Yichen Xie, Chenfeng Xu, Pengliang Ji, Wei Zhan, Mingyu Ding, Masayoshi Tomizuka

    Abstract: The increasing complexity of tasks in robotics demands efficient strategies for multitask and continual learning. Traditional models typically rely on a universal policy for all tasks, facing challenges such as high computational costs and catastrophic forgetting when learning new tasks. To address these issues, we introduce a sparse, reusable, and flexible policy, Sparse Diffusion Policy (SDP). B… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2407.00959  [pdf, other

    cs.AI cs.RO

    Tokenize the World into Object-level Knowledge to Address Long-tail Events in Autonomous Driving

    Authors: Ran Tian, Boyi Li, Xinshuo Weng, Yuxiao Chen, Edward Schmerling, Yue Wang, Boris Ivanovic, Marco Pavone

    Abstract: The autonomous driving industry is increasingly adopting end-to-end learning from sensory inputs to minimize human biases in system design. Traditional end-to-end driving models, however, suffer from long-tail events due to rare or unseen inputs within their training distributions. To address this, we propose TOKEN, a novel Multi-Modal Large Language Model (MM-LLM) that tokenizes the world into ob… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  4. arXiv:2406.16258  [pdf, other

    cs.RO cs.AI cs.LG

    MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention

    Authors: Yuxin Chen, Chen Tang, Chenran Li, Ran Tian, Peter Stone, Masayoshi Tomizuka, Wei Zhan

    Abstract: Aligning robot behavior with human preferences is crucial for deploying embodied AI agents in human-centered environments. A promising solution is interactive imitation learning from human intervention, where a human expert observes the policy's execution and provides interventions as feedback. However, existing methods often fail to utilize the prior policy efficiently to facilitate learning, thu… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    ACM Class: I.2.6; I.2.9

  5. arXiv:2406.04334  [pdf, other

    cs.CV

    DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs

    Authors: Lingchen Meng, Jianwei Yang, Rui Tian, Xiyang Dai, Zuxuan Wu, Jianfeng Gao, Yu-Gang Jiang

    Abstract: Most large multimodal models (LMMs) are implemented by feeding visual tokens as a sequence into the first layer of a large language model (LLM). The resulting architecture is simple but significantly increases computation and memory costs, as it has to handle a large number of additional tokens in its input layer. This paper presents a new architecture DeepStack for LMMs. Considering $N$ layers in… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Project Page: https://deepstack-vl.github.io/

  6. arXiv:2404.18284  [pdf, other

    cs.CV

    S3-SLAM: Sparse Tri-plane Encoding for Neural Implicit SLAM

    Authors: Zhiyao Zhang, Yunzhou Zhang, Yanmin Wu, Bin Zhao, Xingshuo Wang, Rui Tian

    Abstract: With the emergence of Neural Radiance Fields (NeRF), neural implicit representations have gained widespread applications across various domains, including simultaneous localization and mapping. However, current neural implicit SLAM faces a challenging trade-off problem between performance and the number of parameters. To address this problem, we propose sparse tri-plane encoding, which efficiently… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  7. arXiv:2404.13153  [pdf, other

    eess.IV cs.CV

    Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring

    Authors: Chengxu Liu, Xuan Wang, Xiangyu Xu, Ruhao Tian, Shuai Li, Xueming Qian, Ming-Hsuan Yang

    Abstract: Eliminating image blur produced by various kinds of motion has been a challenging problem. Dominant approaches rely heavily on model capacity to remove blurring by reconstructing residual from blurry observation in feature space. These practices not only prevent the capture of spatially variable motion in the real world but also ignore the tailored handling of various motions in image space. In th… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  8. arXiv:2403.04745  [pdf, other

    cs.RO

    A General Calibrated Regret Metric for Detecting and Mitigating Human-Robot Interaction Failures

    Authors: Kensuke Nakamura, Ran Tian, Andrea Bajcsy

    Abstract: Robot decision-making increasingly relies on expressive data-driven human prediction models when operating around people. While these models are known to suffer from prediction errors in out-of-distribution interactions, not all prediction errors equally impact downstream robot performance. We identify that the mathematical notion of regret precisely characterizes the degree to which incorrect pre… ▽ More

    Submitted 20 April, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: 6 figures, 4 tables

  9. arXiv:2401.04621  [pdf, other

    cs.SE cs.AI cs.CL

    DebugBench: Evaluating Debugging Capability of Large Language Models

    Authors: Runchu Tian, Yining Ye, Yujia Qin, Xin Cong, Yankai Lin, Yinxu Pan, Yesai Wu, Haotian Hui, Weichuan Liu, Zhiyuan Liu, Maosong Sun

    Abstract: Large Language Models (LLMs) have demonstrated exceptional coding capability. However, as another critical component of programming proficiency, the debugging capability of LLMs remains relatively unexplored. Previous evaluations of LLMs' debugging ability are significantly limited by the risk of data leakage, the scale of the dataset, and the variety of tested bugs. To overcome these deficiencies… ▽ More

    Submitted 6 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted as Findings of ACL 2024

  10. arXiv:2311.16526  [pdf, other

    cs.LG

    On robust overfitting: adversarial training induced distribution matters

    Authors: Runzhi Tian, Yongyi Mao

    Abstract: Adversarial training may be regarded as standard training with a modified loss function. But its generalization error appears much larger than standard training under standard loss. This phenomenon, known as robust overfitting, has attracted significant research attention and remains largely as a mystery. In this paper, we first show empirically that robust overfitting correlates with the increasi… ▽ More

    Submitted 10 February, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  11. arXiv:2311.04410  [pdf, other

    cs.RO eess.SY

    An Efficient Probabilistic Solution to Mapping Errors in LiDAR-Camera Fusion for Autonomous Vehicles

    Authors: Dan Shen, Zhengming Zhang, Renran Tian, Yaobin Chen, Rini Sherony

    Abstract: LiDAR-camera fusion is one of the core processes for the perception system of current automated driving systems. The typical sensor fusion process includes a list of coordinate transformation operations following system calibration. Although a significant amount of research has been done to improve the fusion accuracy, there are still inherent data mapping errors in practice related to system sync… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  12. arXiv:2311.04231  [pdf, other

    eess.SP cs.AI cs.CV

    A Practical Large-Scale Roadside Multi-View Multi-Sensor Spatial Synchronization Framework for Intelligent Transportation Systems

    Authors: Yong Li, Zhiguo Zhao, Yunli Chen, Rui Tian

    Abstract: Spatial synchronization in roadside scenarios is essential for integrating data from multiple sensors at different locations. Current methods using cascading spatial transformation (CST) often lead to cumulative errors in large-scale deployments. Manual camera calibration is insufficient and requires extensive manual work, and existing methods are limited to controlled or single-view scenarios. To… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

    Comments: 14 pages, 15 figures, 6 tables

  13. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  14. arXiv:2310.07932  [pdf, other

    cs.RO cs.AI cs.CV

    What Matters to You? Towards Visual Representation Alignment for Robot Learning

    Authors: Ran Tian, Chenfeng Xu, Masayoshi Tomizuka, Jitendra Malik, Andrea Bajcsy

    Abstract: When operating in service of people, robots need to optimize rewards aligned with end-user preferences. Since robots will rely on raw perceptual inputs like RGB images, their rewards will inevitably use visual representations. Recently there has been excitement in using representations from pre-trained visual models, but key to making these work in robotics is fine-tuning, which is typically done… ▽ More

    Submitted 15 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  15. arXiv:2310.07218  [pdf, other

    cs.MA cs.AI

    Quantifying Agent Interaction in Multi-agent Reinforcement Learning for Cost-efficient Generalization

    Authors: Yuxin Chen, Chen Tang, Ran Tian, Chenran Li, Jinning Li, Masayoshi Tomizuka, Wei Zhan

    Abstract: Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL). The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario. A quantitative examination of this relationship sheds light on effectively training agents for diverse scenarios. In this study, we present the Level of Influence (LoI), a metric quantifyi… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 12 pages, 6 figures

    ACM Class: I.2.6

  16. arXiv:2309.17036  [pdf, other

    cs.RO cs.CV

    UniQuadric: A SLAM Backend for Unknown Rigid Object 3D Tracking and Light-Weight Modeling

    Authors: Linghao Yang, Yanmin Wu, Yu Deng, Rui Tian, Xinggang Hu, Tiefeng Ma

    Abstract: Tracking and modeling unknown rigid objects in the environment play a crucial role in autonomous unmanned systems and virtual-real interactive applications. However, many existing Simultaneous Localization, Mapping and Moving Object Tracking (SLAMMOT) methods focus solely on estimating specific object poses and lack estimation of object scales and are unable to effectively track unknown objects. I… ▽ More

    Submitted 2 October, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

  17. arXiv:2307.16789  [pdf, other

    cs.AI cs.CL cs.LG

    ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

    Authors: Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Lauren Hong, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, Maosong Sun

    Abstract: Despite the advancements of open-source large language models (LLMs), e.g., LLaMA, they remain significantly limited in tool-use capabilities, i.e., using external tools (APIs) to fulfill human instructions. The reason is that current instruction tuning largely focuses on basic language tasks but ignores the tool-use domain. This is in contrast to the excellent tool-use capabilities of state-of-th… ▽ More

    Submitted 3 October, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

  18. arXiv:2307.15504  [pdf, other

    cs.CL cs.AI

    Exploring Format Consistency for Instruction Tuning

    Authors: Shihao Liang, Runchu Tian, Kunlun Zhu, Yujia Qin, Huadong Wang, Xin Cong, Zhiyuan Liu, Xiaojiang Liu, Maosong Sun

    Abstract: Instruction tuning has emerged as a promising approach to enhancing large language models in following human instructions. It is shown that increasing the diversity and number of instructions in the training data can consistently enhance generalization performance, which facilitates a recent endeavor to collect various instructions and integrate existing instruction tuning datasets into larger col… ▽ More

    Submitted 8 January, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

  19. arXiv:2304.08354  [pdf, other

    cs.CL cs.AI cs.LG

    Tool Learning with Foundation Models

    Authors: Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu , et al. (16 additional authors not shown)

    Abstract: Humans possess an extraordinary ability to create and utilize tools, allowing them to overcome physical limitations and explore new frontiers. With the advent of foundation models, AI systems have the potential to be equally adept in tool use as humans. This paradigm, i.e., tool learning with foundation models, combines the strengths of specialized tools and foundation models to achieve enhanced a… ▽ More

    Submitted 6 August, 2024; v1 submitted 17 April, 2023; originally announced April 2023.

  20. arXiv:2301.00901  [pdf, other

    cs.RO cs.AI

    Towards Modeling and Influencing the Dynamics of Human Learning

    Authors: Ran Tian, Masayoshi Tomizuka, Anca Dragan, Andrea Bajcsy

    Abstract: Humans have internal models of robots (like their physical capabilities), the world (like what will happen next), and their tasks (like a preferred goal). However, human internal models are not always perfect: for example, it is easy to underestimate a robot's inertia. Nevertheless, these models change and improve over time as humans gather more experience. Interestingly, robot actions influence w… ▽ More

    Submitted 2 January, 2023; originally announced January 2023.

    Comments: 18th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2023

  21. arXiv:2212.12660  [pdf

    eess.SY cs.CV

    Risk assessment and mitigation of e-scooter crashes with naturalistic driving data

    Authors: Avinash Prabu, Zhengming Zhang, Renran Tian, Stanley Chien, Lingxi Li, Yaobin Chen, Rini Sherony

    Abstract: Recently, e-scooter-involved crashes have increased significantly but little information is available about the behaviors of on-road e-scooter riders. Most existing e-scooter crash research was based on retrospectively descriptive media reports, emergency room patient records, and crash reports. This paper presents a naturalistic driving study with a focus on e-scooter and vehicle encounters. The… ▽ More

    Submitted 15 January, 2023; v1 submitted 24 December, 2022; originally announced December 2022.

  22. SceNDD: A Scenario-based Naturalistic Driving Dataset

    Authors: Avinash Prabu, Nitya Ranjan, Lingxi Li, Renran Tian, Stanley Chien, Yaobin Chen, Rini Sherony

    Abstract: In this paper, we propose SceNDD: a scenario-based naturalistic driving dataset that is built upon data collected from an instrumented vehicle in downtown Indianapolis. The data collection was completed in 68 driving sessions with different drivers, where each session lasted about 20--40 minutes. The main goal of creating this dataset is to provide the research community with real driving scenario… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

    Comments: Conference: 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC). Link: https://ieeexplore.ieee.org/document/9921953

  23. arXiv:2212.11979  [pdf

    eess.SY cs.CV

    A Wearable Data Collection System for Studying Micro-Level E-Scooter Behavior in Naturalistic Road Environment

    Authors: Avinash Prabu, Dan Shen, Renran Tian, Stanley Chien, Lingxi Li, Yaobin Chen, Rini Sherony

    Abstract: As one of the most popular micro-mobility options, e-scooters are spreading in hundreds of big cities and college towns in the US and worldwide. In the meantime, e-scooters are also posing new challenges to traffic safety. In general, e-scooters are suggested to be ridden in bike lanes/sidewalks or share the road with cars at the maximum speed of about 15-20 mph, which is more flexible and much fa… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

    Comments: Conference: Fast-zero'21, Kanazawa, Japan Date of publication: Sep 2021 Publisher: JSAE

    Journal ref: https://tech.jsae.or.jp/paperinfo/en/content/conf2021-02.11/

  24. arXiv:2212.06385  [pdf, other

    cs.CL

    TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities

    Authors: Zhe Zhao, Yudong Li, Cheng Hou, Jing Zhao, Rong Tian, Weijie Liu, Yiren Chen, Ningyuan Sun, Haoyan Liu, Weiquan Mao, Han Guo, Weigang Guo, Taiqiang Wu, Tao Zhu, Wenhang Shi, Chen Chen, Shan Huang, Sihong Chen, Liqun Liu, Feifei Li, Xiaoshuai Chen, Xingwu Sun, Zhanhui Kang, Xiaoyong Du, Linlin Shen , et al. (1 additional authors not shown)

    Abstract: Recently, the success of pre-training in text domain has been fully extended to vision, audio, and cross-modal scenarios. The proposed pre-training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-training models within a uniform framework. In this paper, we present TencentPretrain, a toolkit… ▽ More

    Submitted 11 July, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

  25. arXiv:2212.00776  [pdf, other

    cs.CV

    ResFormer: Scaling ViTs with Multi-Resolution Training

    Authors: Rui Tian, Zuxuan Wu, Qi Dai, Han Hu, Yu Qiao, Yu-Gang Jiang

    Abstract: Vision Transformers (ViTs) have achieved overwhelming success, yet they suffer from vulnerable resolution scalability, i.e., the performance drops drastically when presented with input resolutions that are unseen during training. We introduce, ResFormer, a framework that is built upon the seminal idea of multi-resolution training for improved performance on a wide spectrum of, mostly unseen, testi… ▽ More

    Submitted 3 April, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: CVPR 2023

  26. arXiv:2210.11693  [pdf, other

    cs.LG

    Amos: An Adam-style Optimizer with Adaptive Weight Decay towards Model-Oriented Scale

    Authors: Ran Tian, Ankur P. Parikh

    Abstract: We present Amos, a stochastic gradient-based optimizer designed for training deep neural networks. It can be viewed as an Adam optimizer with theoretically supported, adaptive learning-rate decay and weight decay. A key insight behind Amos is that it leverages model-specific information to determine the initial learning-rate and decaying schedules. When used for pre-training BERT variants and T5,… ▽ More

    Submitted 21 November, 2022; v1 submitted 20 October, 2022; originally announced October 2022.

  27. arXiv:2210.09934  [pdf, other

    cs.CL

    A Simple and Effective Method to Improve Zero-Shot Cross-Lingual Transfer Learning

    Authors: Kunbo Ding, Weijie Liu, Yuejian Fang, Weiquan Mao, Zhe Zhao, Tao Zhu, Haoyan Liu, Rong Tian, Yiren Chen

    Abstract: Existing zero-shot cross-lingual transfer methods rely on parallel corpora or bilingual dictionaries, which are expensive and impractical for low-resource languages. To disengage from these dependencies, researchers have explored training multilingual models on English-only resources and transferring them to low-resource languages. However, its effect is limited by the gap between embedding cluste… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: Published at COLING2022

  28. arXiv:2209.09130  [pdf, other

    cs.LG cs.CL

    SAMP: A Model Inference Toolkit of Post-Training Quantization for Text Processing via Self-Adaptive Mixed-Precision

    Authors: Rong Tian, Zijing Zhao, Weijie Liu, Haoyan Liu, Weiquan Mao, Zhe Zhao, Kan Zhou

    Abstract: The latest industrial inference engines, such as FasterTransformer and TurboTransformers, have verified that half-precision floating point (FP16) and 8-bit integer (INT8) quantization can greatly improve model inference speed. However, the existing INT8 quantization methods are too complicated, and improper usage will lead to model performance damage greatly. In this paper, we develop a toolkit fo… ▽ More

    Submitted 17 December, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: This paper was accepted by EMNLP2023

  29. arXiv:2208.13441  [pdf, other

    cs.CV cs.AI

    Rethinking Skip Connections in Encoder-decoder Networks for Monocular Depth Estimation

    Authors: Zhitong Lai, Haichao Sun, Rui Tian, Nannan Ding, Zhiguo Wu, Yanjie Wang

    Abstract: Skip connections are fundamental units in encoder-decoder networks, which are able to improve the feature propagtion of the neural networks. However, most methods with skip connections just connected features with the same resolution in the encoder and the decoder, which ignored the information loss in the encoder with the layers going deeper. To leverage the information loss of the features in sh… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.

  30. arXiv:2205.11588  [pdf, other

    cs.CL cs.AI

    Simple Recurrence Improves Masked Language Models

    Authors: Tao Lei, Ran Tian, Jasmijn Bastings, Ankur P. Parikh

    Abstract: In this work, we explore whether modeling recurrence into the Transformer architecture can both be beneficial and efficient, by building an extremely simple recurrent module into the Transformer. We compare our model to baselines following the training and evaluation recipe of BERT. Our results confirm that recurrence can indeed improve Transformer models by a consistent margin, without requiring… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

  31. arXiv:2204.12143  [pdf, other

    cs.CV

    Deeper Insights into the Robustness of ViTs towards Common Corruptions

    Authors: Rui Tian, Zuxuan Wu, Qi Dai, Han Hu, Yu-Gang Jiang

    Abstract: With Vision Transformers (ViTs) making great advances in a variety of computer vision tasks, recent literature have proposed various variants of vanilla ViTs to achieve better efficiency and efficacy. However, it remains unclear how their unique architecture impact robustness towards common corruptions. In this paper, we make the first attempt to probe into the robustness gap among ViT variants an… ▽ More

    Submitted 19 August, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

  32. arXiv:2112.02604  [pdf, other

    cs.CV cs.AI

    PSI: A Pedestrian Behavior Dataset for Socially Intelligent Autonomous Car

    Authors: Tina Chen, Taotao Jing, Renran Tian, Yaobin Chen, Joshua Domeyer, Heishiro Toyoda, Rini Sherony, Zhengming Ding

    Abstract: Prediction of pedestrian behavior is critical for fully autonomous vehicles to drive in busy city streets safely and efficiently. The future autonomous cars need to fit into mixed conditions with not only technical but also social capabilities. As more algorithms and datasets have been developed to predict pedestrian behaviors, these efforts lack the benchmark labels and the capability to estimate… ▽ More

    Submitted 11 June, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

  33. arXiv:2111.14060  [pdf, other

    cs.CV

    Detection of E-scooter Riders in Naturalistic Scenes

    Authors: Kumar Apurv, Renran Tian, Rini Sherony

    Abstract: E-scooters have become ubiquitous vehicles in major cities around the world.The numbers of e-scooters keep escalating, increasing their interactions with other cars on the road. Normal behavior of an e-scooter rider varies enormously to other vulnerable road users. This situation creates new challenges for vehicle active safety systems and automated driving functionalities, which require the detec… ▽ More

    Submitted 28 November, 2021; originally announced November 2021.

  34. arXiv:2110.08977  [pdf, other

    cs.RO cs.CV

    Accurate and Robust Object-oriented SLAM with 3D Quadric Landmark Construction in Outdoor Environment

    Authors: Rui Tian, Yunzhou Zhang, Yonghui Feng, Linghao Yang, Zhenzhong Cao, Sonya Coleman, Dermot Kerr

    Abstract: Object-oriented SLAM is a popular technology in autonomous driving and robotics. In this paper, we propose a stereo visual SLAM with a robust quadric landmark representation method. The system consists of four components, including deep learning detection, object-oriented data association, dual quadric landmark initialization and object-based pose optimization. State-of-the-art quadric-based SLAM… ▽ More

    Submitted 17 October, 2021; originally announced October 2021.

    Comments: Submitting to RA-L

  35. arXiv:2109.14700  [pdf, other

    cs.RO

    Safety Assurances for Human-Robot Interaction via Confidence-aware Game-theoretic Human Models

    Authors: Ran Tian, Liting Sun, Andrea Bajcsy, Masayoshi Tomizuka, Anca D. Dragan

    Abstract: An outstanding challenge with safety methods for human-robot interaction is reducing their conservatism while maintaining robustness to variations in human behavior. In this work, we propose that robots use confidence-aware game-theoretic models of human behavior when assessing the safety of a human-robot interaction. By treating the influence between the human and robot as well as the human's rat… ▽ More

    Submitted 30 October, 2021; v1 submitted 29 September, 2021; originally announced September 2021.

  36. arXiv:2109.12490  [pdf, other

    cs.RO

    Anytime Game-Theoretic Planning with Active Reasoning About Humans' Latent States for Human-Centered Robots

    Authors: Ran Tian, Liting Sun, Masayoshi Tomizuka, David Isele

    Abstract: A human-centered robot needs to reason about the cognitive limitation and potential irrationality of its human partner to achieve seamless interactions. This paper proposes an anytime game-theoretic planner that integrates iterative reasoning models, a partially observable Markov decision process, and chance-constrained Monte-Carlo belief tree search for robot behavioral planning. Our planner enab… ▽ More

    Submitted 26 September, 2021; originally announced September 2021.

    Comments: Presented at ICRA 2021

  37. arXiv:2108.13032  [pdf, other

    cs.CL cs.LG

    Shatter: An Efficient Transformer Encoder with Single-Headed Self-Attention and Relative Sequence Partitioning

    Authors: Ran Tian, Joshua Maynez, Ankur P. Parikh

    Abstract: The highly popular Transformer architecture, based on self-attention, is the foundation of large pretrained models such as BERT, that have become an enduring paradigm in NLP. While powerful, the computational resources and time required to pretrain such models can be prohibitive. In this work, we present an alternative self-attention architecture, Shatter, that more efficiently encodes sequence in… ▽ More

    Submitted 30 August, 2021; originally announced August 2021.

  38. arXiv:2106.02737  [pdf, other

    cs.RO

    Negotiation-Aware Reachability-Based Safety Verification for AutonomousDriving in Interactive Scenarios

    Authors: Ran Tian, Anjian Li, Masayoshi Tomizuka, Liting Sun

    Abstract: Safety assurance is a critical yet challenging aspect when developing self-driving technologies. Hamilton-Jacobi backward-reachability analysis is a formal verification tool for verifying the safety of dynamic systems in the presence of disturbances. However, the standard approach is too conservative to be applied to self-driving applications due to its worst-case assumption on humans' behaviors (… ▽ More

    Submitted 4 June, 2021; originally announced June 2021.

    Comments: This work is presented at the ICRA 2021 Workshop on Safe Robot Control with Learned Motion and Environment Models

  39. arXiv:2103.04289  [pdf, other

    cs.AI cs.RO

    Learning Human Rewards by Inferring Their Latent Intelligence Levels in Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data

    Authors: Ran Tian, Masayoshi Tomizuka, Liting Sun

    Abstract: Reward function, as an incentive representation that recognizes humans' agency and rationalizes humans' actions, is particularly appealing for modeling human behavior in human-robot interaction. Inverse Reinforcement Learning is an effective way to retrieve reward functions from demonstrations. However, it has always been challenging when applying it to multi-agent settings since the mutual influe… ▽ More

    Submitted 7 March, 2021; originally announced March 2021.

  40. arXiv:2102.06827  [pdf, other

    cs.MS physics.chem-ph

    COMET: A Domain-Specific Compilation of High-Performance Computational Chemistry

    Authors: Erdal Mutlu, Ruiqin Tian, Bin Ren, Sriram Krishnamoorthy, Roberto Gioiosa, Jacques Pienaar, Gokcen Kestor

    Abstract: The computational power increases over the past decades havegreatly enhanced the ability to simulate chemical reactions andunderstand ever more complex transformations. Tensor contractions are the fundamental computational building block of these simulations. These simulations have often been tied to one platform and restricted in generality by the interface provided to the user. The expanding pre… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

    Comments: Proceeding of the 33rd the Workshop on Languages and Compilers for Parallel Computing (LCPC), October 2020

  41. arXiv:2102.05187  [pdf, other

    cs.DC cs.PL

    A High-Performance Sparse Tensor Algebra Compiler in Multi-Level IR

    Authors: Ruiqin Tian, Luanzheng Guo, Jiajia Li, Bin Ren, Gokcen Kestor

    Abstract: Tensor algebra is widely used in many applications, such as scientific computing, machine learning, and data analytics. The tensors represented real-world data are usually large and sparse. There are tens of storage formats designed for sparse matrices and/or tensors and the performance of sparse tensor operations depends on a particular architecture and/or selected sparse format, which makes it c… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

  42. arXiv:2101.05995  [pdf, other

    cs.CV

    Accurate and Robust Scale Recovery for Monocular Visual Odometry Based on Plane Geometry

    Authors: Rui Tian, Yunzhou Zhang, Delong Zhu, Shiwen Liang, Sonya Coleman, Dermot Kerr

    Abstract: Scale ambiguity is a fundamental problem in monocular visual odometry. Typical solutions include loop closure detection and environment information mining. For applications like self-driving cars, loop closure is not always available, hence mining prior knowledge from the environment becomes a more promising approach. In this paper, with the assumption of a constant height of the camera above the… ▽ More

    Submitted 16 May, 2021; v1 submitted 15 January, 2021; originally announced January 2021.

    Comments: Submitting to IEEE International Conference on Robotics and Automation 2021

  43. arXiv:2010.01677  [pdf, other

    cs.CL

    Local Additivity Based Data Augmentation for Semi-supervised NER

    Authors: Jiaao Chen, Zhenghui Wang, Ran Tian, Zichao Yang, Diyi Yang

    Abstract: Named Entity Recognition (NER) is one of the first stages in deep language understanding yet current NER models heavily rely on human-annotated data. In this work, to alleviate the dependence on labeled data, we propose a Local Additivity based Data Augmentation (LADA) method for semi-supervised NER, in which we create virtual samples by interpolating sequences close to each other. Our approach ha… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  44. arXiv:2009.01495  [pdf, other

    cs.LG stat.ML

    Bounded Risk-Sensitive Markov Games: Forward Policy Design and Inverse Reward Learning with Iterative Reasoning and Cumulative Prospect Theory

    Authors: Ran Tian, Liting Sun, Masayoshi Tomizuka

    Abstract: Classical game-theoretic approaches for multi-agent systems in both the forward policy design problem and the inverse reward learning problem often make strong rationality assumptions: agents perfectly maximize expected utilities under uncertainties. Such assumptions, however, substantially mismatch with observed humans' behaviors such as satisficing with sub-optimal, risk-seeking, and loss-aversi… ▽ More

    Submitted 20 March, 2021; v1 submitted 3 September, 2020; originally announced September 2020.

    Comments: Accepted by 2021 AAAI Conference on Artificial Intelligence

    Journal ref: 2021 AAAI Conference on Artificial Intelligence

  45. arXiv:2006.14420  [pdf, other

    cs.RO eess.SY

    Three-Dimensional Dynamic Modeling and Motion Analysis for an Active-Tail-Actuated Robotic Fish with Barycentre Regulating Mechanism

    Authors: Xingwen Zheng, Minglei Xiong, Junzheng Zheng, Manyi Wang, Runyu Tian, Guangming Xie

    Abstract: Dynamic modeling has been capturing attention for its fundamentality in precise locomotion analyses and control of underwater robots. However, the existing researches have mainly focused on investigating two-dimensional motion of underwater robots, and little attention has been paid to three-dimensional dynamic modeling, which is just what we focus on. In this article, a three-dimensional dynamic… ▽ More

    Submitted 22 May, 2021; v1 submitted 23 June, 2020; originally announced June 2020.

  46. arXiv:1910.08684  [pdf, other

    cs.CL

    Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation

    Authors: Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh

    Abstract: We address the issue of hallucination in data-to-text generation, i.e., reducing the generation of text that is unsupported by the source. We conjecture that hallucination can be caused by an encoder-decoder model generating content phrases without attending to the source; so we propose a confidence score to ensure that the model attends to the source whenever necessary, as well as a variational B… ▽ More

    Submitted 2 November, 2020; v1 submitted 18 October, 2019; originally announced October 2019.

  47. arXiv:1910.07141  [pdf, other

    cs.RO eess.SY

    Game-theoretic Modeling of Traffic in Unsignalized Intersection Network for Autonomous Vehicle Control Verification and Validation

    Authors: Ran Tian, Nan Li, Ilya Kolmanovsky, Yildiray Yildiz, Anouck Girard

    Abstract: For a foreseeable future, autonomous vehicles (AVs) will operate in traffic together with human-driven vehicles. Their planning and control systems need extensive testing, including early-stage testing in simulations where the interactions among autonomous/human-driven vehicles are represented. Motivated by the need for such simulation tools, we propose a game-theoretic approach to modeling vehicl… ▽ More

    Submitted 17 July, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: IEEE Intelligent Transportation Systems Transactions

  48. arXiv:1909.12935  [pdf

    cs.CV cs.CY

    Responsible Facial Recognition and Beyond

    Authors: Yi Zeng, Enmeng Lu, Yinqian Sun, Ruochen Tian

    Abstract: Facial recognition is changing the way we live in and interact with our society. Here we discuss the two sides of facial recognition, summarizing potential risks and current concerns. We introduce current policies and regulations in different countries. Very importantly, we point out that the risks and concerns are not only from facial recognition, but also realistically very similar to other biom… ▽ More

    Submitted 19 September, 2019; originally announced September 2019.

  49. arXiv:1909.12701  [pdf, other

    cs.AI cs.GT cs.LG

    Beating humans in a penny-matching game by leveraging cognitive hierarchy theory and Bayesian learning

    Authors: Ran Tian, Nan Li, Ilya Kolmanovsky, Anouck Girard

    Abstract: It is a long-standing goal of artificial intelligence (AI) to be superior to human beings in decision making. Games are suitable for testing AI capabilities of making good decisions in non-numerical tasks. In this paper, we develop a new AI algorithm to play the penny-matching game considered in Shannon's "mind-reading machine" (1953) against human players. In particular, we exploit cognitive hier… ▽ More

    Submitted 13 February, 2021; v1 submitted 27 September, 2019; originally announced September 2019.

    Comments: IEEE 2020 American Control Conference

  50. arXiv:1810.00829  [pdf, other

    cs.GT cs.LG cs.RO

    Adaptive Game-Theoretic Decision Making for Autonomous Vehicle Control at Roundabouts

    Authors: Ran Tian, Sisi Li, Nan Li, Ilya Kolmanovsky, Anouck Girard, Yildiray Yildiz

    Abstract: In this paper, we propose a decision making algorithm for autonomous vehicle control at a roundabout intersection. The algorithm is based on a game-theoretic model representing the interactions between the ego vehicle and an opponent vehicle, and adapts to an online estimated driver type of the opponent vehicle. Simulation results are reported.

    Submitted 1 October, 2018; originally announced October 2018.

    Comments: 2018 IEEE Conference on Decision and Control (CDC)