(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 133 results for author: Hoi, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.02415  [pdf, other

    cs.CV

    What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases

    Authors: Anthony Meng Huat Tiong, Junqi Zhao, Boyang Li, Junnan Li, Steven C. H. Hoi, Caiming Xiong

    Abstract: Vision-language (VL) models, pretrained on colossal image-text datasets, have attained broad VL competence that is difficult to evaluate. A common belief is that a small number of VL skills underlie the variety of VL tests. In this paper, we perform a large-scale transfer learning experiment aimed at discovering latent VL skills from data. We reveal interesting characteristics that have important… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  2. arXiv:2402.02526  [pdf, other

    cs.LG

    CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition

    Authors: Quang Pham, Giang Do, Huy Nguyen, TrungTin Nguyen, Chenghao Liu, Mina Sartipi, Binh T. Nguyen, Savitha Ramasamy, Xiaoli Li, Steven Hoi, Nhat Ho

    Abstract: Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width. However, effective training of SMoE has proven to be challenging due to the representation collapse issue, which causes parameter redundancy and limited representation potentials. In this work, we propose a competition mechanism to address this… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  3. arXiv:2402.01440  [pdf, other

    cs.LG cs.AI cs.SI

    Few-Shot Learning on Graphs: from Meta-learning to Pre-training and Prompting

    Authors: Xingtong Yu, Yuan Fang, Zemin Liu, Yuxia Wu, Zhihao Wen, Jianyuan Bo, Xinming Zhang, Steven C. H. Hoi

    Abstract: Graph representation learning, a critical step in graph-centric tasks, has seen significant advancements. Earlier techniques often operate in an end-to-end setting, where performance heavily relies on the availability of ample labeled data. This constraint has spurred the emergence of few-shot learning on graphs, where only a few task-specific labels are available for each task. Given the extensiv… ▽ More

    Submitted 2 March, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  4. arXiv:2312.07035  [pdf, other

    cs.LG cs.AI

    HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts

    Authors: Giang Do, Khiem Le, Quang Pham, TrungTin Nguyen, Thanh-Nam Doan, Bint T. Nguyen, Chenghao Liu, Savitha Ramasamy, Xiaoli Li, Steven Hoi

    Abstract: By routing input tokens to only a few split experts, Sparse Mixture-of-Experts has enabled efficient training of large language models. Recent findings suggest that fixing the routers can achieve competitive performance by alleviating the collapsing problem, where all experts eventually learn similar representations. However, this strategy has two key limitations: (i) the policy derived from rando… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  5. arXiv:2310.18628  [pdf, other

    cs.CL cs.LG

    Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation

    Authors: Hailin Chen, Amrita Saha, Steven Hoi, Shafiq Joty

    Abstract: With the rise of powerful closed-sourced LLMs (ChatGPT, GPT-4), there are increasing interests in distilling the capabilies of close-sourced LLMs to smaller open-sourced LLMs. Previous distillation methods usually prompt ChatGPT to generate a set of instructions and answers, for the student model to learn. However, such standard distillation approach neglects the merits and conditions of the stude… ▽ More

    Submitted 26 January, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023; Codes at: https://github.com/SalesforceAIResearch/PersDistill

  6. arXiv:2309.06057  [pdf, other

    cs.SE cs.CL

    RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair

    Authors: Weishi Wang, Yue Wang, Shafiq Joty, Steven C. H. Hoi

    Abstract: Automatic program repair (APR) is crucial to reduce manual debugging efforts for developers and improve software reliability. While conventional search-based techniques typically rely on heuristic rules or a redundancy assumption to mine fix patterns, recent years have witnessed the surge of deep learning (DL) based approaches to automate the program repair process in a data-driven manner. However… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: FSE 2023, Long paper

  7. arXiv:2306.11417  [pdf, other

    cs.AI cs.LG cs.SE

    PyRCA: A Library for Metric-based Root Cause Analysis

    Authors: Chenghao Liu, Wenzhuo Yang, Himanshu Mittal, Manpreet Singh, Doyen Sahoo, Steven C. H. Hoi

    Abstract: We introduce PyRCA, an open-source Python machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps). It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents. It offers a unified interface for multiple commonly used RCA models, encompassing both graph construction and s… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: Github repo: https://github.com/salesforce/PyRCA

  8. arXiv:2306.00620  [pdf, other

    cs.LG

    OTW: Optimal Transport Warping for Time Series

    Authors: Fabian Latorre, Chenghao Liu, Doyen Sahoo, Steven C. H. Hoi

    Abstract: Dynamic Time Warping (DTW) has become the pragmatic choice for measuring distance between time series. However, it suffers from unavoidable quadratic time complexity when the optimal alignment matrix needs to be computed exactly. This hinders its use in deep learning architectures, where layers involving DTW computations cause severe bottlenecks. To alleviate these issues, we introduce a new metri… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: This is an extended version of an ICASSP 2023 accepted paper https://ieeexplore.ieee.org/document/10095915

  9. arXiv:2306.00029  [pdf, other

    cs.SE cs.AI

    CodeTF: One-stop Transformer Library for State-of-the-art Code LLM

    Authors: Nghi D. Q. Bui, Hung Le, Yue Wang, Junnan Li, Akhilesh Deepak Gotmare, Steven C. H. Hoi

    Abstract: Code intelligence plays a key role in transforming modern software engineering. Recently, deep learning-based models, especially Transformer-based large language models (LLMs), have demonstrated remarkable potential in tackling these tasks by leveraging massive open-source code data and programming language features. However, the development and deployment of such models often require expertise in… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: Ongoing work - Draft Preview

  10. arXiv:2305.14720  [pdf, other

    cs.CV cs.AI

    BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing

    Authors: Dongxu Li, Junnan Li, Steven C. H. Hoi

    Abstract: Subject-driven text-to-image generation models create novel renditions of an input subject based on text prompts. Existing models suffer from lengthy fine-tuning and difficulties preserving the subject fidelity. To overcome these limitations, we introduce BLIP-Diffusion, a new subject-driven image generation model that supports multimodal control which consumes inputs of subject images and text pr… ▽ More

    Submitted 21 June, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  11. arXiv:2305.07922  [pdf, other

    cs.CL cs.LG cs.PL

    CodeT5+: Open Code Large Language Models for Code Understanding and Generation

    Authors: Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D. Q. Bui, Junnan Li, Steven C. H. Hoi

    Abstract: Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence. However, existing code LLMs have two main limitations in terms of architecture and pretraining tasks. First, they often adopt a specific architecture (encoder-only or decoder-only) or rely on a unified encoder-decoder network for different downstream tasks. The former paradigm is limi… ▽ More

    Submitted 20 May, 2023; v1 submitted 13 May, 2023; originally announced May 2023.

    Comments: 26 pages, preprint

  12. arXiv:2305.06500  [pdf, other

    cs.CV cs.LG

    InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning

    Authors: Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, Steven Hoi

    Abstract: Large-scale pre-training and instruction tuning have been successful at creating general-purpose language models with broad competence. However, building general-purpose vision-language models is challenging due to the rich input distributions and task diversity resulting from the additional visual input. Although vision-language pretraining has been widely studied, vision-language instruction tun… ▽ More

    Submitted 15 June, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: preprint

  13. arXiv:2304.04661  [pdf, other

    cs.LG cs.DC cs.SE

    AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities and Challenges

    Authors: Qian Cheng, Doyen Sahoo, Amrita Saha, Wenzhuo Yang, Chenghao Liu, Gerald Woo, Manpreet Singh, Silvio Saverese, Steven C. H. Hoi

    Abstract: Artificial Intelligence for IT operations (AIOps) aims to combine the power of AI with the big data generated by IT Operations processes, particularly in cloud infrastructures, to provide actionable insights with the primary goal of maximizing availability. There are a wide variety of problems to address, and multiple use-cases, where AI capabilities can be leveraged to enhance operational efficie… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  14. arXiv:2301.13415  [pdf, other

    cs.AI cs.LG cs.SE

    LogAI: A Library for Log Analytics and Intelligence

    Authors: Qian Cheng, Amrita Saha, Wenzhuo Yang, Chenghao Liu, Doyen Sahoo, Steven Hoi

    Abstract: Software and System logs record runtime information about processes executing within a system. These logs have become the most critical and ubiquitous forms of observability data that help developers understand system behavior, monitor system health and resolve issues. However, the volume of logs generated can be humongous (of the order of petabytes per day) especially for complex distributed syst… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

    Comments: 17 pages, 7 figures, technical report for open source code, paper release with code

  15. arXiv:2301.12597  [pdf, other

    cs.CV

    BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

    Authors: Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi

    Abstract: The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models. This paper proposes BLIP-2, a generic and efficient pre-training strategy that bootstraps vision-language pre-training from off-the-shelf frozen pre-trained image encoders and frozen large language models. BLIP-2 bridges the modality gap with a lightweight Querying Tra… ▽ More

    Submitted 15 June, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

  16. arXiv:2301.10859  [pdf, other

    cs.LG cs.AI

    Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data

    Authors: Devansh Arpit, Matthew Fernandez, Itai Feigenbaum, Weiran Yao, Chenghao Liu, Wenzhuo Yang, Paul Josel, Shelby Heinecke, Eric Hu, Huan Wang, Stephen Hoi, Caiming Xiong, Kun Zhang, Juan Carlos Niebles

    Abstract: We introduce the Salesforce CausalAI Library, an open-source library for causal analysis using observational data. It supports causal discovery and causal inference for tabular and time series data, of discrete, continuous and heterogeneous types. This library includes algorithms that handle linear and non-linear causal relationships between variables, and uses multi-processing for speed-up. We al… ▽ More

    Submitted 22 September, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

  17. arXiv:2212.10846  [pdf, other

    cs.CV cs.MM

    From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models

    Authors: Jiaxian Guo, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Boyang Li, Dacheng Tao, Steven C. H. Hoi

    Abstract: Large language models (LLMs) have demonstrated excellent zero-shot generalization to new language tasks. However, effective utilization of LLMs for zero-shot visual question-answering (VQA) remains challenging, primarily due to the modality disconnection and task disconnection between LLM and VQA task. End-to-end training on vision and language data may bridge the disconnections, but is inflexible… ▽ More

    Submitted 8 May, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: CVPR 2023 Camera Ready Version

  18. arXiv:2211.17142  [pdf, other

    cs.LG cs.AI

    Learning Label Modular Prompts for Text Classification in the Wild

    Authors: Hailin Chen, Amrita Saha, Shafiq Joty, Steven C. H. Hoi

    Abstract: Machine learning models usually assume i.i.d data during training and testing, but data and tasks in real world often change over time. To emulate the transient nature of real world, we propose a challenging but practical task: text classification in-the-wild, which introduces different non-stationary training/testing stages. Decomposing a complex task into modular components can enable robust gen… ▽ More

    Submitted 5 December, 2022; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: accepted to EMNLP 2022

  19. arXiv:2211.15916  [pdf, other

    cs.CL

    BotSIM: An End-to-End Bot Simulation Toolkit for Commercial Task-Oriented Dialog Systems

    Authors: Guangsen Wang, Shafiq Joty, Junnan Li, Steven Hoi

    Abstract: We introduce BotSIM, a modular, open-source Bot SIMulation environment with dialog generation, user simulation and conversation analytics capabilities. BotSIM aims to serve as a one-stop solution for large-scale data-efficient end-to-end evaluation, diagnosis and remediation of commercial task-oriented dialog (TOD) systems to significantly accelerate commercial bot development and evaluation, redu… ▽ More

    Submitted 30 November, 2022; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Accompanying code documentation at https://opensource.salesforce.com/botsim/latest/index.html. arXiv admin note: text overlap with arXiv:2211.11982

  20. arXiv:2211.14875  [pdf, other

    cs.SE cs.CL

    Detect-Localize-Repair: A Unified Framework for Learning to Debug with CodeT5

    Authors: Nghi D. Q. Bui, Yue Wang, Steven Hoi

    Abstract: Automated software debugging is a crucial task for improving the productivity of software developers. Many neural-based techniques have been proven effective for debugging-related tasks such as bug localization and program repair (or bug fixing). However, these techniques often focus only on either one of them or approach them in a stage-wise manner, ignoring the mutual benefits between them. In t… ▽ More

    Submitted 22 December, 2022; v1 submitted 27 November, 2022; originally announced November 2022.

    Comments: Accepted to EMNLP 2022 Findings Track

  21. arXiv:2211.11982  [pdf, other

    cs.CL

    BotSIM: An End-to-End Bot Simulation Framework for Commercial Task-Oriented Dialog Systems

    Authors: Guangsen Wang, Samson Tan, Shafiq Joty, Gang Wu, Jimmy Au, Steven Hoi

    Abstract: We present BotSIM, a data-efficient end-to-end Bot SIMulation toolkit for commercial text-based task-oriented dialog (TOD) systems. BotSIM consists of three major components: 1) a Generator that can infer semantic-level dialog acts and entities from bot definitions and generate user queries via model-based paraphrasing; 2) an agenda-based dialog user Simulator (ABUS) to simulate conversations with… ▽ More

    Submitted 30 November, 2022; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Paper accepted by the EMNLP 2022 System Demo Track; We have open-sourced the toolkit at https://github.com/salesforce/botsim

  22. arXiv:2210.08773  [pdf, other

    cs.CV

    Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training

    Authors: Anthony Meng Huat Tiong, Junnan Li, Boyang Li, Silvio Savarese, Steven C. H. Hoi

    Abstract: Visual question answering (VQA) is a hallmark of vision and language reasoning and a challenging task under the zero-shot setting. We propose Plug-and-Play VQA (PNP-VQA), a modular framework for zero-shot VQA. In contrast to most existing works, which require substantial adaptation of pretrained language models (PLMs) for the vision modality, PNP-VQA requires no additional training of the PLMs. In… ▽ More

    Submitted 19 March, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022 (Findings); correct typos in Equation 2 on page 4

  23. arXiv:2209.09019  [pdf, other

    cs.CV cs.CL cs.LG

    LAVIS: A Library for Language-Vision Intelligence

    Authors: Dongxu Li, Junnan Li, Hung Le, Guangsen Wang, Silvio Savarese, Steven C. H. Hoi

    Abstract: We introduce LAVIS, an open-source deep learning library for LAnguage-VISion research and applications. LAVIS aims to serve as a one-stop comprehensive library that brings recent advancements in the language-vision field accessible for researchers and practitioners, as well as fertilizing future research and development. It features a unified interface to easily access state-of-the-art image-langu… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: Preprint of LAVIS technical report

  24. arXiv:2209.02370  [pdf, other

    cs.AI cs.CV cs.LG

    Continual Learning, Fast and Slow

    Authors: Quang Pham, Chenghao Liu, Steven C. H. Hoi

    Abstract: According to the Complementary Learning Systems (CLS) theory~\cite{mcclelland1995there} in neuroscience, humans do effective \emph{continual learning} through two complementary systems: a fast learning system centered on the hippocampus for rapid learning of the specifics, individual experiences; and a slow learning system located in the neocortex for the gradual acquisition of structured knowledg… ▽ More

    Submitted 9 July, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2110.00175

  25. arXiv:2207.14428  [pdf, other

    cs.CV

    Paired Cross-Modal Data Augmentation for Fine-Grained Image-to-Text Retrieval

    Authors: Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

    Abstract: This paper investigates an open research problem of generating text-image pairs to improve the training of fine-grained image-to-text cross-modal retrieval task, and proposes a novel framework for paired data augmentation by uncovering the hidden semantic information of StyleGAN2 model. Specifically, we first train a StyleGAN2 model on the given dataset. We then project the real images back to the… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

    Comments: Accepted at ACM MM 2022

  26. arXiv:2207.14425  [pdf, other

    cs.CV

    3D Cartoon Face Generation with Controllable Expressions from a Single GAN Image

    Authors: Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

    Abstract: In this paper, we investigate an open research task of generating 3D cartoon face shapes from single 2D GAN generated human faces and without 3D supervision, where we can also manipulate the facial expressions of the 3D shapes. To this end, we discover the semantic meanings of StyleGAN latent space, such that we are able to produce face images of various expressions, poses, and lighting by control… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

  27. arXiv:2207.06046  [pdf, other

    cs.LG cs.AI

    Learning Deep Time-index Models for Time Series Forecasting

    Authors: Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi

    Abstract: Deep learning has been actively applied to time series forecasting, leading to a deluge of new methods, belonging to the class of historical-value models. Yet, despite the attractive properties of time-index models, such as being able to model the continuous nature of underlying time series dynamics, little attention has been given to them. Indeed, while naive deep time-index models are far more e… ▽ More

    Submitted 20 May, 2023; v1 submitted 13 July, 2022; originally announced July 2022.

  28. arXiv:2207.01780  [pdf, other

    cs.LG cs.CL cs.PL

    CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning

    Authors: Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi

    Abstract: Program synthesis or code generation aims to generate a program that satisfies a problem specification. Recent approaches using large-scale pretrained language models (LMs) have shown promising results, yet they have some critical limitations. In particular, they often follow a standard supervised fine-tuning procedure to train a code generation model only from the pairs of natural-language proble… ▽ More

    Submitted 3 November, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: An earlier version of the work was accepted to NeurIPS 2022

  29. arXiv:2206.15033  [pdf, other

    cs.LG cs.AI

    A Causal Approach to Detecting Multivariate Time-series Anomalies and Root Causes

    Authors: Wenzhuo Yang, Kun Zhang, Steven C. H. Hoi

    Abstract: Detecting anomalies and the corresponding root causes in multivariate time series plays an important role in monitoring the behaviors of various real-world systems, e.g., IT system operations or manufacturing industry. Previous anomaly detection approaches model the joint distribution without considering the underlying mechanism of multivariate time series, making them computationally hungry and h… ▽ More

    Submitted 28 September, 2022; v1 submitted 30 June, 2022; originally announced June 2022.

    Comments: 19 pages, 9 figures

    MSC Class: 68U35; 68T20 ACM Class: I.2.m

  30. arXiv:2206.07898  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    Multimodal Dialogue State Tracking

    Authors: Hung Le, Nancy F. Chen, Steven C. H. Hoi

    Abstract: Designed for tracking user goals in dialogues, a dialogue state tracker is an essential component in a dialogue system. However, the research of dialogue state tracking has largely been limited to unimodality, in which slots and slot values are limited by knowledge domains (e.g. restaurant domain with slots of restaurant name and price range) and are defined by specific database schema. In this pa… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: Accepted at NAACL 2022 (Oral)

  31. arXiv:2206.02967  [pdf, other

    cs.CV cs.AI

    Masked Unsupervised Self-training for Label-free Image Classification

    Authors: Junnan Li, Silvio Savarese, Steven C. H. Hoi

    Abstract: State-of-the-art computer vision models are mostly trained with supervised learning using human-labeled images, which limits their scalability due to the expensive annotation cost. While self-supervised representation learning has achieved impressive progress, it still requires a second stage of finetuning on labeled data. On the other hand, models pre-trained with large-scale text-image supervisi… ▽ More

    Submitted 9 March, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

  32. arXiv:2206.01612  [pdf, other

    cs.LG cs.AI cs.CV

    OmniXAI: A Library for Explainable AI

    Authors: Wenzhuo Yang, Hung Le, Tanmay Laud, Silvio Savarese, Steven C. H. Hoi

    Abstract: We introduce OmniXAI (short for Omni eXplainable AI), an open-source Python library of eXplainable AI (XAI), which offers omni-way explainable AI capabilities and various interpretable machine learning techniques to address the pain points of understanding and interpreting the decisions made by machine learning (ML) in practice. OmniXAI aims to be a one-stop comprehensive library that makes explai… ▽ More

    Submitted 12 December, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: Github repo: https://github.com/salesforce/OmniXAI

    MSC Class: 68T09; 68T20; 68T01 ACM Class: I.2.6; I.2.5

  33. arXiv:2205.15540  [pdf, other

    cs.AI cs.LG

    MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation

    Authors: Wenzhuo Yang, Jia Li, Caiming Xiong, Steven C. H. Hoi

    Abstract: Counterfactual explanation is an important Explainable AI technique to explain machine learning predictions. Despite being studied actively, existing optimization-based methods often assume that the underlying machine-learning model is differentiable and treat categorical attributes as continuous ones, which restricts their real-world applications when categorical attributes have many different va… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: 9 pages, 2 figures

    MSC Class: 68T09 ACM Class: I.2.m

  34. arXiv:2205.11024  [pdf, other

    cs.CL

    Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding

    Authors: Rishabh Bhardwaj, Amrita Saha, Steven C. H. Hoi, Soujanya Poria

    Abstract: Prompt Tuning has been largely successful as a parameter-efficient method of conditioning large-scale pre-trained language models to perform downstream tasks. Thus far, soft prompt tuning learns a fixed set of task-specific continuous vectors, i.e., soft tokens that remain static across the task samples. A fixed prompt, however, may not generalize well to the diverse kinds of inputs the task compr… ▽ More

    Submitted 22 October, 2022; v1 submitted 22 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022

  35. Mining Root Cause Knowledge from Cloud Service Incident Investigations for AIOps

    Authors: Amrita Saha, Steven C. H. Hoi

    Abstract: Root Cause Analysis (RCA) of any service-disrupting incident is one of the most critical as well as complex tasks in IT processes, especially for cloud industry leaders like Salesforce. Typically RCA investigation leverages data-sources like application error logs or service call traces. However a rich goldmine of root cause information is also hidden in the natural language documentation of the p… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Journal ref: ICSE-SEIP 2022

  36. arXiv:2203.16102  [pdf, other

    cs.LG

    Continual Normalization: Rethinking Batch Normalization for Online Continual Learning

    Authors: Quang Pham, Chenghao Liu, Steven Hoi

    Abstract: Existing continual learning methods use Batch Normalization (BN) to facilitate training and improve generalization across tasks. However, the non-i.i.d and non-stationary nature of continual learning data, especially in the online setting, amplify the discrepancy between training and testing in BN and hinder the performance of older tasks. In this work, we study the cross-task normalization effect… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

    Journal ref: International Conference on Learning Representations, 2022

  37. arXiv:2202.11672  [pdf, other

    cs.LG stat.ML

    Learning Fast and Slow for Online Time Series Forecasting

    Authors: Quang Pham, Chenghao Liu, Doyen Sahoo, Steven C. H. Hoi

    Abstract: The fast adaptation capability of deep neural networks in non-stationary environments is critical for online time series forecasting. Successful solutions require handling changes to new and recurring patterns. However, training deep neural forecaster on the fly is notoriously challenging because of their limited ability to adapt to non-stationary environments and the catastrophic forgetting of ol… ▽ More

    Submitted 17 October, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

  38. arXiv:2202.01575  [pdf, other

    cs.LG

    CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting

    Authors: Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi

    Abstract: Deep learning has been actively studied for time series forecasting, and the mainstream paradigm is based on the end-to-end training of neural network architectures, ranging from classical LSTM/RNNs to more recent TCNs and Transformers. Motivated by the recent success of representation learning in computer vision and natural language processing, we argue that a more promising paradigm for time ser… ▽ More

    Submitted 5 May, 2022; v1 submitted 3 February, 2022; originally announced February 2022.

  39. arXiv:2202.01381  [pdf, other

    cs.LG

    ETSformer: Exponential Smoothing Transformers for Time-series Forecasting

    Authors: Gerald Woo, Chenghao Liu, Doyen Sahoo, Akshat Kumar, Steven Hoi

    Abstract: Transformers have been actively studied for time-series forecasting in recent years. While often showing promising results in various scenarios, traditional Transformers are not designed to fully exploit the characteristics of time-series data and thus suffer some fundamental limitations, e.g., they generally lack of decomposition capability and interpretability, and are neither effective nor effi… ▽ More

    Submitted 20 June, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

  40. arXiv:2201.12086  [pdf, other

    cs.CV

    BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

    Authors: Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi

    Abstract: Vision-Language Pre-training (VLP) has advanced the performance for many vision-language tasks. However, most existing pre-trained models only excel in either understanding-based tasks or generation-based tasks. Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision. In… ▽ More

    Submitted 15 February, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

  41. arXiv:2112.09583  [pdf, other

    cs.CV

    Align and Prompt: Video-and-Language Pre-training with Entity Prompts

    Authors: Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi

    Abstract: Video-and-language pre-training has shown promising improvements on various downstream tasks. Most previous methods capture cross-modal interactions with a transformer-based multimodal encoder, not fully addressing the misalignment between unimodal video and text features. Besides, learning fine-grained visual-language alignment usually requires off-the-shelf object detectors to provide object inf… ▽ More

    Submitted 23 December, 2021; v1 submitted 17 December, 2021; originally announced December 2021.

  42. Node-wise Localization of Graph Neural Networks

    Authors: Zemin Liu, Yuan Fang, Chenghao Liu, Steven C. H. Hoi

    Abstract: Graph neural networks (GNNs) emerge as a powerful family of representation learning models on graphs. To derive node representations, they utilize a global model that recursively aggregates information from the neighboring nodes. However, different nodes reside at different parts of the graph in different local contexts, making their distributions vary across the graph. Ideally, how a node receive… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Journal ref: Published in the proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI 2021)

  43. arXiv:2110.10048  [pdf, other

    cs.CV

    Improving Tail-Class Representation with Centroid Contrastive Learning

    Authors: Anthony Meng Huat Tiong, Junnan Li, Guosheng Lin, Boyang Li, Caiming Xiong, Steven C. H. Hoi

    Abstract: In vision domain, large-scale natural datasets typically exhibit long-tailed distribution which has large class imbalance between head and tail classes. This distribution poses difficulty in learning good representations for tail classes. Recent developments have shown good long-tailed model can be learnt by decoupling the training into representation learning and classifier balancing. However, th… ▽ More

    Submitted 4 May, 2023; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: Add in acknowledgment

  44. arXiv:2110.07811  [pdf, other

    cs.CL cs.PL

    Cascaded Fast and Slow Models for Efficient Semantic Code Search

    Authors: Akhilesh Deepak Gotmare, Junnan Li, Shafiq Joty, Steven C. H. Hoi

    Abstract: The goal of natural language semantic code search is to retrieve a semantically relevant code snippet from a fixed set of candidates using a natural language query. Existing approaches are neither effective nor efficient enough towards a practical semantic code search system. In this paper, we propose an efficient and accurate semantic code search framework with cascaded fast and slow models, in w… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: 12 pages

  45. arXiv:2110.01209  [pdf, other

    cs.CV

    Learning Structural Representations for Recipe Generation and Food Retrieval

    Authors: Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

    Abstract: Food is significant to human daily life. In this paper, we are interested in learning structural representations for lengthy recipes, that can benefit the recipe generation and food cross-modal retrieval tasks. Different from the common vision-language data, here the food images contain mixed ingredients and target recipes are lengthy paragraphs, where we do not have annotations on structure infor… ▽ More

    Submitted 1 August, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

    Comments: Accepted at IEEE Transactions on Pattern Analysis and Machine Intelligence. arXiv admin note: substantial text overlap with arXiv:2009.00944

  46. arXiv:2110.00175  [pdf, other

    cs.LG cs.AI

    DualNet: Continual Learning, Fast and Slow

    Authors: Quang Pham, Chenghao Liu, Steven Hoi

    Abstract: According to Complementary Learning Systems (CLS) theory~\citep{mcclelland1995there} in neuroscience, humans do effective \emph{continual learning} through two complementary systems: a fast learning system centered on the hippocampus for rapid learning of the specifics and individual experiences, and a slow learning system located in the neocortex for the gradual acquisition of structured knowledg… ▽ More

    Submitted 30 September, 2021; originally announced October 2021.

  47. arXiv:2109.09265  [pdf, other

    cs.LG cs.MS stat.ML

    Merlion: A Machine Learning Library for Time Series

    Authors: Aadyot Bhatnagar, Paul Kassianik, Chenghao Liu, Tian Lan, Wenzhuo Yang, Rowan Cassius, Doyen Sahoo, Devansh Arpit, Sri Subramanian, Gerald Woo, Amrita Saha, Arun Kumar Jagota, Gokulakrishnan Gopalakrishnan, Manpreet Singh, K C Krithika, Sukumar Maddineni, Daeki Cho, Bo Zong, Yingbo Zhou, Caiming Xiong, Silvio Savarese, Steven Hoi, Huan Wang

    Abstract: We introduce Merlion, an open-source machine learning library for time series. It features a unified interface for many commonly used models and datasets for anomaly detection and forecasting on both univariate and multivariate time series, along with standard pre/post-processing layers. It has several modules to improve ease-of-use, including visualization, anomaly score calibration to improve in… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

    Comments: 22 pages, 1 figure, 14 tables

  48. arXiv:2109.00859  [pdf, other

    cs.CL cs.PL

    CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation

    Authors: Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi

    Abstract: Pre-trained models for Natural Languages (NL) like BERT and GPT have been recently shown to transfer well to Programming Languages (PL) and largely benefit a broad set of code-related tasks. Despite their success, most current methods either rely on an encoder-only (or decoder-only) pre-training that is suboptimal for generation (resp. understanding) tasks or process the code snippet in the same w… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP 2021. 13 pages

  49. Cross-Modal Graph with Meta Concepts for Video Captioning

    Authors: Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

    Abstract: Video captioning targets interpreting the complex visual contents as text descriptions, which requires the model to fully understand video scenes including objects and their interactions. Prevailing methods adopt off-the-shelf object detection networks to give object proposals and use the attention mechanism to model the relations between objects. They often miss some undefined semantic concepts o… ▽ More

    Submitted 1 August, 2022; v1 submitted 14 August, 2021; originally announced August 2021.

    Comments: Accepted at IEEE Transactions on Image Processing

  50. arXiv:2108.01361  [pdf, other

    cs.CV

    Cycle-Consistent Inverse GAN for Text-to-Image Synthesis

    Authors: Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

    Abstract: This paper investigates an open research task of text-to-image synthesis for automatically generating or manipulating images from text descriptions. Prevailing methods mainly use the text as conditions for GAN generation, and train different models for the text-guided image generation and manipulation tasks. In this paper, we propose a novel unified framework of Cycle-consistent Inverse GAN (CI-GA… ▽ More

    Submitted 12 September, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

    Comments: Accepted at ACM MM 2021