(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–24 of 24 results for author: Yen, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.02488  [pdf, other

    eess.AS cs.CL cs.SD

    Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition

    Authors: Hao Yen, Pin-Jui Ku, Sabato Marco Siniscalchi, Chin-Hui Lee

    Abstract: We propose a novel language-universal approach to end-to-end automatic spoken keyword recognition (SKR) leveraging upon (i) a self-supervised pre-trained model, and (ii) a set of universal speech attributes (manner and place of articulation). Specifically, Wav2Vec2.0 is used to generate robust speech representations, followed by a linear output layer to produce attribute sequences. A non-trainable… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  2. arXiv:2404.05809  [pdf, other

    cs.LG cs.AI stat.ME

    Self-Labeling in Multivariate Causality and Quantification for Adaptive Machine Learning

    Authors: Yutian Ren, Aaron Haohua Yen, G. P. Li

    Abstract: Adaptive machine learning (ML) aims to allow ML models to adapt to ever-changing environments with potential concept drift after model deployment. Traditionally, adaptive ML requires a new dataset to be manually labeled to tailor deployed models to altered data distributions. Recently, an interactive causality based self-labeling method was proposed to autonomously associate causally related data… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  3. arXiv:2402.16617  [pdf, other

    cs.CL

    Long-Context Language Modeling with Parallel Context Encoding

    Authors: Howard Yen, Tianyu Gao, Danqi Chen

    Abstract: Extending large language models (LLMs) to process longer inputs is crucial for a wide range of applications. However, the substantial computational cost of transformers and limited generalization of positional encoding restrict the size of their context window. We introduce Context Expansion with Parallel Encoding (CEPE), a framework that can be applied to any existing decoder-only LLMs to extend… ▽ More

    Submitted 11 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: ACL 2024. Code, models, and data are available at https://github.com/princeton-nlp/CEPE. arXiv admin note: text overlap with arXiv:1912.01214 by other authors

  4. arXiv:2311.00687  [pdf, other

    cs.AI cs.CL cs.HC cs.LG

    Improving Interpersonal Communication by Simulating Audiences with Language Models

    Authors: Ryan Liu, Howard Yen, Raja Marjieh, Thomas L. Griffiths, Ranjay Krishna

    Abstract: How do we communicate with others to achieve our goals? We use our prior experience or advice from others, or construct a candidate utterance by predicting how it will be received. However, our experiences are limited and biased, and reasoning about potential outcomes can be difficult and cognitively challenging. In this paper, we explore how we can leverage Large Language Model (LLM) simulations… ▽ More

    Submitted 3 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 16 pages (main paper), 7 tables and figures (main)

  5. arXiv:2309.08828  [pdf, other

    eess.AS cs.SD

    Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints

    Authors: Hao Yen, Sabato Marco Siniscalchi, Chin-Hui Lee

    Abstract: We propose a first step toward multilingual end-to-end automatic speech recognition (ASR) by integrating knowledge about speech articulators. The key idea is to leverage a rich set of fundamental units that can be defined "universally" across all spoken languages, referred to as speech attributes, namely manner and place of articulation. Specifically, several deterministic attribute-to-phoneme map… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  6. arXiv:2309.03900  [pdf, other

    eess.IV cs.CV

    Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction

    Authors: Su-Kai Chen, Hung-Lin Yen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Wen-Hsiao Peng, Yen-Yu Lin

    Abstract: Deep learning is commonly used to reconstruct HDR images from LDR images. LDR stack-based methods are used for single-image HDR reconstruction, generating an HDR image from a deep learning-generated LDR stack. However, current methods generate the stack with predetermined exposure values (EVs), which may limit the quality of HDR reconstruction. To address this, we propose the continuous exposure v… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: ICCV 2023. Project page: https://skchen1993.github.io/CEVR_web/

  7. arXiv:2305.14627  [pdf, other

    cs.CL cs.IR cs.LG

    Enabling Large Language Models to Generate Text with Citations

    Authors: Tianyu Gao, Howard Yen, Jiatong Yu, Danqi Chen

    Abstract: Large language models (LLMs) have emerged as a widely-used tool for information seeking, but their generated outputs are prone to hallucination. In this work, our aim is to allow LLMs to generate text with citations, improving their factual correctness and verifiability. Existing work mainly relies on commercial search engines and human evaluation, making it challenging to reproduce and compare di… ▽ More

    Submitted 31 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted by EMNLP 2023. Code and data are available at https://github.com/princeton-nlp/ALCE

  8. Wizundry: A Cooperative Wizard of Oz Platform for Simulating Future Speech-based Interfaces with Multiple Wizards

    Authors: Siying Hu, Hen Chen Yen, Ziwei Yu, Mingjian Zhao, Katie Seaborn, Can Liu

    Abstract: Wizard of Oz (WoZ) as a prototyping method has been used to simulate intelligent user interfaces, particularly for speech-based systems. However, as our societies' expectations on artificial intelligence (AI) grows, the question remains whether a single Wizard is sufficient for it to simulate smarter systems and more complex interactions. Optimistic visions of 'what artificial intelligence (AI) ca… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: 34 pages

    Report number: Article 115

    Journal ref: Proc. ACM Hum.- Comput. Interact. 7, CSCW1, Article 115 (April 2023), 34 pages

  9. arXiv:2211.02527  [pdf, other

    eess.AS cs.SD

    Cold Diffusion for Speech Enhancement

    Authors: Hao Yen, François G. Germain, Gordon Wichern, Jonathan Le Roux

    Abstract: Diffusion models have recently shown promising results for difficult enhancement tasks such as the conditional and unconditional restoration of natural images and audio signals. In this work, we explore the possibility of leveraging a recently proposed advanced iterative diffusion model, namely cold diffusion, to recover clean speech signals from noisy signals. The unique mathematical properties o… ▽ More

    Submitted 23 May, 2023; v1 submitted 4 November, 2022; originally announced November 2022.

    Comments: 5 pages, 1 figure, 1 table, 3 algorithms. To appear in ICASSP 2023. With corrected references

  10. arXiv:2210.16726  [pdf, ps, other

    eess.AS cs.SD

    Improvements to Embedding-Matching Acoustic-to-Word ASR Using Multiple-Hypothesis Pronunciation-Based Embeddings

    Authors: Hao Yen, Woojay Jeon

    Abstract: In embedding-matching acoustic-to-word (A2W) ASR, every word in the vocabulary is represented by a fixed-dimension embedding vector that can be added or removed independently of the rest of the system. The approach is potentially an elegant solution for the dynamic out-of-vocabulary (OOV) words problem, where speaker- and context-dependent named entities like contact names must be incorporated int… ▽ More

    Submitted 19 February, 2023; v1 submitted 29 October, 2022; originally announced October 2022.

    Comments: Accepted to ICASSP 2023

  11. A Summary of the ALQAC 2021 Competition

    Authors: Nguyen Ha Thanh, Bui Minh Quan, Chau Nguyen, Tung Le, Nguyen Minh Phuong, Dang Tran Binh, Vuong Thi Hai Yen, Teeradaj Racharak, Nguyen Le Minh, Tran Duc Vu, Phan Viet Anh, Nguyen Truong Son, Huy Tien Nguyen, Bhumindr Butr-indr, Peerapon Vateekul, Prachya Boonkwan

    Abstract: We summarize the evaluation of the first Automated Legal Question Answering Competition (ALQAC 2021). The competition this year contains three tasks, which aims at processing the statute law document, which are Legal Text Information Retrieval (Task 1), Legal Text Entailment Prediction (Task 2), and Legal Text Question Answering (Task 3). The final goal of these tasks is to build a system that can… ▽ More

    Submitted 24 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

  12. arXiv:2202.13588   

    eess.IV cs.CV

    Using Multi-scale SwinTransformer-HTC with Data augmentation in CoNIC Challenge

    Authors: Chia-Yen Lee, Hsiang-Chin Chien, Ching-Ping Wang, Hong Yen, Kai-Wen Zhen, Hong-Kun Lin

    Abstract: Colorectal cancer is one of the most common cancers worldwide, so early pathological examination is very important. However, it is time-consuming and labor-intensive to identify the number and type of cells on H&E images in clinical. Therefore, automatic segmentation and classification task and counting the cellular composition of H&E images from pathological sections is proposed by CoNIC Challeng… ▽ More

    Submitted 16 April, 2024; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: Errors have been identified in the analysis

  13. arXiv:2110.08190  [pdf, other

    cs.CL

    Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

    Authors: Shaoyi Huang, Dongkuan Xu, Ian E. H. Yen, Yijue Wang, Sung-en Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Sanguthevar Rajasekaran, Hang Liu, Caiwen Ding

    Abstract: Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit. However, under the trending pretrain-and-finetune paradigm, we postulate a counter-traditional hypothesis, that is: pruning increases the risk of overfitting when performed at the fine-tuning phase. In this paper, we aim to addre… ▽ More

    Submitted 16 January, 2023; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: 11 pages; 16 figures; Published in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing

  14. arXiv:2110.03894  [pdf, other

    eess.AS cs.AI cs.LG cs.NE cs.SD

    Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition

    Authors: Hao Yen, Pin-Jui Ku, Chao-Han Huck Yang, Hu Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, Yu Tsao

    Abstract: In this study, we propose a novel adversarial reprogramming (AR) approach for low-resource spoken command recognition (SCR), and build an AR-SCR system. The AR procedure aims to modify the acoustic signals (from the target domain) to repurpose a pretrained SCR model (from the source domain). To solve the label mismatches between source and target domains, and further improve the stability of AR, w… ▽ More

    Submitted 30 October, 2023; v1 submitted 8 October, 2021; originally announced October 2021.

    Comments: Accepted to Interspeech 2023. Code is available at: https://github.com/dodohow1011/SpeechAdvReprogram. Selected as Best Student Paper Candidate

  15. arXiv:2107.01461  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification

    Authors: Hao Yen, Chao-Han Huck Yang, Hu Hu, Sabato Marco Siniscalchi, Qing Wang, Yuyang Wang, Xianjun Xia, Yuanjun Zhao, Yuzhong Wu, Yannan Wang, Jun Du, Chin-Hui Lee

    Abstract: We propose a novel neural model compression strategy combining data augmentation, knowledge transfer, pruning, and quantization for device-robust acoustic scene classification (ASC). Specifically, we tackle the ASC task in a low-resource environment leveraging a recently proposed advanced neural network pruning mechanism, namely Lottery Ticket Hypothesis (LTH), to find a sub-network neural model a… ▽ More

    Submitted 1 May, 2022; v1 submitted 3 July, 2021; originally announced July 2021.

    Comments: 5 figures. DCASE 2021. The project started in November 2020. Revised version

  16. arXiv:2104.08682  [pdf, other

    cs.CL cs.AI

    Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm

    Authors: Dongkuan Xu, Ian E. H. Yen, Jinxi Zhao, Zhibin Xiao

    Abstract: Transformer-based pre-trained language models have significantly improved the performance of various natural language processing (NLP) tasks in the recent years. While effective and prevalent, these models are usually prohibitively large for resource-limited deployment scenarios. A thread of research has thus been working on applying network pruning techniques under the pretrain-then-finetune para… ▽ More

    Submitted 16 January, 2022; v1 submitted 17 April, 2021; originally announced April 2021.

    Comments: 7 pages, 6 figures, 1 table

  17. arXiv:2004.05665  [pdf, other

    cs.LG stat.ML

    Minimizing FLOPs to Learn Efficient Sparse Representations

    Authors: Biswajit Paria, Chih-Kuan Yeh, Ian E. H. Yen, Ning Xu, Pradeep Ravikumar, Barnabás Póczos

    Abstract: Deep representation learning has become one of the most widely adopted approaches for visual search, recommendation, and identification. Retrieval of such representations from a large database is however computationally challenging. Approximate methods based on learning compact representations, have been widely explored for this problem, such as locality sensitive hashing, product quantization, an… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: Published at ICLR 2020

  18. arXiv:1811.09720  [pdf, other

    cs.LG stat.ML

    Representer Point Selection for Explaining Deep Neural Networks

    Authors: Chih-Kuan Yeh, Joon Sik Kim, Ian E. H. Yen, Pradeep Ravikumar

    Abstract: We propose to explain the predictions of a deep neural network, by pointing to the set of what we call representer points in the training set, for a given test point prediction. Specifically, we show that we can decompose the pre-activation prediction of a neural network into a linear combination of activations of training points, with the weights corresponding to what we call representer values,… ▽ More

    Submitted 23 November, 2018; originally announced November 2018.

    Comments: NIPS 2018

  19. arXiv:1811.01713  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Word Mover's Embedding: From Word2Vec to Document Embedding

    Authors: Lingfei Wu, Ian E. H. Yen, Kun Xu, Fangli Xu, Avinash Balakrishnan, Pin-Yu Chen, Pradeep Ravikumar, Michael J. Witbrock

    Abstract: While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings. Recent work has demonstrated that a distance measure between documents called \emph{Word Mover's Distance} (WMD) that aligns semantically similar words, yields unprecedented KNN classif… ▽ More

    Submitted 30 October, 2018; originally announced November 2018.

    Comments: EMNLP'18 Camera-Ready Version

  20. arXiv:1810.04754  [pdf, other

    cs.LG stat.ML

    Efficient Tensor Decomposition with Boolean Factors

    Authors: Sung-En Chang, Xun Zheng, Ian E. H. Yen, Pradeep Ravikumar, Rose Yu

    Abstract: Tensor decomposition has been extensively used as a tool for exploratory analysis. Motivated by neuroscience applications, we study tensor decomposition with Boolean factors. The resulting optimization problem is challenging due to the non-convex objective and the combinatorial constraints. We propose Binary Matching Pursuit (BMP), a novel generalization of the matching pursuit strategy to decompo… ▽ More

    Submitted 11 November, 2020; v1 submitted 10 October, 2018; originally announced October 2018.

    Comments: 14 pages, 3 figures

  21. arXiv:1809.05247  [pdf, other

    cs.LG cs.AI stat.ML

    Revisiting Random Binning Features: Fast Convergence and Strong Parallelizability

    Authors: Lingfei Wu, Ian E. H. Yen, Jie Chen, Rui Yan

    Abstract: Kernel method has been developed as one of the standard approaches for nonlinear learning, which however, does not scale to large data set due to its quadratic complexity in the number of samples. A number of kernel approximation methods have thus been proposed in the recent years, among which the random features method gains much popularity due to its simplicity and direct reduction of nonlinear… ▽ More

    Submitted 18 September, 2018; v1 submitted 14 September, 2018; originally announced September 2018.

    Comments: KDD16, Oral Paper, Add Code Link for generating Random Binning Features

  22. arXiv:1004.2338  [pdf, ps, other

    cs.CG cs.CC cs.DS

    Complexity Analysis of Balloon Drawing for Rooted Trees

    Authors: Chun-Cheng Lin, Hsu-Chun Yen, Sheung-Hung Poon, Jia-Hao Fan

    Abstract: In a balloon drawing of a tree, all the children under the same parent are placed on the circumference of the circle centered at their parent, and the radius of the circle centered at each node along any path from the root reflects the number of descendants associated with the node. Among various styles of tree drawings reported in the literature, the balloon drawing enjoys a desirable feature of… ▽ More

    Submitted 14 April, 2010; originally announced April 2010.

  23. arXiv:cs/0305006  [pdf, ps, other

    cs.DM

    On the Ramsey Numbers for Bipartite Multigraphs

    Authors: Ming-Yang Chen, Hsueh-I. Lu, Hsu-Chun Yen

    Abstract: A coloring of a complete bipartite graph is shuffle-preserved if it is the case that assigning a color $c$ to edges $(u, v)$ and $(u', v')$ enforces the same color assignment for edges $(u, v')$ and $(u',v)$. (In words, the induced subgraph with respect to color $c$ is complete.) In this paper, we investigate a variant of the Ramsey problem for the class of complete bipartite multigraphs. (By a… ▽ More

    Submitted 12 May, 2003; originally announced May 2003.

    Comments: 10 pages, 3 figures

    ACM Class: G.2.2

  24. Compact Floor-Planning via Orderly Spanning Trees

    Authors: Chien-Chih Liao, Hsueh-I Lu, Hsu-Chun Yen

    Abstract: Floor-planning is a fundamental step in VLSI chip design. Based upon the concept of orderly spanning trees, we present a simple O(n)-time algorithm to construct a floor-plan for any n-node plane triangulation. In comparison with previous floor-planning algorithms in the literature, our solution is not only simpler in the algorithm itself, but also produces floor-plans which require fewer module… ▽ More

    Submitted 4 May, 2003; v1 submitted 17 October, 2002; originally announced October 2002.

    Comments: 13 pages, 5 figures, An early version of this work was presented at 9th International Symposium on Graph Drawing (GD 2001), Vienna, Austria, September 2001. Accepted to Journal of Algorithms, 2003

    ACM Class: F.2.2; E.1; G.2.2; B.7.2

    Journal ref: Journal of Algorithms, 48(2):441-451, 2003