-
PartCraft: Crafting Creative Objects by Parts
Authors:
Kam Woh Ng,
Xiatian Zhu,
Yi-Zhe Song,
Tao Xiang
Abstract:
This paper propels creative control in generative visual AI by allowing users to "select". Departing from traditional text or sketch-based methods, we for the first time allow users to choose visual concepts by parts for their creative endeavors. The outcome is fine-grained generation that precisely captures selected visual concepts, ensuring a holistically faithful and plausible result. To achiev…
▽ More
This paper propels creative control in generative visual AI by allowing users to "select". Departing from traditional text or sketch-based methods, we for the first time allow users to choose visual concepts by parts for their creative endeavors. The outcome is fine-grained generation that precisely captures selected visual concepts, ensuring a holistically faithful and plausible result. To achieve this, we first parse objects into parts through unsupervised feature clustering. Then, we encode parts into text tokens and introduce an entropy-based normalized attention loss that operates on them. This loss design enables our model to learn generic prior topology knowledge about object's part composition, and further generalize to novel part compositions to ensure the generation looks holistically faithful. Lastly, we employ a bottleneck encoder to project the part tokens. This not only enhances fidelity but also accelerates learning, by leveraging shared knowledge and facilitating information exchange among instances. Visual results in the paper and supplementary material showcase the compelling power of PartCraft in crafting highly customized, innovative creations, exemplified by the "charming" and creative birds. Code is released at https://github.com/kamwoh/partcraft.
△ Less
Submitted 8 July, 2024; v1 submitted 5 July, 2024;
originally announced July 2024.
-
ConceptHash: Interpretable Fine-Grained Hashing via Concept Discovery
Authors:
Kam Woh Ng,
Xiatian Zhu,
Yi-Zhe Song,
Tao Xiang
Abstract:
Existing fine-grained hashing methods typically lack code interpretability as they compute hash code bits holistically using both global and local features. To address this limitation, we propose ConceptHash, a novel method that achieves sub-code level interpretability. In ConceptHash, each sub-code corresponds to a human-understandable concept, such as an object part, and these concepts are autom…
▽ More
Existing fine-grained hashing methods typically lack code interpretability as they compute hash code bits holistically using both global and local features. To address this limitation, we propose ConceptHash, a novel method that achieves sub-code level interpretability. In ConceptHash, each sub-code corresponds to a human-understandable concept, such as an object part, and these concepts are automatically discovered without human annotations. Specifically, we leverage a Vision Transformer architecture and introduce concept tokens as visual prompts, along with image patch tokens as model inputs. Each concept is then mapped to a specific sub-code at the model output, providing natural sub-code interpretability. To capture subtle visual differences among highly similar sub-categories (e.g., bird species), we incorporate language guidance to ensure that the learned hash codes are distinguishable within fine-grained object classes while maintaining semantic alignment. This approach allows us to develop hash codes that exhibit similarity within families of species while remaining distinct from species in other families. Extensive experiments on four fine-grained image retrieval benchmarks demonstrate that ConceptHash outperforms previous methods by a significant margin, offering unique sub-code interpretability as an additional benefit. Code at: https://github.com/kamwoh/concepthash.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Contrastive Learning in Distilled Models
Authors:
Valerie Lim,
Kai Wen Ng,
Kenneth Lim
Abstract:
Natural Language Processing models like BERT can provide state-of-the-art word embeddings for downstream NLP tasks. However, these models yet to perform well on Semantic Textual Similarity, and may be too large to be deployed as lightweight edge applications. We seek to apply a suitable contrastive learning method based on the SimCSE paper, to a model architecture adapted from a knowledge distilla…
▽ More
Natural Language Processing models like BERT can provide state-of-the-art word embeddings for downstream NLP tasks. However, these models yet to perform well on Semantic Textual Similarity, and may be too large to be deployed as lightweight edge applications. We seek to apply a suitable contrastive learning method based on the SimCSE paper, to a model architecture adapted from a knowledge distillation based model, DistilBERT, to address these two issues. Our final lightweight model DistilFace achieves an average of 72.1 in Spearman's correlation on STS tasks, a 34.2 percent improvement over BERT base.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
IPR-NeRF: Ownership Verification meets Neural Radiance Field
Authors:
Win Kent Ong,
Kam Woh Ng,
Chee Seng Chan,
Yi Zhe Song,
Tao Xiang
Abstract:
Neural Radiance Field (NeRF) models have gained significant attention in the computer vision community in the recent past with state-of-the-art visual quality and produced impressive demonstrations. Since then, technopreneurs have sought to leverage NeRF models into a profitable business. Therefore, NeRF models make it worth the risk of plagiarizers illegally copying, re-distributing, or misusing…
▽ More
Neural Radiance Field (NeRF) models have gained significant attention in the computer vision community in the recent past with state-of-the-art visual quality and produced impressive demonstrations. Since then, technopreneurs have sought to leverage NeRF models into a profitable business. Therefore, NeRF models make it worth the risk of plagiarizers illegally copying, re-distributing, or misusing those models. This paper proposes a comprehensive intellectual property (IP) protection framework for the NeRF model in both black-box and white-box settings, namely IPR-NeRF. In the black-box setting, a diffusion-based solution is introduced to embed and extract the watermark via a two-stage optimization process. In the white-box setting, a designated digital signature is embedded into the weights of the NeRF model by adopting the sign loss objective. Our extensive experiments demonstrate that not only does our approach maintain the fidelity (\ie, the rendering quality) of IPR-NeRF models, but it is also robust against both ambiguity and removal attacks compared to prior arts.
△ Less
Submitted 22 January, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
DreamCreature: Crafting Photorealistic Virtual Creatures from Imagination
Authors:
Kam Woh Ng,
Xiatian Zhu,
Yi-Zhe Song,
Tao Xiang
Abstract:
Recent text-to-image (T2I) generative models allow for high-quality synthesis following either text instructions or visual examples. Despite their capabilities, these models face limitations in creating new, detailed creatures within specific categories (e.g., virtual dog or bird species), which are valuable in digital asset creation and biodiversity analysis. To bridge this gap, we introduce a no…
▽ More
Recent text-to-image (T2I) generative models allow for high-quality synthesis following either text instructions or visual examples. Despite their capabilities, these models face limitations in creating new, detailed creatures within specific categories (e.g., virtual dog or bird species), which are valuable in digital asset creation and biodiversity analysis. To bridge this gap, we introduce a novel task, Virtual Creatures Generation: Given a set of unlabeled images of the target concepts (e.g., 200 bird species), we aim to train a T2I model capable of creating new, hybrid concepts within diverse backgrounds and contexts. We propose a new method called DreamCreature, which identifies and extracts the underlying sub-concepts (e.g., body parts of a specific species) in an unsupervised manner. The T2I thus adapts to generate novel concepts (e.g., new bird species) with faithful structures and photorealistic appearance by seamlessly and flexibly composing learned sub-concepts. To enhance sub-concept fidelity and disentanglement, we extend the textual inversion technique by incorporating an additional projector and tailored attention loss regularization. Extensive experiments on two fine-grained image benchmarks demonstrate the superiority of DreamCreature over prior methods in both qualitative and quantitative evaluation. Ultimately, the learned sub-concepts facilitate diverse creative applications, including innovative consumer product designs and nuanced property modifications.
△ Less
Submitted 26 November, 2023;
originally announced November 2023.
-
Coordinated Information Campaigns on Social Media: A Multifaceted Framework for Detection and Analysis
Authors:
Kin Wai Ng,
Adriana Iamnitchi
Abstract:
The prevalence of coordinated information campaigns in social media platforms has significant negative consequences across various domains, including social, political, and economic processes. This paper proposes a multifaceted framework for detecting and analysing coordinated message promotion on social media. By simultaneously considering features related to content, time, and network dimensions…
▽ More
The prevalence of coordinated information campaigns in social media platforms has significant negative consequences across various domains, including social, political, and economic processes. This paper proposes a multifaceted framework for detecting and analysing coordinated message promotion on social media. By simultaneously considering features related to content, time, and network dimensions, our framework can capture the diverse nature of coordinated activity and identify anomalous user accounts who likely engaged in suspicious behaviour. Unlike existing solutions that rely on specific constraints, our approach is more flexible as it employs specialised components to extract the significant structures within a network and to detect the most unusual interactions. We demonstrate the effectiveness of our framework using two Twitter datasets, the Russian Internet Research Agency (IRA), and long-term discussions on Data Science topics. The results demonstrate our framework's ability to isolate unusual activity from expected normal behaviour and provide valuable insights for further qualitative investigation.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
HoloPOCUS: Portable Mixed-Reality 3D Ultrasound Tracking, Reconstruction and Overlay
Authors:
Kian Wei Ng,
Yujia Gao,
Shaheryar Mohammed Furqan,
Zachery Yeo,
Joel Lau,
Kee Yuan Ngiam,
Eng Tat Khoo
Abstract:
Ultrasound (US) imaging provides a safe and accessible solution to procedural guidance and diagnostic imaging. The effective usage of conventional 2D US for interventional guidance requires extensive experience to project the image plane onto the patient, and the interpretation of images in diagnostics suffers from high intra- and inter-user variability. 3D US reconstruction allows for more consis…
▽ More
Ultrasound (US) imaging provides a safe and accessible solution to procedural guidance and diagnostic imaging. The effective usage of conventional 2D US for interventional guidance requires extensive experience to project the image plane onto the patient, and the interpretation of images in diagnostics suffers from high intra- and inter-user variability. 3D US reconstruction allows for more consistent diagnosis and interpretation, but existing solutions are limited in terms of equipment and applicability in real-time navigation. To address these issues, we propose HoloPOCUS - a mixed reality US system (MR-US) that overlays rich US information onto the user's vision in a point-of-care setting. HoloPOCUS extends existing MR-US methods beyond placing a US plane in the user's vision to include a 3D reconstruction and projection that can aid in procedural guidance using conventional probes. We validated a tracking pipeline that demonstrates higher accuracy compared to existing MR-US works. Furthermore, user studies conducted via a phantom task showed significant improvements in navigation duration when using our proposed methods.
△ Less
Submitted 26 August, 2023;
originally announced August 2023.
-
Unsupervised Hashing with Similarity Distribution Calibration
Authors:
Kam Woh Ng,
Xiatian Zhu,
Jiun Tian Hoe,
Chee Seng Chan,
Tianyu Zhang,
Yi-Zhe Song,
Tao Xiang
Abstract:
Unsupervised hashing methods typically aim to preserve the similarity between data points in a feature space by mapping them to binary hash codes. However, these methods often overlook the fact that the similarity between data points in the continuous feature space may not be preserved in the discrete hash code space, due to the limited similarity range of hash codes. The similarity range is bound…
▽ More
Unsupervised hashing methods typically aim to preserve the similarity between data points in a feature space by mapping them to binary hash codes. However, these methods often overlook the fact that the similarity between data points in the continuous feature space may not be preserved in the discrete hash code space, due to the limited similarity range of hash codes. The similarity range is bounded by the code length and can lead to a problem known as similarity collapse. That is, the positive and negative pairs of data points become less distinguishable from each other in the hash space. To alleviate this problem, in this paper a novel Similarity Distribution Calibration (SDC) method is introduced. SDC aligns the hash code similarity distribution towards a calibration distribution (e.g., beta distribution) with sufficient spread across the entire similarity range, thus alleviating the similarity collapse problem. Extensive experiments show that our SDC outperforms significantly the state-of-the-art alternatives on coarse category-level and instance-level image retrieval. Code is available at https://github.com/kamwoh/sdc.
△ Less
Submitted 31 August, 2023; v1 submitted 15 February, 2023;
originally announced February 2023.
-
Experimental Evaluation of Baselines for Forecasting Social Media Timeseries
Authors:
Kin Wai Ng,
Frederick Mubang,
Lawrence O. Hall,
John Skvoretz,
Adriana Iamnitchi
Abstract:
Forecasting social media activity can be of practical use in many scenarios, from understanding trends, such as which topics are likely to engage more users in the coming week, to identifying unusual behavior, such as coordinated information operations or PumpNDump efforts. To evaluate a new approach to forecasting, it is important to have baselines against which to assess performance gains. We ex…
▽ More
Forecasting social media activity can be of practical use in many scenarios, from understanding trends, such as which topics are likely to engage more users in the coming week, to identifying unusual behavior, such as coordinated information operations or PumpNDump efforts. To evaluate a new approach to forecasting, it is important to have baselines against which to assess performance gains. We experimentally evaluate the performance of four baselines for forecasting activity in several social media datasets that record discussions related to three different geo-political contexts synchronously taking place on two different platforms, Twitter and YouTube. Experiments are done over hourly time periods. Our evaluation identifies the baselines which are most accurate for particular metrics and thus provide guidance for future work in social media modeling.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Large-Scale Product Retrieval with Weakly Supervised Representation Learning
Authors:
Xiao Han,
Kam Woh Ng,
Sauradip Nag,
Zhiyu Qu
Abstract:
Large-scale weakly supervised product retrieval is a practically useful yet computationally challenging problem. This paper introduces a novel solution for the eBay Visual Search Challenge (eProduct) held at the Ninth Workshop on Fine-Grained Visual Categorisation workshop (FGVC9) of CVPR 2022. This competition presents two challenges: (a) E-commerce is a drastically fine-grained domain including…
▽ More
Large-scale weakly supervised product retrieval is a practically useful yet computationally challenging problem. This paper introduces a novel solution for the eBay Visual Search Challenge (eProduct) held at the Ninth Workshop on Fine-Grained Visual Categorisation workshop (FGVC9) of CVPR 2022. This competition presents two challenges: (a) E-commerce is a drastically fine-grained domain including many products with subtle visual differences; (b) A lacking of target instance-level labels for model training, with only coarse category labels and product titles available. To overcome these obstacles, we formulate a strong solution by a set of dedicated designs: (a) Instead of using text training data directly, we mine thousands of pseudo-attributes from product titles and use them as the ground truths for multi-label classification. (b) We incorporate several strong backbones with advanced training recipes for more discriminative representation learning. (c) We further introduce a number of post-processing techniques including whitening, re-ranking and model ensemble for retrieval enhancement. By achieving 71.53% MAR, our solution "Involution King" achieves the second position on the leaderboard.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
Solar eclipse observations with small radio telescope in Hong Kong in 21cm radio frequency band
Authors:
Chun Sing Leung,
Thomas K. T. Fok,
Kenneith H. K. Hui,
K. W. Ng,
C. M. Lee,
S. H. Chan
Abstract:
Small radio telescope in 21cm was used for studying the partial solar eclipse, with magnitude 0.89, in Hong Kong on 21st June, 2020. The radio telescope SPIDER 300A was designed and constructed by the Radio2Space Company, Italy. Radio flux density time curves (light curve) and a two-dimension mapping of the eclipse is presented in this paper. Standard radio data reduction methods were used to obta…
▽ More
Small radio telescope in 21cm was used for studying the partial solar eclipse, with magnitude 0.89, in Hong Kong on 21st June, 2020. The radio telescope SPIDER 300A was designed and constructed by the Radio2Space Company, Italy. Radio flux density time curves (light curve) and a two-dimension mapping of the eclipse is presented in this paper. Standard radio data reduction methods were used to obtain the intensity time curve. We also adopted the semi-pipeline method for the reduction of data to obtain the same results as with the built-in software of the radio telescope SPIDER 300A. The total solar radio flux of the eclipse was found to reduce by maximum 55 +/- 5 percent, while the maximum eclipsed area of the same eclipse is 86.08%. Other radio observations of solar eclipses in Hong Kong are also discussed in this paper, including SPIDER 300A observation of partial solar eclipse on 26th December 2019 (APPENDIX A); and small radio telescope (SRT), developed by the Haystack Observatory, MIT, USA, observation of 2020 eclipse (APPENDIX B).
△ Less
Submitted 2 July, 2022;
originally announced July 2022.
-
One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective
Authors:
Jiun Tian Hoe,
Kam Woh Ng,
Tianyu Zhang,
Chee Seng Chan,
Yi-Zhe Song,
Tao Xiang
Abstract:
A deep hashing model typically has two main learning objectives: to make the learned binary hash codes discriminative and to minimize a quantization error. With further constraints such as bit balance and code orthogonality, it is not uncommon for existing models to employ a large number (>4) of losses. This leads to difficulties in model training and subsequently impedes their effectiveness. In t…
▽ More
A deep hashing model typically has two main learning objectives: to make the learned binary hash codes discriminative and to minimize a quantization error. With further constraints such as bit balance and code orthogonality, it is not uncommon for existing models to employ a large number (>4) of losses. This leads to difficulties in model training and subsequently impedes their effectiveness. In this work, we propose a novel deep hashing model with only a single learning objective. Specifically, we show that maximizing the cosine similarity between the continuous codes and their corresponding binary orthogonal codes can ensure both hash code discriminativeness and quantization error minimization. Further, with this learning objective, code balancing can be achieved by simply using a Batch Normalization (BN) layer and multi-label classification is also straightforward with label smoothing. The result is an one-loss deep hashing model that removes all the hassles of tuning the weights of various losses. Importantly, extensive experiments show that our model is highly effective, outperforming the state-of-the-art multi-loss hashing models on three large-scale instance retrieval benchmarks, often by significant margins. Code is available at https://github.com/kamwoh/orthohash
△ Less
Submitted 29 September, 2021;
originally announced September 2021.
-
Social-Media Activity Forecasting with Exogenous Information Signals
Authors:
Kin Wai Ng,
Sameera Horawalavithana,
Adriana Iamnitchi
Abstract:
Due to their widespread adoption, social media platforms present an ideal environment for studying and understanding social behavior, especially on information spread. Modeling social media activity has numerous practical implications such as supporting efforts to analyze strategic information operations, designing intervention techniques to mitigate disinformation, or delivering critical informat…
▽ More
Due to their widespread adoption, social media platforms present an ideal environment for studying and understanding social behavior, especially on information spread. Modeling social media activity has numerous practical implications such as supporting efforts to analyze strategic information operations, designing intervention techniques to mitigate disinformation, or delivering critical information during disaster relief operations. In this paper we propose a modeling technique that forecasts topic-specific daily volume of social media activities by using both exogenous signals, such as news or armed conflicts records, and endogenous data from the social media platform we model. Empirical evaluations with real datasets from two different platforms and two different contexts each composed of multiple interrelated topics demonstrate the effectiveness of our solution.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
Can a Tesla Turbine be Utilised as a Non-Magnetic Actuator for MRI-Guided Robotic Interventions?
Authors:
David Navarro-Alarcon,
Luiza Labazanova,
Man Kiu Chow,
Kwun Wang Ng,
Derek Kwok
Abstract:
This paper introduces a new type of nonmagnetic actuator for MRI interventions. Ultrasonic and piezoelectric motors are one the most commonly used actuators in MRI applications. However, most of these actuators are only MRI-safe, which means they cannot be operated while imaging as they cause significant visual artifacts. To cope with this issue, we developed a new pneumatic rotary servo-motor (ba…
▽ More
This paper introduces a new type of nonmagnetic actuator for MRI interventions. Ultrasonic and piezoelectric motors are one the most commonly used actuators in MRI applications. However, most of these actuators are only MRI-safe, which means they cannot be operated while imaging as they cause significant visual artifacts. To cope with this issue, we developed a new pneumatic rotary servo-motor (based on the Tesla turbine) that can be effectively used during continuous MR imaging. We thoroughly tested the performance and magnetic properties of our MRI-compatible actuator with several experiments, both inside and outside an MRI scanner. The reported results confirm the feasibility to use this motor for MRI-guided robotic interventions.
△ Less
Submitted 19 August, 2021;
originally announced August 2021.
-
Ternary Hashing
Authors:
Chang Liu,
Lixin Fan,
Kam Woh Ng,
Yilun Jin,
Ce Ju,
Tianyu Zhang,
Chee Seng Chan,
Qiang Yang
Abstract:
This paper proposes a novel ternary hash encoding for learning to hash methods, which provides a principled more efficient coding scheme with performances better than those of the state-of-the-art binary hashing counterparts. Two kinds of axiomatic ternary logic, Kleene logic and Łukasiewicz logic are adopted to calculate the Ternary Hamming Distance (THD) for both the learning/encoding and testin…
▽ More
This paper proposes a novel ternary hash encoding for learning to hash methods, which provides a principled more efficient coding scheme with performances better than those of the state-of-the-art binary hashing counterparts. Two kinds of axiomatic ternary logic, Kleene logic and Łukasiewicz logic are adopted to calculate the Ternary Hamming Distance (THD) for both the learning/encoding and testing/querying phases. Our work demonstrates that, with an efficient implementation of ternary logic on standard binary machines, the proposed ternary hashing is compared favorably to the binary hashing methods with consistent improvements of retrieval mean average precision (mAP) ranging from 1\% to 5.9\% as shown in CIFAR10, NUS-WIDE and ImageNet100 datasets.
△ Less
Submitted 19 March, 2021; v1 submitted 16 March, 2021;
originally announced March 2021.
-
Protecting Intellectual Property of Generative Adversarial Networks from Ambiguity Attack
Authors:
Ding Sheng Ong,
Chee Seng Chan,
Kam Woh Ng,
Lixin Fan,
Qiang Yang
Abstract:
Ever since Machine Learning as a Service (MLaaS) emerges as a viable business that utilizes deep learning models to generate lucrative revenue, Intellectual Property Right (IPR) has become a major concern because these deep learning models can easily be replicated, shared, and re-distributed by any unauthorized third parties. To the best of our knowledge, one of the prominent deep learning models…
▽ More
Ever since Machine Learning as a Service (MLaaS) emerges as a viable business that utilizes deep learning models to generate lucrative revenue, Intellectual Property Right (IPR) has become a major concern because these deep learning models can easily be replicated, shared, and re-distributed by any unauthorized third parties. To the best of our knowledge, one of the prominent deep learning models - Generative Adversarial Networks (GANs) which has been widely used to create photorealistic image are totally unprotected despite the existence of pioneering IPR protection methodology for Convolutional Neural Networks (CNNs). This paper therefore presents a complete protection framework in both black-box and white-box settings to enforce IPR protection on GANs. Empirically, we show that the proposed method does not compromise the original GANs performance (i.e. image generation, image super-resolution, style transfer), and at the same time, it is able to withstand both removal and ambiguity attacks against embedded watermarks.
△ Less
Submitted 28 February, 2021; v1 submitted 8 February, 2021;
originally announced February 2021.
-
Rethinking Uncertainty in Deep Learning: Whether and How it Improves Robustness
Authors:
Yilun Jin,
Lixin Fan,
Kam Woh Ng,
Ce Ju,
Qiang Yang
Abstract:
Deep neural networks (DNNs) are known to be prone to adversarial attacks, for which many remedies are proposed. While adversarial training (AT) is regarded as the most robust defense, it suffers from poor performance both on clean examples and under other types of attacks, e.g. attacks with larger perturbations. Meanwhile, regularizers that encourage uncertain outputs, such as entropy maximization…
▽ More
Deep neural networks (DNNs) are known to be prone to adversarial attacks, for which many remedies are proposed. While adversarial training (AT) is regarded as the most robust defense, it suffers from poor performance both on clean examples and under other types of attacks, e.g. attacks with larger perturbations. Meanwhile, regularizers that encourage uncertain outputs, such as entropy maximization (EntM) and label smoothing (LS) can maintain accuracy on clean examples and improve performance under weak attacks, yet their ability to defend against strong attacks is still in doubt. In this paper, we revisit uncertainty promotion regularizers, including EntM and LS, in the field of adversarial learning. We show that EntM and LS alone provide robustness only under small perturbations. Contrarily, we show that uncertainty promotion regularizers complement AT in a principled manner, consistently improving performance on both clean examples and under various attacks, especially attacks with large perturbations. We further analyze how uncertainty promotion regularizers enhance the performance of AT from the perspective of Jacobian matrices $\nabla_X f(X;θ)$, and find out that EntM effectively shrinks the norm of Jacobian matrices and hence promotes robustness.
△ Less
Submitted 26 November, 2020;
originally announced November 2020.
-
Protect, Show, Attend and Tell: Empowering Image Captioning Models with Ownership Protection
Authors:
Jian Han Lim,
Chee Seng Chan,
Kam Woh Ng,
Lixin Fan,
Qiang Yang
Abstract:
By and large, existing Intellectual Property (IP) protection on deep neural networks typically i) focus on image classification task only, and ii) follow a standard digital watermarking framework that was conventionally used to protect the ownership of multimedia and video content. This paper demonstrates that the current digital watermarking framework is insufficient to protect image captioning t…
▽ More
By and large, existing Intellectual Property (IP) protection on deep neural networks typically i) focus on image classification task only, and ii) follow a standard digital watermarking framework that was conventionally used to protect the ownership of multimedia and video content. This paper demonstrates that the current digital watermarking framework is insufficient to protect image captioning tasks that are often regarded as one of the frontiers AI problems. As a remedy, this paper studies and proposes two different embedding schemes in the hidden memory state of a recurrent neural network to protect the image captioning model. From empirical points, we prove that a forged key will yield an unusable image captioning model, defeating the purpose of infringement. To the best of our knowledge, this work is the first to propose ownership protection on image captioning task. Also, extensive experiments show that the proposed method does not compromise the original image captioning performance on all common captioning metrics on Flickr30k and MS-COCO datasets, and at the same time it is able to withstand both removal and ambiguity attacks. Code is available at https://github.com/jianhanlim/ipr-imagecaptioning
△ Less
Submitted 31 August, 2021; v1 submitted 25 August, 2020;
originally announced August 2020.
-
Rethinking Privacy Preserving Deep Learning: How to Evaluate and Thwart Privacy Attacks
Authors:
Lixin Fan,
Kam Woh Ng,
Ce Ju,
Tianyu Zhang,
Chang Liu,
Chee Seng Chan,
Qiang Yang
Abstract:
This paper investigates capabilities of Privacy-Preserving Deep Learning (PPDL) mechanisms against various forms of privacy attacks. First, we propose to quantitatively measure the trade-off between model accuracy and privacy losses incurred by reconstruction, tracing and membership attacks. Second, we formulate reconstruction attacks as solving a noisy system of linear equations, and prove that a…
▽ More
This paper investigates capabilities of Privacy-Preserving Deep Learning (PPDL) mechanisms against various forms of privacy attacks. First, we propose to quantitatively measure the trade-off between model accuracy and privacy losses incurred by reconstruction, tracing and membership attacks. Second, we formulate reconstruction attacks as solving a noisy system of linear equations, and prove that attacks are guaranteed to be defeated if condition (2) is unfulfilled. Third, based on theoretical analysis, a novel Secret Polarization Network (SPN) is proposed to thwart privacy attacks, which pose serious challenges to existing PPDL methods. Extensive experiments showed that model accuracies are improved on average by 5-20% compared with baseline mechanisms, in regimes where data privacy are satisfactorily protected.
△ Less
Submitted 23 June, 2020; v1 submitted 20 June, 2020;
originally announced June 2020.
-
Hyper-Sphere Quantization: Communication-Efficient SGD for Federated Learning
Authors:
Xinyan Dai,
Xiao Yan,
Kaiwen Zhou,
Han Yang,
Kelvin K. W. Ng,
James Cheng,
Yu Fan
Abstract:
The high cost of communicating gradients is a major bottleneck for federated learning, as the bandwidth of the participating user devices is limited. Existing gradient compression algorithms are mainly designed for data centers with high-speed network and achieve $O(\sqrt{d} \log d)$ per-iteration communication cost at best, where $d$ is the size of the model. We propose hyper-sphere quantization…
▽ More
The high cost of communicating gradients is a major bottleneck for federated learning, as the bandwidth of the participating user devices is limited. Existing gradient compression algorithms are mainly designed for data centers with high-speed network and achieve $O(\sqrt{d} \log d)$ per-iteration communication cost at best, where $d$ is the size of the model. We propose hyper-sphere quantization (HSQ), a general framework that can be configured to achieve a continuum of trade-offs between communication efficiency and gradient accuracy. In particular, at the high compression ratio end, HSQ provides a low per-iteration communication cost of $O(\log d)$, which is favorable for federated learning. We prove the convergence of HSQ theoretically and show by experiments that HSQ significantly reduces the communication cost of model training without hurting convergence accuracy.
△ Less
Submitted 25 November, 2019; v1 submitted 11 November, 2019;
originally announced November 2019.
-
Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search
Authors:
Xinyan Dai,
Xiao Yan,
Kelvin K. W. Ng,
Jie Liu,
James Cheng
Abstract:
Vector quantization (VQ) techniques are widely used in similarity search for data compression, fast metric computation and etc. Originally designed for Euclidean distance, existing VQ techniques (e.g., PQ, AQ) explicitly or implicitly minimize the quantization error. In this paper, we present a new angle to analyze the quantization error, which decomposes the quantization error into norm error and…
▽ More
Vector quantization (VQ) techniques are widely used in similarity search for data compression, fast metric computation and etc. Originally designed for Euclidean distance, existing VQ techniques (e.g., PQ, AQ) explicitly or implicitly minimize the quantization error. In this paper, we present a new angle to analyze the quantization error, which decomposes the quantization error into norm error and direction error. We show that quantization errors in norm have much higher influence on inner products than quantization errors in direction, and small quantization error does not necessarily lead to good performance in maximum inner product search (MIPS). Based on this observation, we propose norm-explicit quantization (NEQ) --- a general paradigm that improves existing VQ techniques for MIPS. NEQ quantizes the norms of items in a dataset explicitly to reduce errors in norm, which is crucial for MIPS. For the direction vectors, NEQ can simply reuse an existing VQ technique to quantize them without modification. We conducted extensive experiments on a variety of datasets and parameter configurations. The experimental results show that NEQ improves the performance of various VQ techniques for MIPS, including PQ, OPQ, RQ and AQ.
△ Less
Submitted 20 November, 2019; v1 submitted 11 November, 2019;
originally announced November 2019.
-
[Extended version] Rethinking Deep Neural Network Ownership Verification: Embedding Passports to Defeat Ambiguity Attacks
Authors:
Lixin Fan,
Kam Woh Ng,
Chee Seng Chan
Abstract:
With substantial amount of time, resources and human (team) efforts invested to explore and develop successful deep neural networks (DNN), there emerges an urgent need to protect these inventions from being illegally copied, redistributed, or abused without respecting the intellectual properties of legitimate owners. Following recent progresses along this line, we investigate a number of watermark…
▽ More
With substantial amount of time, resources and human (team) efforts invested to explore and develop successful deep neural networks (DNN), there emerges an urgent need to protect these inventions from being illegally copied, redistributed, or abused without respecting the intellectual properties of legitimate owners. Following recent progresses along this line, we investigate a number of watermark-based DNN ownership verification methods in the face of ambiguity attacks, which aim to cast doubts on the ownership verification by forging counterfeit watermarks. It is shown that ambiguity attacks pose serious threats to existing DNN watermarking methods. As remedies to the above-mentioned loophole, this paper proposes novel passport-based DNN ownership verification schemes which are both robust to network modifications and resilient to ambiguity attacks. The gist of embedding digital passports is to design and train DNN models in a way such that, the DNN inference performance of an original task will be significantly deteriorated due to forged passports. In other words, genuine passports are not only verified by looking for the predefined signatures, but also reasserted by the unyielding DNN model inference performances. Extensive experimental results justify the effectiveness of the proposed passport-based DNN ownership verification schemes. Code and models are available at https://github.com/kamwoh/DeepIPR
△ Less
Submitted 2 November, 2019; v1 submitted 16 September, 2019;
originally announced September 2019.
-
Pyramid: A General Framework for Distributed Similarity Search
Authors:
Shiyuan Deng,
Xiao Yan,
Kelvin K. W. Ng,
Chenyu Jiang,
James Cheng
Abstract:
Similarity search is a core component in various applications such as image matching, product recommendation and low-shot classification. However, single machine solutions are usually insufficient due to the large cardinality of modern datasets and stringent latency requirement of on-line query processing. We present Pyramid, a general and efficient framework for distributed similarity search. Pyr…
▽ More
Similarity search is a core component in various applications such as image matching, product recommendation and low-shot classification. However, single machine solutions are usually insufficient due to the large cardinality of modern datasets and stringent latency requirement of on-line query processing. We present Pyramid, a general and efficient framework for distributed similarity search. Pyramid supports search with popular similarity functions including Euclidean distance, angular distance and inner product. Different from existing distributed solutions that are based on KD-tree or locality sensitive hashing (LSH), Pyramid is based on Hierarchical Navigable Small World graph (HNSW), which is the state of the art similarity search algorithm on a single machine. To achieve high query processing throughput, Pyramid partitions a dataset into sub-datasets containing similar items for index building and assigns a query to only some of the sub-datasets for query processing. To provide the robustness required by production deployment, Pyramid also supports failure recovery and straggler mitigation. Pyramid offers a set of concise API such that users can easily use Pyramid without knowing the details of distributed execution. Experiments on large-scale datasets show that Pyramid produces quality results for similarity search, achieves high query processing throughput and is robust under node failure and straggler.
△ Less
Submitted 25 June, 2019;
originally announced June 2019.
-
Homogeneous hierarchical NiMoO4@NiMoO4 nanostructure as a high-performance anode material for electrochemical energy storage
Authors:
Jia Yi Dong,
Jin Cheng Xu,
Kwun Nam Hui,
Ye Yang,
Xi Tian Zhang,
Kar Wei Ng,
Shuang Peng Wang,
Zi Kang Tang
Abstract:
Here we report the extraordinary electrochemical energy storage capability of NiMoO4@NiMoO4 homogeneous hierarchical nanosheet-on-nanowire-arrays (SOWAs) synthesized on nickel substrate by a two-stage hydrothermal process. Comparatively speaking, the SOWAs electrode displays improved electrochemical performances than the bare NiMoO4 nanowire arrays. Such improvements can be ascribed to the charact…
▽ More
Here we report the extraordinary electrochemical energy storage capability of NiMoO4@NiMoO4 homogeneous hierarchical nanosheet-on-nanowire-arrays (SOWAs) synthesized on nickel substrate by a two-stage hydrothermal process. Comparatively speaking, the SOWAs electrode displays improved electrochemical performances than the bare NiMoO4 nanowire arrays. Such improvements can be ascribed to the characteristic homogeneous hierarchical structure which not only effectively increases the active surface areas for fast charge transfer, but also reduces the electrode resistance significantly by eliminating the potential barrier at the nanowire/nanosheet junction, which is usually an issue in other reported heterogeneous architectures. We further evaluate the performances of the SOWAs by constructing an asymmetric hybrid supercapacitor (ASC) with the SOWAs and activated carbon (AC). The optimized ASC shows excellent electrochemical performances with 47.2 Wh/kg in energy density at 1.38 kW/kg at 0-1.2 V. Moreover, the specific capacity retention can be as high as 91.4% after 4000 cycles, illustrating the remarkable cycling stability of the NiMoO4@NiMoO4//AC ASC device. Our results show that this unique NiMoO4@NiMoO4 SOWAs display great prospect for future energy storage applications
△ Less
Submitted 27 March, 2019;
originally announced March 2019.
-
Guaranteed Sufficient Decrease for Stochastic Variance Reduced Gradient Optimization
Authors:
Fanhua Shang,
Yuanyuan Liu,
Kaiwen Zhou,
James Cheng,
Kelvin K. W. Ng,
Yuichi Yoshida
Abstract:
In this paper, we propose a novel sufficient decrease technique for stochastic variance reduced gradient descent methods such as SVRG and SAGA. In order to make sufficient decrease for stochastic optimization, we design a new sufficient decrease criterion, which yields sufficient decrease versions of stochastic variance reduction algorithms such as SVRG-SD and SAGA-SD as a byproduct. We introduce…
▽ More
In this paper, we propose a novel sufficient decrease technique for stochastic variance reduced gradient descent methods such as SVRG and SAGA. In order to make sufficient decrease for stochastic optimization, we design a new sufficient decrease criterion, which yields sufficient decrease versions of stochastic variance reduction algorithms such as SVRG-SD and SAGA-SD as a byproduct. We introduce a coefficient to scale current iterate and to satisfy the sufficient decrease property, which takes the decisions to shrink, expand or even move in the opposite direction, and then give two specific update rules of the coefficient for Lasso and ridge regression. Moreover, we analyze the convergence properties of our algorithms for strongly convex problems, which show that our algorithms attain linear convergence rates. We also provide the convergence guarantees of our algorithms for non-strongly convex problems. Our experimental results further verify that our algorithms achieve significantly better performance than their counterparts.
△ Less
Submitted 25 February, 2018;
originally announced February 2018.
-
Guaranteed Sufficient Decrease for Variance Reduced Stochastic Gradient Descent
Authors:
Fanhua Shang,
Yuanyuan Liu,
James Cheng,
Kelvin Kai Wing Ng,
Yuichi Yoshida
Abstract:
In this paper, we propose a novel sufficient decrease technique for variance reduced stochastic gradient descent methods such as SAG, SVRG and SAGA. In order to make sufficient decrease for stochastic optimization, we design a new sufficient decrease criterion, which yields sufficient decrease versions of variance reduction algorithms such as SVRG-SD and SAGA-SD as a byproduct. We introduce a coef…
▽ More
In this paper, we propose a novel sufficient decrease technique for variance reduced stochastic gradient descent methods such as SAG, SVRG and SAGA. In order to make sufficient decrease for stochastic optimization, we design a new sufficient decrease criterion, which yields sufficient decrease versions of variance reduction algorithms such as SVRG-SD and SAGA-SD as a byproduct. We introduce a coefficient to scale current iterate and satisfy the sufficient decrease property, which takes the decisions to shrink, expand or move in the opposite direction, and then give two specific update rules of the coefficient for Lasso and ridge regression. Moreover, we analyze the convergence properties of our algorithms for strongly convex problems, which show that both of our algorithms attain linear convergence rates. We also provide the convergence guarantees of our algorithms for non-strongly convex problems. Our experimental results further verify that our algorithms achieve significantly better performance than their counterparts.
△ Less
Submitted 4 June, 2017; v1 submitted 20 March, 2017;
originally announced March 2017.
-
Laser Optomechanics
Authors:
Weijian Yang,
S. Adair Gerke,
Kar Wei Ng,
Yi Rao,
Christopher Chase,
Connie J. Chang-Hasnain
Abstract:
Cavity optomechanics explores the coupling between the optical field and the mechanical oscillation to induce cooling and regenerative oscillation in a mechanical oscillator. So far, optomechanics relies on the detuning between the cavity and an external pump laser, where the laser acts only as a power supply. Here, we report a new scheme with mutual coupling between a mechanical oscillator that s…
▽ More
Cavity optomechanics explores the coupling between the optical field and the mechanical oscillation to induce cooling and regenerative oscillation in a mechanical oscillator. So far, optomechanics relies on the detuning between the cavity and an external pump laser, where the laser acts only as a power supply. Here, we report a new scheme with mutual coupling between a mechanical oscillator that supports a mirror of a vertical-cavity surface-emitting laser (VCSEL) and the optical field, greatly enhancing the light-matter energy transfer. In this work, we used an ultra-light-weight (130 pg) high-contrast-grating (HCG) mirror in a VCSEL, whose reflectivity spectrum is designed to facilitate strong optomechanical coupling, to demonstrate optomechanically-induced regenerative oscillation of the laser optomechanical cavity with > 550 nm self-oscillation amplitude of the micro-mechanical oscillator, two to three orders of magnitude larger than typical. This new scheme not only offers an efficient approach for high-speed wavelength-swept sources, but also has far-reaching significance in the realization of quantum entanglement of macroscopic objects and ultrasensitive measurement of displacements and forces.
△ Less
Submitted 26 February, 2015;
originally announced February 2015.
-
Single Crystalline InGaAs Nanopillar Grown on Polysilicon with Dimensions beyond Substrate Grain Size Limit
Authors:
Kar Wei Ng,
Thai-Truong D. Tran,
Wai Son Ko,
Roger Chen,
Fanglu Lu,
Connie J. Chang-Hasnain
Abstract:
Monolithic integration of III-V optoelectronic devices with materials for various functionalities inexpensively is always desirable. Polysilicon (poly-Si) is an ideal platform because it is dopable and semi-conducting and can be deposited and patterned easily on a wide range of low cost substrates. However, the lack of crystalline coherency in poly-Si poses an immense challenge for high-quality ep…
▽ More
Monolithic integration of III-V optoelectronic devices with materials for various functionalities inexpensively is always desirable. Polysilicon (poly-Si) is an ideal platform because it is dopable and semi-conducting and can be deposited and patterned easily on a wide range of low cost substrates. However, the lack of crystalline coherency in poly-Si poses an immense challenge for high-quality epitaxial growth. In this work, we demonstrate, for the first time, direct growth of micron-sized InGaAs/GaAs nanopillars on polysilicon. Transmission electron microscopy shows that the micron-sized pillars are single-crystalline and single Wurzite-phase, far exceeding the substrate crystal grain size ~100nm. The high quality growth is enabled by the unique tapering geometry at the base of the nanostructure, which reduces the effective InGaAs/Si contact area to < 40 nm in diameter. The small footprint not only reduces stress due to lattice mismatch but also prevents the nanopillar from nucleating on multiple Si crystal grains. This relaxes the grain size requirement for poly-Si, potentially reducing the cost for poly-Si deposition. Lasing is achieved in the as-grown pillars under optical pumping, attesting their excellent crystalline and optical quality. These promising results open up a pathway for low-cost synergy of optoelectronics with other technologies such as CMOS integrated circuits, sensing, nanofluidics, thin film transistor display, photovoltaics, etc.
△ Less
Submitted 28 October, 2013;
originally announced October 2013.
-
Nanolasers grown on silicon
Authors:
Roger Chen,
Thai-Truong D. Tran,
Kar Wei Ng,
Wai Son Ko,
Linus C. Chuang,
Forrest G. Sedgwick,
Connie Chang-Hasnain
Abstract:
Integration of optical interconnects with silicon-based electronics can address the growing limitations facing chip-scale data transport as microprocessors become progressively faster. However, material lattice mismatch and incompatible growth temperatures have fundamentally limited monolithic integration of lasers onto silicon substrates until now. Here, we use a novel growth scheme to overcome t…
▽ More
Integration of optical interconnects with silicon-based electronics can address the growing limitations facing chip-scale data transport as microprocessors become progressively faster. However, material lattice mismatch and incompatible growth temperatures have fundamentally limited monolithic integration of lasers onto silicon substrates until now. Here, we use a novel growth scheme to overcome this roadblock and directly grow on-chip InGaAs nanopillar lasers, demonstrating the potency of bottom-up nano-optoelectronic integration. Unique helically-propagating cavity modes are employed to strongly confine light within subwavelength nanopillars despite low refractive index contrast between InGaAs and silicon. These modes thereby provide an avenue for engineering on-chip nanophotonic devices such as lasers. Nanopillar lasers are as-grown on silicon, offer tiny footprints and scalability, and are thereby particularly suited to high-density optoelectronics. They may ultimately form the basis of the missing monolithic light sources needed to bridge the existing gap between photonic and electronic circuits.
△ Less
Submitted 17 January, 2011;
originally announced January 2011.
-
Discovery of microscopic electronic inhomogeneity in the high-Tc superconductor Bi2Sr2CaCu2O8+x
Authors:
S. H. Pan,
J. P. ONeal,
R. L. Badzey,
C. Chamon,
H. Ding,
J. R. Engelbrecht,
Z. Wang,
H. Eisaki,
S. Uchida,
A. K. Gupta,
K. W. Ng,
E. W. Hudson,
K. M. Lang,
J. C. Davis
Abstract:
The parent compounds of the copper oxide high-Tc superconductors are unusual insulators. Superconductivity arises when they are properly doped away from stoichiometry1. In Bi2Sr2CaCu2O8+x, superconductivity results from doping with excess oxygen atoms, which introduce positive charge carriers (holes) into the CuO2 planes, where superconductivity is believed to originate. The role of these oxygen…
▽ More
The parent compounds of the copper oxide high-Tc superconductors are unusual insulators. Superconductivity arises when they are properly doped away from stoichiometry1. In Bi2Sr2CaCu2O8+x, superconductivity results from doping with excess oxygen atoms, which introduce positive charge carriers (holes) into the CuO2 planes, where superconductivity is believed to originate. The role of these oxygen dopants is not well understood, other than the fact that they provide charge carriers. However, it is not even clear how these charges distribute in the CuO2 planes. Accordingly, many models of high-Tc superconductors simply assume that the charge carriers introduced by doping distribute uniformly, leading to an electronically homogeneous system, as in ordinary metals. Here we report the observation of an electronic inhomogeneity in the high-Tc superconductor Bi2Sr2CaCu2O8+x using scanning tunnelling microscopy/spectroscopy. This inhomogeneity is manifested as spatial variations in both the local density of states spectrum and the superconducting energy gap. These variations are correlated spatially and vary on a surprisingly short length scale of ~ 14 Angs. Analysis suggests that the inhomogeneity observed is a consequence of proximity to a Mott insulator resulting in poor screening of the charge potentials associated with the oxygen ions left behind in the BiO plane after doping. Hence this experiment is a direct probe of the local nature of the superconducting state, which is not easily accessible by macroscopic measurements.
△ Less
Submitted 16 July, 2001;
originally announced July 2001.
-
The SPOrt Project: Cosmological and Astrophysical Goals
Authors:
R. Fabbri,
S. Cortiglioni,
S. Cecchini,
M. Orsini,
E. Carretti,
G. Boella,
G. Sironi,
J. Monari,
A. Orfei,
R. Tascone,
U. Pisani,
K. W. Ng,
L. Nicastro,
L. Popa,
I. A. Strukov,
M. V. Sazhin
Abstract:
We present the cosmological and astrophysical objectives of the SPOrt mission, which is scheduled for flying on the International Space Station (ISS) in the year 2002 with the purpose of measuring the diffuse sky polarized radiation in the microwave region. We discuss the problem of disentangling the cosmic background polarized signal from the Galactic foregrounds.
We present the cosmological and astrophysical objectives of the SPOrt mission, which is scheduled for flying on the International Space Station (ISS) in the year 2002 with the purpose of measuring the diffuse sky polarized radiation in the microwave region. We discuss the problem of disentangling the cosmic background polarized signal from the Galactic foregrounds.
△ Less
Submitted 26 January, 1999;
originally announced January 1999.
-
The SPOrt Project: an Experimental Overview
Authors:
S. Cortiglioni,
S. Cecchini,
E. Carretti,
M. Orsini,
R. Fabbri,
G. Boella,
G. Sironi,
J. Monari,
A. Orfei,
R. Tascone,
U. Pisani,
K. W. Ng,
L. Nicastro,
L. Popa,
I. A. Strukov,
M. V. Sazhin
Abstract:
The Sky Polarization Observatory (SPOrt) is presented as a project aimed to measure the diffuse sky polarized emission, from the International Space Station, in the frequency range 20-90 GHz with 7 degrees of HPBW. The SPOrt experimental configuration is described with emphasis on the aspects that make SPOrt the first European scientific payload operating at microwave wavelengths.
The Sky Polarization Observatory (SPOrt) is presented as a project aimed to measure the diffuse sky polarized emission, from the International Space Station, in the frequency range 20-90 GHz with 7 degrees of HPBW. The SPOrt experimental configuration is described with emphasis on the aspects that make SPOrt the first European scientific payload operating at microwave wavelengths.
△ Less
Submitted 26 January, 1999;
originally announced January 1999.