Search | arXiv e-print repository

Introducing Diminutive Causal Structure into Graph Representation Learning

Authors: Hang Gao, Peng Qiao, Yifan Jin, Fengge Wu, Jiangmeng Li, Changwen Zheng

Abstract: When engaging in end-to-end graph representation learning with Graph Neural Networks (GNNs), the intricate causal relationships and rules inherent in graph data pose a formidable challenge for the model in accurately capturing authentic data relationships. A proposed mitigating strategy involves the direct integration of rules or relationships corresponding to the graph data into the model. Howeve… ▽ More When engaging in end-to-end graph representation learning with Graph Neural Networks (GNNs), the intricate causal relationships and rules inherent in graph data pose a formidable challenge for the model in accurately capturing authentic data relationships. A proposed mitigating strategy involves the direct integration of rules or relationships corresponding to the graph data into the model. However, within the domain of graph representation learning, the inherent complexity of graph data obstructs the derivation of a comprehensive causal structure that encapsulates universal rules or relationships governing the entire dataset. Instead, only specialized diminutive causal structures, delineating specific causal relationships within constrained subsets of graph data, emerge as discernible. Motivated by empirical insights, it is observed that GNN models exhibit a tendency to converge towards such specialized causal structures during the training process. Consequently, we posit that the introduction of these specific causal structures is advantageous for the training of GNN models. Building upon this proposition, we introduce a novel method that enables GNN models to glean insights from these specialized diminutive causal structures, thereby enhancing overall performance. Our method specifically extracts causal knowledge from the model representation of these diminutive causal structures and incorporates interchange intervention to optimize the learning process. Theoretical analysis serves to corroborate the efficacy of our proposed method. Furthermore, empirical experiments consistently demonstrate significant performance improvements across diverse datasets. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.07170 [pdf, other]

VoxNeuS: Enhancing Voxel-Based Neural Surface Reconstruction via Gradient Interpolation

Authors: Sidun Liu, Peng Qiao, Zongxin Ye, Wenyu Li, Yong Dou

Abstract: Neural Surface Reconstruction learns a Signed Distance Field~(SDF) to reconstruct the 3D model from multi-view images. Previous works adopt voxel-based explicit representation to improve efficiency. However, they ignored the gradient instability of interpolation in the voxel grid, leading to degradation on convergence and smoothness. Besides, previous works entangled the optimization of geometry a… ▽ More Neural Surface Reconstruction learns a Signed Distance Field~(SDF) to reconstruct the 3D model from multi-view images. Previous works adopt voxel-based explicit representation to improve efficiency. However, they ignored the gradient instability of interpolation in the voxel grid, leading to degradation on convergence and smoothness. Besides, previous works entangled the optimization of geometry and radiance, which leads to the deformation of geometry to explain radiance, causing artifacts when reconstructing textured planes. In this work, we reveal that the instability of gradient comes from its discontinuity during trilinear interpolation, and propose to use the interpolated gradient instead of the original analytical gradient to eliminate the discontinuity. Based on gradient interpolation, we propose VoxNeuS, a lightweight surface reconstruction method for computational and memory efficient neural surface reconstruction. Thanks to the explicit representation, the gradient of regularization terms, i.e. Eikonal and curvature loss, are directly solved, avoiding computation and memory-access overhead. Further, VoxNeuS adopts a geometry-radiance disentangled architecture to handle the geometry deformation from radiance optimization. The experimental results show that VoxNeuS achieves better reconstruction quality than previous works. The entire training process takes 15 minutes and less than 3 GB of memory on a single 2080ti GPU. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2405.04416 [pdf, other]

DistGrid: Scalable Scene Reconstruction with Distributed Multi-resolution Hash Grid

Authors: Sidun Liu, Peng Qiao, Zongxin Ye, Wenyu Li, Yong Dou

Abstract: Neural Radiance Field~(NeRF) achieves extremely high quality in object-scaled and indoor scene reconstruction. However, there exist some challenges when reconstructing large-scale scenes. MLP-based NeRFs suffer from limited network capacity, while volume-based NeRFs are heavily memory-consuming when the scene resolution increases. Recent approaches propose to geographically partition the scene and… ▽ More Neural Radiance Field~(NeRF) achieves extremely high quality in object-scaled and indoor scene reconstruction. However, there exist some challenges when reconstructing large-scale scenes. MLP-based NeRFs suffer from limited network capacity, while volume-based NeRFs are heavily memory-consuming when the scene resolution increases. Recent approaches propose to geographically partition the scene and learn each sub-region using an individual NeRF. Such partitioning strategies help volume-based NeRF exceed the single GPU memory limit and scale to larger scenes. However, this approach requires multiple background NeRF to handle out-of-partition rays, which leads to redundancy of learning. Inspired by the fact that the background of current partition is the foreground of adjacent partition, we propose a scalable scene reconstruction method based on joint Multi-resolution Hash Grids, named DistGrid. In this method, the scene is divided into multiple closely-paved yet non-overlapped Axis-Aligned Bounding Boxes, and a novel segmented volume rendering method is proposed to handle cross-boundary rays, thereby eliminating the need for background NeRFs. The experiments demonstrate that our method outperforms existing methods on all evaluated large-scale scenes, and provides visually plausible scene reconstruction. The scalability of our method on reconstruction quality is further evaluated qualitatively and quantitatively. △ Less

Submitted 8 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: Originally submitted to Siggraph Asia 2023

arXiv:2405.00587 [pdf, other]

GraCo: Granularity-Controllable Interactive Segmentation

Authors: Yian Zhao, Kehan Li, Zesen Cheng, Pengchong Qiao, Xiawu Zheng, Rongrong Ji, Chang Liu, Li Yuan, Jie Chen

Abstract: Interactive Segmentation (IS) segments specific objects or parts in the image according to user input. Current IS pipelines fall into two categories: single-granularity output and multi-granularity output. The latter aims to alleviate the spatial ambiguity present in the former. However, the multi-granularity output pipeline suffers from limited interaction flexibility and produces redundant resul… ▽ More Interactive Segmentation (IS) segments specific objects or parts in the image according to user input. Current IS pipelines fall into two categories: single-granularity output and multi-granularity output. The latter aims to alleviate the spatial ambiguity present in the former. However, the multi-granularity output pipeline suffers from limited interaction flexibility and produces redundant results. In this work, we introduce Granularity-Controllable Interactive Segmentation (GraCo), a novel approach that allows precise control of prediction granularity by introducing additional parameters to input. This enhances the customization of the interactive system and eliminates redundancy while resolving ambiguity. Nevertheless, the exorbitant cost of annotating multi-granularity masks and the lack of available datasets with granularity annotations make it difficult for models to acquire the necessary guidance to control output granularity. To address this problem, we design an any-granularity mask generator that exploits the semantic property of the pre-trained IS model to automatically generate abundant mask-granularity pairs without requiring additional manual annotation. Based on these pairs, we propose a granularity-controllable learning strategy that efficiently imparts the granularity controllability to the IS model. Extensive experiments on intricate scenarios at object and part levels demonstrate that our GraCo has significant advantages over previous methods. This highlights the potential of GraCo to be a flexible annotation tool, capable of adapting to diverse segmentation scenarios. The project page: https://zhao-yian.github.io/GraCo. △ Less

Submitted 16 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

Comments: CVPR2024 Highlight, Project: https://zhao-yian.github.io/GraCo

arXiv:2404.10484 [pdf, other]

AbsGS: Recovering Fine Details for 3D Gaussian Splatting

Authors: Zongxin Ye, Wenyu Li, Sidun Liu, Peng Qiao, Yong Dou

Abstract: 3D Gaussian Splatting (3D-GS) technique couples 3D Gaussian primitives with differentiable rasterization to achieve high-quality novel view synthesis results while providing advanced real-time rendering performance. However, due to the flaw of its adaptive density control strategy in 3D-GS, it frequently suffers from over-reconstruction issue in intricate scenes containing high-frequency details,… ▽ More 3D Gaussian Splatting (3D-GS) technique couples 3D Gaussian primitives with differentiable rasterization to achieve high-quality novel view synthesis results while providing advanced real-time rendering performance. However, due to the flaw of its adaptive density control strategy in 3D-GS, it frequently suffers from over-reconstruction issue in intricate scenes containing high-frequency details, leading to blurry rendered images. The underlying reason for the flaw has still been under-explored. In this work, we present a comprehensive analysis of the cause of aforementioned artifacts, namely gradient collision, which prevents large Gaussians in over-reconstructed regions from splitting. To address this issue, we propose the novel homodirectional view-space positional gradient as the criterion for densification. Our strategy efficiently identifies large Gaussians in over-reconstructed regions, and recovers fine details by splitting. We evaluate our proposed method on various challenging datasets. The experimental results indicate that our approach achieves the best rendering quality with reduced or similar memory consumption. Our method is easy to implement and can be incorporated into a wide variety of most recent Gaussian Splatting-based methods. We will open source our codes upon formal publication. Our project page is available at: https://ty424.github.io/AbsGS.github.io/ △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2403.11449 [pdf, other]

Graph Partial Label Learning with Potential Cause Discovering

Authors: Hang Gao, Jiaguo Yuan, Jiangmeng Li, Peng Qiao, Fengge Wu, Changwen Zheng, Huaping Liu

Abstract: Graph Neural Networks (GNNs) have garnered widespread attention for their potential to address the challenges posed by graph representation learning, which face complex graph-structured data across various domains. However, due to the inherent complexity and interconnectedness of graphs, accurately annotating graph data for training GNNs is extremely challenging. To address this issue, we have int… ▽ More Graph Neural Networks (GNNs) have garnered widespread attention for their potential to address the challenges posed by graph representation learning, which face complex graph-structured data across various domains. However, due to the inherent complexity and interconnectedness of graphs, accurately annotating graph data for training GNNs is extremely challenging. To address this issue, we have introduced Partial Label Learning (PLL) into graph representation learning. PLL is a critical weakly supervised learning problem where each training instance is associated with a set of candidate labels, including the ground-truth label and the additional interfering labels. PLL allows annotators to make errors, which reduces the difficulty of data labeling. Subsequently, we propose a novel graph representation learning method that enables GNN models to effectively learn discriminative information within the context of PLL. Our approach utilizes potential cause extraction to obtain graph data that holds causal relationships with the labels. By conducting auxiliary training based on the extracted graph data, our model can effectively eliminate the interfering information in the PLL scenario. We support the rationale behind our method with a series of theoretical analyses. Moreover, we conduct extensive evaluations and ablation studies on multiple datasets, demonstrating the superiority of our proposed method. △ Less

Submitted 21 May, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.06775 [pdf, other]

FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation

Authors: Pengchong Qiao, Lei Shang, Chang Liu, Baigui Sun, Xiangyang Ji, Jie Chen

Abstract: Subject-driven generation has garnered significant interest recently due to its ability to personalize text-to-image generation. Typical works focus on learning the new subject's private attributes. However, an important fact has not been taken seriously that a subject is not an isolated new concept but should be a specialization of a certain category in the pre-trained model. This results in the… ▽ More Subject-driven generation has garnered significant interest recently due to its ability to personalize text-to-image generation. Typical works focus on learning the new subject's private attributes. However, an important fact has not been taken seriously that a subject is not an isolated new concept but should be a specialization of a certain category in the pre-trained model. This results in the subject failing to comprehensively inherit the attributes in its category, causing poor attribute-related generations. In this paper, motivated by object-oriented programming, we model the subject as a derived class whose base class is its semantic category. This modeling enables the subject to inherit public attributes from its category while learning its private attributes from the user-provided example. Specifically, we propose a plug-and-play method, Subject-Derived regularization (SuDe). It constructs the base-derived class modeling by constraining the subject-driven generated images to semantically belong to the subject's category. Extensive experiments under three baselines and two backbones on various subjects show that our SuDe enables imaginative attribute-related generations while maintaining subject fidelity. Codes will be open sourced soon at FaceChain (https://github.com/modelscope/facechain). △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: accepted by CVPR2024

arXiv:2401.15949 [pdf, ps, other]

TFDMNet: A Novel Network Structure Combines the Time Domain and Frequency Domain Features

Authors: Hengyue Pan, Yixin Chen, Zhiliang Tian, Peng Qiao, Linbo Qiao, Dongsheng Li

Abstract: Convolutional neural network (CNN) has achieved impressive success in computer vision during the past few decades. The image convolution operation helps CNNs to get good performance on image-related tasks. However, it also has high computation complexity and hard to be parallelized. This paper proposes a novel Element-wise Multiplication Layer (EML) to replace convolution layers, which can be trai… ▽ More Convolutional neural network (CNN) has achieved impressive success in computer vision during the past few decades. The image convolution operation helps CNNs to get good performance on image-related tasks. However, it also has high computation complexity and hard to be parallelized. This paper proposes a novel Element-wise Multiplication Layer (EML) to replace convolution layers, which can be trained in the frequency domain. Theoretical analyses show that EMLs lower the computation complexity and easier to be parallelized. Moreover, we introduce a Weight Fixation mechanism to alleviate the problem of over-fitting, and analyze the working behavior of Batch Normalization and Dropout in the frequency domain. To get the balance between the computation complexity and memory usage, we propose a new network structure, namely Time-Frequency Domain Mixture Network (TFDMNet), which combines the advantages of both convolution layers and EMLs. Experimental results imply that TFDMNet achieves good performance on MNIST, CIFAR-10 and ImageNet databases with less number of operations comparing with corresponding CNNs. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: This paper is the updated edition of our paper Learning Convolutional Neural Networks in the Frequency Domain (arXiv:2204.06718). Comparing with the previous edition, we design a mixture model to get the balance between the computation complexity and memory usage

arXiv:2311.11825 [pdf, other]

Holistic Inverse Rendering of Complex Facade via Aerial 3D Scanning

Authors: Zixuan Xie, Rengan Xie, Rong Li, Kai Huang, Pengju Qiao, Jingsen Zhu, Xu Yin, Qi Ye, Wei Hua, Yuchi Huo, Hujun Bao

Abstract: In this work, we use multi-view aerial images to reconstruct the geometry, lighting, and material of facades using neural signed distance fields (SDFs). Without the requirement of complex equipment, our method only takes simple RGB images captured by a drone as inputs to enable physically based and photorealistic novel-view rendering, relighting, and editing. However, a real-world facade usually h… ▽ More In this work, we use multi-view aerial images to reconstruct the geometry, lighting, and material of facades using neural signed distance fields (SDFs). Without the requirement of complex equipment, our method only takes simple RGB images captured by a drone as inputs to enable physically based and photorealistic novel-view rendering, relighting, and editing. However, a real-world facade usually has complex appearances ranging from diffuse rocks with subtle details to large-area glass windows with specular reflections, making it hard to attend to everything. As a result, previous methods can preserve the geometry details but fail to reconstruct smooth glass windows or verse vise. In order to address this challenge, we introduce three spatial- and semantic-adaptive optimization strategies, including a semantic regularization approach based on zero-shot segmentation techniques to improve material consistency, a frequency-aware geometry regularization to balance surface smoothness and details in different surfaces, and a visibility probe-based scheme to enable efficient modeling of the local lighting in large-scale outdoor environments. In addition, we capture a real-world facade aerial 3D scanning image set and corresponding point clouds for training and benchmarking. The experiment demonstrates the superior quality of our method on facade holistic inverse rendering, novel view synthesis, and scene editing compared to state-of-the-art baselines. △ Less

Submitted 8 April, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

arXiv:2310.14616 [pdf, other]

Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz

Authors: Tao Sun, Congliang Chen, Peng Qiao, Li Shen, Xinwang Liu, Dongsheng Li

Abstract: Sign-based stochastic methods have gained attention due to their ability to achieve robust performance despite using only the sign information for parameter updates. However, the current convergence analysis of sign-based methods relies on the strong assumptions of first-order gradient Lipschitz and second-order gradient Lipschitz, which may not hold in practical tasks like deep neural network tra… ▽ More Sign-based stochastic methods have gained attention due to their ability to achieve robust performance despite using only the sign information for parameter updates. However, the current convergence analysis of sign-based methods relies on the strong assumptions of first-order gradient Lipschitz and second-order gradient Lipschitz, which may not hold in practical tasks like deep neural network training that involve high non-smoothness. In this paper, we revisit sign-based methods and analyze their convergence under more realistic assumptions of first- and second-order smoothness. We first establish the convergence of the sign-based method under weak first-order Lipschitz. Motivated by the weak first-order Lipschitz, we propose a relaxed second-order condition that still allows for nonconvex acceleration in sign-based methods. Based on our theoretical results, we gain insights into the computational advantages of the recently developed LION algorithm. In distributed settings, we prove that this nonconvex acceleration persists with linear speedup in the number of nodes, when utilizing fast communication compression gossip protocols. The novelty of our theoretical results lies in that they are derived under much weaker assumptions, thereby expanding the provable applicability of sign-based algorithms to a wider range of problems. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2305.18158 [pdf, other]

Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning

Authors: Yu Wang, Pengchong Qiao, Chang Liu, Guoli Song, Xiawu Zheng, Jie Chen

Abstract: Recent advances in robust semi-supervised learning (SSL) typically filter out-of-distribution (OOD) information at the sample level. We argue that an overlooked problem of robust SSL is its corrupted information on semantic level, practically limiting the development of the field. In this paper, we take an initial step to explore and propose a unified framework termed OOD Semantic Pruning (OSP), w… ▽ More Recent advances in robust semi-supervised learning (SSL) typically filter out-of-distribution (OOD) information at the sample level. We argue that an overlooked problem of robust SSL is its corrupted information on semantic level, practically limiting the development of the field. In this paper, we take an initial step to explore and propose a unified framework termed OOD Semantic Pruning (OSP), which aims at pruning OOD semantics out from in-distribution (ID) features. Specifically, (i) we propose an aliasing OOD matching module to pair each ID sample with an OOD sample with semantic overlap. (ii) We design a soft orthogonality regularization, which first transforms each ID feature by suppressing its semantic component that is collinear with paired OOD sample. It then forces the predictions before and after soft orthogonality decomposition to be consistent. Being practically simple, our method shows a strong performance in OOD detection and ID classification on challenging benchmarks. In particular, OSP surpasses the previous state-of-the-art by 13.7% on accuracy for ID classification and 5.9% on AUROC for OOD detection on TinyImageNet dataset. The source codes are publicly available at https://github.com/rain305f/OSP. △ Less

Submitted 29 May, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

Comments: Accpected by CVPR 2023

arXiv:2304.13639 [pdf, other]

PVP: Pre-trained Visual Parameter-Efficient Tuning

Authors: Zhao Song, Ke Yang, Naiyang Guan, Junjie Zhu, Peng Qiao, Qingyong Hu

Abstract: Large-scale pre-trained transformers have demonstrated remarkable success in various computer vision tasks. However, it is still highly challenging to fully fine-tune these models for downstream tasks due to their high computational and storage costs. Recently, Parameter-Efficient Tuning (PETuning) techniques, e.g., Visual Prompt Tuning (VPT) and Low-Rank Adaptation (LoRA), have significantly redu… ▽ More Large-scale pre-trained transformers have demonstrated remarkable success in various computer vision tasks. However, it is still highly challenging to fully fine-tune these models for downstream tasks due to their high computational and storage costs. Recently, Parameter-Efficient Tuning (PETuning) techniques, e.g., Visual Prompt Tuning (VPT) and Low-Rank Adaptation (LoRA), have significantly reduced the computation and storage cost by inserting lightweight prompt modules into the pre-trained models and tuning these prompt modules with a small number of trainable parameters, while keeping the transformer backbone frozen. Although only a few parameters need to be adjusted, most PETuning methods still require a significant amount of downstream task training data to achieve good results. The performance is inadequate on low-data regimes, especially when there are only one or two examples per class. To this end, we first empirically identify the poor performance is mainly due to the inappropriate way of initializing prompt modules, which has also been verified in the pre-trained language models. Next, we propose a Pre-trained Visual Parameter-efficient (PVP) Tuning framework, which pre-trains the parameter-efficient tuning modules first and then leverages the pre-trained modules along with the pre-trained transformer backbone to perform parameter-efficient tuning on downstream tasks. Experiment results on five Fine-Grained Visual Classification (FGVC) and VTAB-1k datasets demonstrate that our proposed method significantly outperforms state-of-the-art PETuning methods. △ Less

Submitted 26 April, 2023; originally announced April 2023.

arXiv:2301.12332 [pdf, other]

Towards Vision Transformer Unrolling Fixed-Point Algorithm: a Case Study on Image Restoration

Authors: Peng Qiao, Sidun Liu, Tao Sun, Ke Yang, Yong Dou

Abstract: The great success of Deep Neural Networks (DNNs) has inspired the algorithmic development of DNN-based Fixed-Point (DNN-FP) for computer vision tasks. DNN-FP methods, trained by Back-Propagation Through Time or computing the inaccurate inversion of the Jacobian, suffer from inferior representation ability. Motivated by the representation power of the Transformer, we propose a framework to unroll t… ▽ More The great success of Deep Neural Networks (DNNs) has inspired the algorithmic development of DNN-based Fixed-Point (DNN-FP) for computer vision tasks. DNN-FP methods, trained by Back-Propagation Through Time or computing the inaccurate inversion of the Jacobian, suffer from inferior representation ability. Motivated by the representation power of the Transformer, we propose a framework to unroll the FP and approximate each unrolled process via Transformer blocks, called FPformer. To reduce the high consumption of memory and computation, we come up with FPRformer by sharing parameters between the successive blocks. We further design a module to adapt Anderson acceleration to FPRformer to enlarge the unrolled iterations and improve the performance, called FPAformer. In order to fully exploit the capability of the Transformer, we apply the proposed model to image restoration, using self-supervised pre-training and supervised fine-tuning. 161 tasks from 4 categories of image restoration problems are used in the pre-training phase. Hereafter, the pre-trained FPformer, FPRformer, and FPAformer are further fine-tuned for the comparison scenarios. Using self-supervised pre-training and supervised fine-tuning, the proposed FPformer, FPRformer, and FPAformer achieve competitive performance with state-of-the-art image restoration methods and better training efficiency. FPAformer employs only 29.82% parameters used in SwinIR models, and provides superior performance after fine-tuning. To train these comparison models, it takes only 26.9% time used for training SwinIR models. It provides a promising way to introduce the Transformer in low-level vision tasks. △ Less

Submitted 28 January, 2023; originally announced January 2023.

arXiv:2301.01896 [pdf]

Nanoparticles Passive Targeting Allows Optical Imaging of Bone Diseases

Authors: Chao Mi, Xun Zhang, Chengyu Yang, Jianqun Wu, Xinxin Chen, Chenguang Ma, Sitong Wu, Zhichao Yang, Pengzhen Qiao, Yang Liu, Weijie Wu, Zhiyong Guo, Jiayan Liao, Jiajia Zhou, Ming Guan, Chao Liang, Chao Liu, Dayong Jin

Abstract: Bone health related skeletal disorders are commonly diagnosed by X-ray imaging, but the radiation limits its use. Light excitation and optical imaging through the near-infrared-II window (NIR-II, 1000-1700 nm) can penetrate deep tissues without radiation risk, but the targeting of contrast agent is non-specific. Here, we report that lanthanide-doped nanocrystals can be passively transported by end… ▽ More Bone health related skeletal disorders are commonly diagnosed by X-ray imaging, but the radiation limits its use. Light excitation and optical imaging through the near-infrared-II window (NIR-II, 1000-1700 nm) can penetrate deep tissues without radiation risk, but the targeting of contrast agent is non-specific. Here, we report that lanthanide-doped nanocrystals can be passively transported by endothelial cells and macrophages from the blood vessels into bone marrow microenvironment. We found that this passive targeting scheme can be effective for longer than two months. We therefore developed an intravital 3D and high-resolution planar imaging instrumentation for bone disease diagnosis. We demonstrated the regular monitoring of 1 mm bone defects for over 10 days, with resolution similar to X-ray imaging result, but more flexible use in prognosis. Moreover, the passive targeting can be used to reveal the early onset inflammation at the joints as the synovitis in the early stage of rheumatoid arthritis. Furthermore, the proposed method is comparable to μみゅーCT in recognizing symptoms of osteoarthritis, including the mild hyperostosis in femur which is ~100 μみゅーm thicker than normal, and the growth of millimeter-scale osteophyte in the knee joint, which further proves the power and universality of our approach in diagnosis of bone diseases △ Less

Submitted 4 January, 2023; originally announced January 2023.

arXiv:2211.12268 [pdf, other]

Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation

Authors: Zesen Cheng, Pengchong Qiao, Kehan Li, Siheng Li, Pengxu Wei, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen

Abstract: Weakly supervised semantic segmentation is typically inspired by class activation maps, which serve as pseudo masks with class-discriminative regions highlighted. Although tremendous efforts have been made to recall precise and complete locations for each class, existing methods still commonly suffer from the unsolicited Out-of-Candidate (OC) error predictions that not belongs to the label candida… ▽ More Weakly supervised semantic segmentation is typically inspired by class activation maps, which serve as pseudo masks with class-discriminative regions highlighted. Although tremendous efforts have been made to recall precise and complete locations for each class, existing methods still commonly suffer from the unsolicited Out-of-Candidate (OC) error predictions that not belongs to the label candidates, which could be avoidable since the contradiction with image-level class tags is easy to be detected. In this paper, we develop a group ranking-based Out-of-Candidate Rectification (OCR) mechanism in a plug-and-play fashion. Firstly, we adaptively split the semantic categories into In-Candidate (IC) and OC groups for each OC pixel according to their prior annotation correlation and posterior prediction correlation. Then, we derive a differentiable rectification loss to force OC pixels to shift to the IC group. Incorporating our OCR with seminal baselines (e.g., AffinityNet, SEAM, MCTformer), we can achieve remarkable performance gains on both Pascal VOC (+3.2%, +3.3%, +0.8% mIoU) and MS COCO (+1.0%, +1.3%, +0.5% mIoU) datasets with negligible extra training overhead, which justifies the effectiveness and generality of our OCR. △ Less

Submitted 14 March, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

Comments: Accepted to CVPR2023

arXiv:2210.08519 [pdf, other]

Fuzzy Positive Learning for Semi-supervised Semantic Segmentation

Authors: Pengchong Qiao, Zhidan Wei, Yu Wang, Zhennan Wang, Guoli Song, Fan Xu, Xiangyang Ji, Chang Liu, Jie Chen

Abstract: Semi-supervised learning (SSL) essentially pursues class boundary exploration with less dependence on human annotations. Although typical attempts focus on ameliorating the inevitable error-prone pseudo-labeling, we think differently and resort to exhausting informative semantics from multiple probably correct candidate labels. In this paper, we introduce Fuzzy Positive Learning (FPL) for accurate… ▽ More Semi-supervised learning (SSL) essentially pursues class boundary exploration with less dependence on human annotations. Although typical attempts focus on ameliorating the inevitable error-prone pseudo-labeling, we think differently and resort to exhausting informative semantics from multiple probably correct candidate labels. In this paper, we introduce Fuzzy Positive Learning (FPL) for accurate SSL semantic segmentation in a plug-and-play fashion, targeting adaptively encouraging fuzzy positive predictions and suppressing highly-probable negatives. Being conceptually simple yet practically effective, FPL can remarkably alleviate interference from wrong pseudo labels and progressively achieve clear pixel-level semantic discrimination. Concretely, our FPL approach consists of two main components, including fuzzy positive assignment (FPA) to provide an adaptive number of labels for each pixel and fuzzy positive regularization (FPR) to restrict the predictions of fuzzy positive categories to be larger than the rest under different perturbations. Theoretical analysis and extensive experiments on Cityscapes and VOC 2012 with consistent performance gain justify the superiority of our approach. △ Less

Submitted 19 November, 2022; v1 submitted 16 October, 2022; originally announced October 2022.

arXiv:2208.13029 [pdf, other]

Multi-Outputs Is All You Need For Deblur

Authors: Sidun Liu, Peng Qiao, Yong Dou

Abstract: Image deblurring task is an ill-posed one, where exists infinite feasible solutions for blurry image. Modern deep learning approaches usually discard the learning of blur kernels and directly employ end-to-end supervised learning. Popular deblurring datasets define the label as one of the feasible solutions. However, we argue that it's not reasonable to specify a label directly, especially when th… ▽ More Image deblurring task is an ill-posed one, where exists infinite feasible solutions for blurry image. Modern deep learning approaches usually discard the learning of blur kernels and directly employ end-to-end supervised learning. Popular deblurring datasets define the label as one of the feasible solutions. However, we argue that it's not reasonable to specify a label directly, especially when the label is sampled from a random distribution. Therefore, we propose to make the network learn the distribution of feasible solutions, and design based on this consideration a novel multi-head output architecture and corresponding loss function for distribution learning. Our approach enables the model to output multiple feasible solutions to approximate the target distribution. We further propose a novel parameter multiplexing method that reduces the number of parameters and computational effort while improving performance. We evaluated our approach on multiple image-deblur models, including the current state-of-the-art NAFNet. The improvement of best overall (pick the highest score among multiple heads for each validation image) PSNR outperforms the compared baselines up to 0.11~0.18dBでしべる. The improvement of the best single head (pick the best-performed head among multiple heads on validation set) PSNR outperforms the compared baselines up to 0.04~0.08dBでしべる. The codes are available at https://github.com/Liu-SD/multi-output-deblur. △ Less

Submitted 27 August, 2022; originally announced August 2022.

Comments: Under review

arXiv:2207.02625 [pdf, other]

$L_2$BN: Enhancing Batch Normalization by Equalizing the $L_2$ Norms of Features

Authors: Zhennan Wang, Kehan Li, Runyi Yu, Yian Zhao, Pengchong Qiao, Chang Liu, Fan Xu, Xiangyang Ji, Guoli Song, Jie Chen

Abstract: In this paper, we analyze batch normalization from the perspective of discriminability and find the disadvantages ignored by previous studies: the difference in $l_2$ norms of sample features can hinder batch normalization from obtaining more distinguished inter-class features and more compact intra-class features. To address this issue, we propose a simple yet effective method to equalize the… ▽ More In this paper, we analyze batch normalization from the perspective of discriminability and find the disadvantages ignored by previous studies: the difference in $l_2$ norms of sample features can hinder batch normalization from obtaining more distinguished inter-class features and more compact intra-class features. To address this issue, we propose a simple yet effective method to equalize the $l_2$ norms of sample features. Concretely, we $l_2$-normalize each sample feature before feeding them into batch normalization, and therefore the features are of the same magnitude. Since the proposed method combines the $l_2$ normalization and batch normalization, we name our method $L_2$BN. The $L_2$BN can strengthen the compactness of intra-class features and enlarge the discrepancy of inter-class features. The $L_2$BN is easy to implement and can exert its effect without any additional parameters or hyper-parameters. We evaluate the effectiveness of $L_2$BN through extensive experiments with various models on image classification and acoustic scene classification tasks. The results demonstrate that the $L_2$BN can boost the generalization ability of various neural network models and achieve considerable performance improvements. △ Less

Submitted 21 March, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

Comments: 12 pages, 8 figures

arXiv:2008.00901 [pdf, other]

Automated Segmentation of Brain Gray Matter Nuclei on Quantitative Susceptibility Mapping Using Deep Convolutional Neural Network

Authors: Chao Chai, Pengchong Qiao, Bin Zhao, Huiying Wang, Guohua Liu, Hong Wu, E Mark Haacke, Wen Shen, Chen Cao, Xinchen Ye, Zhiyang Liu, Shuang Xia

Abstract: Abnormal iron accumulation in the brain subcortical nuclei has been reported to be correlated to various neurodegenerative diseases, which can be measured through the magnetic susceptibility from the quantitative susceptibility mapping (QSM). To quantitively measure the magnetic susceptibility, the nuclei should be accurately segmented, which is a tedious task for clinicians. In this paper, we pro… ▽ More Abnormal iron accumulation in the brain subcortical nuclei has been reported to be correlated to various neurodegenerative diseases, which can be measured through the magnetic susceptibility from the quantitative susceptibility mapping (QSM). To quantitively measure the magnetic susceptibility, the nuclei should be accurately segmented, which is a tedious task for clinicians. In this paper, we proposed a double-branch residual-structured U-Net (DB-ResUNet) based on 3D convolutional neural network (CNN) to automatically segment such brain gray matter nuclei. To better tradeoff between segmentation accuracy and the memory efficiency, the proposed DB-ResUNet fed image patches with high resolution and the patches with low resolution but larger field of view into the local and global branches, respectively. Experimental results revealed that by jointly using QSM and T$_\text{1}$ weighted imaging (T$_\text{1}$WI) as inputs, the proposed method was able to achieve better segmentation accuracy over its single-branch counterpart, as well as the conventional atlas-based method and the classical 3D-UNet structure. The susceptibility values and the volumes were also measured, which indicated that the measurements from the proposed DB-ResUNet are able to present high correlation with values from the manually annotated regions of interest. △ Less

Submitted 3 August, 2020; originally announced August 2020.

Comments: submitted to IEEE Transactions on Medical Imaging

arXiv:2001.11723 [pdf, other]

On a problem of Erdős about graphs whose size is the Turán number plus one

Authors: Pu Qiao, Xingzhi Zhan

Abstract: We consider finite simple graphs. Given a graph $H$ and a positive integer $n,$ the Turán number of $H$ for the order $n,$ denoted ${\rm ex}(n,H),$ is the maximum size of a graph of order $n$ not containing $H$ as a subgraph. Erdős posed the following problem in 1990: "For which graphs $H$ is it true that every graph on $n$ vertices and ${\rm ex}(n,H)+1$ edges contains at least two $H$s? Perhaps… ▽ More We consider finite simple graphs. Given a graph $H$ and a positive integer $n,$ the Turán number of $H$ for the order $n,$ denoted ${\rm ex}(n,H),$ is the maximum size of a graph of order $n$ not containing $H$ as a subgraph. Erdős posed the following problem in 1990: "For which graphs $H$ is it true that every graph on $n$ vertices and ${\rm ex}(n,H)+1$ edges contains at least two $H$s? Perhaps this is always true." We solve the second part of this problem in the negative by proving that for every integer $k\ge 4,$ there exists a graph $H$ of order $k$ and at least two orders $n$ such that there exists a graph of order $n$ and size ${\rm ex}(n,H)+1$ which contains exactly one copy of $H.$ Denote by $C_4$ the $4$-cycle. We also prove that for every integer $n$ with $6\le n\le 11,$ there exists a graph of order $n$ and size ${\rm ex}(n,C_4)+1$ which contains exactly one copy of $C_4,$ but for $n=12$ or $n=13,$ the minimum number of copies of $C_4$ in a graph of order $n$ and size ${\rm ex}(n,C_4)+1$ is $2.$ △ Less

Submitted 31 January, 2020; originally announced January 2020.

Comments: 16 pages, 6 figures

MSC Class: 05C35; 05C30; 05C75

arXiv:1910.07281 [pdf, other]

doi 10.1017/S0004972720001471

The diameter and radius of radially maximal graphs

Authors: Pu Qiao, Xingzhi Zhan

Abstract: A graph is called radially maximal if it is not complete and the addition of any new edge decreases its radius. In 1976 Harary and Thomassen proved that the radius $r$ and diameter $d$ of any radially maximal graph satisfy $r\le d\le 2r-2.$ Dutton, Medidi and Brigham rediscovered this result with a different proof in 1995 and they posed the conjecture that the converse is true, that is, if $r$ and… ▽ More A graph is called radially maximal if it is not complete and the addition of any new edge decreases its radius. In 1976 Harary and Thomassen proved that the radius $r$ and diameter $d$ of any radially maximal graph satisfy $r\le d\le 2r-2.$ Dutton, Medidi and Brigham rediscovered this result with a different proof in 1995 and they posed the conjecture that the converse is true, that is, if $r$ and $d$ are positive integers satisfying $r\le d\le 2r-2,$ then there exists a radially maximal graph with radius $r$ and diameter $d.$ We prove this conjecture and a little more. △ Less

Submitted 16 October, 2019; originally announced October 2019.

Comments: 8 pages, 3 figures

Journal ref: Bull. Aust. Math. Soc. 104 (2021) 196-202

arXiv:1906.01536 [pdf, other]

doi 10.1109/ICPR.2018.8546126

Visual Tree Convolutional Neural Network in Image Classification

Authors: Yuntao Liu, Yong Dou, Ruochun Jin, Peng Qiao

Abstract: In image classification, Convolutional Neural Network(CNN) models have achieved high performance with the rapid development in deep learning. However, some categories in the image datasets are more difficult to distinguished than others. Improving the classification accuracy on these confused categories is benefit to the overall performance. In this paper, we build a Confusion Visual Tree(CVT) bas… ▽ More In image classification, Convolutional Neural Network(CNN) models have achieved high performance with the rapid development in deep learning. However, some categories in the image datasets are more difficult to distinguished than others. Improving the classification accuracy on these confused categories is benefit to the overall performance. In this paper, we build a Confusion Visual Tree(CVT) based on the confused semantic level information to identify the confused categories. With the information provided by the CVT, we can lead the CNN training procedure to pay more attention on these confused categories. Therefore, we propose Visual Tree Convolutional Neural Networks(VT-CNN) based on the original deep CNN embedded with our CVT. We evaluate our VT-CNN model on the benchmark datasets CIFAR-10 and CIFAR-100. In our experiments, we build up 3 different VT-CNN models and they obtain improvement over their based CNN models by 1.36%, 0.89% and 0.64%, respectively. △ Less

Submitted 4 June, 2019; originally announced June 2019.

Comments: 7 pages, 2 figures, conference

Journal ref: 2018 24th International Conference on Pattern Recognition (ICPR)

arXiv:1904.12150 [pdf, other]

Relation between the number of leaves of a tree and its diameter

Authors: Pu Qiao, Xingzhi Zhan

Abstract: Let $L(n,d)$ denote the minimum possible number of leaves in a tree of order $n$ and diameter $d.$ In 1975 Lesniak gave the lower bound $B(n,d)=\lceil 2(n-1)/d\rceil$ for $L(n,d).$ When $d$ is even, $B(n,d)=L(n,d).$ But when $d$ is odd, $B(n,d)$ is smaller than $L(n,d)$ in general. For example, $B(21,3)=14$ while $L(21,3)=19.$ We prove that for $d\ge 2,$… ▽ More Let $L(n,d)$ denote the minimum possible number of leaves in a tree of order $n$ and diameter $d.$ In 1975 Lesniak gave the lower bound $B(n,d)=\lceil 2(n-1)/d\rceil$ for $L(n,d).$ When $d$ is even, $B(n,d)=L(n,d).$ But when $d$ is odd, $B(n,d)$ is smaller than $L(n,d)$ in general. For example, $B(21,3)=14$ while $L(21,3)=19.$ We prove that for $d\ge 2,$ $ L(n,d)=\left\lceil \frac{2(n-1)}{d}\right\rceil$ if $d$ is even and $L(n,d)=\left\lceil \frac{2(n-2)}{d-1}\right\rceil$ if $d$ is odd. The converse problem is also considered. Let $D(n,f)$ be the minimum possible diameter of a tree of order $n$ with exactly $f$ leaves. We prove that $D(n,f)=2$ if $n=f+1,$ $D(n,f)=2k+1$ if $n=kf+2,$ and $D(n,f)=2k+2$ if $kf+3\le n\le (k+1)f+1.$ △ Less

Submitted 27 April, 2019; originally announced April 2019.

arXiv:1902.09928 [pdf, other]

IF-TTN: Information Fused Temporal Transformation Network for Video Action Recognition

Authors: Ke Yang, Peng Qiao, Dongsheng Li, Yong Dou

Abstract: Effective spatiotemporal feature representation is crucial to the video-based action recognition task. Focusing on discriminate spatiotemporal feature learning, we propose Information Fused Temporal Transformation Network (IF-TTN) for action recognition on top of popular Temporal Segment Network (TSN) framework. In the network, Information Fusion Module (IFM) is designed to fuse the appearance and… ▽ More Effective spatiotemporal feature representation is crucial to the video-based action recognition task. Focusing on discriminate spatiotemporal feature learning, we propose Information Fused Temporal Transformation Network (IF-TTN) for action recognition on top of popular Temporal Segment Network (TSN) framework. In the network, Information Fusion Module (IFM) is designed to fuse the appearance and motion features at multiple ConvNet levels for each video snippet, forming a short-term video descriptor. With fused features as inputs, Temporal Transformation Networks (TTN) are employed to model middle-term temporal transformation between the neighboring snippets following a sequential order. As TSN itself depicts long-term temporal structure by segmental consensus, the proposed network comprehensively considers multiple granularity temporal features. Our IF-TTN achieves the state-of-the-art results on two most popular action recognition datasets: UCF101 and HMDB51. Empirical investigation reveals that our architecture is robust to the input motion map quality. Replacing optical flow with the motion vectors from compressed video stream, the performance is still comparable to the flow-based methods while the testing speed is 10x faster. △ Less

Submitted 11 April, 2019; v1 submitted 26 February, 2019; originally announced February 2019.

arXiv:1902.05488 [pdf, other]

Exploring Frame Segmentation Networks for Temporal Action Localization

Authors: Ke Yang, Xiaolong Shen, Peng Qiao, Shijie Li, Dongsheng Li, Yong Dou

Abstract: Temporal action localization is an important task of computer vision. Though many methods have been proposed, it still remains an open question how to predict the temporal location of action segments precisely. Most state-of-the-art works train action classifiers on video segments pre-determined by action proposal. However, recent work found that a desirable model should move beyond segment-level… ▽ More Temporal action localization is an important task of computer vision. Though many methods have been proposed, it still remains an open question how to predict the temporal location of action segments precisely. Most state-of-the-art works train action classifiers on video segments pre-determined by action proposal. However, recent work found that a desirable model should move beyond segment-level and make dense predictions at a fine granularity in time to determine precise temporal boundaries. In this paper, we propose a Frame Segmentation Network (FSN) that places a temporal CNN on top of the 2D spatial CNNs. Spatial CNNs are responsible for abstracting semantics in spatial dimension while temporal CNN is responsible for introducing temporal context information and performing dense predictions. The proposed FSN can make dense predictions at frame-level for a video clip using both spatial and temporal context information. FSN is trained in an end-to-end manner, so the model can be optimized in spatial and temporal domain jointly. We also adapt FSN to use it in weakly supervised scenario (WFSN), where only video level labels are provided when training. Experiment results on public dataset show that FSN achieves superior performance in both frame-level action localization and temporal action localization. △ Less

Submitted 14 February, 2019; originally announced February 2019.

Comments: Accepted by Journal of Visual Communication and Image Representation

arXiv:1901.09547 [pdf, other]

Pairs of a tree and a nontree graph with the same status sequence

Authors: Pu Qiao, Xingzhi Zhan

Abstract: The status of a vertex $x$ in a graph is the sum of the distances between $x$ and all other vertices. Let $G$ be a connected graph. The status sequence of $G$ is the list of the statuses of all vertices arranged in nondecreasing order. $G$ is called status injective if all the statuses of its vertices are distinct. Let $G$ be a member of a family of graphs $\mathscr{F}$ and let the status sequence… ▽ More The status of a vertex $x$ in a graph is the sum of the distances between $x$ and all other vertices. Let $G$ be a connected graph. The status sequence of $G$ is the list of the statuses of all vertices arranged in nondecreasing order. $G$ is called status injective if all the statuses of its vertices are distinct. Let $G$ be a member of a family of graphs $\mathscr{F}$ and let the status sequence of $G$ be $s.$ $G$ is said to be status unique in $\mathscr{F}$ if $G$ is the unique graph in $\mathscr{F}$ whose status sequence is $s.$ In 2011, J.L. Shang and C. Lin posed the following two conjectures. Conjecture 1: A tree and a nontree graph cannot have the same status sequence. Conjecture 2: Any status injective tree is status unique in all connected graphs. We settle these two conjectures negatively. For every integer $n\ge 10,$ we construct a tree $T_n$ and a unicyclic graph $U_n,$ both of order $n,$ with the following two properties: (1) $T_n$ and $U_n$ have the same status sequence; (2) for $n\ge 15,$ if $n$ is congruent to $3$ modulo $4$ then $T_n$ is status injective and among any four consecutive even orders, there is at least one order $n$ such that $T_n$ is status injective. △ Less

Submitted 28 January, 2019; originally announced January 2019.

Comments: 14 pages, 11 figures

arXiv:1811.05314 [pdf, ps, other]

The largest graphs with given order and diameter: A simple proof

Authors: Pu Qiao, Xingzhi Zhan

Abstract: A consequence of Ore's classic theorem characterizing the maximal graphs with given order and diameter is a determination of the largest such graphs. We give a very short and simple proof of this smaller result, based on a well-known elementary observation. A consequence of Ore's classic theorem characterizing the maximal graphs with given order and diameter is a determination of the largest such graphs. We give a very short and simple proof of this smaller result, based on a well-known elementary observation. △ Less

Submitted 9 November, 2018; originally announced November 2018.

arXiv:1807.06216 [pdf, other]

Learning Generic Diffusion Processes for Image Restoration

Authors: Peng Qiao, Yong Dou, Yunjin Chen, Wensen Feng

Abstract: Image restoration problems are typical ill-posed problems where the regularization term plays an important role. The regularization term learned via generative approaches is easy to transfer to various image restoration, but offers inferior restoration quality compared with that learned via discriminative approaches. On the contrary, the regularization term learned via discriminative approaches ar… ▽ More Image restoration problems are typical ill-posed problems where the regularization term plays an important role. The regularization term learned via generative approaches is easy to transfer to various image restoration, but offers inferior restoration quality compared with that learned via discriminative approaches. On the contrary, the regularization term learned via discriminative approaches are usually trained for a specific image restoration problem, and fail in the problem for which it is not trained. To address this issue, we propose a generic diffusion process (genericDP) to handle multiple Gaussian denoising problems based on the Trainable Non-linear Reaction Diffusion (TNRD) models. Instead of one model, which consists of a diffusion and a reaction term, for one Gaussian denoising problem in TNRD, we enforce multiple TNRD models to share one diffusion term. The trained genericDP model can provide both promising denoising performance and high training efficiency compared with the original TNRD models. We also transfer the trained diffusion term to non-blind deconvolution which is unseen in the training phase. Experiment results show that the trained diffusion term for multiple Gaussian denoising can be transferred to image non-blind deconvolution as an image prior and provide competitive performance. △ Less

Submitted 17 July, 2018; originally announced July 2018.

Comments: 12 pages, 3 figures, 3 tables

Journal ref: British Machine Vision Conference 2018

arXiv:1806.06564 [pdf, other]

Detour-saturated graphs of small girths

Authors: Pu Qiao, Xingzhi Zhan

Abstract: A detour of a graph G is a longest path in G. The detour order of G is the number of vertices in a detour of G. A graph is said to be detour-saturated if the addition of any edge increases strictly the detour order. L.W. Beineke, J.E. Dunbar and M. Frick asked the following three questions in 2005. (1) What is the smallest order of a detour-saturated graph of girth 4? (2) Let Pr be the graph obtai… ▽ More A detour of a graph G is a longest path in G. The detour order of G is the number of vertices in a detour of G. A graph is said to be detour-saturated if the addition of any edge increases strictly the detour order. L.W. Beineke, J.E. Dunbar and M. Frick asked the following three questions in 2005. (1) What is the smallest order of a detour-saturated graph of girth 4? (2) Let Pr be the graph obtained from the Petersen graph by splitting one of its vertices into three leaves. Is Pr the smallest triangle-free detour-saturated graph? (3) Does there exist a detour-saturated graph with finite girth bigger than 5? We answer these questions. △ Less

Submitted 18 June, 2018; originally announced June 2018.

Comments: 6 pages, 4 figures

arXiv:1802.09250 [pdf, other]

The minimum number of Hamilton cycles in a hamiltonian threshold graph of a prescribed order

Authors: Pu Qiao, Xingzhi Zhan

Abstract: We prove that the minimum number of Hamilton cycles in a hamiltonian threshold graph of order $n$ is $2^{\lfloor (n-3)/2\rfloor}$ and this minimum number is attained uniquely by the graph with degree sequence $n-1,n-1,n-2,\ldots,\lceil n/2\rceil,\lceil n/2\rceil,\ldots,3,2$ of $n-2$ distinct degrees. This graph is also the unique graph of minimum size among all hamiltonian threshold graphs of orde… ▽ More We prove that the minimum number of Hamilton cycles in a hamiltonian threshold graph of order $n$ is $2^{\lfloor (n-3)/2\rfloor}$ and this minimum number is attained uniquely by the graph with degree sequence $n-1,n-1,n-2,\ldots,\lceil n/2\rceil,\lceil n/2\rceil,\ldots,3,2$ of $n-2$ distinct degrees. This graph is also the unique graph of minimum size among all hamiltonian threshold graphs of order $n.$ △ Less

Submitted 26 February, 2018; originally announced February 2018.

Comments: 8 pages

MSC Class: 05C45; 05C35

arXiv:1708.03280 [pdf, other]

Exploring Temporal Preservation Networks for Precise Temporal Action Localization

Authors: Ke Yang, Peng Qiao, Dongsheng Li, Shaohe Lv, Yong Dou

Abstract: Temporal action localization is an important task of computer vision. Though a variety of methods have been proposed, it still remains an open question how to predict the temporal boundaries of action segments precisely. Most works use segment-level classifiers to select video segments pre-determined by action proposal or dense sliding windows. However, in order to achieve more precise action boun… ▽ More Temporal action localization is an important task of computer vision. Though a variety of methods have been proposed, it still remains an open question how to predict the temporal boundaries of action segments precisely. Most works use segment-level classifiers to select video segments pre-determined by action proposal or dense sliding windows. However, in order to achieve more precise action boundaries, a temporal localization system should make dense predictions at a fine granularity. A newly proposed work exploits Convolutional-Deconvolutional-Convolutional (CDC) filters to upsample the predictions of 3D ConvNets, making it possible to perform per-frame action predictions and achieving promising performance in terms of temporal action localization. However, CDC network loses temporal information partially due to the temporal downsampling operation. In this paper, we propose an elegant and powerful Temporal Preservation Convolutional (TPC) Network that equips 3D ConvNets with TPC filters. TPC network can fully preserve temporal resolution and downsample the spatial resolution simultaneously, enabling frame-level granularity action localization. TPC network can be trained in an end-to-end manner. Experiment results on public datasets show that TPC network achieves significant improvement on per-frame action prediction and competing results on segment-level temporal action localization. △ Less

Submitted 11 September, 2017; v1 submitted 10 August, 2017; originally announced August 2017.

arXiv:1707.07753 [pdf]

doi 10.1364/AOP.10.000180

Recent advances in high-contrast metastructures, metasurfaces and photonic crystals

Authors: Pengfei Qiao, Weijian Yang, Connie J. Chang-Hasnain

Abstract: In the recent decade, the research field using arrays of high-index-contrast near-wavelength dieletric structures on flat surfaces, known as high-contrast metastructures (HCMs) or metasurfaces, has emerged and expanded rapidly. Although the HCMs and metasurfaces share great similarities in physical structures with photonic crystals (PhCs), i.e. periodic nanostructures, many differences exist in th… ▽ More In the recent decade, the research field using arrays of high-index-contrast near-wavelength dieletric structures on flat surfaces, known as high-contrast metastructures (HCMs) or metasurfaces, has emerged and expanded rapidly. Although the HCMs and metasurfaces share great similarities in physical structures with photonic crystals (PhCs), i.e. periodic nanostructures, many differences exist in their design, analysis, operation conditions, and applications. In this paper, we provide a generalized theoretical understanding of the two subjects and show their intrinsic connections. We further discuss the simulation and design approaches, categorized by their functionalities and applications. The similarity and differences between HCMs, metasurfaces and PhCs are also discussed. New findings are presented regarding the physical connection between the PhC band structures and the 1D and 2D HCM scattering spectra under transverse and longitudinal tilt incidence. Novel designs using HCMs as holograms, spatial light modulators, and surface plasmonic couplers are discussed. Recent advances on HCMs, metasurfaces and PhCs are reviewed and compared for applications such as broadband mirrors, waveguides, couplers, resonators, and reconfigurable optics. △ Less

Submitted 24 July, 2017; originally announced July 2017.

Comments: 58 pages, 44 figures, review article

arXiv:1705.09540 [pdf, other]

On vertex types of graphs

Authors: Pu Qiao, Xingzhi Zhan

Abstract: The vertices of a graph are classified into seven types by J.T. Hedetniemi, S.M. Hedetniemi, S.T. Hedetniemi and T.M. Lewis and they ask the following questions: 1) What is the smallest order $n$ of a graph having $n-2$ very typical vertices or $n-2$ typical vertices? 2) What is the smallest order of a pantypical graph? We answer these two questions in this paper. The vertices of a graph are classified into seven types by J.T. Hedetniemi, S.M. Hedetniemi, S.T. Hedetniemi and T.M. Lewis and they ask the following questions: 1) What is the smallest order $n$ of a graph having $n-2$ very typical vertices or $n-2$ typical vertices? 2) What is the smallest order of a pantypical graph? We answer these two questions in this paper. △ Less

Submitted 26 May, 2017; originally announced May 2017.

arXiv:1702.07472 [pdf, other]

Learning Non-local Image Diffusion for Image Denoising

Authors: Peng Qiao, Yong Dou, Wensen Feng, Yunjin Chen

Abstract: Image diffusion plays a fundamental role for the task of image denoising. Recently proposed trainable nonlinear reaction diffusion (TNRD) model defines a simple but very effective framework for image denoising. However, as the TNRD model is a local model, the diffusion behavior of which is purely controlled by information of local patches, it is prone to create artifacts in the homogenous regions… ▽ More Image diffusion plays a fundamental role for the task of image denoising. Recently proposed trainable nonlinear reaction diffusion (TNRD) model defines a simple but very effective framework for image denoising. However, as the TNRD model is a local model, the diffusion behavior of which is purely controlled by information of local patches, it is prone to create artifacts in the homogenous regions and over-smooth highly textured regions, especially in the case of strong noise levels. Meanwhile, it is widely known that the non-local self-similarity (NSS) prior stands as an effective image prior for image denoising, which has been widely exploited in many non-local methods. In this work, we are highly motivated to embed the NSS prior into the TNRD model to tackle its weaknesses. In order to preserve the expected property that end-to-end training is available, we exploit the NSS prior by a set of non-local filters, and derive our proposed trainable non-local reaction diffusion (TNLRD) model for image denoising. Together with the local filters and influence functions, the non-local filters are learned by employing loss-specific training. The experimental results show that the trained TNLRD model produces visually plausible recovered images with more textures and less artifacts, compared to its local versions. Moreover, the trained TNLRD model can achieve strongly competitive performance to recent state-of-the-art image denoising methods in terms of peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). △ Less

Submitted 24 February, 2017; originally announced February 2017.

Comments: under review in a journal

arXiv:1610.03608 [pdf, ps, other]

A spatio-temporal model and inference tools for longitudinal count data on multicolor cell growth

Authors: Puxue Qiao, Christina Mølck, Davide Ferrari, Frédéric Hollande

Abstract: Multicolor cell spatio-temporal image data have become important to investigate organ development and regeneration, malignant growth or immune responses by tracking different cell types both in vivo and in vitro. Statistical modeling of image data from common longitudinal cell experiments poses significant challenges due to the presence of complex spatio-temporal interactions between different cel… ▽ More Multicolor cell spatio-temporal image data have become important to investigate organ development and regeneration, malignant growth or immune responses by tracking different cell types both in vivo and in vitro. Statistical modeling of image data from common longitudinal cell experiments poses significant challenges due to the presence of complex spatio-temporal interactions between different cell types and difficulties related to measurement of single cell trajectories. Current analysis methods focus mainly on univariate cases, often not considering the spatio-temporal effects affecting cell growth between different cell populations. In this paper, we propose a conditional spatial autoregressive model to describe multivariate count cell data on the lattice, and develop inference tools. The proposed methodology is computationally tractable and enables researchers to estimate a complete statistical model of multicolor cell growth. Our methodology is applied on real experimental data where we investigate how interactions between cells affect their growth. We include two case studies; the first evaluates interactions between cancer cells and fibroblasts, which are normally present in the tumor microenvironment, whilst the second evaluates interactions between cloned cancer cells when grown as different combinations. △ Less

Submitted 9 April, 2018; v1 submitted 12 October, 2016; originally announced October 2016.

arXiv:1609.06585 [pdf, other]

Image Denoising via Multi-scale Nonlinear Diffusion Models

Authors: Wensen Feng, Peng Qiao, Xuanyang Xi, Yunjin Chen

Abstract: Image denoising is a fundamental operation in image processing and holds considerable practical importance for various real-world applications. Arguably several thousands of papers are dedicated to image denoising. In the past decade, sate-of-the-art denoising algorithm have been clearly dominated by non-local patch-based methods, which explicitly exploit patch self-similarity within image. Howeve… ▽ More Image denoising is a fundamental operation in image processing and holds considerable practical importance for various real-world applications. Arguably several thousands of papers are dedicated to image denoising. In the past decade, sate-of-the-art denoising algorithm have been clearly dominated by non-local patch-based methods, which explicitly exploit patch self-similarity within image. However, in recent two years, discriminatively trained local approaches have started to outperform previous non-local models and have been attracting increasing attentions due to the additional advantage of computational efficiency. Successful approaches include cascade of shrinkage fields (CSF) and trainable nonlinear reaction diffusion (TNRD). These two methods are built on filter response of linear filters of small size using feed forward architectures. Due to the locality inherent in local approaches, the CSF and TNRD model become less effective when noise level is high and consequently introduces some noise artifacts. In order to overcome this problem, in this paper we introduce a multi-scale strategy. To be specific, we build on our newly-developed TNRD model, adopting the multi-scale pyramid image representation to devise a multi-scale nonlinear diffusion process. As expected, all the parameters in the proposed multi-scale diffusion model, including the filters and the influence functions across scales, are learned from training data through a loss based approach. Numerical results on Gaussian and Poisson denoising substantiate that the exploited multi-scale strategy can successfully boost the performance of the original TNRD model with single scale. As a consequence, the resulting multi-scale diffusion models can significantly suppress the typical incorrect features for those noisy images with heavy noise. △ Less

Submitted 21 September, 2016; originally announced September 2016.

arXiv:1608.06170 [pdf, ps, other]

Turán problems for digraphs avoiding distinct walks of a given length with the same endpoints

Authors: Zejun Huang, Zhenhua Lyu, Pu Qiao

Abstract: Let $n \ge 5$ and $k\ge 4$ be positive integers. We determine the maximum size of digraphs of order n that avoid distinct walks of length k with the same endpoints. We also characterize the extremal digraphs attaining this maximum number when $k \ge 5$. Let $n \ge 5$ and $k\ge 4$ be positive integers. We determine the maximum size of digraphs of order n that avoid distinct walks of length k with the same endpoints. We also characterize the extremal digraphs attaining this maximum number when $k \ge 5$. △ Less

Submitted 22 August, 2016; originally announced August 2016.

MSC Class: 05C35; 05C20

arXiv:cond-mat/0506268 [pdf]

doi 10.1016/j.ssc.2005.05.047

Effect of Ti doping on the electrical transport and magnetic properties of layered compound Na0.8CoO2

Authors: W. Y. Zhang, Y. G. Zhao, Z. P. Guo, P. T. Qiao, L. Cui, L. B. Luo, X. P. Zhang, H. C. Yu, Y. G. Shi, S. Y. Zhang, T. Y. Zhao, J. Q. Li

Abstract: Effect of Ti doping on the electrical transport and magnetic properties of layered Na0.8Co1-xTixO2 compounds has been investigated. The lattice parameters a and c increase with x. A minor amount of Ti doping results in a metal-insulator transition at low temperatures. For samples with x > 0.03, the variable-range hopping process dominates the transport behavior above a certain temperature. The t… ▽ More Effect of Ti doping on the electrical transport and magnetic properties of layered Na0.8Co1-xTixO2 compounds has been investigated. The lattice parameters a and c increase with x. A minor amount of Ti doping results in a metal-insulator transition at low temperatures. For samples with x > 0.03, the variable-range hopping process dominates the transport behavior above a certain temperature. The temperature dependence of magnetization of all the samples is found to obey the Curie-Weiss law. The mechanism of the doping effect is discussed. △ Less

Submitted 12 June, 2005; originally announced June 2005.

Comments: 19 pages, 6 figures, Solid State Commun. (in press)

arXiv:cond-mat/0105053 [pdf]

doi 10.1016/S0921-4534(01)01100-5

Influence of the starting composition on the structural and superconducting properties of MgB2 phase

Authors: Y. G. Zhao, X. P. Zhang, P. T. Qiao, H. T. Zhang, S. L. Jia, B. S. Cao M. H. Zhu, Z. H. Han, X. L. Wang, B. L. Gu

Abstract: We report the preparation of Mg$_{1-x}$B$_{2}$ (0$\le$x$\le$0.5) compounds with the nominal compositions. Single phase MgB$_{2}$ was obtained for x=0 sample. For 0$<$x$\le$0.5, MgB$_{4}$ coexists with "MgB$_{2}$" and the amount of MgB$_{4}$ increases with x. With the increase of x, the lattice parameter ${\it c}$ of "MgB$_{2}$" increases and the lattice parameter ${\it a}$ decreases, correspondi… ▽ More We report the preparation of Mg$_{1-x}$B$_{2}$ (0$\le$x$\le$0.5) compounds with the nominal compositions. Single phase MgB$_{2}$ was obtained for x=0 sample. For 0$<$x$\le$0.5, MgB$_{4}$ coexists with "MgB$_{2}$" and the amount of MgB$_{4}$ increases with x. With the increase of x, the lattice parameter ${\it c}$ of "MgB$_{2}$" increases and the lattice parameter ${\it a}$ decreases, correspondingly T$_{c}$ of Mg$_{1-x}$B$_{2}$ decreases. The results were discussed in terms of the presence of Mg vacancies or B interstitials in the MgB$_{2}$ structure. This work is helpful to the understanding of the MgB$_{2}$ films with different T$_{c}$, as well as the Mg site doping effect for MgB$_{2}$. △ Less

Submitted 3 May, 2001; originally announced May 2001.

Comments: 11 pages, 4 figures

arXiv:cond-mat/0103077 [pdf]

Effect of Li doping on structure and superconducting transition temperature of Mg1-xLixB2

Authors: Y. G. Zhao, X. P. Zhang, P. T. Qiao, H. T. Zhang, S. L. Jia, B. S. Cao M. H. Zhu, Z. H. Han, X. L. Wang, B. L. Gu

Abstract: We report the preparation of Mg1-xLixB2 compounds. Nearly single phased samples were obtained for x<0.3. The in-plane lattice parameter a decreases with Li doping, while the lattice parameter c does not show obvious change. The superconducting transition temperature of Mg1-xLixB2 decreases with Li doping and loss of superconductivity occurs for x=0.5 sample. The results of our work are consisten… ▽ More We report the preparation of Mg1-xLixB2 compounds. Nearly single phased samples were obtained for x<0.3. The in-plane lattice parameter a decreases with Li doping, while the lattice parameter c does not show obvious change. The superconducting transition temperature of Mg1-xLixB2 decreases with Li doping and loss of superconductivity occurs for x=0.5 sample. The results of our work are consistent with the prediction of the hole superconductivity mechanism. △ Less

Submitted 2 March, 2001; originally announced March 2001.

Comments: 5 pages, 4 figures

Showing 1–40 of 40 results for author: Qiao, P