Search | arXiv e-print repository

The Categorical Data Map: A Multidimensional Scaling-Based Approach

Authors: Frederik L. Dennig, Lucas Joos, Patrick Paetzold, Daniela Blumberg, Oliver Deussen, Daniel A. Keim, Maximilian T. Fischer

Abstract: Categorical data does not have an intrinsic definition of distance or order, and therefore, established visualization techniques for categorical data only allow for a set-based or frequency-based analysis, e.g., through Euler diagrams or Parallel Sets, and do not support a similarity-based analysis. We present a novel dimensionality reduction-based visualization for categorical data, which is base… ▽ More Categorical data does not have an intrinsic definition of distance or order, and therefore, established visualization techniques for categorical data only allow for a set-based or frequency-based analysis, e.g., through Euler diagrams or Parallel Sets, and do not support a similarity-based analysis. We present a novel dimensionality reduction-based visualization for categorical data, which is based on defining the distance of two data items as the number of varying attributes. Our technique enables users to pre-attentively detect groups of similar data items and observe the properties of the projection, such as attributes strongly influencing the embedding. Our prototype visually encodes data properties in an enhanced scatterplot-like visualization, encoding attributes in the background to show the distribution of categories. In addition, we propose two graph-based measures to quantify the plot's visual quality, which rank attributes according to their contribution to cluster cohesion. To demonstrate the capabilities of our similarity-based approach, we compare it to Euler diagrams and Parallel Sets regarding visual scalability and show its benefits through an expert study with five data scientists analyzing the Titanic and Mushroom datasets with up to 23 attributes and 8124 category combinations. Our results indicate that the Categorical Data Map offers an effective analysis method, especially for large datasets with a high number of category combinations. △ Less

Submitted 26 August, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: Fully replaced; 10 pages, 9 figures, LaTeX; to appear at Visual Data Science (VDS) Symposium at IEEE VIS 2024

arXiv:2403.19456 [pdf, other]

Break-for-Make: Modular Low-Rank Adaptations for Composable Content-Style Customization

Authors: Yu Xu, Fan Tang, Juan Cao, Yuxin Zhang, Oliver Deussen, Weiming Dong, Jintao Li, Tong-Yee Lee

Abstract: Personalized generation paradigms empower designers to customize visual intellectual properties with the help of textual descriptions by tuning or adapting pre-trained text-to-image models on a few images. Recent works explore approaches for concurrently customizing both content and detailed visual style appearance. However, these existing approaches often generate images where the content and sty… ▽ More Personalized generation paradigms empower designers to customize visual intellectual properties with the help of textual descriptions by tuning or adapting pre-trained text-to-image models on a few images. Recent works explore approaches for concurrently customizing both content and detailed visual style appearance. However, these existing approaches often generate images where the content and style are entangled. In this study, we reconsider the customization of content and style concepts from the perspective of parameter space construction. Unlike existing methods that utilize a shared parameter space for content and style, we propose a learning framework that separates the parameter space to facilitate individual learning of content and style, thereby enabling disentangled content and style. To achieve this goal, we introduce "partly learnable projection" (PLP) matrices to separate the original adapters into divided sub-parameter spaces. We propose "break-for-make" customization learning pipeline based on PLP, which is simple yet effective. We break the original adapters into "up projection" and "down projection", train content and style PLPs individually with the guidance of corresponding textual prompts in the separate adapters, and maintain generalization by employing a multi-correspondence projection learning strategy. Based on the adapters broken apart for separate training content and style, we then make the entity parameter space by reconstructing the content and style PLPs matrices, followed by fine-tuning the combined adapter to generate the target object with the desired appearance. Experiments on various styles, including textures, materials, and artistic style, show that our method outperforms state-of-the-art single/multiple concept learning pipelines in terms of content-style-prompt alignment. △ Less

Submitted 31 March, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.07627 [pdf, other]

doi 10.1145/3652028

generAItor: Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation

Authors: Thilo Spinner, Rebecca Kehlbeck, Rita Sevastjanova, Tobias Stähle, Daniel A. Keim, Oliver Deussen, Mennatallah El-Assady

Abstract: Large language models (LLMs) are widely deployed in various downstream tasks, e.g., auto-completion, aided writing, or chat-based text generation. However, the considered output candidates of the underlying search algorithm are under-explored and under-explained. We tackle this shortcoming by proposing a tree-in-the-loop approach, where a visual representation of the beam search tree is the centra… ▽ More Large language models (LLMs) are widely deployed in various downstream tasks, e.g., auto-completion, aided writing, or chat-based text generation. However, the considered output candidates of the underlying search algorithm are under-explored and under-explained. We tackle this shortcoming by proposing a tree-in-the-loop approach, where a visual representation of the beam search tree is the central component for analyzing, explaining, and adapting the generated outputs. To support these tasks, we present generAItor, a visual analytics technique, augmenting the central beam search tree with various task-specific widgets, providing targeted visualizations and interaction possibilities. Our approach allows interactions on multiple levels and offers an iterative pipeline that encompasses generating, exploring, and comparing output candidates, as well as fine-tuning the model based on adapted data. Our case study shows that our tool generates new insights in gender bias analysis beyond state-of-the-art template-based methods. Additionally, we demonstrate the applicability of our approach in a qualitative user study. Finally, we quantitatively evaluate the adaptability of the model to few samples, as occurring in text-generation use cases. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: 24 pages paper, 4 pages references, 3 pages appendix, 8 figures

ACM Class: I.2.7; H.5.2

arXiv:2402.08324 [pdf, other]

Uncertainty Quantification via Stable Distribution Propagation

Authors: Felix Petersen, Aashwin Mishra, Hilde Kuehne, Christian Borgelt, Oliver Deussen, Mikhail Yurochkin

Abstract: We propose a new approach for propagating stable probability distributions through neural networks. Our method is based on local linearization, which we show to be an optimal approximation in terms of total variation distance for the ReLU non-linearity. This allows propagating Gaussian and Cauchy input uncertainties through neural networks to quantify their output uncertainties. To demonstrate the… ▽ More We propose a new approach for propagating stable probability distributions through neural networks. Our method is based on local linearization, which we show to be an optimal approximation in terms of total variation distance for the ReLU non-linearity. This allows propagating Gaussian and Cauchy input uncertainties through neural networks to quantify their output uncertainties. To demonstrate the utility of propagating distributions, we apply the proposed method to predicting calibrated confidence intervals and selective prediction on out-of-distribution data. The results demonstrate a broad applicability of propagating distributions and show the advantages of our method over other approaches such as moment matching. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: Published at ICLR 2024, Code @ https://github.com/Felix-Petersen/distprop

arXiv:2401.17800 [pdf, other]

Dance-to-Music Generation with Encoder-based Textual Inversion of Diffusion Models

Authors: Sifei Li, Weiming Dong, Yuxin Zhang, Fan Tang, Chongyang Ma, Oliver Deussen, Tong-Yee Lee, Changsheng Xu

Abstract: The harmonious integration of music with dance movements is pivotal in vividly conveying the artistic essence of dance. This alignment also significantly elevates the immersive quality of gaming experiences and animation productions. While there has been remarkable advancement in creating high-fidelity music from textual descriptions, current methodologies mainly concentrate on modulating overarch… ▽ More The harmonious integration of music with dance movements is pivotal in vividly conveying the artistic essence of dance. This alignment also significantly elevates the immersive quality of gaming experiences and animation productions. While there has been remarkable advancement in creating high-fidelity music from textual descriptions, current methodologies mainly concentrate on modulating overarching characteristics such as genre and emotional tone. They often overlook the nuanced management of temporal rhythm, which is indispensable in crafting music for dance, since it intricately aligns the musical beats with the dancers' movements. Recognizing this gap, we propose an encoder-based textual inversion technique for augmenting text-to-music models with visual control, facilitating personalized music generation. Specifically, we develop dual-path rhythm-genre inversion to effectively integrate the rhythm and genre of a dance motion sequence into the textual space of a text-to-music model. Contrary to the classical textual inversion method, which directly updates text embeddings to reconstruct a single target object, our approach utilizes separate rhythm and genre encoders to obtain text embeddings for two pseudo-words, adapting to the varying rhythms and genres. To achieve a more accurate evaluation, we propose improved evaluation metrics for rhythm alignment. We demonstrate that our approach outperforms state-of-the-art methods across multiple evaluation metrics. Furthermore, our method seamlessly adapts to in-the-wild data and effectively integrates with the inherent text-guided generation capability of the pre-trained model. Samples are available at \url{https://youtu.be/D7XDwtH1YwE}. △ Less

Submitted 31 January, 2024; originally announced January 2024.

Comments: 9 pages, 3 figures

arXiv:2310.11252 [pdf, other]

Revealing the Unwritten: Visual Investigation of Beam Search Trees to Address Language Model Prompting Challenges

Authors: Thilo Spinner, Rebecca Kehlbeck, Rita Sevastjanova, Tobias Stähle, Daniel A. Keim, Oliver Deussen, Andreas Spitz, Mennatallah El-Assady

Abstract: The growing popularity of generative language models has amplified interest in interactive methods to guide model outputs. Prompt refinement is considered one of the most effective means to influence output among these methods. We identify several challenges associated with prompting large language models, categorized into data- and model-specific, linguistic, and socio-linguistic challenges. A co… ▽ More The growing popularity of generative language models has amplified interest in interactive methods to guide model outputs. Prompt refinement is considered one of the most effective means to influence output among these methods. We identify several challenges associated with prompting large language models, categorized into data- and model-specific, linguistic, and socio-linguistic challenges. A comprehensive examination of model outputs, including runner-up candidates and their corresponding probabilities, is needed to address these issues. The beam search tree, the prevalent algorithm to sample model outputs, can inherently supply this information. Consequently, we introduce an interactive visual method for investigating the beam search tree, facilitating analysis of the decisions made by the model during generation. We quantitatively show the value of exposing the beam search tree and present five detailed analysis scenarios addressing the identified challenges. Our methodology validates existing results and offers additional insights. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: 9 pages paper, 2 pages references, 7 figures

ACM Class: H.5.2; I.2.7

arXiv:2308.15316 [pdf, other]

3D-MuPPET: 3D Multi-Pigeon Pose Estimation and Tracking

Authors: Urs Waldmann, Alex Hoi Hang Chan, Hemal Naik, Máté Nagy, Iain D. Couzin, Oliver Deussen, Bastian Goldluecke, Fumihiro Kano

Abstract: Markerless methods for animal posture tracking have been rapidly developing recently, but frameworks and benchmarks for tracking large animal groups in 3D are still lacking. To overcome this gap in the literature, we present 3D-MuPPET, a framework to estimate and track 3D poses of up to 10 pigeons at interactive speed using multiple camera views. We train a pose estimator to infer 2D keypoints and… ▽ More Markerless methods for animal posture tracking have been rapidly developing recently, but frameworks and benchmarks for tracking large animal groups in 3D are still lacking. To overcome this gap in the literature, we present 3D-MuPPET, a framework to estimate and track 3D poses of up to 10 pigeons at interactive speed using multiple camera views. We train a pose estimator to infer 2D keypoints and bounding boxes of multiple pigeons, then triangulate the keypoints to 3D. For identity matching of individuals in all views, we first dynamically match 2D detections to global identities in the first frame, then use a 2D tracker to maintain IDs across views in subsequent frames. We achieve comparable accuracy to a state of the art 3D pose estimator in terms of median error and Percentage of Correct Keypoints. Additionally, we benchmark the inference speed of 3D-MuPPET, with up to 9.45 fps in 2D and 1.89 fps in 3D, and perform quantitative tracking evaluation, which yields encouraging results. Finally, we showcase two novel applications for 3D-MuPPET. First, we train a model with data of single pigeons and achieve comparable results in 2D and 3D posture estimation for up to 5 pigeons. Second, we show that 3D-MuPPET also works in outdoors without additional annotations from natural environments. Both use cases simplify the domain shift to new species and environments, largely reducing annotation effort needed for 3D posture tracking. To the best of our knowledge we are the first to present a framework for 2D/3D animal posture and trajectory tracking that works in both indoor and outdoor environments for up to 10 individuals. We hope that the framework can open up new opportunities in studying animal collective behaviour and encourages further developments in 3D multi-animal posture tracking. △ Less

Submitted 15 December, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

arXiv:2307.10447 [pdf, other]

doi 10.1109/TVCG.2023.3327149

Reducing Ambiguities in Line-based Density Plots by Image-space Colorization

Authors: Yumeng Xue, Patrick Paetzold, Rebecca Kehlbeck, Bin Chen, Kin Chung Kwan, Yunhai Wang, Oliver Deussen

Abstract: Line-based density plots are used to reduce visual clutter in line charts with a multitude of individual lines. However, these traditional density plots are often perceived ambiguously, which obstructs the user's identification of underlying trends in complex datasets. Thus, we propose a novel image space coloring method for line-based density plots that enhances their interpretability. Our method… ▽ More Line-based density plots are used to reduce visual clutter in line charts with a multitude of individual lines. However, these traditional density plots are often perceived ambiguously, which obstructs the user's identification of underlying trends in complex datasets. Thus, we propose a novel image space coloring method for line-based density plots that enhances their interpretability. Our method employs color not only to visually communicate data density but also to highlight similar regions in the plot, allowing users to identify and distinguish trends easily. We achieve this by performing hierarchical clustering based on the lines passing through each region and mapping the identified clusters to the hue circle using circular MDS. Additionally, we propose a heuristic approach to assign each line to the most probable cluster, enabling users to analyze density and individual lines. We motivate our method by conducting a small-scale user study, demonstrating the effectiveness of our method using synthetic and real-world datasets, and providing an interactive online tool for generating colored line-based density plots. △ Less

Submitted 22 November, 2023; v1 submitted 16 July, 2023; originally announced July 2023.

Comments: Published in IEEE Transactions on Visualization and Computer Graphics (Supplementary Material: https://osf.io/jm5yz/)

arXiv:2305.16225 [pdf, other]

ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

Authors: Yuxin Zhang, Weiming Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu

Abstract: Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models. However, representing and editing specific visual attributes such as material, style, and layout remains a challenge, leading to… ▽ More Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models. However, representing and editing specific visual attributes such as material, style, and layout remains a challenge, leading to a lack of disentanglement and editability. To address this problem, we propose a novel approach that leverages the step-by-step generation process of diffusion models, which generate images from low to high frequency information, providing a new perspective on representing, generating, and editing images. We develop the Prompt Spectrum Space P*, an expanded textual conditioning space, and a new image representation method called \sysname. ProSpect represents an image as a collection of inverted textual token embeddings encoded from per-stage prompts, where each prompt corresponds to a specific generation stage (i.e., a group of consecutive steps) of the diffusion model. Experimental results demonstrate that P* and ProSpect offer better disentanglement and controllability compared to existing methods. We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models. Our source code is available athttps://github.com/zyxElsa/ProSpect. △ Less

Submitted 7 December, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

arXiv:2305.00604 [pdf, other]

ISAAC Newton: Input-based Approximate Curvature for Newton's Method

Authors: Felix Petersen, Tobias Sutter, Christian Borgelt, Dongsung Huh, Hilde Kuehne, Yuekai Sun, Oliver Deussen

Abstract: We present ISAAC (Input-baSed ApproximAte Curvature), a novel method that conditions the gradient using selected second-order information and has an asymptotically vanishing computational overhead, assuming a batch size smaller than the number of neurons. We show that it is possible to compute a good conditioner based on only the input to a respective layer without a substantial computational over… ▽ More We present ISAAC (Input-baSed ApproximAte Curvature), a novel method that conditions the gradient using selected second-order information and has an asymptotically vanishing computational overhead, assuming a batch size smaller than the number of neurons. We show that it is possible to compute a good conditioner based on only the input to a respective layer without a substantial computational overhead. The proposed method allows effective training even in small-batch stochastic regimes, which makes it competitive to first-order as well as second-order methods. △ Less

Submitted 30 April, 2023; originally announced May 2023.

Comments: Published at ICLR 2023, Code @ https://github.com/Felix-Petersen/isaac, Video @ https://youtu.be/7RKRX-MdwqM

arXiv:2303.06257 [pdf, other]

A Problem Space for Designing Visualizations

Authors: Michael Gleicher, Maria Riveiro, Tatiana von Landesberger, Oliver Deussen, Remco Chang, Christina Gillman

Abstract: Visualization researchers and visualization professionals seek appropriate abstractions of visualization requirements that permit considering visualization solutions independently from specific problems. Abstractions can help us design, analyze, organize, and evaluate the things we create. The literature has many task structures (taxonomies, typologies, etc.), design spaces, and related ``framewor… ▽ More Visualization researchers and visualization professionals seek appropriate abstractions of visualization requirements that permit considering visualization solutions independently from specific problems. Abstractions can help us design, analyze, organize, and evaluate the things we create. The literature has many task structures (taxonomies, typologies, etc.), design spaces, and related ``frameworks'' that provide abstractions of the problems a visualization is meant to address. In this viewpoint, we introduce a different one, a problem space that complements existing frameworks by focusing on the needs that a visualization is meant to solve. We believe it provides a valuable conceptual tool for designing and discussing visualizations. △ Less

Submitted 14 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

Comments: Author's submitted version. An article with the same content was approved for publication by the Visualization Viewpoints Department of IEEE Computer Graphics and Applications magazine

arXiv:2303.03964 [pdf, other]

doi 10.1109/TVCG.2023.3238821

Force-Directed Graph Layouts Revisited: A New Force Based on the T-Distribution

Authors: Fahai Zhong, Mingliang Xue, Jian Zhang, Fan Zhang, Rui Ban, Oliver Deussen, Yunhai Wang

Abstract: In this paper, we propose the t-FDP model, a force-directed placement method based on a novel bounded short-range force (t-force) defined by Student's t-distribution. Our formulation is flexible, exerts limited repulsive forces for nearby nodes and can be adapted separately in its short- and long-range effects. Using such forces in force-directed graph layouts yields better neighborhood preservati… ▽ More In this paper, we propose the t-FDP model, a force-directed placement method based on a novel bounded short-range force (t-force) defined by Student's t-distribution. Our formulation is flexible, exerts limited repulsive forces for nearby nodes and can be adapted separately in its short- and long-range effects. Using such forces in force-directed graph layouts yields better neighborhood preservation than current methods, while maintaining low stress errors. Our efficient implementation using a Fast Fourier Transform is one order of magnitude faster than state-of-the-art methods and two orders faster on the GPU, enabling us to perform parameter tuning by globally and locally adjusting the t-force in real-time for complex graphs. We demonstrate the quality of our approach by numerical evaluation against state-of-the-art approaches and extensions for interactive exploration. △ Less

Submitted 4 March, 2023; originally announced March 2023.

Comments: To appear in IEEE Transactions on Visualization and Computer Graphics

arXiv:2302.05368 [pdf, other]

doi 10.1145/3544548.3580734

Interactive Context-Preserving Color Highlighting for Multiclass Scatterplots

Authors: Kecheng Lu, Khairi Reda, Oliver Deussen, Yunhai Wang

Abstract: Color is one of the main visual channels used for highlighting elements of interest in visualization. However, in multi-class scatterplots, color highlighting often comes at the expense of degraded color discriminability. In this paper, we argue for context-preserving highlighting during the interactive exploration of multi-class scatterplots to achieve desired pop-out effects, while maintaining g… ▽ More Color is one of the main visual channels used for highlighting elements of interest in visualization. However, in multi-class scatterplots, color highlighting often comes at the expense of degraded color discriminability. In this paper, we argue for context-preserving highlighting during the interactive exploration of multi-class scatterplots to achieve desired pop-out effects, while maintaining good perceptual separability among all classes and consistent color mapping schemes under varying points of interest. We do this by first generating two contrastive color mapping schemes with large and small contrasts to the background. Both schemes maintain good perceptual separability among all classes and ensure that when colors from the two palettes are assigned to the same class, they have a high color consistency in color names. We then interactively combine these two schemes to create a dynamic color mapping for highlighting different points of interest. We demonstrate the effectiveness through crowd-sourced experiments and case studies. △ Less

Submitted 10 February, 2023; originally announced February 2023.

Comments: To appear in CHI'23: ACM Conference on Human Factors in Computing Systems

arXiv:2210.08277 [pdf, other]

Deep Differentiable Logic Gate Networks

Authors: Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen

Abstract: Recently, research has increasingly focused on developing efficient neural network architectures. In this work, we explore logic gate networks for machine learning tasks by learning combinations of logic gates. These networks comprise logic gates such as "AND" and "XOR", which allow for very fast execution. The difficulty in learning logic gate networks is that they are conventionally non-differen… ▽ More Recently, research has increasingly focused on developing efficient neural network architectures. In this work, we explore logic gate networks for machine learning tasks by learning combinations of logic gates. These networks comprise logic gates such as "AND" and "XOR", which allow for very fast execution. The difficulty in learning logic gate networks is that they are conventionally non-differentiable and therefore do not allow training with gradient descent. Thus, to allow for effective training, we propose differentiable logic gate networks, an architecture that combines real-valued logics and a continuously parameterized relaxation of the network. The resulting discretized logic gate networks achieve fast inference speeds, e.g., beyond a million images of MNIST per second on a single CPU core. △ Less

Submitted 15 October, 2022; originally announced October 2022.

Comments: Published at NeurIPS 2022

arXiv:2206.07290 [pdf, other]

Differentiable Top-k Classification Learning

Authors: Felix Petersen, Hilde Kuehne, Christian Borgelt, Oliver Deussen

Abstract: The top-k classification accuracy is one of the core metrics in machine learning. Here, k is conventionally a positive integer, such as 1 or 5, leading to top-1 or top-5 training objectives. In this work, we relax this assumption and optimize the model for multiple k simultaneously instead of using a single k. Leveraging recent advances in differentiable sorting and ranking, we propose a different… ▽ More The top-k classification accuracy is one of the core metrics in machine learning. Here, k is conventionally a positive integer, such as 1 or 5, leading to top-1 or top-5 training objectives. In this work, we relax this assumption and optimize the model for multiple k simultaneously instead of using a single k. Leveraging recent advances in differentiable sorting and ranking, we propose a differentiable top-k cross-entropy classification loss. This allows training the network while not only considering the top-1 prediction, but also, e.g., the top-2 and top-5 predictions. We evaluate the proposed loss function for fine-tuning on state-of-the-art architectures, as well as for training from scratch. We find that relaxing k does not only produce better top-5 accuracies, but also leads to top-1 accuracy improvements. When fine-tuning publicly available ImageNet models, we achieve a new state-of-the-art for these models. △ Less

Submitted 15 June, 2022; originally announced June 2022.

Comments: Published at ICML 2022, Code @ https://github.com/Felix-Petersen/difftopk

arXiv:2204.13845 [pdf, other]

GenDR: A Generalized Differentiable Renderer

Authors: Felix Petersen, Bastian Goldluecke, Christian Borgelt, Oliver Deussen

Abstract: In this work, we present and study a generalized family of differentiable renderers. We discuss from scratch which components are necessary for differentiable rendering and formalize the requirements for each component. We instantiate our general differentiable renderer, which generalizes existing differentiable renderers like SoftRas and DIB-R, with an array of different smoothing distributions t… ▽ More In this work, we present and study a generalized family of differentiable renderers. We discuss from scratch which components are necessary for differentiable rendering and formalize the requirements for each component. We instantiate our general differentiable renderer, which generalizes existing differentiable renderers like SoftRas and DIB-R, with an array of different smoothing distributions to cover a large spectrum of reasonable settings. We evaluate an array of differentiable renderer instantiations on the popular ShapeNet 3D reconstruction benchmark and analyze the implications of our results. Surprisingly, the simple uniform distribution yields the best overall results when averaged over 13 classes; in general, however, the optimal choice of distribution heavily depends on the task. △ Less

Submitted 28 April, 2022; originally announced April 2022.

Comments: Published at CVPR 2022, Code @ https://github.com/Felix-Petersen/gendr

arXiv:2203.09630 [pdf, other]

Monotonic Differentiable Sorting Networks

Authors: Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen

Abstract: Differentiable sorting algorithms allow training with sorting and ranking supervision, where only the ordering or ranking of samples is known. Various methods have been proposed to address this challenge, ranging from optimal transport-based differentiable Sinkhorn sorting algorithms to making classic sorting networks differentiable. One problem of current differentiable sorting methods is that th… ▽ More Differentiable sorting algorithms allow training with sorting and ranking supervision, where only the ordering or ranking of samples is known. Various methods have been proposed to address this challenge, ranging from optimal transport-based differentiable Sinkhorn sorting algorithms to making classic sorting networks differentiable. One problem of current differentiable sorting methods is that they are non-monotonic. To address this issue, we propose a novel relaxation of conditional swap operations that guarantees monotonicity in differentiable sorting networks. We introduce a family of sigmoid functions and prove that they produce differentiable sorting networks that are monotonic. Monotonicity ensures that the gradients always have the correct sign, which is an advantage in gradient-based optimization. We demonstrate that monotonic differentiable sorting networks improve upon previous differentiable sorting methods. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Comments: Published at ICLR 2022, Code @ https://github.com/Felix-Petersen/diffsort, Video @ https://www.youtube.com/watch?v=Rl-sFaE1z4M

arXiv:2110.10784 [pdf, other]

Style Agnostic 3D Reconstruction via Adversarial Style Transfer

Authors: Felix Petersen, Bastian Goldluecke, Oliver Deussen, Hilde Kuehne

Abstract: Reconstructing the 3D geometry of an object from an image is a major challenge in computer vision. Recently introduced differentiable renderers can be leveraged to learn the 3D geometry of objects from 2D images, but those approaches require additional supervision to enable the renderer to produce an output that can be compared to the input image. This can be scene information or constraints such… ▽ More Reconstructing the 3D geometry of an object from an image is a major challenge in computer vision. Recently introduced differentiable renderers can be leveraged to learn the 3D geometry of objects from 2D images, but those approaches require additional supervision to enable the renderer to produce an output that can be compared to the input image. This can be scene information or constraints such as object silhouettes, uniform backgrounds, material, texture, and lighting. In this paper, we propose an approach that enables a differentiable rendering-based learning of 3D objects from images with backgrounds without the need for silhouette supervision. Instead of trying to render an image close to the input, we propose an adversarial style-transfer and domain adaptation pipeline that allows to translate the input image domain to the rendered image domain. This allows us to directly compare between a translated image and the differentiable rendering of a 3D object reconstruction in order to train the 3D object reconstruction network. We show that the approach learns 3D geometry from images with backgrounds and provides a better performance than constrained methods for single-view 3D object reconstruction on this task. △ Less

Submitted 20 October, 2021; originally announced October 2021.

Comments: To be published at WACV 2022, Code @ https://github.com/Felix-Petersen/style-agnostic-3d-reconstruction

arXiv:2110.05651 [pdf, other]

Learning with Algorithmic Supervision via Continuous Relaxations

Authors: Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen

Abstract: The integration of algorithmic components into neural architectures has gained increased attention recently, as it allows training neural networks with new forms of supervision such as ordering constraints or silhouettes instead of using ground truth labels. Many approaches in the field focus on the continuous relaxation of a specific task and show promising results in this context. But the focus… ▽ More The integration of algorithmic components into neural architectures has gained increased attention recently, as it allows training neural networks with new forms of supervision such as ordering constraints or silhouettes instead of using ground truth labels. Many approaches in the field focus on the continuous relaxation of a specific task and show promising results in this context. But the focus on single tasks also limits the applicability of the proposed concepts to a narrow range of applications. In this work, we build on those ideas to propose an approach that allows to integrate algorithms into end-to-end trainable neural network architectures based on a general approximation of discrete conditions. To this end, we relax these conditions in control structures such as conditional statements, loops, and indexing, so that resulting algorithms are smoothly differentiable. To obtain meaningful gradients, each relevant variable is perturbed via logistic distributions and the expectation value under this perturbation is approximated. We evaluate the proposed continuous relaxation model on four challenging tasks and show that it can keep up with relaxations specifically designed for each individual task. △ Less

Submitted 25 October, 2021; v1 submitted 11 October, 2021; originally announced October 2021.

Comments: Published at NeurIPS 2021, Code @ https://github.com/Felix-Petersen/algovision, Video @ https://www.youtube.com/watch?v=01ENzpkjOCE

arXiv:2108.03529 [pdf, other]

SpEuler: Semantics-preserving Euler Diagrams

Authors: Rebecca Kehlbeck, Jochen Görtler, Yunhai Wang, Oliver Deussen

Abstract: Creating comprehensible visualizations of highly overlapping set-typed data is a challenging task due to its complexity. To facilitate insights into set connectivity and to leverage semantic relations between intersections, we propose a fast two-step layout technique for Euler diagrams that are both well-matched and well-formed. Our method conforms to established form guidelines for Euler diagrams… ▽ More Creating comprehensible visualizations of highly overlapping set-typed data is a challenging task due to its complexity. To facilitate insights into set connectivity and to leverage semantic relations between intersections, we propose a fast two-step layout technique for Euler diagrams that are both well-matched and well-formed. Our method conforms to established form guidelines for Euler diagrams regarding semantics, aesthetics, and readability. First, we establish an initial ordering of the data, which we then use to incrementally create a planar, connected, and monotone dual graph representation. In the next step, the graph is transformed into a circular layout that maintains the semantics and yields simple Euler diagrams with smooth curves. When the data cannot be represented by simple diagrams, our algorithm always falls back to a solution that is not well-formed but still well-matched, whereas previous methods often fail to produce expected results. We show the usefulness of our method for visualizing set-typed data using examples from text analysis and infographics. Furthermore, we discuss the characteristics of our approach and evaluate our method against state-of-the-art methods. △ Less

Submitted 7 August, 2021; originally announced August 2021.

arXiv:2105.04019 [pdf, other]

Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision

Authors: Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen

Abstract: Sorting and ranking supervision is a method for training neural networks end-to-end based on ordering constraints. That is, the ground truth order of sets of samples is known, while their absolute values remain unsupervised. For that, we propose differentiable sorting networks by relaxing their pairwise conditional swap operations. To address the problems of vanishing gradients and extensive blurr… ▽ More Sorting and ranking supervision is a method for training neural networks end-to-end based on ordering constraints. That is, the ground truth order of sets of samples is known, while their absolute values remain unsupervised. For that, we propose differentiable sorting networks by relaxing their pairwise conditional swap operations. To address the problems of vanishing gradients and extensive blurring that arise with larger numbers of layers, we propose mapping activations to regions with moderate gradients. We consider odd-even as well as bitonic sorting networks, which outperform existing relaxations of the sorting operation. We show that bitonic sorting networks can achieve stable training on large input sets of up to 1024 elements. △ Less

Submitted 14 July, 2021; v1 submitted 9 May, 2021; originally announced May 2021.

Comments: Published at ICML 2021, Code @ https://github.com/Felix-Petersen/diffsort, Video @ https://www.youtube.com/watch?v=38dvqdYEs1o

Journal ref: PMLR 139:8546-8555, 2021

arXiv:2103.02380 [pdf, other]

doi 10.1109/TVCG.2021.3052167

Shape-driven Coordinate Ordering for Star Glyph Sets via Reinforcement Learning

Authors: Ruizhen Hu, Bin Chen, Juzhan Xu, Oliver van Kaick, Oliver Deussen, Hui Huang

Abstract: We present a neural optimization model trained with reinforcement learning to solve the coordinate ordering problem for sets of star glyphs. Given a set of star glyphs associated to multiple class labels, we propose to use shape context descriptors to measure the perceptual distance between pairs of glyphs, and use the derived silhouette coefficient to measure the perception of class separability… ▽ More We present a neural optimization model trained with reinforcement learning to solve the coordinate ordering problem for sets of star glyphs. Given a set of star glyphs associated to multiple class labels, we propose to use shape context descriptors to measure the perceptual distance between pairs of glyphs, and use the derived silhouette coefficient to measure the perception of class separability within the entire set. To find the optimal coordinate order for the given set, we train a neural network using reinforcement learning to reward orderings with high silhouette coefficients. The network consists of an encoder and a decoder with an attention mechanism. The encoder employs a recurrent neural network (RNN) to encode input shape and class information, while the decoder together with the attention mechanism employs another RNN to output a sequence with the new coordinate order. In addition, we introduce a neural network to efficiently estimate the similarity between shape context descriptors, which allows to speed up the computation of silhouette coefficients and thus the training of the axis ordering network. Two user studies demonstrate that the orders provided by our method are preferred by users for perceiving class separation. We tested our model on different settings to show its robustness and generalization abilities and demonstrate that it allows to order input sets with unseen data size, data dimension, or number of classes. We also demonstrate that our model can be adapted to coordinate ordering of other types of plots such as RadViz by replacing the proposed shape-aware silhouette coefficient with the corresponding quality metric to guide network training. △ Less

Submitted 3 March, 2021; originally announced March 2021.

Journal ref: IEEE Transactions on Visualization and Computer Graphics 2021

arXiv:2009.02969 [pdf, other]

Palettailor: Discriminable Colorization for Categorical Data

Authors: Kecheng Lu, Mi Feng, Xin Chen, Michael Sedlmair, Oliver Deussen, Dani Lischinski, Zhanglin Cheng, Yunhai Wang

Abstract: We present an integrated approach for creating and assigning color palettes to different visualizations such as multi-class scatterplots, line, and bar charts. While other methods separate the creation of colors from their assignment, our approach takes data characteristics into account to produce color palettes, which are then assigned in a way that fosters better visual discrimination of classes… ▽ More We present an integrated approach for creating and assigning color palettes to different visualizations such as multi-class scatterplots, line, and bar charts. While other methods separate the creation of colors from their assignment, our approach takes data characteristics into account to produce color palettes, which are then assigned in a way that fosters better visual discrimination of classes. To do so, we use a customized optimization based on simulated annealing to maximize the combination of three carefully designed color scoring functions: point distinctness, name difference, and color discrimination. We compare our approach to state-ofthe-art palettes with a controlled user study for scatterplots and line charts, furthermore we performed a case study. Our results show that Palettailor, as a fully-automated approach, generates color palettes with a higher discrimination quality than existing approaches. The efficiency of our optimization allows us also to incorporate user modifications into the color selection process. △ Less

Submitted 7 September, 2020; originally announced September 2020.

Comments: 10 pages

arXiv:2008.05567 [pdf, other]

Procedural Urban Forestry

Authors: Till Niese, Sören Pirk, Matthias Albrecht, Bedrich Benes, Oliver Deussen

Abstract: The placement of vegetation plays a central role in the realism of virtual scenes. We introduce procedural placement models (PPMs) for vegetation in urban layouts. PPMs are environmentally sensitive to city geometry and allow identifying plausible plant positions based on structural and functional zones in an urban layout. PPMs can either be directly used by defining their parameters or can be lea… ▽ More The placement of vegetation plays a central role in the realism of virtual scenes. We introduce procedural placement models (PPMs) for vegetation in urban layouts. PPMs are environmentally sensitive to city geometry and allow identifying plausible plant positions based on structural and functional zones in an urban layout. PPMs can either be directly used by defining their parameters or can be learned from satellite images and land register data. Together with approaches for generating buildings and trees, this allows us to populate urban landscapes with complex 3D vegetation. The effectiveness of our framework is shown through examples of large-scale city scenes and close-ups of individually grown tree models; we also validate it by a perceptual user study. △ Less

Submitted 13 August, 2020; v1 submitted 12 August, 2020; originally announced August 2020.

Comments: 14 pages

arXiv:1908.00475 [pdf, other]

Semantic Concept Spaces: Guided Topic Model Refinement using Word-Embedding Projections

Authors: Mennatallah El-Assady, Rebecca Kehlbeck, Christopher Collins, Daniel Keim, Oliver Deussen

Abstract: We present a framework that allows users to incorporate the semantics of their domain knowledge for topic model refinement while remaining model-agnostic. Our approach enables users to (1) understand the semantic space of the model, (2) identify regions of potential conflicts and problems, and (3) readjust the semantic relation of concepts based on their understanding, directly influencing the top… ▽ More We present a framework that allows users to incorporate the semantics of their domain knowledge for topic model refinement while remaining model-agnostic. Our approach enables users to (1) understand the semantic space of the model, (2) identify regions of potential conflicts and problems, and (3) readjust the semantic relation of concepts based on their understanding, directly influencing the topic modeling. These tasks are supported by an interactive visual analytics workspace that uses word-embedding projections to define concept regions which can then be refined. The user-refined concepts are independent of a particular document collection and can be transferred to related corpora. All user interactions within the concept space directly affect the semantic relations of the underlying vector space model, which, in turn, change the topic modeling. In addition to direct manipulation, our system guides the users' decision-making process through recommended interactions that point out potential improvements. This targeted refinement aims at minimizing the feedback required for an efficient human-in-the-loop process. We confirm the improvements achieved through our approach in two user studies that show topic model quality improvements through our visual knowledge externalization and learning process. △ Less

Submitted 1 August, 2019; originally announced August 2019.

Journal ref: IEEE Transactions on Visualization and Computer Graphics, 2019

arXiv:1905.06886 [pdf, other]

AlgoNet: $C^\infty$ Smooth Algorithmic Neural Networks

Authors: Felix Petersen, Christian Borgelt, Oliver Deussen

Abstract: Artificial neural networks revolutionized many areas of computer science in recent years since they provide solutions to a number of previously unsolved problems. On the other hand, for many problems, classic algorithms exist, which typically exceed the accuracy and stability of neural networks. To combine these two concepts, we present a new kind of neural networks$-$algorithmic neural networks (… ▽ More Artificial neural networks revolutionized many areas of computer science in recent years since they provide solutions to a number of previously unsolved problems. On the other hand, for many problems, classic algorithms exist, which typically exceed the accuracy and stability of neural networks. To combine these two concepts, we present a new kind of neural networks$-$algorithmic neural networks (AlgoNets). These networks integrate smooth versions of classic algorithms into the topology of neural networks. A forward AlgoNet includes algorithmic layers into existing architectures while a backward AlgoNet can solve inverse problems without or with only weak supervision. In addition, we present the $\texttt{algonet}$ package, a PyTorch based library that includes, inter alia, a smoothly evaluated programming language, a smooth 3D mesh renderer, and smooth sorting algorithms. △ Less

Submitted 23 May, 2019; v1 submitted 16 May, 2019; originally announced May 2019.

Comments: preprint, 9 pages

arXiv:1905.01127 [pdf, other]

doi 10.1109/TVCG.2019.2934812

Uncertainty-Aware Principal Component Analysis

Authors: Jochen Görtler, Thilo Spinner, Dirk Streeb, Daniel Weiskopf, Oliver Deussen

Abstract: We present a technique to perform dimensionality reduction on data that is subject to uncertainty. Our method is a generalization of traditional principal component analysis (PCA) to multivariate probability distributions. In comparison to non-linear methods, linear dimensionality reduction techniques have the advantage that the characteristics of such probability distributions remain intact after… ▽ More We present a technique to perform dimensionality reduction on data that is subject to uncertainty. Our method is a generalization of traditional principal component analysis (PCA) to multivariate probability distributions. In comparison to non-linear methods, linear dimensionality reduction techniques have the advantage that the characteristics of such probability distributions remain intact after projection. We derive a representation of the PCA sample covariance matrix that respects potential uncertainty in each of the inputs, building the mathematical foundation of our new method: uncertainty-aware PCA. In addition to the accuracy and performance gained by our approach over sampling-based strategies, our formulation allows us to perform sensitivity analysis with regard to the uncertainty in the data. For this, we propose factor traces as a novel visualization that enables to better understand the influence of uncertainty on the chosen principal components. We provide multiple examples of our technique using real-world datasets. As a special case, we show how to propagate multivariate normal distributions through PCA in closed form. Furthermore, we discuss extensions and limitations of our approach. △ Less

Submitted 1 August, 2019; v1 submitted 3 May, 2019; originally announced May 2019.

Journal ref: IEEE Transactions on Visualization and Computer Graphics, 2020

arXiv:1903.11149 [pdf, other]

Pix2Vex: Image-to-Geometry Reconstruction using a Smooth Differentiable Renderer

Authors: Felix Petersen, Amit H. Bermano, Oliver Deussen, Daniel Cohen-Or

Abstract: The long-coveted task of reconstructing 3D geometry from images is still a standing problem. In this paper, we build on the power of neural networks and introduce Pix2Vex, a network trained to convert camera-captured images into 3D geometry. We present a novel differentiable renderer ($DR$) as a forward validation means during training. Our key insight is that $DR$s produce images of a particular… ▽ More The long-coveted task of reconstructing 3D geometry from images is still a standing problem. In this paper, we build on the power of neural networks and introduce Pix2Vex, a network trained to convert camera-captured images into 3D geometry. We present a novel differentiable renderer ($DR$) as a forward validation means during training. Our key insight is that $DR$s produce images of a particular appearance, different from typical input images. Hence, we propose adding an image-to-image translation component, converting between these rendering styles. This translation closes the training loop, while allowing to use minimal supervision only, without needing any 3D model as ground truth. Unlike state-of-the-art methods, our $DR$ is $C^\infty$ smooth and thus does not display any discontinuities at occlusions or dis-occlusions. Through our novel training scheme, our network can train on different types of images, where previous work can typically only train on images of a similar appearance to those rendered by a $DR$. △ Less

Submitted 26 May, 2019; v1 submitted 26 March, 2019; originally announced March 2019.

Showing 1–28 of 28 results for author: Deussen, O