Search | arXiv e-print repository

doi 10.1063/5.0132706

High Gradient Testing of off-Axis Coupled C-band Cu and CuAg Accelerating Structures

Authors: Mitchell Schneider, Muhammed Zuboraj, Valery Dolgashev, John J Lewellen, Sami G Tantawi, Ryan Fleming, Dmitry Gorelov, Mark Middendorf, Emilio A Nanni, Evgenya I Simakov

Abstract: We report the results of high gradient testing of two single cell off axis coupled standing wave accelerating structures. Two brazed standing wave side coupled structures with the same geometry were tested one made of pure copper Cu and one made of a copper silver CuAg alloy with silver concentration of 0.08 percent. A peak surface electric field of 450 MV per m was achieved in the CuAg structure… ▽ More We report the results of high gradient testing of two single cell off axis coupled standing wave accelerating structures. Two brazed standing wave side coupled structures with the same geometry were tested one made of pure copper Cu and one made of a copper silver CuAg alloy with silver concentration of 0.08 percent. A peak surface electric field of 450 MV per m was achieved in the CuAg structure for a klystron input power of 14.5 MW and a 1 mirco s pulse length which was 25 percent higher than the peak surface electric field achieved in the Cu structure. The superb high gradient performance was achieved because of the two major optimizations in the cavity geometry 1 the shunt impedance of the cavity was maximized for a peak surface electric field to accelerating gradient ratio of 2 for a fully relativistic particle 2 the peak magnetic field enhancement due to the input coupler was minimized to limit pulse heating. These tests allow us to conclude that C band accelerating structures can operate at peak fields similar to those at higher frequencies while providing a larger beam iris for improved beam transport. △ Less

Submitted 30 October, 2022; originally announced October 2022.

arXiv:2004.08433 [pdf, ps, other]

A Weighted Population Update Rule for PACO Applied to the Single Machine Total Weighted Tardiness Problem

Authors: Daniel Abitz, Tom Hartmann, Martin Middendorf

Abstract: In this paper a new population update rule for population based ant colony optimization (PACO) is proposed. PACO is a well known alternative to the standard ant colony optimization algorithm. The new update rule allows to weight different parts of the solutions. PACO with the new update rule is evaluated for the example of the single machine total weighted tardiness problem (SMTWTP). This is an… ▽ More In this paper a new population update rule for population based ant colony optimization (PACO) is proposed. PACO is a well known alternative to the standard ant colony optimization algorithm. The new update rule allows to weight different parts of the solutions. PACO with the new update rule is evaluated for the example of the single machine total weighted tardiness problem (SMTWTP). This is an $\mathcal{NP}$-hard optimization problem where the aim is to schedule jobs on a single machine such that their total weighted tardiness is minimized. PACO with the new population update rule is evaluated with several benchmark instances from the OR-Library. Moreover, the impact of the weights of the jobs on the solutions in the population and on the convergence of the algorithm are analyzed experimentally. The results show that PACO with the new update rule has on average better solution quality than PACO with the standard update rule. △ Less

Submitted 17 April, 2020; originally announced April 2020.

arXiv:2004.07118 [pdf, other]

Complete Edge-Colored Permutation Graphs

Authors: Tom Hartmann, Max Bannach, Martin Middendorf, Peter F. Stadler, Nicolas Wieseke, Marc Hellmuth

Abstract: We introduce the concept of complete edge-colored permutation graphs as complete graphs that are the edge-disjoint union of "classical" permutation graphs. We show that a graph $G=(V,E)$ is a complete edge-colored permutation graph if and only if each monochromatic subgraph of $G$ is a "classical" permutation graph and $G$ does not contain a triangle with~$3$ different colors. Using the modular de… ▽ More We introduce the concept of complete edge-colored permutation graphs as complete graphs that are the edge-disjoint union of "classical" permutation graphs. We show that a graph $G=(V,E)$ is a complete edge-colored permutation graph if and only if each monochromatic subgraph of $G$ is a "classical" permutation graph and $G$ does not contain a triangle with~$3$ different colors. Using the modular decomposition as a framework we demonstrate that complete edge-colored permutation graphs are characterized in terms of their strong prime modules, which induce also complete edge-colored permutation graphs. This leads to an $\mathcal{O}(|V|^2)$-time recognition algorithm. We show, moreover, that complete edge-colored permutation graphs form a superclass of so-called symbolic ultrametrics and that the coloring of such graphs is always a Gallai coloring. △ Less

Submitted 15 April, 2020; originally announced April 2020.

arXiv:1712.06442 [pdf, ps, other]

doi 10.1073/pnas.1412770112

Phylogenomics with Paralogs

Authors: Marc Hellmuth, Nicolas Wieseke, Marcus Lechner, Hans-Peter Lenhof, Martin Middendorf, Peter F. Stadler

Abstract: Phylogenomics heavily relies on well-curated sequence data sets that consist, for each gene, exclusively of 1:1-orthologous. Paralogs are treated as a dangerous nuisance that has to be detected and removed. We show here that this severe restriction of the data sets is not necessary. Building upon recent advances in mathematical phylogenetics we demonstrate that gene duplications convey meaningful… ▽ More Phylogenomics heavily relies on well-curated sequence data sets that consist, for each gene, exclusively of 1:1-orthologous. Paralogs are treated as a dangerous nuisance that has to be detected and removed. We show here that this severe restriction of the data sets is not necessary. Building upon recent advances in mathematical phylogenetics we demonstrate that gene duplications convey meaningful phylogenetic information and allow the inference of plausible phylogenetic trees, provided orthologs and paralogs can be distinguished with a degree of certainty. Starting from tree-free estimates of orthology, cograph editing can sufficiently reduce the noise in order to find correct event-annotated gene trees. The information of gene trees can then directly be translated into constraints on the species trees. While the resolution is very poor for individual gene families, we show that genome-wide data sets are sufficient to generate fully resolved phylogenetic trees, even in the presence of horizontal gene transfer. We demonstrate that the distribution of paralogs in large gene families contains in itself sufficient phylogenetic signal to infer fully resolved species phylogenies. This source of phylogenetic information is independent of information contained in orthologous sequences and is resilient against horizontal gene transfer. An important consequence is that phylogenomics data sets need not be restricted to 1:1 orthologs. △ Less

Submitted 18 December, 2017; originally announced December 2017.

Journal ref: PNAS 2015 112 (7) 2058-2063

arXiv:1307.7831 [pdf, ps, other]

Unifying Parsimonious Tree Reconciliation

Authors: Nicolas Wieseke, Matthias Bernt, Martin Middendorf

Abstract: Evolution is a process that is influenced by various environmental factors, e.g. the interactions between different species, genes, and biogeographical properties. Hence, it is interesting to study the combined evolutionary history of multiple species, their genes, and the environment they live in. A common approach to address this research problem is to describe each individual evolution as a phy… ▽ More Evolution is a process that is influenced by various environmental factors, e.g. the interactions between different species, genes, and biogeographical properties. Hence, it is interesting to study the combined evolutionary history of multiple species, their genes, and the environment they live in. A common approach to address this research problem is to describe each individual evolution as a phylogenetic tree and construct a tree reconciliation which is parsimonious with respect to a given event model. Unfortunately, most of the previous approaches are designed only either for host-parasite systems, for gene tree/species tree reconciliation, or biogeography. Hence, a method is desirable, which addresses the general problem of mapping phylogenetic trees and covering all varieties of coevolving systems, including e.g., predator-prey and symbiotic relationships. To overcome this gap, we introduce a generalized cophylogenetic event model considering the combinatorial complete set of local coevolutionary events. We give a dynamic programming based heuristic for solving the maximum parsimony reconciliation problem in time O(n^2), for two phylogenies each with at most n leaves. Furthermore, we present an exact branch-and-bound algorithm which uses the results from the dynamic programming heuristic for discarding partial reconciliations. The approach has been implemented as a Java application which is freely available from http://pacosy.informatik.uni-leipzig.de/coresym. △ Less

Submitted 30 July, 2013; originally announced July 2013.

Comments: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013)

arXiv:q-bio/0701021 [pdf, ps, other]

doi 10.1007/11415770_41

Motif Discovery through Predictive Modeling of Gene Regulation

Authors: Manuel Middendorf, Anshul Kundaje, Mihir Shah, Yoav Freund, Chris H. Wiggins, Christina Leslie

Abstract: We present MEDUSA, an integrative method for learning motif models of transcription factor binding sites by incorporating promoter sequence and gene expression data. We use a modern large-margin machine learning approach, based on boosting, to enable feature selection from the high-dimensional search space of candidate binding sequences while avoiding overfitting. At each iteration of the algori… ▽ More We present MEDUSA, an integrative method for learning motif models of transcription factor binding sites by incorporating promoter sequence and gene expression data. We use a modern large-margin machine learning approach, based on boosting, to enable feature selection from the high-dimensional search space of candidate binding sequences while avoiding overfitting. At each iteration of the algorithm, MEDUSA builds a motif model whose presence in the promoter region of a gene, coupled with activity of a regulator in an experiment, is predictive of differential expression. In this way, we learn motifs that are functional and predictive of regulatory response rather than motifs that are simply overrepresented in promoter sequences. Moreover, MEDUSA produces a model of the transcriptional control logic that can predict the expression of any gene in the organism, given the sequence of the promoter region of the target gene and the expression state of a set of known or putative transcription factors and signaling molecules. Each motif model is either a $k$-length sequence, a dimer, or a PSSM that is built by agglomerative probabilistic clustering of sequences with similar boosting loss. By applying MEDUSA to a set of environmental stress response expression data in yeast, we learn motifs whose ability to predict differential expression of target genes outperforms motifs from the TRANSFAC dataset and from a previously published candidate set of PSSMs. We also show that MEDUSA retrieves many experimentally confirmed binding sites associated with environmental stress response from the literature. △ Less

Submitted 14 January, 2007; originally announced January 2007.

Comments: RECOMB 2005

Journal ref: Research in Computational Molecular Biology 2005

arXiv:q-bio/0411033 [pdf, ps, other]

doi 10.1103/PhysRevE.71.046117

An Information-Theoretic Approach to Network Modularity

Authors: Etay Ziv, Manuel Middendorf, Chris Wiggins

Abstract: Exploiting recent developments in information theory, we propose, illustrate, and validate a principled information-theoretic algorithm for module discovery and resulting measure of network modularity. This measure is an order parameter (a dimensionless number between 0 and 1). Comparison is made to other approaches to module-discovery and to quantifying network modularity using Monte Carlo gene… ▽ More Exploiting recent developments in information theory, we propose, illustrate, and validate a principled information-theoretic algorithm for module discovery and resulting measure of network modularity. This measure is an order parameter (a dimensionless number between 0 and 1). Comparison is made to other approaches to module-discovery and to quantifying network modularity using Monte Carlo generated Erdos-like modular networks. Finally, the Network Information Bottleneck (NIB) algorithm is applied to a number of real world networks, including the "social" network of coauthors at the APS March Meeting 2004. △ Less

Submitted 16 November, 2004; originally announced November 2004.

Comments: 13 pages, 8 figures

arXiv:q-bio/0411028 [pdf, ps, other]

Predicting Genetic Regulatory Response Using Classification

Authors: Manuel Middendorf, Anshul Kundaje, Chris Wiggins, Yoav Freund, Christina Leslie

Abstract: We present a novel classification-based method for learning to predict gene regulatory response. Our approach is motivated by the hypothesis that in simple organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular experiment based on (1) the presence of binding site subsequences (``motifs'') in the gene's regul… ▽ More We present a novel classification-based method for learning to predict gene regulatory response. Our approach is motivated by the hypothesis that in simple organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular experiment based on (1) the presence of binding site subsequences (``motifs'') in the gene's regulatory region and (2) the expression levels of regulators such as transcription factors in the experiment (``parents''). Thus our learning task integrates two qualitatively different data sources: genome-wide cDNA microarray data across multiple perturbation and mutant experiments along with motif profile data from regulatory sequences. We convert the regression task of predicting real-valued gene expression measurement to a classification task of predicting +1 and -1 labels, corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. The learning algorithm employed is boosting with a margin-based generalization of decision trees, alternating decision trees. This large-margin classifier is sufficiently flexible to allow complex logical functions, yet sufficiently simple to give insight into the combinatorial mechanisms of gene regulation. We observe encouraging prediction accuracy on experiments based on the Gasch S. cerevisiae dataset, and we show that we can accurately predict up- and down-regulation on held-out experiments. Our method thus provides predictive hypotheses, suggests biological experiments, and provides interpretable insight into the structure of genetic regulatory networks. △ Less

Submitted 12 November, 2004; originally announced November 2004.

Comments: 8 pages, 4 figures, presented at Twelfth International Conference on Intelligent Systems for Molecular Biology (ISMB 2004), supplemental website: http://www.cs.columbia.edu/compbio/geneclass

Journal ref: Proceedings of the Twelfth International Conference on Intelligent Systems for Molecular Biology (ISMB 2004), Bioinformatics 20 Suppl 1, I232-I240, 2004

arXiv:q-bio/0408010 [pdf, ps, other]

doi 10.1073/pnas.0409515102

Inferring Network Mechanisms: The Drosophila melanogaster Protein Interaction Network

Authors: Manuel Middendorf, Etay Ziv, Chris Wiggins

Abstract: Naturally occurring networks exhibit quantitative features revealing underlying growth mechanisms. Numerous network mechanisms have recently been proposed to reproduce specific properties such as degree distributions or clustering coefficients. We present a method for inferring the mechanism most accurately capturing a given network topology, exploiting discriminative tools from machine learning… ▽ More Naturally occurring networks exhibit quantitative features revealing underlying growth mechanisms. Numerous network mechanisms have recently been proposed to reproduce specific properties such as degree distributions or clustering coefficients. We present a method for inferring the mechanism most accurately capturing a given network topology, exploiting discriminative tools from machine learning. The Drosophila melanogaster protein network is confidently and robustly (to noise and training data subsampling) classified as a duplication-mutation-complementation network over preferential attachment, small-world, and other duplication-mutation mechanisms. Systematic classification, rather than statistical study of specific properties, provides a discriminative approach to understand the design of complex networks. △ Less

Submitted 15 August, 2004; originally announced August 2004.

Comments: 19 pages, 5 figures

Journal ref: PNAS, Vol. 102, No. 9, pp. 3192-3197 (March 1, 2005)

arXiv:q-bio/0406016 [pdf, ps, other]

Predicting Genetic Regulatory Response using Classification: Yeast Stress Response

Authors: Manuel Middendorf, Anshul Kundaje, Chris Wiggins, Yoav Freund, Christina Leslie

Abstract: We present a novel classification-based algorithm called GeneClass for learning to predict gene regulatory response. Our approach is motivated by the hypothesis that in simple organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular experiment based on (1) the presence of binding site subsequences (``motifs'')… ▽ More We present a novel classification-based algorithm called GeneClass for learning to predict gene regulatory response. Our approach is motivated by the hypothesis that in simple organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular experiment based on (1) the presence of binding site subsequences (``motifs'') in the gene's regulatory region and (2) the expression levels of regulators such as transcription factors in the experiment (``parents''). Thus our learning task integrates two qualitatively different data sources: genome-wide cDNA microarray data across multiple perturbation and mutant experiments along with motif profile data from regulatory sequences. Rather than focusing on the regression task of predicting real-valued gene expression measurements, GeneClass performs the classification task of predicting +1 and -1 labels, corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. GeneClass uses the Adaboost learning algorithm with a margin-based generalization of decision trees called alternating decision trees. In computational experiments based on the Gasch S. cerevisiae dataset, we show that the GeneClass method predicts up- and down-regulation on held-out experiments with high accuracy. We explore a range of experimental setups related to environmental stress response, and we retrieve important regulators, binding site motifs, and relationships between regulators and binding sites that are known to be associated to specific stress response pathways. Our method thus provides predictive hypotheses, suggests biological experiments, and provides interpretable insight into the structure of genetic regulatory networks. △ Less

Submitted 8 June, 2004; v1 submitted 7 June, 2004; originally announced June 2004.

Comments: Supplementary website: http://www.cs.columbia.edu/compbio/geneclass

Journal ref: Proceedings of the First Annual RECOMB Regulation Workshop 2004

arXiv:q-bio/0402017 [pdf, ps, other]

Discriminative Topological Features Reveal Biological Network Mechanisms

Authors: Manuel Middendorf, Etay Ziv, Carter Adams, Jen Hom, Robin Koytcheff, Chaya Levovitz, Gregory Woods, Linda Chen, Chris Wiggins

Abstract: Recent genomic and bioinformatic advances have motivated the development of numerous random network models purporting to describe graphs of biological, technological, and sociological origin. The success of a model has been evaluated by how well it reproduces a few key features of the real-world data, such as degree distributions, mean geodesic lengths, and clustering coefficients. Often pairs o… ▽ More Recent genomic and bioinformatic advances have motivated the development of numerous random network models purporting to describe graphs of biological, technological, and sociological origin. The success of a model has been evaluated by how well it reproduces a few key features of the real-world data, such as degree distributions, mean geodesic lengths, and clustering coefficients. Often pairs of models can reproduce these features with indistinguishable fidelity despite being generated by vastly different mechanisms. In such cases, these few target features are insufficient to distinguish which of the different models best describes real world networks of interest; moreover, it is not clear a priori that any of the presently-existing algorithms for network generation offers a predictive description of the networks inspiring them. To derive discriminative classifiers, we construct a mapping from the set of all graphs to a high-dimensional (in principle infinite-dimensional) ``word space.'' This map defines an input space for classification schemes which allow us for the first time to state unambiguously which models are most descriptive of the networks they purport to describe. Our training sets include networks generated from 17 models either drawn from the literature or introduced in this work, source code for which is freely available. We anticipate that this new approach to network analysis will be of broad impact to a number of communities. △ Less

Submitted 9 February, 2004; originally announced February 2004.

Comments: supplemental website: http://www.columbia.edu/itc/applied/wiggins/netclass/

Journal ref: BMC Bioinformatics 2004, 5:181 (22 November 2004)

arXiv:cond-mat/0306610 [pdf, ps, other]

doi 10.1103/PhysRevE.71.016110

Systematic identification of statistically significant network measures

Authors: Etay Ziv, Robin Koytcheff, Manuel Middendorf, Chris Wiggins

Abstract: We present a novel graph embedding space (i.e., a set of measures on graphs) for performing statistical analyses of networks. Key improvements over existing approaches include discovery of "motif-hubs" (multiple overlapping significant subgraphs), computational efficiency relative to subgraph census, and flexibility (the method is easily generalizable to weighted and signed graphs). The embeddin… ▽ More We present a novel graph embedding space (i.e., a set of measures on graphs) for performing statistical analyses of networks. Key improvements over existing approaches include discovery of "motif-hubs" (multiple overlapping significant subgraphs), computational efficiency relative to subgraph census, and flexibility (the method is easily generalizable to weighted and signed graphs). The embedding space is based on {\it scalars}, functionals of the adjacency matrix representing the network. {\it Scalars} are global, involving all nodes; although they can be related to subgraph enumeration, there is not a one-to-one mapping between scalars and subgraphs. Improvements in network randomization and significance testing--we learn the distribution rather than assuming gaussianity--are also presented. The resulting algorithm establishes a systematic approach to the identification of the most significant scalars and suggests machine-learning techniques for network classification. △ Less

Submitted 27 January, 2005; v1 submitted 24 June, 2003; originally announced June 2003.

Comments: 19 pages, 12 figures

Journal ref: Phys. Rev. E 71, 016110 (2005)

Showing 1–12 of 12 results for author: Middendorf, M