(Translated by https://www.hiragana.jp/)
Equivalence and Similarity Refutation for Probabilistic Programs | Proceedings of the ACM on Programming Languages

research-article

Open access

Equivalence and Similarity Refutation for Probabilistic Programs

Authors:

Krishnendu Chatterjee,

Ehsan Kafshdar Goharshady,

Đorđe ŽikelićAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 8, Issue PLDI

Article No.: 232, Pages 2098 - 2122

https://doi.org/10.1145/3656462

Published: 20 June 2024 Publication History

Abstract

We consider the problems of statically refuting equivalence and similarity of output distributions defined by a pair of probabilistic programs. Equivalence and similarity are two fundamental relational properties of probabilistic programs that are essential for their correctness both in implementation and in compilation. In this work, we present a new method for static equivalence and similarity refutation. Our method refutes equivalence and similarity by computing a function over program outputs whose expected value with respect to the output distributions of two programs is different. The function is computed simultaneously with an upper expectation supermartingale and a lower expectation submartingale for the two programs, which we show to together provide a formal certificate for refuting equivalence and similarity. To the best of our knowledge, our method is the first approach to relational program analysis to offer the combination of the following desirable features: (1) it is fully automated, (2) it is applicable to infinite-state probabilistic programs, and (3) it provides formal guarantees on the correctness of its results. We implement a prototype of our method and our experiments demonstrate the effectiveness of our method to refute equivalence and similarity for a number of examples collected from the literature.

References

[1]

Sheshansh Agrawal, Krishnendu Chatterjee, and Petr Novotný. 2018. Lexicographic ranking supermartingales: an efficient approach to termination of probabilistic programs. Proc. ACM Program. Lang., 2, POPL (2018).

Digital Library

[2]

Alejandro Aguirre, Gilles Barthe, Justin Hsu, Benjamin Lucien Kaminski, Joost-Pieter Katoen, and Christoph Matheja. 2021. A pre-expectation calculus for probabilistic sensitivity. Proc. ACM Program. Lang., 5, POPL (2021).

Digital Library

[3]

Alejandro Aguirre, Gilles Barthe, Justin Hsu, and Alexandra Silva. 2018. Almost Sure Productivity. In ICALP.

[4]

Aws Albarghouthi and Justin Hsu. 2018. Synthesizing coupling proofs of differential privacy. Proc. ACM Program. Lang., 2, POPL (2018).

Digital Library

[5]

Ali Asadi, Krishnendu Chatterjee, Hongfei Fu, Amir Kafshdar Goharshady, and Mohammad Mahdavi. 2021. Polynomial reachability witnesses via Stellensätze. In PLDI.

[6]

Martin Avanzini, Georg Moser, and Michael Schaper. 2020. A modular cost analysis for probabilistic programs. Proc. ACM Program. Lang., 4, OOPSLA (2020).

Digital Library

[7]

Jialu Bao, Nitesh Trivedi, Drashti Pathak, Justin Hsu, and Subhajit Roy. 2022. Data-Driven Invariant Learning for Probabilistic Programs. In CAV.

[8]

Gilles Barthe, Thomas Espitau, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2018. Proving expected sensitivity of probabilistic programs. Proc. ACM Program. Lang., 2, POPL (2018).

Digital Library

[9]

Gilles Barthe, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016. Proving Differential Privacy via Probabilistic Couplings. In LICS.

[10]

Gilles Barthe, Marco Gaboardi, Justin Hsu, and Benjamin C. Pierce. 2016. Programming language techniques for differential privacy. ACM SIGLOG News, 3, 1 (2016).

Digital Library

[11]

Gilles Barthe, Benjamin Grégoire, and Santiago Zanella Béguelin. 2009. Formal certification of code-based cryptographic proofs. In POPL.

[12]

Gilles Barthe, Charlie Jacomme, and Steve Kremer. 2022. Universal Equivalence and Majority of Probabilistic Programs over Finite Fields. ACM Trans. Comput. Log., 23, 1 (2022).

Digital Library

[13]

Gilles Barthe, Joost-Pieter Katoen, and Alexandra Silva. 2020. Foundations of probabilistic programming. Cambridge University Press.

[14]

Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. 2013. Testing Closeness of Discrete Distributions. J. ACM, 60, 1 (2013).

Digital Library

[15]

Kevin Batz, Mingshuai Chen, Sebastian Junges, Benjamin Lucien Kaminski, Joost-Pieter Katoen, and Christoph Matheja. 2023. Probabilistic Program Verification via Inductive Synthesis of Inductive Invariants. In TACAS.

[16]

Raven Beutner, C.-H. Luke Ong, and Fabian Zaiser. 2022. Guaranteed bounds for posterior inference in universal probabilistic programming. In PLDI.

[17]

Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul A. Szerlip, Paul Horsfall, and Noah D. Goodman. 2019. Pyro: Deep Universal Probabilistic Programming. J. Mach. Learn. Res., 20 (2019).

[18]

Clément L Canonne. 2020. A survey on distribution testing: Your data is big. But is it blue? Theory of Computing.

[19]

Aleksandar Chakarov and Sriram Sankaranarayanan. 2013. Probabilistic Program Analysis with Martingales. In CAV.

[20]

Sourav Chakraborty and Kuldeep S. Meel. 2019. On Testing of Uniform Samplers. In AAAI.

[21]

Siu-on Chan, Ilias Diakonikolas, Paul Valiant, and Gregory Valiant. 2014. Optimal Algorithms for Testing Closeness of Discrete Distributions. In SODA.

[22]

Krishnendu Chatterjee, Hongfei Fu, and Amir Kafshdar Goharshady. 2016. Termination Analysis of Probabilistic Programs Through Positivstellensatz’s. In CAV.

[23]

Krishnendu Chatterjee, Hongfei Fu, Amir Kafshdar Goharshady, and Ehsan Kafshdar Goharshady. 2020. Polynomial invariant generation for non-deterministic recursive programs. In PLDI.

[24]

Krishnendu Chatterjee, Hongfei Fu, Petr Novotný, and Rouzbeh Hasheminezhad. 2018. Algorithmic Analysis of Qualitative and Quantitative Termination Problems for Affine Probabilistic Programs. ACM Trans. Program. Lang. Syst., 40, 2 (2018).

Digital Library

[25]

Krishnendu Chatterjee, Amir Kafshdar Goharshady, Tobias Meggendorfer, and Dorde Zikelic. 2022. Sound and Complete Certificates for Quantitative Termination Analysis of Probabilistic Programs. In CAV.

[26]

Krishnendu Chatterjee, Ehsan Kafshdar Goharshady, Petr Novotný, Jiri Zárevúcky, and Dorde Zikelic. 2021. On Lexicographic Proof Rules for Probabilistic Termination. In FM.

[27]

Krishnendu Chatterjee, Ehsan Kafshdar Goharshady, Petr Novotný, and Đorđe Žikelić. 2024. Equivalence and Similarity Refutation for Probabilistic Programs. arxiv:2404.03430.

[28]

Krishnendu Chatterjee, Petr Novotný, and Dorde Zikelic. 2017. Stochastic invariants for probabilistic termination. In POPL.

[29]

Swarat Chaudhuri, Sumit Gulwani, and Roberto Lublinerman. 2010. Continuity analysis of programs. In POPL.

[30]

Mingshuai Chen, Joost-Pieter Katoen, Lutz Klinkenberg, and Tobias Winkler. 2022. Does a Program Yield the Right Distribution? - Verifying Probabilistic Programs via Generating Functions. In CAV.

[31]

Taolue Chen and Stefan Kiefer. 2014. On the Total Variation Distance of Labelled Markov Chains. In CSL-LICS.

[32]

Ezgi Çiçek, Gilles Barthe, Marco Gaboardi, Deepak Garg, and Jan Hoffmann. 2017. Relational cost analysis. In POPL.

[33]

Ryan Culpepper and Andrew Cobb. 2017. Contextual equivalence for probabilistic programs with continuous random variables and scoring. In ESOP.

[34]

Marco F. Cusumano-Towner and Vikash K. Mansinghka. 2017. AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms. In NIPS.

[35]

Yuxin Deng and Wenjie Du. 2009. The Kantorovich metric in computer science: A brief survey. Electronic Notes in Theoretical Computer Science, 253, 3 (2009).

[36]

Justin Domke. 2021. An Easy to Interpret Diagnostic for Approximate Inference: Symmetric Divergence Over Simulations. CoRR, abs/2103.01030 (2021).

[37]

Saikat Dutta, Owolabi Legunsen, Zixin Huang, and Sasa Misailovic. 2018. Testing probabilistic programming systems. In FSE.

[38]

Saikat Dutta, Wenxian Zhang, Zixin Huang, and Sasa Misailovic. 2019. Storm: program reduction for testing and debugging probabilistic programming systems. In FSE.

[39]

Paul Feautrier and Laure Gonnord. 2010. Accelerated Invariant Generation for C Programs with Aspic and C2fsm. In TAPAS@SAS.

[40]

Dennis Felsing, Sarah Grebing, Vladimir Klebanov, Philipp Rümmer, and Mattias Ulbrich. 2014. Automating regression verification. In ASE.

[41]

Nate Foster, Dexter Kozen, Konstantinos Mamouras, Mark Reitblatt, and Alexandra Silva. 2016. Probabilistic NetKAT. In ESOP.

[42]

Timon Gehr, Sasa Misailovic, and Martin T. Vechev. 2016. PSI: Exact Symbolic Inference for Probabilistic Programs. In CAV.

[43]

Andrew Gelman, Daniel Lee, and Jiqiang Guo. 2015. Stan: A probabilistic programming language for Bayesian inference and optimization. Journal of Educational and Behavioral Statistics, 40, 5 (2015).

[44]

Zoubin Ghahramani. 2015. Probabilistic machine learning and artificial intelligence. Nat., 521, 7553 (2015), 452–459.

[45]

Benny Godlin and Ofer Strichman. 2013. Regression verification: proving the equivalence of similar programs. Softw. Test. Verification Reliab., 23, 3 (2013).

[46]

Noah D. Goodman, Vikash K. Mansinghka, Daniel M. Roy, Kallista A. Bonawitz, and Joshua B. Tenenbaum. 2008. Church: a language for generative models. In UAI.

Digital Library

[47]

Andrew D. Gordon, Thomas A. Henzinger, Aditya V. Nori, and Sriram K. Rajamani. 2014. Probabilistic programming. In FOSE.

[48]

Roger B. Grosse, Siddharth Ancha, and Daniel M. Roy. 2016. Measuring the reliability of MCMC inference with bidirectional Monte Carlo. In NIPS.

[49]

Roger B. Grosse, Zoubin Ghahramani, and Ryan P. Adams. 2015. Sandwiching the marginal likelihood using bidirectional Monte Carlo. CoRR, abs/1511.02543 (2015).

[50]

Gurobi Optimization, LLC. 2023. Gurobi Optimizer Reference Manual. https://www.gurobi.com

[51]

David Handelman. 1988. Representing polynomials by positive linear functions on compact convex polyhedra. Pacific J. Math., 132, 1 (1988).

[52]

Marcel Hark, Benjamin Lucien Kaminski, Jürgen Giesl, and Joost-Pieter Katoen. 2020. Aiming low is harder: induction for lower bounds in probabilistic program verification. Proc. ACM Program. Lang., 4, POPL (2020).

Digital Library

[53]

Leen Helmink, M. P. A. Sellink, and Frits W. Vaandrager. 1993. Proof-Checking a Data Link Protocol. In TYPES.

[54]

Zixin Huang, Zhenbang Wang, and Sasa Misailovic. 2018. PSense: Automatic Sensitivity Analysis for Probabilistic Programs. In ATVA.

[55]

Benjamin Lucien Kaminski, Joost-Pieter Katoen, and Christoph Matheja. 2019. On the hardness of analyzing probabilistic programs. Acta Informatica, 56, 3 (2019).

Digital Library

[56]

Benjamin Lucien Kaminski, Joost-Pieter Katoen, Christoph Matheja, and Federico Olmedo. 2018. Weakest Precondition Reasoning for Expected Runtimes of Randomized Algorithms. J. ACM, 65, 5 (2018).

Digital Library

[57]

Stefan Kiefer. 2018. On Computing the Total Variation Distance of Hidden Markov Models. In ICALP.

[58]

Stefan Kiefer, Andrzej S. Murawski, Joël Ouaknine, Björn Wachter, and James Worrell. 2011. Language Equivalence for Probabilistic Automata. In CAV.

[59]

Stefan Kiefer and Qiyi Tang. 2020. Comparing Labelled Markov Decision Processes. In FSTTCS.

[60]

Satoshi Kura, Natsuki Urabe, and Ichiro Hasuo. 2019. Tail Probabilities for Randomized Program Runtimes via Martingales for Higher Moments. In TACAS.

[61]

Kim G. Larsen and Arne Skou. 1991. Bisimulation through probabilistic testing. Information and Computation, 94, 1 (1991).

[62]

Axel Legay, Andrzej S. Murawski, Joël Ouaknine, and James Worrell. 2008. On Automated Verification of Probabilistic Programs. In TACAS.

[63]

Annabelle McIver, Carroll Morgan, Benjamin Lucien Kaminski, and Joost-Pieter Katoen. 2018. A new proof rule for almost-sure termination. Proc. ACM Program. Lang., 2, POPL (2018).

Digital Library

[64]

Sean P Meyn and Richard L Tweedie. 2012. Markov chains and stochastic stability. Springer Science & Business Media.

[65]

Andrzej S. Murawski and Joël Ouaknine. 2005. On Probabilistic Program Equivalence and Refinement. In CONCUR.

[66]

Chandrakana Nandi, Dan Grossman, Adrian Sampson, Todd Mytkowicz, and Kathryn S. McKinley. 2017. Debugging probabilistic programs. In MAPL@PLDI.

[67]

Van Chan Ngo, Quentin Carbonneaux, and Jan Hoffmann. 2018. Bounded expectations: resource analysis for probabilistic programs. In PLDI.

[68]

Federico Olmedo, Friedrich Gretz, Nils Jansen, Benjamin Lucien Kaminski, Joost-Pieter Katoen, and Annabelle McIver. 2018. Conditioning in Probabilistic Programming. ACM Trans. Program. Lang. Syst., 40, 1 (2018).

Digital Library

[69]

David Park. 1969. Fixpoint induction and proofs of program properties. Machine intelligence, 5 (1969).

[70]

Nimrod Partush and Eran Yahav. 2013. Abstract Semantic Differencing for Numerical Programs. In SAS.

[71]

Nimrod Partush and Eran Yahav. 2014. Abstract semantic differencing via speculative correlation. In OOPSLA.

[72]

Weihao Qu, Marco Gaboardi, and Deepak Garg. 2021. Relational cost analysis in a functional-imperative setting.

[73]

Sriram Sankaranarayanan, Aleksandar Chakarov, and Sumit Gulwani. 2013. Static analysis for probabilistic programs: inferring whole program properties from finitely many paths. In PLDI.

[74]

Sriram Sankaranarayanan, Henny B. Sipma, and Zohar Manna. 2004. Constraint-Based Linear-Relations Analysis. In SAS.

[75]

Toru Takisaka, Yuichiro Oyabu, Natsuki Urabe, and Ichiro Hasuo. 2021. Ranking and Repulsing Supermartingales for Reachability in Randomized Programs. ACM Trans. Program. Lang. Syst., 43, 2 (2021).

Digital Library

[76]

Sebastian Thrun. 2000. Probabilistic Algorithms in Robotics. AI Mag., 21, 4 (2000).

[77]

Mathieu Tracol, Josée Desharnais, and Abir Zhioua. 2011. Computing Distances between Probabilistic Automata. In QAPL.

[78]

Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, and David M. Blei. 2017. Deep probabilistic programming. In ICLR.

[79]

Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 2018. An Introduction to Probabilistic Programming. CoRR, abs/1809.10756 (2018), arxiv:1809.10756

[80]

Cédric Villani. 2021. Topics in optimal transportation. 58, American Mathematical Soc.

[81]

Di Wang, Jan Hoffmann, and Thomas W. Reps. 2021. Central moment analysis for cost accumulators in probabilistic programs. In PLDI.

[82]

Peixin Wang, Hongfei Fu, Krishnendu Chatterjee, Yuxin Deng, and Ming Xu. 2020. Proving expected sensitivity of probabilistic programs with randomized variable-dependent termination time. Proc. ACM Program. Lang., 4, POPL (2020).

Digital Library

[83]

Peixin Wang, Hongfei Fu, Amir Kafshdar Goharshady, Krishnendu Chatterjee, Xudong Qin, and Wenjun Shi. 2019. Cost analysis of nondeterministic probabilistic programs. In PLDI.

[84]

David Williams. 1991. Probability with Martingales. Cambridge University Press. isbn:978-0-521-40605-5

[85]

Wolfram Research, Inc. 2022. Mathematica 13.2. https://www.wolfram.com

[86]

Dorde Zikelic, Bor-Yuh Evan Chang, Pauline Bolignano, and Franco Raimondi. 2022. Differential cost analysis with simultaneous potentials and anti-potentials. In PLDI.

Index Terms

Equivalence and Similarity Refutation for Probabilistic Programs

Recommendations

Slicing of probabilistic programs based on specifications
Abstract
This paper presents the first slicing approach for probabilistic programs based on specifications. We show that when probabilistic programs are accompanied by their specifications in the form of pre- and post-condition, we can exploit ...
Highlights
- Probabilistic programs support slicing based on specification.
- The presence of ...
A Theory of Slicing for Imperative Probabilistic Programs

Dedicated to the memory of Sebastian Danicic.

We present a theory for slicing imperative probabilistic programs containing random assignments and “observe” statements for conditioning. We represent such programs as probabilistic control-flow graphs (...
Equivalence of Measures of Complexity Classes

The resource-bounded measures of complexity classes are shown to be robust with respect to certain changes in the underlying probability measure. Specifically, for any real number $\delta > 0$, any uniformly polynomial-time computable sequence $\mv{\...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 8, Issue PLDI

June 2024

2198 pages

EISSN:2475-1421

DOI:10.1145/3554317

Editor:
Michael Hicks
Amazon, USA

Issue’s Table of Contents

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2024

Published in PACMPL Volume 8, Issue PLDI

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

European Research Council
Czech Science Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
94
Total Downloads

Downloads (Last 12 months)94
Downloads (Last 6 weeks)45

Reflects downloads up to 15 Aug 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents