Probabilistic Hoare logic (PHL) is an extension of Hoare logic and is specifically useful in verifying randomized programs. It allows researchers to formally reason about the behavior of programs with stochastic elements, ensuring the desired probabilistic properties are upheld. The relative completeness of satisfaction-based PHL has been an open problem ever since the birth of the first PHL in 1979. More specifically, no satisfaction-based PHL with While-loop has been proven to be relatively complete yet. This paper solves this problem by establishing a new PHL with While-loop and prove its relative completeness. The programming language concerned in our PHL is expressively equivalent to the existing PHL systems but brings a lot of convenience in showing completeness. The weakest preterm for While-loop command reveals how it changes the probabilistic properties of computer states, considering both execution branches that halt and infinite runs. We prove the relative completeness of our PHL in two steps. We first establish a semantics and proof system of Hoare triples with probabilistic programs and deterministic assertions. Then, by utilizing the weakest precondition of deterministic assertions, we construct the weakest preterm calculus of probabilistic expressions. The relative completeness of our PHL is then obtained as a consequence of the weakest preterm calculus.
††copyright: acmlicensed††journalyear: 2018††doi: XXXXXXX.XXXXXXX††conference: Make sure to enter the correct
conference title from your rights confirmation emai; October 14-18,
2024; Salt Lake City, U.S.A.††isbn: 978-1-4503-XXXX-X/18/06
1. Introduction
Hoare Logic. Hoare logic provides a formalization with logical rules on reasoning about the correctness of programs. It was originally designed by C. A. R. Hoare in 1969 in his seminal paper (Hoare, 1969) which was in turn extended by himself in (Hoare, 1971a). The underpinning idea captures the precondition and postcondition of executing a certain program. The precondition describes the property that the command relies on as a start. The postcondition describes the property that the command must lead to after each correct execution. Hoare logic has become one of the most influential tools in the formal verification of programs in the past decades. It has been successfully applied in analysis of deterministic (Hoare, 1969, 1971a; Winskel, 1993), nondeterministic (Dijkstra, 1975, 1976; Apt, 1984), recursive (Hoare, 1971b; Foley and Hoare, 1971; Apt et al., 2009b), probabilistic (Ramshaw, 1979; Den Hartog and de Vink, 2002; Chadha et al., 2007; Rand and Zdancewic, 2015) and quantum programs (Ying, 2011; Liu et al., 2019; Unruh, 2019; Zhou et al., 2019; Deng and Feng, 2022). A comprehensive review of Hoare logic is referred to Apt, Boer, and Olderog (Apt et al., 2009a; Apt and Olderog, 2019).
Probabilistic Hoare Logic. Probabilistic Hoare logic (PHL) (Ramshaw, 1979; Den Hartog and de Vink, 2002; Chadha et al., 2007; Rand and Zdancewic, 2015) is an extension of Hoare logic. It introduces probabilistic commands to handle programs with randomized behavior, providing tools to derive probabilistic assertions that guarantee a program fulfills its intended behavior with certain probabilities. Nowadays PHL plays important roles in the formal verification of cryptographic algorithm (Corin and den Hartog, 2005; den Hartog, 2008; Barthe et al., 2009, 2012, 2013), machine learning algorithm (Sutskever et al., 2013; Srivastava et al., 2014) and others systems involving uncertainty.
Ramshaw (Ramshaw, 1979) developed the first Probabilistic Hoare Logic (PHL) using a truth-functional assertion language, where logic formulas are interpreted as either true or false. This type of PHL is called satisfaction-based PHL within the Hoare logic community. There are two types of formulas in this logic: deterministic formulas and probabilistic formulas. The truth value of deterministic formulas is interpreted on program states, which are functions that map program variables to their values. On the other hand, the truth value of probabilistic formulas is interpreted on the probability distribution of program states. However, Ramshaw’s PHL is incomplete and may not be able to prove some simple and valid assertions.
To address this problem, expectation-based PHL was introduced in a series of work (Kozen, 1985; Jones, 1990; Morgan et al., 1996; Morgan and McIver, 1999). This approach employs arithmetical assertions instead of truth-functional assertions. In this context, a Hoare triple represents that the expected value of the function after the execution of program should be at least as high as the expected value of the function before the execution.
Different Probabilistic Commands. Satisfaction-based PHL was developed further by den Hartog, Vink and Ricardo (Den Hartog and de Vink, 2002; Corin and den Hartog, 2005; den Hartog, 2008). Their PHL captures randomized behaviors by probabilistic choices, where the command is chosen with probability and the command is chosen with probability , represented as . They also provide a denotational semantics accordingly and establish the completeness of the proof system without a while-loop. On the other hand, Chadha et al. (Chadha et al., 2007) constructed their PHL by incorporating randomness from tossing a biased coin. They showed that their PHL without the while-loop is complete and decidable. Rand and Zdancewic (Rand and Zdancewic, 2015) established the randomness of their PHL by also using a biased coin. They formally verified their logic in the Coq proof assistant.
Our Contribution. While recent work (Batz et al., 2021) has proved that expectation-based PHL with the While loop is relatively complete, the work to date has not proven the relative completeness of any satisfaction-based PHL with the While loop. This is just the main contribution of this paper. To elaborate:
(1)
We propose a new satisfaction-based PHL in which the randomness is introduced by the command of probabilistic assignment, i.e., . This construction makes our logic concise in expressing random assignments with respect to discrete distribution, which are commonly seen in areas of cryptography, computer vision, coding theory and biology (Gordon et al., 2014). For example, in cryptographic algorithms, almost all nonces are chosen from some prepared discrete distributions on integers, rational or real numbers. Similarly, in the phase of parameter setting, a machine learning algorithm would choose parameters from a distribution over floating point numbers w.r.t. with some accuracy (discrete as well). The probabilistic assignment also brings a lot of convenience to the completeness proof since it can be treated as a probabilistic extension of the normal assignment. It is also expressively equivalent to the existing randomized commands, like probabilistic choices and biased coins.
(2)
We find out the appropriate weakest preterm for probabilistic expressions w.r.t. While-loop. It shows how While-loop changes the probabilistic properties of computer states, considering both execution branches that halt and infinite runs. As a preview, we prove the relative completeness of our PHL in two steps. We first establish a proof system of Hoare triples with deterministic assertions. Then, by utilizing the weakest precondition of deterministic assertions, we construct the weakest preterm calculus of probabilist expressions. The relative completeness of our PHL is then obtained as an application of the weakest preterm calculus.
The outline of this paper is as follows. We first introduce our PHL with deterministic assertions in Section 2. We define the denotational semantics of deterministic assertions, construct a proof system and show that it is sound and relatively complete. Then Section 3 introduces the proof system for probabilistic assertions based on weakest preconditions and proves that it is relatively complete as well. We conclude this paper with future work in Section 4.
2. Probabilistic Hoare Logic with Deterministic Assertion
Hoare logic is a formal system that reasons about ”Hoare triples” of the form . A Hoare triple characterizes the effect of a command on the states that satisfy the precondition , which means that if a program state satisfies , it must also satisfy the postcondition after the correct execution of on the state. These assertions, also known as formulas, are built from deterministic and probabilistic expressions and will be defined in this section and the next. The commands are based on classical program statements such as assignment, conditional choice, while loop, and so on. This section will focus on the deterministic formulas.
2.1. Deterministic Expressions and Formulas
Let be a set of program variables denoted by capital letters. Let be a set of logical variables. We assume and are disjoint. Program variables are those variables that may occur in programs. They constitute deterministic expressions.
Deterministic expressions are classified into arithmetic expression and Boolean expression . The arithmetic expression consists of integer constant and variables from . It also involves arithmetic operators between these components. The arithmetic operator set is defined as . In contrast, logical variables are used only in assertions.
Definition 2.1 (Arithmetic expressions).
Given a set of program variables , we define the arithmetic expression as follows:
.
This syntax allows an arithmetic expression () to be either an integer constant (), a program variable (), or a composition of two arithmetic expressions () built by an arithmetic operation (). They intuitively represent integers in programs.
The Boolean constant set is . We define relational operators () to be performed on arithmetic expressions including . And logical operators, e.g., , can be applied to any Boolean expressions.
Definition 2.2 (Boolean expressions).
The Boolean expression is defined as follows:
A Boolean expression represents some truth value, true or false. The expression represents that the truth value is determined by the binary relation between two integers.
The semantics of deterministic expressions is defined on deterministic states which are denoted as mappings . Let be the set of all deterministic states. Each state is a description of the value of every program variable. Accordingly, the semantics of arithmetic expressions is which maps each deterministic state to an integer. Analogously, the semantics of Boolean expressions is which maps each state to a Boolean value.
Definition 2.3 (Semantics of deterministic expressions).
The semantics of deterministic expressions are defined inductively as follows:
=
=
=
=
=
=
=
=
As mentioned above, the interpretation of an arithmetic expression is an integer. A program variable on a deterministic state is interpreted as its value on the state. A constant is always itself over any state. An arithmetic expression is mapped to the integer calculated by the operator applied on the interpretation of and the interpretation of on the state. The Boolean expressions can be understood similarly. For example, let be a state such that and . Then and .
Next we define deterministic formulas based on deterministic expressions.
Definition 2.4 (Syntax of deterministic formulas).
The deterministic formulas are defined by the following BNF:
where represents arithmetic expression build on :
.
We restricts to the classical operators: and . and can be expressed in the standard way. The formula applies universal quantifier to the logical variable in formula .
An interpretation is a function which maps logical variables to integers. Given an interpretation and a deterministic state , the semantics of is defined as follows.
=
=
=
=
The semantics of a deterministic formula is denoted by which represents the set of all states satisfying .
Definition 2.5 (Semantics of deterministic formulas).
The semantics of deterministic formulas is defined inductively as follows:
=
=
=
=
=
for all integer and ,
The defaults to all deterministic states , while is interpreted as the empty set. The symbol denotes complement, and represents the set of remaining states in after removing all states satisfying . The logical operations and between formulas can be interpreted as intersection and union operation of state sets which satisfy corresponding formulas, respectively. And the formula is satisfied on a deterministic state with interpretation if and only if is true with respect to all interpretations which assigns the same values to every variable as except .
For example, let be a state such that and let . The deterministic formula is satisfied on , i.e. . It is also valid (satisfied on arbitrary state and interpretation).
2.2. Commands
Commands are actions that we perform on program states. They change a deterministic state to a probabilistic distribution of deterministic states. We introduce the probabilistic assignment command to capture the randomized executions of probabilistic programs.
Definition 2.6 (Syntax of command expressions).
The commands are defined inductively as follows:
where in which is a set of integers and are real numbers such that and . We omit those s when they are all equal to . is deterministic formula.
The command skip represents a null command doing nothing. is the deterministic assignment. is the sequential composition of and as usual. The last two expressions are the conditional choice and loop, respectively. can be read as a value is chosen with probability and is assigned to . The probabilistic assignment is the way to introduce randomness in this paper. It is worth noting that the language we use for command expressions is just as expressive as the languages that are constructed by using biased coins or probabilistic choices. This can be easily understood through an example: a probabilistic choice is equivalent to the following program in our language (assuming that is a new program variable):
The semantics of commands is defined on probabilistic states. It shows how different commands update probabilistic states. A probabilistic state, denoted by , is a probability sub-distribution on deterministic states, i.e., . Thus, each requires that . We use sub-distributions to take into account the situations where some programs may never terminate in certain states. For a deterministic state , is a special probabilistic state that assigns the value of 1 to and the value of 0 to any other state. We call it the probabilistic form of a deterministic state. A deterministic state is considered to be a support of if . The set of all supports of is denoted by .
Definition 2.7 (Semantics of command expressions).
The semantics of commands is a function . It is defined inductively as follows:
•
•
•
•
•
•
The command skip changes nothing. We write to denote the state which assigns variables the same values as except that the variable is assigned the value . Here denotes the distribution restricted to those states where is true. Formally, with if and otherwise. We can write to denote if the initial state is deterministic.
In general, if , and . Then it means that executing command from state will terminate on state with probability .
Example 2.8.
Let and let be a deterministic state such that . If we run the command on , then distribution is obtained.
Example 2.9.
Let 0 be the probabilistic state that maps every deterministic state to 0.
For any probabilistic state ,
This is because
It’s easy to see that and for all . Therefore, . The statement implies that certain programs that never terminate result in probabilistic states 0.
Example 2.10.
Assume that there are two variables and infinitely many states where , for all .
Consider the command . If we let , then and for all .
Definition 2.5 gives the semantics of deterministic formulas over deterministic states. A deterministic formula describes some property of deterministic states. But how to evaluate a deterministic formula on probabilistic states? The semantics is given as follows:
iff for each support of , .
We call it possibility semantics because the definition intuitively means that a formula is true on a probabilistic state if and only if is true on all possible deterministic states indicated by the probabilistic state. That implies that all supports of a distribution share a common property. Hence we can claim that the distribution satisfies the formula. The possibility semantics makes our PHL with deterministic formula (PHLd) essentially equivalent to Dijkstra’s non-deterministic Hoare logic (Dijkstra, 1975). However, the former serves as a better intermediate step towards PHL with probabilistic formulas than the latter. Therefore, we will still present PHLd in detail, especially the weakest precondtion calculus of PHLd, which is not concretely introduced in non-deterministic Hoare logic.
2.3. Proof System with deterministic assertions
A proof system for PHL is comprised of Hoare triples. A Hoare triple, written as , is considered valid if, for every deterministic state that satisfies , executing command C results in a probabilistic state that satisfies . Formally,
if for all interpretation and deterministic state , if , then .
We now build a proof system for PHLd for the derivation of Hoare triples with probabilistic commands and deterministic assertions. Most rules in our proof system are standard, and they are inherited from Hoare logic or natural deduction (Apt and Olderog, 2019). Only one new rule for probabilistic assignment is added, along with some structural rules. The symbol represents the formula which replaces every occurrence of in with .
Definition 2.11 (Proof system of PHLd).
The proof system of PHLd consists of the following inference rules:
The majority of the above inference rules are easy to comprehend. is special since it involves semantically valid implications in the premise part. It characterizes the monotonicity of Hoare triples, which means that a stronger precondition must also lead to the same postcondition or some weaker one. In rule , formula is called loop invariant which will not be changed by command . In the remaining part of this section, we prove the soundness and completeness of PHLd. Most of the proofs are similar to their analogue in classical Hoare logic, the confident readers may feel free to skip them.
Theorem 2.12 (Soundness).
For all deterministic formula and and command , implies .
Proof.
We prove by structural induction on . Let be an arbitrary interpretation.
•
(SKIP) It’s trivial to see that that .
•
(AS) Assume . This means that is true if the variable is assigned to the value and all other values are assigned to a value according to . Let . Then assigns to the value and all other variables to the same value as . Therefore, .
•
(PAS) Assume . This means that is true if the variable is assigned to the any of and all other values are assigned to a value according to . Let . Then is a distribution with support , where for . Since assigns to the value and all other variables to the same value as . We know that . This means that is true on all supports of . Therefore, .
•
(SEQ) If rule (SEQ) is used to derive from and , then by induction hypothesis we have and . Assume . Let be an arbitrary state which belongs to the support of . From we know that there is a state such that . Now by we know that and . By we know that and .
•
(IF) Assume , and . By induction hypothesis we know that and . Let be an arbitrary state which belongs to .
Since is a deterministic state, we know that either or .
–
If , then . Hence . From and we know that . Now by we deduce that . Therefore, .
–
If , then . Hence . From and we know that . Now by we deduce that . Therefore, .
•
(CONS) Assume , and . By induction hypothesis we obtain that . Let be a state such that . Then by we know that . By we know that . Hence for all which belongs to . Now by we know that .
•
(AND) Assume and . By induction hypothesis we know that and . Let be a state such that . Let be a state in .
Then and by we have . Hence