Delving Simplicity Part Ⅱ: Combinator Completeness of Simplicity #308
roconnor-blockstream
announced in
Delving Simplicity
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This post is cross-posted in https://delvingbitcoin.org/t/delving-simplicity-part-combinator-completeness-of-simplicity/1935.
In Part Ⅰ of this series, we described the three methods of composition in programming: sequential, parallel, and conditional. In this part, we will see how these methods of composition are directly realized within Simplicity.
Simplicity Types
Parallel compositions produce an output that is a product of two types. We write this product type as
A × BorA * B. Conditional compositions consume an input that is a sum (also known as a tagged union) of two typesA + B. In order to directly realize these forms of composition, Simplicity’s type system includes these two type formers. Simplicity has only one other type: the unit type.Unit Type
Simplicity’s unit type, written
𝟙orONE, is a type with only one value. This single value is typically denoted by the empty tuple, written⟨⟩or(). Since this type consists of only a single value, values of this type contain no information; it is essentially a zero-bit data type.We use the
:to denote a value’s type, so⟨⟩ : 𝟙means the empty tuple has unit type.Sum Type
The sum of two types,
A + B, is the tagged union of two types. The tag is either “left” or “right,” indicating whether a given value corresponds to the left-hand type of the sum or the right-hand type. Ifa : A, then we writeσᴸ(a)orinl(a)for the left-tagged value of typeA + B(i.e.,σᴸ(a) : A + B). Similarly, ifb : B, then we writeσᴿ(b)orinr(b)for the right-tagged value (i.e.,σᴿ(b) : A + B).Values of a sum type always contains a tag, even when taking the sum of two identical types. In particular, if
a : A, thenσᴸ(a)andσᴿ(a)are two distinct values of typeA + Abecause they have different tags.Boolean Type
We will discuss building data structures in more detail in the next part of this series, but as a preview, let’s look at our first non-trivial type:
𝟙 + 𝟙. We typically denote this type using the shorthand𝟚orTWO. Keep in mind that this is purely notational convenience. This type is the tagged union of two unit types and therefore has two values:σᴸ⟨⟩ : 𝟚andσᴿ⟨⟩ : 𝟚. This is a one-bit data type. By convention, we considerσᴸ⟨⟩to be a zero or false value, which we may write as0orfalse. Similarly, we considerσᴿ⟨⟩to be a one or true value, which we may write as1ortrue. Again, this is just notational convenience.Product Type
The product of two types,
A × B, contains pairs of values. Ifa : Aandb : Bare values of typesAandB, then the pair, written as⟨a, b⟩or(a, b), has typeA × B(i.e.,⟨a, b⟩ : A × B).🛈 Simplicity only has finite types. You can read off the number of values a type has by interpreting the type using arithmetic. For example, the type
𝟚 × 𝟚has four values. The type𝟚 + 𝟚also has four values; however, they are different values from those of type𝟚 × 𝟚.Core Simplicity Expressions
In Part Ⅰ, we presumed we had some set of basic operations, and each operation had an input type and an output type. We will write
f : A ⊢ Borf : A |- Bto mean thatfhas input typeAand output typeB.⚠ While Simplicity expressions have input and output types, Simplicity’s types do not include function types. Simplicity is a “first-order” programming language.
Two Basic Operations
Simplicity provides several basic operations, most of which we will see later in this series when we talk about Simplicity jets. The core language provides expressions for only two basic operations:
iden : A ⊢ Aandunit : A ⊢ 𝟙. Technically, these are two families of operations, with one of each operation per Simplicity typeA.The
idenoperation passes its input along to its output. Theunitoperation discards its input and returns the empty tuple,⟨⟩.To be more precise, we need to distinguish between programming language syntax and semantics.
idenandunitare Simplicity expressions, which constitute the language’s syntax. Simplicity expressions denote operations, which are functions from an input type to an output type. We use fancy square brackets to indicate what a Simplicity expression denotes:⟦iden⟧(a) = a(alternatively written|[iden]|(a) = a)⟦unit⟧(a) = ⟨⟩Three Composition Combinators
The sequential and parallel composition methods are directly implemented by Simplicity combinators. Recall that Simplicity expressions denote functions. A combinator is a function that takes functions and returns a function. We follow the usual programming language practice of defining these Simplicity expressions using typing rules:
The first rule states that if
fis a Simplicity expression with input typeAand output typeB, andgis an expression with input typeBand output typeC, thencomp f gis a Simplicity expression with input typeAand output typeC. The second rule defines thepair f gexpression similarly.We may also write
comp f gasf ⨾ gorf >>> gand we may writepair f gasf ▵ gorf &&& g.These combinators form expressions of exactly the right type to implement sequential and parallel composition. Accordingly, these expressions have semantics for these forms of composition.
⟦f ⨾ g⟧(a) = ⟦g⟧(⟦f⟧(a))⟦f ▵ g⟧(a) = ⟨⟦f⟧(a), ⟦g⟧(a)⟩⚠ The version of sequential composition used in mathematics is backwards with respect to our syntax.
The other method of composition is conditional composition. The natural approach would be, given
f : A ⊢ Candg : B ⊢ C, to definecopair f g : A + B ⊢ C. However,copairdoesn’t allow branches to access shared data. In particular, the distribution function,dist : (A + B) × C ⊢ A × C + B × C,cannot be defined in terms of
copairalone.Instead of
copair, we use the followingcasecombinator instead.This
casecombinator does conditional composition and distribution all in one. It’s helpful to think of the typeCas the type of a shared environment that both branches of thecaseexpression get access to. Accordingly, thecasecombinator has the following semantics.⟦case f g⟧⟨σᴸ(a), c⟩ = ⟦f⟧⟨a, c⟩⟦case f g⟧⟨σᴿ(b), c⟩ = ⟦g⟧⟨b, c⟩Four more Combinators
We are still missing a few fundamental operations on our types. For instance, we can use
pairto produce product types, but we have no way yet to consume product types. Similarly, thecasecombinator consumes sum types, but we have no way yet to produce sum types. Four additional combinators enable the consumption of product types and production of sum types. The first two combinators extract one of the values of a pair, and execute a function on it:The last two combinators execute a function and wrap the result with either a left tag or a right tag.
These combinators have the required semantics:
⟦take f⟧⟨a, b⟩ = ⟦f⟧(a)⟦drop f⟧⟨a, b⟩ = ⟦f⟧(b)⟦injl f⟧(a) = σᴸ(⟦f⟧(a))⟦injr f⟧(a) = σᴿ(⟦f⟧(a))Simplicity and the Sequent Calculus
Readers familiar with formal logic will find a resemblance between our set of nine rules and the conjunctive-disjunctive fragment of Gentzen’s sequent calculus. The sequent calculus inspired Simplicity’s language design. One can think of Simplicity as a slightly tweaked variant of the “functional interpretation” of Gentzen’s sequent calculus, which is analogous to the Curry-Howard correspondence between natural deduction and the lambda calculus.
Notice that the premises in the combinator rules of the core Simplicity language are typically “smaller” than the types in the conclusion. Later in this series we will describe “the Bit Machine”, an abstract stack machine for interpreting Simplicity expressions. The Bit Machine takes advantage of this phenomenon of “smaller types in the premises” to minimize the amount of data copied during execution. Only the
idenandcompcore combinators involve moving data around, and the rest of the core combinators are implemented with just some bookkeeping.Values are not Expressions
Above we gave notation for the values of the various types that Simplicity supports. It is important to note that Simplicity expressions only denote operations, and operations are functions, not values. We can construct values by starting with the
unitfunction and composing it withinjl,injrandpaircombinators. To avoid writing out such tedious constructions, we introduce the following notation.Given a value of some Simplicity type,
b : B, we writescribe b : A ⊢ Bfor the unique Simplicity expression that always returns the valueb. For example,scribe ⟨σᴸ⟨⟩, σᴿ⟨⟩⟩is shorthand forinjl unit ▵ injr unit : A ⊢ 𝟚 × 𝟚. Keep in mind that,scribeisn’t a Simplicity combinator; it is a macro-like notional convenience.Bitcoin Script has a similar property; Bitcoin Script only contains operations. For example
OP_1is the operation that pushes the value1onto the stack, but there is no Bitcoin Script expression for the value1itself.Simplicity’s Completeness Theorem
Earlier we saw that the naïve realization of conditional composition would have left us unable to define the “distribution” function
dist : (A + B) × C ⊢ A × C + B × C. How do we know that we aren’t missing something else?The answer is that the “Simplicity Completeness theorem” proves that for any function between two Simplicity types there exists some Simplicity expression that denotes it.
We already saw
scribe, which is a special case of the Simplicity Completeness theorem for constant functions. For other functions, we can build an nested set ofcaseexpressions to fully decompose any input of any type and compose that with ascribeexpression for every output value. This procedure constructs what is effectively a giant lookup table to implement any function.But you don’t have to trust us. The core Simplicity language is formally specified in the Rocq proof assistant (formerly named the Coq proof assistant), and the Simplicity Completeness theorem is one of the theorems we have formally verified.
Conclusion
We described Simplicity’s type system, combinators, and basic expressions that make up Simplicity’s core computational language. Later in this series, we will describe how Simplicity interacts with transactions and introduce a few more Simplicity combinators.
But before delving into that, in Part Ⅲ, we will look at building data structures and computation from this seemingly meager language. While we do have the Simplicity Completeness theorem, the expressions that the theorem constructs are astronomical in size, making it only useful as a theoretical tool. The practice of generating succinct Simplicity expressions could be viewed as an exercise in compression by exploiting structure within computations.
Beta Was this translation helpful? Give feedback.
All reactions