Binary decision diagrams https://en.wikipedia.org/wiki/Binary_decision_diagram are a data structure for storing boolean functions, spiritually something like [Bool] -> Bool. You can tabulate such a thing, but there is a lot of shared substructure. I like to think of binary decision diagrams as a normal form of if-then-else expressions that have been hash consed into a dag. You can then check for equality of the functions as pointer equality.

You pick an ordering to your variables and push around the if-then-elses to order the variable split on if(a,if(b, c1, c2), if(b, c3, c4)) = if(b, if(a, c1, c3), if(a, c2, c4)) and compress using the identity if(a, c, c) = c. This makes BDDs a canonical example of a “hash cons modulo theories” https://www.philipzucker.com/pldi_2025/ https://www.philipzucker.com/hashing-modulo/ with this equational theory baked in. One does not usually literally use rewriting mechanisms to do them. BDDs are so simple they are usually quite

A BDD can also be seen as a compressed trie representation https://en.wikipedia.org/wiki/Radix_tree . Again, the unusual thing compared to most other tries is that you hash cons / intern the subtries.

Finding satisfying inputs to the function is easily doable given a small BDD. You can also count the total number of solutions. I’ve been turned back on to BDDs by being intrigued by Roulette https://dl.acm.org/doi/10.1145/3729334 , which does exact inference for probablistic programs. The model counting abilities of BDDs and relatives is important here.

BDDs (A,B,C,...) -> Out can be extended easily to any argument types A,B,C,... that are finitely enumerable (or if you only need finitely many exceptional cases besides a default case) and there is only the restriction of being able to hash cons the resulting output value type Out. Any names for the variables which are totally ordered work.

Just like functions, BDDs are an applicative functor https://en.wikipedia.org/wiki/Applicative_functor (to use some Haskellese). That means it makes sense to map them, but also to map binary operations on their output type. map2

Without Hash Consing

You can consider BDDs without the hash consing and they still work. They just don’t share right so you’re missing out on possibly exponential compression.

It is nice to show though because it is simpler and avoids the extra junk of the hash consing.

The basic interesting thing is liftA2 / map2. It checks on the variable ordering and recurses. If they are equal it kind of zips. If they aren’t equal it recurses apart the one with the bigger variable. This is the sort of thing you see when you merge together tries.

from typing import NamedTuple
import operator
type BoolExpr = "Const" | "ITE" 
import numpy as np

class Const(NamedTuple):
    val : object

    def liftA2(self, op, other):
        return other.map(lambda x: op(self.val,x))
    def map(self, f):
        return Const(f(self.val))

class ITE(NamedTuple):
    var : object
    left : BoolExpr
    right : BoolExpr

    def map(self, f):
        left = self.left.map(f)
        right = self.right.map(f)
        if left == right:
            return left
        else:
            return ITE(self.var, left, right)

    def __invert__(self):
        return self.map(lambda x: not x)

    def liftA2(self, op, other): # map2
        """
        BDD is basically [Bool] -> Bool. This is an applicative functor
        """
        if isinstance(other, Const):
            return self.map(lambda x: op(x , other.val))
        elif isinstance(other, ITE):
            if other.var == self.var:
                var = self.var
                left = self.left.liftA2(op, other.left)
                right = self.right.liftA2(op, other.right)
            elif other.var > self.var:
                var = other.var
                left = self.liftA2(op, other.left)
                right = self.liftA2(op, other.right)
            elif other.var < self.var:
                var = self.var
                left = self.left.liftA2(op, other)
                right = self.right.liftA2(op, other)
            else:
                raise ValueError("incomparable variables")
            if left == right:
                return left
            else:
                return ITE(var, left, right)
        else:
            raise ValueError("unexepcted", other)
    def __and__(self, other):
        return self.liftA2(operator.and_, other)
    def __or__(self, other):
        return self.liftA2(operator.or_, other)
    def __xor__(self, other):
        return self.liftA2(operator.xor, other)
    

TRUE = Const(True)
FALSE = Const(False)
def Var(x):
    return ITE(x, TRUE, FALSE)

def Vars(names):
    return [Var(x) for x in names.split()]
x0, x1 = Var(0), Var(1)
assert (x0 & x1) & x1 == (x1 & x0)

def BitVec(name, N):
    # could throw it in a numpy array. Maybe not that useful. + won't work right.
    x = np.empty(N, dtype=object)
    x[:] = [Var(name + str(i)) for i in range(N)]
    return x

x0 | x1 ^ x0
assert ~(~x0 | x1) == x0 & ~x1



def half_add(a,b):
    return a ^ b, a & b
def full_add(a,b,c):
    return a ^ b ^ c, (a & b) | (c & (a ^ b))


class BitVec(NamedTuple):
    xs : BoolExpr
    def __add__(self, other):
        s,c = half_add(self.xs[0], other.xs[0])
        res = [s]
        for a,b in zip(self.xs[1:], other.xs[1:]):
            s,c = full_add(a,b,c)
            res.append(s)
        return s, c

Hash Consing

Taking inspiration from Type-Safe Modular Hash-Consing (a very good paper) https://usr.lmf.cnrs.fr/~jcf/publis/hash-consing2.pdf we’ll make one that works in a local hash cons. In this paper they also make a BDD as one of their examples.

If you ignore types, modules are basically just records. Python classes are basically just records too. A python class can contain another class although probably the static type checkers will not like this. Using this we can get a bit of safety from confusing trees from one hashcons from another hash cons. Not sure if the complexity is worth it.

One can also just have a global hash consing table that you clear occasionally.

A tricky bit is avoiding accidental traversal of the entire subtrees in some operations. The default __eq__ and __hash__ method for tuples will do this, so we overload them to consider the id. This plays funny with using a key to lookup in the hashcons table. Turning the key into a regular tuple first I think fixes this. functools.cache implements memoization which is very important because there are lots of shared subtrees. It is unfortunate that map cannot be memoized (?) because you can’t hash the op parameter.

It is possible the following is correct.

from typing import NamedTuple
import operator

import numpy as np
from dataclasses import dataclass
import dataclasses
import functools

@dataclass
class BDD():
    table : dict
    def __init__(self0):
        self0.table = {}
        # make new uple classes that belong to this BDD instance
        class Const(NamedTuple):
            val : object
            def liftA2(self, op, other):
                return other.map(lambda x: op(self.val,x))
            def map(self, f):
                return self0.const(f(self.val))
            # caching achieves a dynamic programming like effect
            @functools.cache
            def __and__(self, x):
                return x.map(lambda y: self.val & y)
            @functools.cache
            def __or__(self, x):
                return x.map(lambda y: self.val | y)
            @functools.cache
            def __invert__(self):
                return self0.const(False if self.val else True)
            def __len__(self):
                return 1 if self.val else 0
            def __eq__(self, other):
                return self is other
            def __hash__(self):
                return hash(id(self))
        type BoolExpr = "Const" | "ITE" 
        self0._Const = Const

        class ITE(NamedTuple):
            var : object
            left : BoolExpr
            right : BoolExpr

            def map(self, f):
                left = self.left.map(f)
                right = self.right.map(f)
                if left == right:
                    return left
                else:
                    return self0.ite(self.var, left, right)

            def __invert__(self):
                return self.map(lambda x: not x)

            def liftA2(self, op, other): # map2
                """
                BDD is basically [Bool] -> Bool. This is an applicative functor
                """
                if isinstance(other, Const):
                    return self.map(lambda x: op(x , other.val))
                elif isinstance(other, ITE):
                    if other.var == self.var:
                        var = self.var
                        left = self.left.liftA2(op, other.left)
                        right = self.right.liftA2(op, other.right)
                    elif other.var > self.var:
                        var = other.var
                        left = self.liftA2(op, other.left)
                        right = self.liftA2(op, other.right)
                    elif other.var < self.var:
                        var = self.var
                        left = self.left.liftA2(op, other)
                        right = self.right.liftA2(op, other)
                    else:
                        raise ValueError("incomparable variables")
                    if left == right:
                        return left
                    else:
                        return self0.ite(var, left, right)
                else:
                    raise ValueError("unexepcted", other)
            @functools.cache
            def __and__(self, other):
                return self.liftA2(operator.and_, other)
            @functools.cache
            def __or__(self, other):
                return self.liftA2(operator.or_, other)
            @functools.cache
            def __xor__(self, other):
                return self.liftA2(operator.xor, other)
            @functools.cache
            def __len__(self):
                return len(self.left) + len(self.right)
            def __eq__(self, other):
                return self is other
            def __hash__(self):
                return hash(id(self))
        self0._ITE = ITE
        self0.TRUE = self0.const(True)
        self0.FALSE = self0.const(False)
    def _hashcons(self, x : tuple) -> tuple:
        assert isinstance(x, tuple)
        # check that these are hash consed in _this_ BDD
        assert all(not isinstance(t, tuple) or isinstance(t, self._ITE) or isinstance(t, self._Const) for t in x)
        # I get away with this because Const turns into a 1-tuple and ITE turns into a 3-tuple
        key = tuple(x) #tuple(id(t) if isinstance(t, self.Node) else t for t in x)
        res = self.table.get(key)
        if res is None:
            self.table[key] = x
            return x
        else:
            return res
    def const(self, v):
        return self._hashcons(self._Const(v))
    def ite(self, c, t, e):
        return self._hashcons(self._ITE(c,t,e))
    def Var(self, x : object):
        return self.ite(x, self.TRUE, self.FALSE)
    def Vars(self, names : str):
        return [self.Var(name) for name in names.split()]

H = BDD()
H.const(True) == H.TRUE

H.TRUE & H.FALSE
H.TRUE & H.TRUE
x0, x1, x2 = H.Vars("x0 x1 x2")
x1 & H.TRUE
x1 & x2

ITE(var='x2', left=ITE(var='x1', left=Const(val=True), right=Const(val=False)), right=Const(val=False))

x1 & x2
print("done")
x1 & x2
len(x1 & x2 | x0)

done

3

Bits and Bobbles

Maybe next time I’ll actually show using BDDs.

This was brought on my this post, where I has kind of a cute hashcons I thought https://www.philipzucker.com/slotted_hash_cons/

Compared to SAT solvers, a small BDD really let’s you ask some complex questions like QBF or model counting

https://www.youtube.com/watch?v=csRVdViWzi4 Logic and Proof, Lecture 10: Binary Decision Diagrams - Paulson

Harrison has a section also in Handbook of Practical Logic

I first heard of BDDs https://www.coursera.org/learn/vlsi-cad-logic driving to Lincoln probably

BDDs for symbolic model checking https://en.wikipedia.org/wiki/Model_checking . You can track a set of reachable states in a bdd and hope they saturate. You can existentialized out the variables in the transition relation over and over.

There are also python packages that pack lower level faster bdd libraries. There might be something to having a flexible pure python one though. https://github.com/tulip-control/dd

Knuth The Art of Computer Programming, Volume 4, Fascicle 1 has BDD section https://dl.acm.org/doi/10.5555/1593023

https://crypto.stanford.edu/pbc/notes/zdd/

BDDs are a trie. BDD conjunction is a relative of / is worst case optimal join https://en.wikipedia.org/wiki/Worst-case_optimal_join_algorithm . BDDs are good for (boolean) quantifier elimination which is usually tough https://egraphs.zulipchat.com/#narrow/channel/328977-topic.2Ftheory/topic/Egraphs.20modulo.20T/near/520129390 . In some sense Bddbdddb was a WCOJ https://egraphs.zulipchat.com/#narrow/channel/328972-general/topic/Is.20WCOJ.20the.20same.20as.20Grobner.20Bases.3F/near/468226525 I feel like Remy said this on the egraph zulip, but I can’t find it.

There are also SDDs https://en.wikipedia.org/wiki/Sentential_decision_diagram which are an intriguing generalization of BDDs in another direction https://github.com/ML-KULeuven/PySDD https://github.com/neuppl/rsdd https://neuppl.github.io/rsdd-docs/

https://www.cs.cmu.edu/~bryant/pubdir/fmcad22.pdf tbuddy a proof generating sat solver buddy cudd

zdd https://en.wikipedia.org/wiki/Zero-suppressed_decision_diagram I’ve never had as crisp a grasp on what zdds are about

pgbdd https://github.com/rebryant/pgbdd pgpbs https://github.com/rebryant/pgpbs-artifact

“Knowledge Compilation” https://arxiv.org/abs/1106.1819 Darwiche

Use model counting to get entropy of an ising model in microcanonical ensemble (fixed energy)

Roulette uses a BDD for the

https://people.csail.mit.edu/mcarbin/papers/aplas05.pdf Using Datalog with Binary Decision Diagrams for Program Analysis

https://drive.google.com/file/d/17ezRWes7OkszARizFBPOqwUUqrGj8h6Q/view automata and computability book ganesh Gopalakrishnan

Brozkowski minimization - reverse dfa to nfa, determinize, then that is minimal? But do it twice back to original language? NFA can have multiple start states.

https://users.cs.utah.edu/~ganesh/Jove.html

https://github.com/ganeshutah/Jove

https://github.com/mvcisback/bdd2dfa

Another refactoring

It’s not clear to me exactly what the best factoring of ideas is. Here nodes keep reference to the bdd they belong to. This does seem less weird looking. A downside is that functools.cache is using the saem cache for all bdds. Could make a new combinator and have the bdd own the func cache.

from typing import NamedTuple
import operator

import numpy as np
from dataclasses import dataclass
import dataclasses
import functools

type InLabel = object
type Out = object

class Const(NamedTuple):
    bdd : "BDD"
    val : Out
    def liftA2(self, op, other):
        return other.map(lambda x: op(self.val,x))
    def map(self, f):
        return self.bdd.const(f(self.val))
    # caching achieves a dynamic programming like effect
    @functools.cache
    def __and__(self, x):
        return x.map(lambda y: self.val & y)
    @functools.cache
    def __or__(self, x):
        return x.map(lambda y: self.val | y)
    @functools.cache
    def __invert__(self):
        return self.bdd.const(False if self.val else True)
    def __len__(self):
        return 1 if self.val else 0
    def __eq__(self, other):
        return self is other
    def __hash__(self):
        return hash(id(self))
type BoolExpr = "Const" | "ITE" 

class ITE(NamedTuple):
    bdd : "BDD"
    var : InLabel
    left : BoolExpr
    right : BoolExpr

    def map(self, f):
        left = self.left.map(f)
        right = self.right.map(f)
        if left == right:
            return left
        else:
            return self.bdd.ite(self.var, left, right)

    def __invert__(self):
        return self.map(lambda x: not x)

    def liftA2(self, op, other): # map2
        """
        BDD is basically [Bool] -> Bool. This is an applicative functor
        """
        if isinstance(other, Const):
            return self.map(lambda x: op(x , other.val))
        elif isinstance(other, ITE):
            if other.var == self.var:
                var = self.var
                left = self.left.liftA2(op, other.left)
                right = self.right.liftA2(op, other.right)
            elif other.var > self.var:
                var = other.var
                left = self.liftA2(op, other.left)
                right = self.liftA2(op, other.right)
            elif other.var < self.var:
                var = self.var
                left = self.left.liftA2(op, other)
                right = self.right.liftA2(op, other)
            else:
                raise ValueError("incomparable variables")
            if left == right:
                return left
            else:
                return self.bdd.ite(var, left, right)
        else:
            raise ValueError("unexepcted", other)
    # caches aren't bdd specific.
    @functools.cache
    def __and__(self, other):
        return self.liftA2(operator.and_, other)
    @functools.cache
    def __or__(self, other):
        return self.liftA2(operator.or_, other)
    @functools.cache
    def __xor__(self, other):
        return self.liftA2(operator.xor, other)
    @functools.cache
    def __len__(self):
        return len(self.left) + len(self.right)
    def __eq__(self, other):
        return self is other
    def __hash__(self):
        return hash(id(self))


@dataclass
class BDD():
    #table : dict
    def __init__(self):
        self.table = {}
        self.TRUE = self.const(True)
        self.FALSE = self.const(False)
    def _hashcons(self, x : tuple) -> tuple:
        assert isinstance(x, tuple)
        assert all(t.bdd is self for t in x if isinstance(t, ITE) or isinstance(t, Const))
        key = x[1:] # remove bdd field?  #tuple(id(t) if isinstance(t, self.Node) else t for t in x)
        res = self.table.get(key)
        if res is None:
            self.table[key] = x
            return x
        else:
            return res
    def const(self, v):
        return self._hashcons(Const(self, v))
    def ite(self, c, t, e):
        return self._hashcons(ITE(self, c,t,e))
    def Var(self, x : object):
        return self.ite(x, self.TRUE, self.FALSE)
    def Vars(self, names : str):
        return [self.Var(name) for name in names.split()]

bdd = BDD()
x,y,z = bdd.Vars("x y z")
assert x & y is (y & x | bdd.FALSE)

# simple global table version over tuples
table = {}
def hashcons(x : tuple[...]) -> tuple[...]:
    """
    hashcons returns the actual memory of a canonical tuple assuming that all subterms have already been hashconsed.
    Now one can use `is` to compare canonical terms for equality, which is pointer equality and fast.
    """
    assert isinstance(x, tuple)
    key = tuple(id(t) if isinstance(t, tuple) else t for t in x) # avoids recursion into inner tuples during hashing and equality check internal to table.get
    res = table.get(key)
    if res is None:
        table[key] = x
        return x
    return res

from typing import Protocol
class HashSig(Protocol):
    Node : type
    table : dict


class HashCons():
    def __init__(self):
        self.table = {}
        class Node(tuple):
            def __hash__(self): 
                print("hash", self)
                return hash(id(self))
            def __eq__(self, other): return self is other  #
        self.Node = Node
    def hashcons(self, x : tuple) -> tuple: # "self.Node"
        if isinstance(x, self.Node):
            return x
        assert isinstance(x, tuple)
        assert all(not isinstance(t, tuple) or isinstance(t, self.Node) for t in x)
        key = x #tuple(id(t) if isinstance(t, self.Node) else t for t in x)
        res = self.table.get(key)
        if res is None:
            x1 = self.Node(x)
            self.table[key] = x1
            return x1
        else:
            return res
h = HashCons()
a = h.hashcons(('a', 1, 2))
a1 = h.hashcons(('a', 1, 2))
a is ("a", 1, 2)
a[0]

h2 = HashCons()
b = h2.hashcons(('a', 1, 2))
print("here")
assert a is not b

h2.hashcons((b,))
print("there")
# h.hashcons((b,)) # error

And more spelled out. Using the generic map2 loses ought on short circuit evaluation.

    def __and__(self, other):
        if other is True:
            return self
        elif other is False:
            return False
        elif isinstance(other, ITE):
            if other.var == self.var:
                var = self.var
                left = self.left & other.left # still not right
                right = self.right & other.right
            elif other.var > self.var:
                var = other.var
                left = self & other.left
                right = self & other.right
            elif other.var < self.var:
                var = self.var
                left = other & self.left
                right = other & self.right
            else:
                raise ValueError("incomparable variables")
            if left == right:
                return left
            else:
                return ITE(var, left, right)
        else:
            raise ValueError("unexepcted", other)

Generalizing. We can hold anything in the leaves. We can hold any finite casing idea. piecewise branching, red,green,blue etc.

Tries.

class Cases(NamedTuple):
    var : object
    cases : list[tuple[object, ]]

A sat solver like context interface recording cnf instead of bdds.

import functools
from dataclasses import dataclass
Lit = int

@dataclass
class Ctx():
    cnf : list[list[Lit]]
    counter : int
    def __init__(self):
        self.cnf = []
        self.counter = 0
    def var(self, name=None):
        self.counter += 1
        return Var(self.counter, name, self)
    def apply(self, f, *args):
        res = self.var()
        lits = [l.idx for l in args]
        lits.append(res.idx)
        self.cnf.extend(clausify(lits, lambda *args: args[-1] == f(*args[:-1])))
        return res
    def add(self, v):
        # force v to be true. A way to add constraints
        self.cnf.append([v.idx])
    def check(self):
        cnf = pysat.CNF(from_clauses=self.cnf)
    def to_dimacs(self):
        out = [f"p cnf {len(self.cnf)} {self.counter}]"]
        for cls in self.cnf:
            out.append(" ".join(str(l) for l in cls) + " 0")


@dataclass
class Var():
    name : str
    idx : int # assume positive. We could inline negation a little bit.
    def __init__(self, idx=None, name=None, ctx=None):
        self.idx = idx
        self.name = name
        self.ctx = ctx    
    # This might memoize. Overloading _eq__ makes things a touch fishy.
    #@functools.cache
    def __or__(self, other): return self.ctx.apply(lambda x,y: x or y, self, other)
    def __invert__(self): return self.ctx.apply(lambda x: not x, self)
    def __and__(self, other): return self.ctx.apply(lambda x,y: x and y, self, other)
    def __eq__(self, other): return self.ctx.apply(lambda x,y: x == y, self, other)
    def __hash__(self): return hash(self.idx)


ctx = Ctx()
a = ctx.var("a")
b = ctx.var("b")

(a | b) & b

c = a | b
~a 
d = a | b
print(d)
print(c)
# some memoization might be nice
ctx.add(d == c)
ctx

Embedding Bdds in egglog as if then else expressions.

; Binary Decision Diagrams are if-then-else trees/ compressed tries that hash cons their leaves
; This is easily expressible in the facilities provided. Everything in egg-smol is automatcally shared
; and Compression is easily expressible as a rule.

; They are a notion of first class set useful for certain classes of uniformly describable sets.
; https://en.wikipedia.org/wiki/Binary_decision_diagram
; https://www.lri.fr/~filliatr/ftp/publis/hash-consing2.pdf Type-Safe Modular Hash-Consing - Section 3.3

(datatype BDD
    (ITE i64 BDD BDD) ; variables labelled by number
    (True)
    (False)
)

; compress unneeded nodes
(rewrite (ITE n a a) a)

(function and (BDD BDD) BDD)
(rewrite (and (False) n) (False))
(rewrite (and n (False)) (False))
(rewrite (and (True) x) x)
(rewrite (and x (True)) x)
; We use an order where low variables are higher in tree
; Could go the other way.
(rewrite (and (ITE n a1 a2) (ITE m b1 b2))
    (ITE n (and a1 (ITE m b1 b2)) (and a2 (ITE m b1 b2)))
    :when ((< n m))
)
(rewrite (and (ITE n a1 a2) (ITE m b1 b2))
    (ITE m (and (ITE n a1 a2) b1) (and (ITE n a1 a2) b2))
    :when ((> n m))
)
(rewrite (and (ITE n a1 a2) (ITE n b1 b2))
    (ITE n (and a1 b1) (and a2 b2))
)

(define b0 (ITE 0 (True) (False)))
(define b1 (ITE 1 (True) (False)))
(define b2 (ITE 2 (True) (False)))

(define b123 (and b2 (and b0 b1)))
(define b11 (and b1 b1))
(define b12 (and b1 b2))
(run 5)
(extract b11)
(extract b12)
(extract b123)
(check (= (and (ITE 1 (True) (False)) (ITE 2 (True) (False)))
       (ITE 1 (ITE 2 (True) (False)) (False)))
)
;(check (= b123 (ITE 3 ()))

(function or (BDD BDD) BDD)
(rewrite (or (True) n) (True))
(rewrite (or n (True)) (True))
(rewrite (or (False) x) x)
(rewrite (or x (False)) x)
(rewrite (or (ITE n a1 a2) (ITE m b1 b2))
    (ITE n (or a1 (ITE m b1 b2)) (or a2 (ITE m b1 b2)))
    :when ((< n m))
)
(rewrite (or (ITE n a1 a2) (ITE m b1 b2))
    (ITE m (or (ITE n a1 a2) b1) (or (ITE n a1 a2) b2))
    :when ((> n m))
)
(rewrite (or (ITE n a1 a2) (ITE n b1 b2))
    (ITE n (or a1 b1) (or a2 b2))
)

(define or121 (or b1 (or b2 b1)))
(run 5)
(extract or121)

(function not (BDD) BDD)
(rewrite (not (True)) (False))
(rewrite (not (False)) (True))
(rewrite (not (ITE n a1 a2)) (not (ITE n (not a1) (not a2))))

(function xor (BDD BDD) BDD)
(rewrite (xor (True) n) (not n))
(rewrite (xor n (True)) (not n))
(rewrite (xor (False) x) x)
(rewrite (xor x (False)) x)
(rewrite (xor (ITE n a1 a2) (ITE m b1 b2))
    (ITE n (xor a1 (ITE m b1 b2)) (or a2 (ITE m b1 b2)))
    :when ((< n m))
)
(rewrite (xor (ITE n a1 a2) (ITE m b1 b2))
    (ITE m (xor (ITE n a1 a2) b1) (or (ITE n a1 a2) b2))
    :when ((> n m))
)
(rewrite (xor (ITE n a1 a2) (ITE n b1 b2))
    (ITE n (xor a1 b1) (xor a2 b2))
)