String Knuth Bendix
I’ve become entranced by all the varieties of the knuth bendix completion algorithm as of late.
Knuth Bendix completion is a kind of equational reasoning algorithm. Given some generators of an equivalence relation like X + 0 = X
it converts them into a “good” rewriting system like X + 0 > X
.
Knuth Bendix does not have to be over terms. It can be over other things that have a notion of overlapping and ordering, graphs, strings, multisets, polynomials, etc. There are ways perhaps of modelling these things as terms, which is a unifying, but there can be computational and mental overhead in doing so.
Strings are in particular an interesting example and actually the original thing Knuth and Bendix were considering (I think).
When we say strings, we mean the same thing as sequences. There isn’t anything intrinsically textual about what we’re doing.
Having said that, it is neat that you can use the commonly available string manipulation libraries to achieve string rewriting needs. String rewriting is the analog of repeatedly applying find/replace rules.
Try it out on colab: https://colab.research.google.com/github/philzook58/philzook58.github.io/blob/master/pynb/20240909string_knuth.ipynb
def rewrite(s : str, R : list[tuple[str,str]]) > str:
"""Fully reduce s by rewrite rules R"""
s0 = ""
while s0 != s:
s0 = s
for lhs,rhs in R:
s = s.replace(lhs,rhs)
return s
assert rewrite("aaabbbaa", [("aa", "a")]) == "abbba" # remove duplicate a
assert rewrite("aaabbbaa", [("aa", "a"), ("bb", "b")]) == "aba" # remove duplicates
assert rewrite("aaabbbaa", [("aa", "a"), ("bb", "b"), ("ba", "ab")]) == "ab" # bubble sort
But actually for my purposes, I don’t want to be restricted to just sequences of characters, so I need to rebuild some functionality we had with python strings using python tuples.
Rewriting Tuples
In order to do matching, I need to know when one tuple is a substring (I originally called this subsequence, but have been informed that has another technical meaning).
def subseq(s,t):
"""Return index when s is a subsequence of t, None otherwise"""
for i in range(len(t)  len(s) + 1):
if s == t[i:i+len(s)]:
return i
return None
assert subseq((2,2), (1,1,1,2,2)) == 3
assert subseq((2,2), (1,1,1,2,2,2)) == 3
assert subseq((), (1,2,3)) == 0
assert subseq((3,4), (4,5,3)) is None
assert subseq((3,4), (4,5,3,4)) == 2
Next, we need to reimplement the replace
functionality of a single replacement rule.
def replace(s, lhs, rhs):
""""""
i = subseq(lhs, s)
if i is not None:
return s[:i] + rhs + s[i+len(lhs):]
else:
return s
assert replace((1,2,3,4), (2,3), (5,6)) == (1,5,6,4)
assert replace((1,2,3,4), (2,3), (5,6,7)) == (1,5,6,7,4)
assert replace((1,2,3,4), (2,3), (5,6,7,8)) == (1,5,6,7,8,4)
assert replace((1,1), (4,4), (2,2)) == (1,1)
Finally we can fully reduce tuples now
def rewrite(s, R, exclude=1):
# exclude is useful for simplifying a rule
while True:
s0 = s
for i,(lhs,rhs) in enumerate(R):
if i != exclude:
s = replace(s,lhs,rhs)
if s == s0:
return s
assert rewrite((1,2,3,4), [((2,3), (5,6)), ((5,6), (7,8))]) == (1,7,8,4)
assert rewrite((1,1,1,1,1,1), [((1,1), ())]) == ()
assert rewrite((1,1,1,1,2,1), [((1,1), ())]) == (2,1)
Knuth Bendix
So we’ve implemented running our string rewrite system. Nice.
The next piece I need is to implement our term ordering. The concept of a term ordering I think is the key point of term rewriting that is both nearly trivial in some sense but also not at all obvious.
A key property that is very useful in a rewriting system (or any algorithm) is termination. When we want to talk mathematically about termination, we talk about well founded relations. https://en.wikipedia.org/wiki/Wellfounded_relation Well founded transition relations are the ones that are guaranteed to terminate.
Orderings are also binary relations and we can cook up all sorts of ways of ordering things based on different intuitions and needs. If we choose an ordering that is well founded, then when the right hand sides of our rewrites are “smaller” than our left hand sides, our rewrite system is guaranteed to terminate.
Actually implementing the typical orderings (LPO, RPO, KBO) for terms is quite confusing. The basics string orderings are more straightforward.
We want short strings, so we can order by length. To tie break, we can use the build in lex ordering. This is called the shortlex ordering.
def shortlex(s,t):
""" order by length, then tie break by contents lex"""
if len(s) < len(t):
return t,s
elif len(s) > len(t):
return s,t
elif s < t:
return t,s
elif s > t:
return s,t
else:
assert False
Here’s things that are a bit trickier. In Knuth Bendix we need to find all non trivial overlaps of left hand sides in rules. These are the possible sources of nonconfluence aka the rule order mattering.
def overlaps(s,t):
"""critical pairs https://en.wikipedia.org/wiki/Critical_pair_(term_rewriting)"""
# make len(t) >= len(s)
if len(t) < len(s):
s,t = t,s
if subseq(s,t) is not None:
yield t
# iterate over possible overlap sizes 1 to the len(s) at edges
for osize in range(1,len(s)):
if t[osize:] == s[:osize]:
yield t + s[osize:]
if s[osize:] == t[:osize]:
yield s + t[osize:]
assert set(overlaps((1,2), (2,3))) == {(1,2,3)}
assert set(overlaps((1,2), (3,2))) == set()
assert set(overlaps((1,2), (2,1))) == {(1,2,1), (2,1,2)}
assert set(overlaps((1,2), (1,2))) == {(1,2)}
assert set(overlaps((2,2), (2,2,3))) == {(2,2,3), (2,2,2,3)}
assert set(overlaps((), (1,2))) == {(1,2)} # Hmm. Kind of a weird edge case
Given all these pieces, knuth bendix is a pretty simple almost brute force way of inferring equations and rules turning a set of equations into a good set of rules.
In a big loop we
 For each pair of rules, we try to find a critical pair and add this to E
 Reduce all equations in E with respect to R to remove redundancy
 Orient equations in E to make them rewrite rules in R
def deduce(R):
"""deduce all possible critical pairs from R"""
for i, (lhs,rhs) in enumerate(R):
for j in range(i):
lhs1,rhs1 = R[j]
for o in overlaps(lhs1,lhs):
x,y = rewrite(o, [(lhs1, rhs1)]), rewrite(o, [(lhs, rhs)])
if x != y:
yield x,y
def KB(E):
E = E.copy()
R = []
done = False
while not done:
done = True
E.extend(deduce(R))
while E:
lhs, rhs = E.pop()
lhs, rhs = rewrite(lhs,R), rewrite(rhs,R)
if lhs != rhs:
done = False
lhs, rhs = shortlex(lhs,rhs)
R.append((lhs, rhs))
return R
Here’s a nice example from https://haskellformaths.blogspot.com/2010/05/stringrewritingandknuthbendix.html .
a
and b
generate rotations and flips of a square. These equations aren’t normalizing though as is. We can run them through Knuth Bendix to get a normalizing set of rules.
A nice trick is to encode the inverse as a negative. These are opaque identitifiers and I have done nothin special in KB for accounting for inverses, but it is nice from a python syntax perspective.
e = 0
a = 1 # a is rottate square
b = 2 # b is flip square horizontally.
E = [
((a, a), ()), # inverse b * b = 1
((b,b), ()), # inverse a * a = 1
((a,a,a,a), ()), # a^4 = 1
((b,b), ()), # b^2 = 1
((a,a,a,b), (b,a)) #a^3 b = ba
]
print(E)
print(KB(E))
[((1, 1), ()), ((2, 2), ()), ((1, 1, 1, 1), ()), ((2, 2), ()), ((1, 1, 1, 2), (2, 1))]
[((1, 1, 1, 2), (2, 1)), ((2, 2), ()), ((1, 1, 1, 1), ()), ((2, 2), ()), ((1, 1), ()), ((1, 1, 1), (1,)), ((1, 1, 2), (1, 2, 1)), ((2,), (2,)), ((1, 2, 1), (2,)), ((1, 2, 1, 1), (1, 2)), ((1, 2, 1), (2, 1, 1)), ((2, 1, 2), (1,)), ((2, 1, 1, 2), (1, 1)), ((1, 1), (2, 2)), ((2, 2), ()), ((1, 2), (2, 1)), ((1, 2), (2, 1)), ((1, 1), (1, 1)), ((1, 1, 1), (1,))]
It’s a big pile. Here is a post processing to simplify the redundancies in the rules. A more efficient knuth bendix, like Huet’s algorithm, would do this as it goes. The simplification of the left hand side of rules is a bit tricky. I’m not particularly sure I did it right here.
def simplify(R):
Rnew = []
E = []
for i, (lhs,rhs) in enumerate(R):
# lhs = reduce(Rnew)
lhs1 = rewrite(lhs, R, exclude=i) # Lsimplify. nebulous correctness. I might be playing it both ways here. I keep around the old R even though I should have moved it to E?
rhs1 = rewrite(rhs, R) # Rsimplify
if lhs1 == lhs:
Rnew.append((lhs,rhs1))
elif lhs1 != rhs1:
E.append((lhs1, rhs1))
return E,Rnew
R = KB(E)
E,R = simplify(R)
R
[((1, 1), ()),
((2,), (2,)),
((1, 1), ()),
((2, 2), ()),
((1, 2), (2, 1)),
((1, 2), (2, 1)),
((1, 1), (1, 1)),
((1, 1, 1), (1,))]
We’ve deduced that $b^{1} = b$ for the flips.
Just to confirm I haven’t lost completeness, I can derive all of the original axioms using my new simplified system R.
E = [
((a, a), ()), # inverse b * b = 1
((b,b), ()), # inverse a * a = 1
((a,a,a,a), ()), # a^4 = 1
((b,b), ()), # b^2 = 1
((a,a,a,b), (b,a)) #a^3 b = ba
]
for x,y in E:
print(rewrite(x, R), rewrite(y, R))
() ()
() ()
() ()
() ()
(2, 1) (2, 1)
Bits and Bobbles
This post was aided by many discussions with Cody Roux, my friend / personal term rewriting serf.
Note that GAP already ships a string knuth bendix. This is accessible from python via sage https://www.sagemath.org/
#https://doc.sagemath.org/html/en/reference/groups/sage/groups/finitely_presented.html
from sage.all import *
F = FreeGroup(3)
x0,x1,x2 = F.gens()
G = F / [x0**2, x1 * x0]
G
F([1])
F.gap()
G.relations()
k = G.rewriting_system()
k.make_confluent()
k
G.simplification_isomorphism()
Where I’m going with my next post is to tie in string rewriting to an egraph giving a “seqeunce egraph”. The “characters” of the string are ground terms. Because this egraph embeds string rewriting, it isn’t guaranteed to terminate.
I want this because programs often have sequence of instructions. Instrinsic associativity in the style of https://www.philipzucker.com/linear_grobner_egraph/ seems useful.
https://github.com/sympy/sympy/blob/master/sympy/combinatorics/rewritingsystem.py hmm also here
https://github.com/gappackages/kbmag
 https://www.sciencedirect.com/science/article/pii/S1567832610001025 efficiency issues of kbmag  swan https://www.cs.stir.ac.uk/~kjt/techreps/pdf/TR197.pdf Augmenting Metaheuristics with Rewriting Systems
 https://maffsa.sourceforge.net/old_manpages/maf.html MAF
 rkbp
Useful books:
 Epstein book Word Processing in Groups
 Handbook of computational group theory
 Charles Sims book  Computation with Finitely Presented Groups
 Term Rewriting and All That
https://community.wolfram.com/groups//m/t/3217387 Monoids, stringrewriting, confluence, and the KnuthBendix Algorithm  Mathematica
https://haskellformaths.blogspot.com/2010/05/stringrewritingandknuthbendix.html
The automata anlge is really intriguing, becayse I’d hope I could use off the shelf high perf stuff to achieve the rewriting. Maybe BurntSushi?
Word processing in groups automatic groups https://en.wikipedia.org/wiki/Automatic_group word acceptor. Is this for partiality? the valid stings = {w  accept(w)} ? Could have automata that identifies strings that correspond to particular elements wa=v is the same asa recognizer a=i(w)v
What’s the deal with the automata stuff
Twee sorting. If we specialize commutiativyt, a very bad rule, to the individual guys, it becopmes sorting rules
This rewrite system will sort.
Computational group theory. Orbits. Caleb’s graph hashing. Cayley Graph. Nauty
Free monoid I can totalize the groupoid of paths into path fragment sequences? Use string KB a la grobner egraphs. String
shortlex
https://math.stackexchange.com/questions/684066/knuthbendixcompletionalgorithmwordproblem
https://docs.gapsystem.org/pkg/kbmag/doc/manual.pdf
Strings are nice mathematically because they bake in an associative operation of concatenation, which we use to model other associative operations like group multiplication. Most typical algebraic thingies have associativity, and cutting out associativty computationally by brute force is awful. We often bake in associativty into our notation by just forgetting to bother with parenthesis, which makes proofs incvolving associativity so trivial as to be beneath notice.
Strings can also represent paths.

Strings can be modelled as terms. The string
xyz
in a left string contextA
and a right string contextB
,AxyzB
, can be modelled in a couple different way:x(y(z(B)))
orcons(x,cons(y,cons(z,cons(B)))
or reversedlyz(y(x(A)))
. It’s interesting that we tend to make the string contexts A & B implicit, whereas the upper term context is usually implicit but the lower child context is not implicit and instead represented by variables. Food for thought.  It is to my knowledge difficult to shallowly embed term rewriting into string rewriting. It is possible to embed because string rewriting is turing complete, so the brute force solutions is build a turing machine into string rewriting, then write a program to do term rewriting on that turing machine. Yuck.
 Ground term rewriting is trivially embeddable into string rewriting. Just flatten / pretty print the term. The difference here is that term rewriting supports (multiple) variables, which is tough in strings. When we look at critical pairs, in ground terms the ONLY way to have non trivial overlap is for on term to be a subterm of the other. There are no interesting partial overlaps.
String knuth bendix is much simpler than term knuth bendix because the notion of overlap and reduction is so much simpler. You don’t need to go into some side discussion of unification and narrowing, which are topics into their own right.
A Interesting trick to if I need bulk rewriting, perhaps concat a special symbol in between my different strings I want reduced "$".join(string_db).replace()
Regular lexicographic order is not well founded b > ab > aab > aaab > …
https://www.philipzucker.com/ground_kbo/
Charles Sim book has heuristics, suggestions about indexing Automata for indexing?
Hyperscan. Burntsushi. Geoff langdale Can their tricks be applied or are those tricks about regex?
A way of generating all the normal forms is mentioned for this finite group. Enumerate all strings formed by appending the generators to something else that was a normal form.
def nfs(gens, R):
nf = [()]
seen = 0
while seen < len(nf):
s = nf[seen]
seen += 1
for g in gens:
t = (g,) + s
t1 = rewrite(t, R)
if t == t1:
nf.append(t)
return nf
nfs([1,2], KB(E))
[(), (1,)]
def subseq(s,t):
for i in range(len(t)  len(s)):
if s == t[i:i+len(s)]:
return i
return None
def overlaps(s,t):
if len(t) < len(s):
s,t = t,s
# t is bigger than s
#overlap_size == 0
#for i in range(len(t)  len(s)):
# if s == t[i:i+len(s)]:
if subseq(s,t) is not None:
yield t
for osize in range(1,min(len(s), len(t)) + 1):
if t[osize:] == s[:osize]:
yield t + s[osize:]
if s[osize:] == t[:osize]:
yield s + t[osize:]
list(overlaps( (2,2,2,1,2), (1,2,2) ))
[(1, 2, 2, 2, 2, 1, 2), (2, 2, 2, 1, 2, 2), (1, 2, 2, 2, 1, 2)]
proofs. don’t delete anything. Keep the active simplified set though. R = [] # write only explain = [] Rmin = {} # simmplified ids or tuples without exaplain
R = [((lhs,rhs),explain)] Rmin = {(lhs,rhs)} … {lhs:rhs} (?)
A trie / prefix trie seems useful.
preallocate numpy (?) for numba jit? np.zeros((N, M))
def huet(E,R):
R = set()
unmarked = []
while len(E) > 0 or len(unmarked) > 0:
while E:
(lhs,rhs) = E.pop()
lhs,rhs = rewrite(lhs, R), rewrite(rhs, R)
if lhs == rhs:
continue
lhs,rhs = shortlex(lhs,rhs)
Rnew = [(lhs,rhs)]
unmarked.append((lhs,rhs))
for (g,d) in R:
g1 = replace(g, (lhs,rhs))
if g1 == g:
d = rewrite(d, R)
Rnew.append((g,d))
else:
E.append((g1,d))
R = Rnew
if unmarked:
(lhs,rhs) = unmarked.pop()
for (g,d) in R:
for o in overlaps(g,lhs):
x = rewrite(o, R)
y = rewrite(o, [(lhs,rhs)])
if x != y:
E.append((x,y))
Knuth bendix can be made more goal driven. We don’t persay need to fully ocmplete the system if we just want to prove some particular equality.
Similarly, we can early stop knuth bendix is all we want is a simplified term. It isn’t obvious if we can show that a term is globally maximally simplified until KB finishes though.
def KBgoal(E, goal):
"""
goal driven KB.
"""
gl, gr = goal
while len(E) > 0:
if gl == gr:
# We can do early stopping given a particular goal.
return True
E,R = KB1(E, R)
gl, gr = reduce(gl, R), reduce(gr, R)
return False
# could also do binary combinations as goals. and / or / not of = !=
to be even more goal driven, we should generate critical pairs coming from the goal terms first. This is a bit (exactly?) like set of support. mark new rules as “tainted” by a goal term. This is interesting in the simplifcation driven context also. If we insist on a particular hyperesolution / UR resolution total grounding, we get egraph saturation.
The only thing we need to make egraph saturation complwete is throw in a KB step on the rules every once in a while.
def overlaps(s,t):
if len(s) > len(t):
s,t = t,s
for i in range(1, min(len(s), len(t)) + 1):
if s[i:] == t[:i]:
yield i
for i in range(len(t)  len(s)):
if s == t[i:i+len(s)]:
return i
"""
for i, (lhs,rhs) in enumerate(R):
for j in range(i):
lhs1,rhs1 = R[j]
for o in overlaps(lhs1,lhs):
E.append((reduce(o, [(lhs1, rhs1)]), reduce(o, [(lhs, rhs)])))
"""
e = 0
a = 1 # a is rottate square
b = 2 # b is flip square horizontally.
E = [
((a, a), ()), # inverse b * b = 1
((b,b), ()), # inverse a * a = 1
#((e,a), (a,)),
#((e,b), (b,)),
#((e,a), (a,)),
#((e,b), (b,)),
((a,a,a,a), ()),
((b,b), ()),
((a,a,a,b), (b,a))
]
print(E)
print(KB(E))
Old notes String rewriting systems
SemiThue systems Word problem Monoid presentation
converting to term rewriting system fff —> f(f(f(X)))
Term Rewriting using linux command line tools
The string search and manipulation tools are very powerful and efficient. They compile queries like grep into simple machines and such I think.
There’s a big difference between ground and non ground terms. They appear subtly different when latexed, but the are way different beats Ground terms equation proving can be done through the e graph efficiently.
Ground term rewriting seems to be identical to string rewriting. Just layout serially a traversal of the term.
The implicit prefix and suffix are the context of the term
rules = [
("aa", "b"),
("b", "c")
]
def run_rules(s,rules):
old_s = None
while s != old_s:
old_s = s
for lhs,rhs in rules:
s = s.replace(lhs,rhs)
return s
print(run_rules("ababaaaaccaaa", rules))
def naive_completion(rules):
old_rules = None
while rules != old_rules:
old_rules = rules.copy()
for (lhs,rhs) in old_rules:
a = run_rules(lhs, rules)
b = run_rules(rhs, rules)
if a == b:
break
if a < b:
rules.add((b,a))
if a > b:
rules.add((a,b))
return rules
# an incomplete reduction routine?
# Triangular rewrite rules in some sense.
# Is this right at all? This is like a chunk of Huet's. I think just moving R > E might be ok even if not one of listed rules. No, if I could do that ths move + simplfy would give me a more powerful Rsimplify.
# I might be weakening my rules. That's not so bad imo.
def reduce_rules(E):
# worklist style
R = set()
# Sort E so smallest last probably. Most reduction power.
E = sorted(E, key=lambda k: len(k[0]), reverse=True)
while len(E) > 0:
(a,b) = E.pop()
a = run_rules(a, R) # simplify
b = run_rules(b, R)
print(a,b)
if a == b: #delete
continue
if (len(b), b) > (len(a), a): # len then lex ordering
R.add((b,a))
if (len(a), a) > (len(b), b):
R.add((a,b))
return R
rules = {
("aaa", "a"),
("aaa", "c")
}
print(naive_completion(rules))
rules = [
("ffa", "a"),
("fa", "a")
]
print(run_rules("ffffffffffffa", rules))
print(reduce_rules(rules))
rules = [
("12+", "21+"), # an application of comm
("23+1+", "2+31+") # an application of assoc
]
# I am really going to want a notion of indirection or compression here.
# Intern strings
class RPN():
def __init__(self, s):
self.s = str(s)
def __add__(self,b):
return RPN(self.s + b.s + "+")
def __repr__(self):
return self.s
b0 = RPN(0)
b1 = RPN(1)
b2 = RPN(2)
b3 = RPN(3)
E = [
(b1 + b0, b0 + b1),
(b0 + b1, b1),
(b1 + b2, b3),
(b2, b1 + b1)
]
E = [
("10+", "01+"),
("01+", "1"),
("12+", "3"),
("2", "11+"),
("00+", "0"),
("00+1+1+1+", "3"),
]
print(reduce_rules(E))
E = reduce_rules(E)
print(run_rules("00+1+0+1+0+2+",E))
E = [( str(i) + str(j) + "+" , str(i + j)) for i in range(4) for j in range(10) if i + j <= 9]
print(E)
print(reduce_rules(E))
print(run_rules("00+1+0+1+0+2+",E))
Ropes
We can of course compile a rule set to do better than this. In some sense every string represents a possibly infinite class of strings posible by running rules in reverse
String rewriting systems are a bit easier to think about and find good stock library functionality for.
string rewriting is unary term rewriting. A variable string pattern “aaaXbbbb” is a curious object from that perspective. While simple, it is a higher order pattern. a(a(a(X(b(b(b(Y)))))))
. You can also finitize a bit. foo(a)
can be made an atomic character. Or you can partially normalize a term rewriting system to string rewriting form
String orderings Lexicographic comparison Length shortlex  first length, then lex symbol counts
def critical_pairs(a,b):
assert len(a) > 0 and len(b) > 0
if len(b) <= len(a):
a,b = b,a # a is shorter
cps = []
if b.find(a) != 1: # b contains a
cps.append(b)
for n in range(1,len(a)): # check if overlapping concat is possible
if b[n:] == a[:n]:
cps.append(b + a[n:])
if b[:n] == a[n:]:
cps.append(a + b[n:])
return cps
print(critical_pairs("aba", "ac"))
print(critical_pairs("aba", "ca"))
print(critical_pairs("abab", "ba"))
'''
def reduce_rules(rules): # a very simple reduction, reducing rhs, and removing duplicate rules
rules1 = set()
for (lhs,rhs) in rules:
rhs = run_rules(rhs,rules)
rules1.add((lhs,rhs))
return rules1
'''
#run_rules
#reduce_rules
Building a suffix tree might be a good way to find critical pairs.
Lempel Ziv / Lzw is the analog of hash consing? Some kind of string compression is. That’s fun.
http://haskellformaths.blogspot.com/2010/05/stringrewritingandknuthbendix.html
It seems like named pattern string rewriting and variable term rewriting might be close enough
You could imagine
((x + 0) + 0 + 0) laid out as + + + x 0 0 0. and the found ground rewite rule + x 0 > x being applied iteratively.
Labelling shifts: f(g(a), b) > f +0 g +1 a 1 b 0
the pattern
f(?x) > ?x
becoming
f +0
And if we number from the bottom?
f +2
and