This is a follow up to this question and Deedlit's answer.
I'm looking for a precise definition of the "hem?" (tree A homeomorphically embeddable in tree B?) relation, preferably in terms of a runnable program or function in some programming language, that accepts two trees (see below) as arguments and returns either true or false.
I've found many definitions on the web but no actual code so far. Deedlit gives the following definition:
Given two trees, S and T, we use the following comparison algorithm. First check, inductively, if S is less than or equal to any of the immediate subtrees of T; if so, then S < T. Similarly, check if T is less than or equal to any of the immediate subtrees of S; if so, then T < S. If neither of those checks apply, then compare the number of children of the root of S to the number of children of the root of T; the tree with the larger number is greater. Finally, if the roots of S and T have the same number of children, compare the immediate subtrees of S and T one by one, starting from the smallest pair, then going to the second smallest pair, etc. The first time you find two different immediate subtrees, the greater of the two will belong to the greater original tree.
Not sure if I understand that correctly. An actual implementation would be nice.
For comparison, one might define the well known "substring?" relation which is a well-quasi-order on strings as follows (using clojure but I don't really care about choice of programming language, they are mostly equivalent)
(defn substring? [x y]
(cond
(empty? x) true
(empty? y) false
:else (recur
(if (= (first x) (first y))
(rest x) x)
(rest y))))
This definition works if x and y are strings or sequences of characters.
Let's say we limit ourselves to trees with 2 labels a
and b
, represented by parens ()
and brackets []
just like in Deedlit's answer. First, we would probably parse the strings into trees of nested vectors. Using instaparse, the following grammar seems to work:
(def two-label-tree-grammar
"S = (a | b)
a = <'('> (a | b)* <')'>
b = <'['> (a | b)* <']'>")
(def two-label-tree-parser
(insta/parser two-label-tree-grammar))
Then (two-label-tree-parser "([()([][])])")
evaluates to [:S [:a [:b [:a] [:a [:b] [:b]]]]]
(the generic root node :S
can be ignored).
How would one programmatically check the "hem?" relation on such nested vectors? There's a wikia page that mentions a "reducibility" relation which might be easier to handle. There seems to be a claim that (hem? A B) and (reducible? B A) are equivalent, but I could not find a proof of the equivalence, or a programmatic definition of the reducible? relation.