Imagine you have two systems of delimiting. One with paired delimiters, [
and ]
:
[abc]
Then another system which uses a single interstitial delimiter, /
:
a/b/c
It's easy to see how to encode structure in the first case, as they nest cleanly:
[ab[cd[e]f]]
But let's say you are looking to encode arbitrary nested structures in the second scheme, using only some number of interstitial delimiters. Whatever the encoding winds up being, a/////b//c////d///e///f
would be an example of "following the rules", while //a//b/c///d///e///f///
would not.
So you're basically able to put a unary-encoded integer from 0..∞ (let's say 0 is /
, 1 is //
) between your elements.
It's obviously possible to encode, though the results won't necessarily be pretty or visually intuitive. One way would be to consider you have two factors to record at each step.
Whether a nesting level is being entered or not for the ensuing token or if it's going to stay the same. So for
a?b
, we want to know ...a[b
... or ...ab
...how much of a nesting drop the ensuing token has after it. So for
d?e
, we want to know e.g. ...d[e]
... or ...d[e]]
... or ...d[e]]]
The first is just a yes or no, and the second is a number which can range from 0 to however many nesting levels you've gotten so far. So multiply the nesting level to drop by 2, add 1 if you're going a level deeper in and leave it alone otherwise.
[ab[cd[e]f]] => a 0 b 1 c 0 d 3 e 4 f => a/b//c/d///e////f
That's pretty mechanical, but if I've got it right, I think it verifies it can be done.
But here's my question: Is there a known encoding for this kind of problem that would more intuitively convey the structure to a human reader, perhaps at the cost of making longer strings? Let's say a system that would decay such that simple cases like [ab[cd]e]
could look more like a/b//c///d//e
or similar, while still being able to encode everything distinctly?
I realize the quality I'm asking for is a bit "nebulous", but perhaps you see what I mean. One thing I don't like about the encoding I chose above is it imposes a left-to-right "leaning" property, when there isn't anything particularly left-like or right-like about the nesting properties being encoded. I wonder how that might be excised by making different choices.
[
and]
! – HostileFork says dont trust SE Jul 26 '17 at 02:41a/b/c
by{a b c}
. – Raphael Jul 28 '17 at 04:47