0

I am trying to create a regular expression that will generate the following language under the {a,b,c} alphabet: all words that do not contain the substring "bbc"

I am having a really hard time understanding how to approach this question. I have done several other questions where a certain substring must be excluded, but this one really messes with my logic.

Thanks in advance

1 Answers1

2

Some things I learnt while learning to answer this type of question:

  1. Negation isn't easy to express. There are no shortcuts. The resulting expressions may be far more complex than the expression you would get if negation was a basic operator in the expression language - in which case you could write $\neg(.^*bbc.^*)$, where $.$ is a shortcut for $(a\cup\neg a)$
  2. Lacking $\neg$, the only way to say $\neg a$ for some symbol $a$ is to enumerate all other symbols: $\neg a = (b\cup c \cup d \cup \ldots)$. In Unix-like regular expressions, you can of course use [^a].
  3. In compound expressions, you can use left factorization. E.g. $\neg(bbc) = \epsilon \cup \neg b\neg(bc) = \epsilon \cup \neg b(\epsilon \cup \neg c) = \ldots$. The key insight is that this can be made to work with Kleene star expressions, too; I suggest you invent or look up the details yourself.
reinierpost
  • 5,509
  • 1
  • 21
  • 38