5

I'm doing a lot of code analysis using lexer and finite state machine. For time being I'm using table to describe FSM:

| token | current state | target state |
+-------+---------------+--------------+
| .     | start         | dot          |
| trace | dot           | method       |
| (     | method        | detected     |

Using this table and implicit start state FSM is created:

enter image description here

Lexer is used to generate a stream of tokens and token used as trigger for state transition. In case transition from current state is impossible - FSM is set to start state.

Using table to describe FSM is alright for fairly small number of states, but it gets complicated fairly quickly. Google search suggested very few interesting results:

So the question is there standard or de facto language to describe finite state machines?

aisbaa
  • 159
  • 1
  • 5
  • 3
    using state-transition tables is the regular way of doing it. However, if I remember correctly, then Type-3-Grammatics are recognized by finite state machines, so I think you could use those as description language (https://en.wikipedia.org/wiki/Regular_grammar) – Lovis Aug 26 '16 at 10:17
  • Why don't you use a parser generator? – whatsisname Aug 26 '16 at 16:33
  • Thats a good question, I'll look into parser generator as well. – aisbaa Aug 30 '16 at 08:57
  • So you are building this FSM in order to parse code and analyze it, correct? So you are building the in memory representation of the FSM, right? If I've got that right, are you handling the parsing the table? – JimmyJames Aug 30 '16 at 16:18
  • 1
    One of the simplest languages which can encode the information set of a state-transition table, along with parseable predicates for labelled (or not) transitions, is probably those good old S-expressions: https://en.wikipedia.org/wiki/S-expression – YSharp Aug 31 '16 at 05:41
  • @JimmyJames you're quite a good nit picker, I like that. I'm using FSM to detect code issues, it is very similar to grepping code, except in this case I'm using tokens. For example I'm trying to find code that is using string interpolation instead of parametrized SQL queries. – aisbaa Aug 31 '16 at 09:03
  • @YSharp thank you, will take a look into S-expression. – aisbaa Aug 31 '16 at 09:04
  • @aisbaa I'm not trying to nit-pick but... thanks for the compliment? What I am trying to clarify is if you can choose any format for the input you want or if you trying constrained by some library or tool that you are using. I think it's the former but some of the comments on my answer imply it might be the latter. – JimmyJames Aug 31 '16 at 13:16

2 Answers2

4

Most of what has been standardised goes beyond simple finite state machines.

If you like XML, there's State Chart XML and UML's XMI for UML state charts, both of which are a superset of finite state machine.

There's also Matlab's stateflow, but I'm not sure if there's a text based language behind them or a proprietary format.

Trawling my links from a few years back, Microsoft's Abstract state machine language has an implementation XASM ( I've not used it ) and a paper 'SML-a high level language for the design and verification of finite state machines'. If you want something more of domain language within a programming languages, there's this question Is there a programming language with built-in state machine construct?

Pete Kirkham
  • 1,878
4

Since finite state machines are a basically subset of (labelled) directed graphs, what about something like DOT. It was just the first thing that popped up on a Google "directed graph language". I've never used it.

JimmyJames
  • 27,287
  • DOT and graphviz is for making diagrams, it's not useful for code analysis. – whatsisname Aug 26 '16 at 16:38
  • 1
    @whatsisname My understanding is that the question is about how to describe the finite state machine. DOT does that. Are you saying that the DOT representation can only be used to for making diagrams? If so, that seems to be obviously not true. – JimmyJames Aug 26 '16 at 16:46
  • I interpreted his question as he needs to do some analysis on code, but not necessarily report his findings to someone else. Graphviz might be useful for reporting the findings, but not the actual analysis. We often use "describe" as a synonym for "characterize" in this context. – whatsisname Aug 26 '16 at 19:15
  • Additionally, if he is going to use DOT for whatever he's doing, now he has two problems. – whatsisname Aug 26 '16 at 19:15
  • @whatsisname If you look at around the middle of the page from my link, there are 10 different tools listed that can supposedly use this format. All I see in the question is a request for a different way to format the FMS as text. The question is unclear so maybe you are right and it's not helpful. I'd need clarification on the question to be sure. – JimmyJames Aug 29 '16 at 13:26
  • I didn't mentioned DOT in original question. Yes, definitely it is one way of describing FSM. As @whatsisname mentioned, DOT is more targeted on visual representation of graphs. The down side for me is that there is not explicit way for expressing what triggers state change, or I haven't found one. – aisbaa Aug 30 '16 at 09:11
  • @aisbaa What is it about DOT that makes it "more targeted at visual representation"? It's just a text representation of a graph. What you do with it after it's parsed is up to you. I don't see anything in your table example that shows what triggers state change. Can you update the question with more detail on what you are trying to do? – JimmyJames Aug 30 '16 at 14:45
  • @JimmyJames thank you for pointing that out, updated the question. I guess trigger or token in my case could be passed in attribute list. – aisbaa Aug 30 '16 at 16:08