Documentation – How to keep the analyzer code and grammar definition in sync?

I am working with a custom DSL and simple enough to specify how several scripts will run. The DSL takes the form of configuration files that are very simple and easy to read for humans. They define which scripts should be executed in what order and how. I have a program that analyzes this configuration file and executes the steps.

Configuration files are sometimes created by other developers who are not familiar with the analyzer. They just need to know what the grammar rules have. These rules are simple, which reduces the learning curve to write these configuration files.

The de facto definition of grammar is, of course, my analyzer code. However, this is obviously not accessible to new developers. Then, the next logical step is to write a succinct description of the grammar, for example, as an EBNF. However, as the grammar increases with new features, the EBNF document will become obsolete and I will have to remember to update it manually after each relevant code change. Worse, I can inevitably forget to update the EBNF document, which leads developers to be surprised that the actual analyzer behaves differently than the documentation says.

What strategy can I use to maintain the grammar documentation of my analyzer? My priorities are DRY and minimal investment of developer time (both mine and others).

  • Is it a good idea to try to write the EBNF as the grammar of the basic truth and then generate somehow the logic of the analyzer automatically from this EBNF? How can I prevent the analyzer from being much more complex and difficult to maintain?
  • Can I automatically spit the EBNF from the code logic? Is this practical in Python?
  • Am I thinking too much? Perhaps the best option is to be disciplined about manual document updating after all.