Note: starting from version 1.15 DRAKON Editor uses a new code generation algorithm that is not based on search.
DRAKON Editor generates source code from diagrams for a number of programming languages. As of version 1.11, the following languages are supported: C, C++, Java, C#, Python, Tcl, Javascript and Erlang. Since these languages have a lot in common, most of the generation is done by the same algorithm. The language-specific routines mostly take care of the differences in syntax. Code generation has four steps:
A diagram in DRAKON Editor is composed of geometrical shapes of two types: icons and line segments. The shapes are not connected to each other and the first task of the code generator is to glue them into a network:
Once the graph of geometric shapes is constructed, an abstract graph of the algorithm is extracted from it. The graph of the algorithm is directed while the graph of geometrical shapes is not.
At this point, the graph has only two kinds of nodes: "If" nodes and "Action" nodes. Each node can have many incoming edges. Each "Action" node has one outgoing edge, each "If" node has two outgoing edges.
What happens next depends on the programming language which we are emitting code for. C and C++ have the famous goto statement which makes code generation pretty straightforward. The languages that don't have goto require more effort.
Code generation for a language that has the goto construct is easy: each node of the algorithm graph receives a label and each edge turns into a goto. Some optimizations are made to reduce the number of gotos:
A special action is taken to ensure that the first icon in the diagram becomes the first line of the generated code. This is necessary for placing the variable declarations at the beginning of the function body. Although C++ does not strictly require that all the variables are declared in the beginning of the function, putting some of the variables there can sometimes be necessary.
DRAKON Editor employs search to generate source code for the languages that do not have goto. Generation is done in two steps. First, the graph is prepared for the search procedure. Then the search procedure attempts to find the code tree that implements the algorithm specified by the diagram. If the search fails, a fallback routine builds the code tree by emulating goto.
The graph is scanned for cycles. If an edge points to a node that has already been visited on the way to this edge, this edge is considered "backward". The edges that "go back" receive a special mark. After this stage the graph does not have cycles if the edges with the "backward" mark are not counted.
Then technical nodes are added to the graph:
Afterwards, branching in the algorithm is normalized by bringing the "If" icons and their outcomes to the standard "If-Then-Else" form.
If necessary, nodes and edges are copied. At this stage, the algorithm could be directly mapped to a nested tree of "If-Then-Else" constructs if the "backward" edges are ignored.
If there were no loops we would not need to do search. But they are there!
The overall goal of the search is transforming loops into this form:
while true: do stuff if must exit: break
Each loop should turn into an infinite "while" construct that has one or more "break" statements as the exits. The "while" construct can also have "continue" statements.
The search transforms the algorithm graph into the code tree. The code tree is built of two kinds of constructs: "if" and "while". It is easy to determine where to insert "if", "while true" and "continue" statements:
Therefore, the specific goal of the search is to find the places where to put "break" statements.
The search has the following properties:
The fundamental question that is asked for each node during the search is: "is it possible to put a break here?".
The branching of the search (finding the next states that are reachable from the current one) is done by:
Depth-first search proved to fit this task well while A* has performed poorly.
This algorithm fails to find a solution for diagrams that contain certain structures, for example:
The search fails once in a while and that is why we must have an emergency exit. There is a pretty straightforward way to emulate goto in a programming language that does not have this construct:
Here is what the Fibonacci algorithm looks like with goto emulation:
def fibonacci(n): _next_item_ = 153 while True: if _next_item_ == 153: _sw_153 = n if _sw_153 == 0: result = [0] _next_item_ = 168 elif _sw_153 == 1: result = [0, 1] _next_item_ = 168 else: result = [0, 1] i = 2 _next_item_ = 1630002 if _next_item_ == 1630002: if i <= n: f2 = result[i - 2] f1 = result[i - 1] fib = f1 + f2 result.append(fib) i += 1 _next_item_ = 1630002 continue else: _next_item_ = 168 if _next_item_ == 168: return result
The goto emulation has a performance cost: the code becomes approximately two times slower.
This goto emulation algorithm was borrowed from somewhere on the internet but unfortunately the exact URL is lost.
Posted: May 31, 2012
Copyright 2012 Stepan Mitkin
drakon.editor@gmail.com