ERIL: a Visual Language for Data Modelling

Home

By Stepan Mitkin

Contents

  • Introduction
    • A note on terminology
  • Data model
    • Why do we need a data model?
    • Data model is not limited to databases
    • Logical and physical data model
  • Drawing tips from the ERIL language
    • Cut the data model into many small diagrams
    • Define each class only once
    • Arrows, paws and lines
    • Right angles and straight lines
    • Avoid line intersections
    • Master above, slave below
    • Do not put inheritance together with relationships
  • Summary

Introduction

What is ERIL? ERIL is a graphics language for representing data model. It is based on entity-relationship and class diagrams. Thus the name, Entity-Relationship and Inheritance Language.

ERIL is a counterpart for DRAKON. The DRAKON language describes algorithms and behaviour. It is an improved version of flowcharts. DRAKON introduces some rules that make flowcharts more human-eye-friendly.

The ERIL language describes data structure. In a similar way, ERIL can be defined as a set of guidelines and best practices aimed at better readability of structure diagrams.

A note on terminology

Throughout the article, the following terms will have the following meaning:

  • Class, table, entity. These words will mean the same, "a type of an object".
  • Field will mean the same as property.

Data model

Why do we need a data model?

Computer programs work with data. Data is being received, transformed, stored and sent. If we see the structure of the data, we understand a great deal about the program.

And in software business, understanding is more valuable than gold. The main obstacle to understanding is complexity. Data model is a powerful weapon against complexity.

Data model is not limited to databases

Since data is omnipresent in all parts of a software system, data modelling is not limited only to databases. Any component of a program can be explained in terms of data it works with.

The good news is, the nature of data is the same no matter where the data is stored. That's why any data can be described using the same language. Data in a database, in the memory, on the network or even in an http header is organised by the same principles. It makes it possible to represent data of any kind by a single visual language.

Processes, customers and artists

We have three examples on this diagram:

  1. Processes and threads. There can be many processes running in an operating system. Inside each process there can be one or more threads.
  2. Customers, orders and order lines. There can be many customers. Over time, each customer may send many orders. Each order can contain many order lines.
  3. Artists, albums and tracks. There can be music made by many artists in the library. Each artist can release several albums throughout his or her career. Albums consist of tracks.

All these examples come from different domains. Despite that, they follow the same abstract idea. That idea can be represented graphically on an entity-relationship diagram. It does not matter whether we store this information in a relational database, in an XML file or in memory objects.

Logical and physical data model

Isn't data model something that must be well hidden? It is dangerous to expose the innermost secrets of the system, isn't it? Yes, it is. And therefore we need to make a distinction between the logical and the physical data models.

The physical data model is confidential information that belongs to the team. It contains all the necessary details that help the team have a precise view of the project.

The logical data model is public. It is a part of the contract that the system adheres to. The logical data structure defines the mental picture of our system for our customers.

Let us take an online banking application as an example. A real banking application consists of several servers and databases. Each of those databases may have hundreds of tables. The goal of the physical model is to give precise information about each field and each relation.

But the bank's customer does not deal with that sheer complexity. What the customer has in mind is a relatively simple picture:

  • A list of his own accounts.
  • A list of his own cards.
  • A list of transactions.
  • Zero or more cards can be linked to one account.
  • Zero or more transactions can be recorded for each account.
  • Some transactions can be related to a card.

Online banking application

This is the logical data model. It is meant for documentation and integration. The goal of the logical data model is to explain the system to the outside world.

Drawing tips from the ERIL language

In a nutshell, the ERIL language augments entity-relationship diagrams with inheritance and a few best practices. Here they are.

Cut the data model into many small diagrams

Do not ever try to fit the whole data model on one diagram.

Unfortunately, here is what often happens to a new employee starting on a project. "Our database schema is in that file. Take a look if you dare." - that's what a newcomer is usually told. He follows the link, opens the file and stares helplessly at the immense cobweb of rectangles and arrows.

Wrong approach

This practice builds a (wrong) opinion that diagrams are only good for small, "hello world" applications. In the meantime, the solution is simple and straightforward.

Cut the model into several manageable pieces. Divide and conquer. Instead of having one gigantic diagram, split the system into many small ones. Each of the small diagrams should contain just a few entities. The entities on such a small diagram should ideally be related to a single concept.

Wrong approach

It is not necessary to draw all the links between the selected entities. Show just those that are relevant to this specific small diagram.

Define each class only once

When you break down the data model into several sub-diagrams, it is okay to mention the same class (or table) several times. Even more, it is perfectly fine to draw the same class several times on the same diagram. For example, in order to represent a tree-like data structure.

The definition of a class, however, must exist only in one place. The class definition should describe the content of the class: fields, indexes, etc. Other occurrences of the class are just references to the class definition. Those references can be represented as plain boxes with class names inside.

A similar guideline applies to links. There must be only one "definition" of a link. References to the same link can be shown in many places.

The difference between a link "definition" and link "reference" is that the definition has ellipses with field names on it. These fields are references and collections that implement the link. For example, the link Order-OrderLine may be implemented by the reference OrderLine.Order and the collection of references Order.OrderLines (since it is a one-to-many link).

Definitions and references

There are two diagrams on one visual scene in the example above. The definitions of the classes Employee and Division are placed on the upper diagram. The lower diagram has only references to these two classes.

A Division is connected to Employee with two links:

  1. There are several Employees working directly for a Division.
  2. A Division has one Employee as a Manager.

Divisions from a tree structure: a Division may consist of Subdivisions.

Arrows, paws and lines

Different types of lines denote different type of links.

A "paw" stands for the one-to-many relationship which is traversable both ways.

For example, the Artist-Albums relationship. Several Albums are linked to one Artist.

Bi-directional one-to-many link

We can get the list of Albums that belong to an Artist by examining the Artist's Albums property. The Albums property is a collection (real or virtual) of references to Album entities. It is also possible to navigate from an Album to its Artist if we follow the Album's Artist property. The Artist property is a reference to an Artist.

An arrow also denotes a one-to-many link. Its difference from the "paw" is that with an arrow, we are interested to follow the link only in one direction.

One-directional one-to-one link

For example, an Order-OrderType relationship. It may useful to inspect the order type for a specific order. But we may not be interested in queries like "find all orders of that type in the world". In this case, a one-way reference from Order to OrderType is enough.

A many-to-many relationship can be represented in two ways:

  1. If we are only interested in which objects are connected to which, we use a horizontal line that has "paws" on both ends.
  2. If some information must be stored alongside with the connection, one more classes (tables) are created in between. The many-to-many link is then becomes two one-to-many links.

Many-to-many relationship

One-to-one relationships can be represented as a plain line. In a such relationship, two objects form a pair. If we have a one-to-one relationship A-B, then we have the following rules:

  • For each A object there is at most one B object.
  • For each B object there is at most one A object.

One-to-one link

Here, a person has exactly one spouse.

Right angles and straight lines

Take a look at these two maps. Which neighbourhood is easier to navigate?

Two city layouts

Of course, the one on the right-hand side. Why?

Because the right city layout is better organised:

  1. The streets are straight.
  2. The streets intersect at right-angles.

These restrictions result in a simpler mental model. That "Manhattan" mental model makes much easier both driving and diagram reading.

So here are the Manhattan rules:

  1. Lines must be straight. Curved and wavy lines are forbidden.
  2. Lines must be either vertical or horizontal. Slanting lines are not allowed.

Manhattan vs unorganised graphs

It takes some effort and eye tension to follow a curved line. Straight lines need not to be traced. That relaxes the eyes and the brain.

The requirement to have either strictly vertical or horizontal lines leads to a simpler graph structure. Every node in the graph can have no more than four neighbours.

Avoid line intersections

This one is the most important guideline.

Avoid line intersections at all cost!

Line intersections are the enemy number one in diagram drawing. They bring in ambiguity and destroy readability.

Our brain automatically thinks that if some objects touch on a diagram, there is a connection between them. In order to convince the brain otherwise, we must waste precious time and mental energy. This effort can and must be avoided.

The prohibition on line intersections is one of the main features of the DRAKON language.

Sometimes, the model can be interconnected so tightly that it is impossible to draw it without intersections. It this case the diagram is broken up into several parts. Each of such parts is then drawn without line intersections.

T-joints are of course allowed.

Master above, slave below

Very often an object is a part of another object. If the part cannot exist without the whole, such part-whole relationship is called onwership, or master-slave relationship.

Usually the "slave" object gets destroyed when its "master" object goes away.

ERIL suggests a clear way to distinguish relationships of that type:

  • Vertical lines mean ownership.
  • Horizontal lines mean peer relationships.

For an ownership relationship, it is easy to see who is the master and who is the slave. The entity above is the master, the entity below is the slave. An object may have only one master, or owner.

Ownership relation

In a peer relation, everybody has equal rights. So it is not important whether an entity appears on left or on the right side.

Peer relation

A note on consistency. If a line is connected to one class vertically, it must also connect to the other one vertically. It is an error to start with a vertical line and end with a horizontal one, or vice-versa.

Relationship consistency

Another way of calling ownership and peer relations is composition and aggregation. The problem with these terms is that they are not ergonomic. In other words, they are hard to remember and easy to confuse.

Do not put inheritance together with relationships

It might be tempting to lump together all information pertaining to an entity in one place. While it is generally a good idea, there are cases when it's not.

For example, data relationships and inheritance relationships should better be represented on different diagrams.

Inheritance is mixed with data links

Inheritance and links can be shown on the same visual scene, but on different parts of it. The reasoning behind this recommendation is that inheritance is conceptually different from a data reference.

Inheritance is separate with data links

Different ideas should be represented separately.

Summary

Data model is a powerful but underestimated tool. It helps understand the structure of the program.

There two main kinds of data model:

  • Internal, or "physical". Contains private detailed information about the project or component.
  • Logical. Describes the software as a part of the contract that the software implements. Crucial for integrations.

Entity-relationship and class diagrams are the traditional means for describing data model. ERIL improves them by introducing certain rules and conventions. Most of the ideas behind ERIL have been borrowed from the DRAKON language where they showed their high practical value.

Here are the guidelines introduced by ERIL:

  • Divide and conquer. Do not try to fit the world on one sheet of paper. Draw many simple diagrams instead.
  • Define things in one place, reference in many. Each class (table) and relationship can be referenced many times, but defined only in one place.
  • Use standard and well-recognised symbols to indicate the type of the relationship.
    • One-to-one: a simple line.
    • One-to-many, two-way: a line with a "paw".
    • One-to-many, one-way: an arrow.
    • Many-to-many: a line with two "paws"
  • Lines must be straight. No bending, curved or slanting lines are allowed.
  • Either vertical or horizontal. Lines must be either strictly vertical or horizontal.
  • Vertical lines mean ownership.
  • Horizontal lines mean peer relations.
  • Important! Line intersections are not allowed.
  • Different ideas are drawn separately. Do not lump together inheritance and data relationships.

Well-drawn diagrams are easier to read than plain text. Some argue, however, that diagrams are harder to create. The purpose of the above rules is to help non-artists produce readable structure diagrams quickly.

A software project is a knowledge base. The project team's job is to contribute to it. Together with DRAKON, ERIL covers most aspects of that knowledge base.


Updated on 25 January 2014.

Contact: drakon.editor@gmail.com