Erlang programming techniques

From Citizendium
Revision as of 17:49, 8 May 2009 by imported>Howard C. Berkowitz (Erlang Programming Techniques moved to Erlang programming techniques: CZ standard capitalization)
Jump to navigation Jump to search

Template:TOC-right Erlang programming techniques exploit the design goal of the he Erlang programming language, "Making reliable distributed systems in the presence of software errors" as the title of Joe Armstrong's dissertation states. Its creators of Erlang has tried to solve that problem using a particular philosophy which makes certain programming techniques more powerful than others. This article will present the philosophy and the most powerful programming techniques.

Why this article?

This article is to a large extend a mash-up of Chapter 4 of Joe Armstrong's dissertation plus a definition of some of the key functionalities of Erlang so that the content can be easily understood by people that do not know anything about Erlang.

The intention is to provide non-Erlangers to get an idea of how Erlang affects the creation of software and what benefits that might bring about.

Fundamental Features of Erlang

Processes, Processes, Processes

Erlang provides its own processes without resorting to the underlying OS. This is done so efficiently that several millions of processes can run on the same machine. It is not just an academic show-off - the intention is to create as many processes as required to capture the truly concurrent activities of the system being created.


Pure message passing

The only way for for processes to communicate is to send messages to each other. There is no no shared state between processes which removes many of headaches normally associated with concurrent programming (todo: find reference).

Strong isolation between concurrent processes

The failure of one process will not cause other processes to fail.

Since failures are unavoidable in software Erlang has mechanisms that allows a process to detect if another process has failed and why it has failed, but the ability to detect the failure of another process does not make the process itself fail.

A failure in one process can cause other processes to fail, if and only if that behaviour is explicitly programmed.


Abstracting Out Concurrency

Although Erlang makes it easy to create concurrent programs Erlang also has the means to separate the concurrent parts of the code from the sequential parts. This makes it easier to debug and understand a program since the sequential parts are often easier to deal with than the concurrent parts which has to deal with many processes, order of message delivery, live-locks and dead-locks.

Erlang - as many other programming languages - deals with the problem by separating the programs into a generic component that can be parameterised with a number of plug-ins. The difference is that the features of Erlang provides a very powerful environment for the plug-ins to execute in, in particular error handling and dynamic code replacement makes it easy to make the separation clean and effective.

There are two ways of providing the separation.

  • Write both the generic component and the plug-ins yourself.
  • Use the Erlang/OTP library's behaviours as the generic components and just write the plug-ins yourself.

The Erlang/OTP library provides five behaviours that can cover most needs:

  • gen_server: servers for client-server set-ups.
  • gen_fsm: finite state machines.
  • gen_event: event handlers, e.g., error loggers.
  • supervisor: used to monitor worker processes for failures.
  • application: used to bundle components together as an application.

It is also possible to create your own behaviours, but the test of time has shown that the five Erlang/OTP behaviours can deal with most practical problems without too many artificial code tricks.

Everything is a process

Erlang's view of the world is that everything is a process and that the only way for the processes to interact is by sending messages to each other.

It has never been the intention of the creators of Erlang that all software in a system should be written in Erlang (see Bjarne's technical doctor dissertation for details - todo: find referene). When interfacing with external software one can maintain Erlang's view of the world by writing an interface program which maintains the illusion that everything is a process.

Erlang comes with a rich library that provide interface programs for many things, such as TCP, UDP and files (ref: kernel). There are language libraries for writing drivers to programs in C and Java, plus guidelines for interfacing to programs in other languages (ref: linked driver).

Error Handling Philosophy

This is where Erlang has taken a radically different approach than most other technologies, but it is also this approach that greatly simplifies the work of the programmer as we shall see.

Let some other process do the error handling

No matter how much one tries to handle all possible errors in a piece of code chances are that there are still errors in the code and when one of those occur the code will fail.

Erlang is designed to utilise remote handling of errors, which gives the following benefits compared to the more traditional programming languages:

  1. The failing code and the error-handling code runs in different threads of control.
  2. The code which deals with the error is not weaved into the code that solves the real problem.

The different threads of control means that failures will only affect one part of the program. The separation of the error handling from the value-adding code is an example of how separation of cross-cutting concerns can be done. This paradigm is the foundation for Aspect Oriented Programming (ref: Wikipedia article).


Workers and Supervisors

Just letting some other process do the error handling is not enough to create good architectures so Erlang uses the terms workers and supervisors for the two rôles.

Worker processes are the ones providing the value-add for the customer. The Supervisor process observers the works and deal with errors occurring in the Workers.

Fail-fast

The Fail-fast approach to failures (ref: wikipedia article http://en.wikipedia.org/wiki/Fail_Fast) is to "immediately report at its interface any failure or condition that is likely to lead to failure".

Erlang uses the fail-fast approach, but it is often referred to by Joe Armstrong and other Erlangers by the slogan "Let it crash".

Since another process will fix an error the implication for the programmers is that they should let their programs crash in the event of an error.

There is a distinction between exceptions and errors:

exceptions
occur when the operating system does not know what to do.
errors
occur when the programmer does not know what to do.

When the programmer has foreseen an exception and provided code to handle it it is not an error.

When you as a programmer do not know what to do the traditional approach is to add some defensive code which - in most cases - simply terminates the program since other solutions implies that you knew what to do! This principle is integrated into the Erlang system so that if a particular situation has not been handled by the programmer the compiler will create code to terminate the program and the diagnostic is often just as good as what the programmer would have supplied.

This approach means that the code will not be cluttered with defensive code and it is thus a lot easier to understand and maintain.

Intentional Programming

The term was introduced by Joe Armstrong in his dissertation (ref:...)

intentional programming
a style of programming where the reader of a program can easily see what the programmer intended by their code.

This is more of a style than a feature of Erlang, but the rich libraries that comes with Erlang tries to use this style and the style fits with the Erlang features and semantics.

This is better explained with the example from section 4.5 of Joe Armstrong's dissertation (ref:...): Erlang's dict module (link: http://erlang.org/doc/man/dict.html) provides a Key-Value dictionary. After extensive use of the library three patterns of use was identified:

data retrieval
the programmer knows a specific key should be in the dictionary and it is an error if it is not.
search
it is unknown if the key is there or not and both cases must be dealt with.
test
knowing if a key is present is enough.

The dict module provides these three functions to cater for the three different intentions:

dict:fetch(Key, Dict) = Val | EXIT
dict:find(Key, Dict) = {ok, Val} | error.
dict:is_key(Key, Dict) = Boolean

Note that the functionality of all three can be implemented by the fetch or find functions, but the intention of the programmer using the dict module will be obfuscated by extra code.

Implications on Software Engineering

Open question: should this be a separate article?

The Erlang Programming Techniques presented here has some implications for not only how Erlang programs look, but also for the software engineering disciplines used to create a system.

list: more value-adding code (Jan Henry Nyström's work with Motorola), aggressive coding style which gives higher productivity (ref: 4x article + ? ), hot code upgrade (enables more agile approach to software releases), documentation : functionality and error handling separated, ...