copyright : http://www.cis.upenn.edu/~matuszek/General/ConciseGuides/concise-erlang.html
A Concise Guide to Erlang
Copyright ©2010, David Matuszek
Copyright ©2010, David Matuszek
About Erlang
Erlang is an expression-oriented, single-assignment, garbage-collected, purely functional language. There are no loops, so recursion is heavily used.Erlang is quite a small language. It is of interest primarily because of its approach to concurrency, using Actors. Actors have subsequently been incorporated into other languages, most importantly Clojure and Scala. Erlang is most suitable for building extremely reliable, fault-tolerant systems that do not need to be shut down in order to be upgraded. Its extremely convenient bit-manipulation makes it an excellent language for low-level communications.
Running Erlang
As with many languages, Erlang can be run in a REPL (Read-Eval-Print-Loop) "shell." Short pieces of code can be tested directly in the shell. To start the shell, entererl
at the command line. Within the shell,- Use
c(module.erl).
orc(module).
to compile a module namedmodule
from the filemodule.erl
. The parameter toc
should be an atom. - Use
f().
to clear (forget) previous associations. - Use the up and down arrows to choose a previous entry.
- Use control-C to exit the shell.
- Directives beginning with a minus sign cannot be used in the REPL. In particular, you cannot
import
any files. - Functions cannot be defined in the REPL.
- Except for a very few built-in functions, function calls must be prefixed by the name of the module in which they are defined; for example,
lists:map(args)
,my_module:my_function(args)
.
Directives
Every Erlang program should begin with a module directive, of the form -module(filename).
and saved in a file with the name
filename.erl
.To provide functions defined in this file to other programs, use
-export([function1/arity1, ..., functionN/arityN]).
where the "arity" is the number of parameters expected by the function.
To use functions defined in another file, use
-import(filename, [function1/arity1, ..., functionN/arityN]).
where the "arity" is as above. Imported methods may be called without a
filename:
prefix.To define a record:
-record(Name, {Key1 = Default1, ..., KeyN = DefaultN}).
where the Keys are atoms; the default values are optional. Records may be defined in Erlang source files or in files with the extension
.hrl
. , but may not be defined in the REPL.To specify compiler options:
-compile(Options).
The
export_all
option is useful for debugging, but should be avoided in production code.Documentation
Comments begin with a%
character and continue to the end of the line.Erlang used EDoc (inspired by Javadoc). EDoc comments go before a module or a function. Some of the tags that can be used for a module are
@author
, @copyright
, @deprecated
, @doc
(followed by XHTML), and @version
. Some of the tags that can be used for a function are@deprecated
, @doc
(followed by XHTML), @private
, and @spec
.Variables
Erlang is a single-assignment language. That is, once a variable has been given a value, it cannot be given a different value. In this sense it is like algebra rather than like most conventional programming languages.Variables must begin with a capital letter or an underscore, and are composed of letters, digits, and underscores.
The special variable
_
is a "don't care" variable--it does not retain its value. It is as if every occurrance of _
is a new, different variable.Erlang issues a warning if a variable occurs only once in a function. To eliminate this warning, use an underscore as the first character of the variable name.
Data types
Erlang has:- Integers, of unlimited size:
1112223344455666777888999000
.- Integers may be written in any base from 2 to 36, with the syntax
base#number
, for example,16#3FF
is 1023. - The ASCII value of characters can be written as
$c
, for example,$A
is 65, and$\n
is 10.
- Integers may be written in any base from 2 to 36, with the syntax
- Floats:
1234.5678
,6.0221415e23
. - Strings, enclosed in double quotes:
"This is a string."
- A string is implemented as a list of ASCII (integer) values; how it is printed depends on whether it contains non-ASCII values.
- Erlang has no Unicode support, but Unicode strings can be represented as a list of integers.
- Standard escape sequences, such as
\n
and\t
, may be used in strings.
- Atoms. An atom stands for itself. It begins with a lowercase letter and is composed of letters, digits, and underscores, or it is any string enclosed in single quotes:
atom1
,'Atom 2'
.- Erlang has no separate "boolean" type, but uses the atoms
true
andfalse
to represent boolean values.
- Erlang has no separate "boolean" type, but uses the atoms
- Lists, which are a comma-separated sequence of values enclosed in brackets:
[abc, 123, "pigs in a tree"]
. - Tuples, which are a comma-separated sequence of values enclosed in braces:
{abc, 123, "pigs in a tree"}
.- Because the values in tuples are "anonymous," a common technique is to use name-value pairs, with the name being an atom:
{{name, "Pat"}, {age, 27}, {gender, female}}
.
- Because the values in tuples are "anonymous," a common technique is to use name-value pairs, with the name being an atom:
- Records, which are not a separate data type, but are just tuples with keys associated with each value. They are declared in a file and defined (given specific values) in the program.
- Binaries, enclosed in double angle brackets:
<<0, 255, 128, 128>>
,<<"hello">>
,<<X:3, Y:7, Z:6>>
. Binaries are sequences of bits; the number of bits in a binary must be a multiple of 8.- Erlang has extremely good support for binaries, most of which is beyond the scope of this paper.
- References are globally unique values, created by calling
make_ref()
. - Process identifiers (Pids) are the "names" of processes.
Type tests and conversions
To test for or convert "strings," recall that strings are actually lists of integers.Type tests
is_atom(X) | is_function(X) | is_number(X) | is_tuple(X) |
is_binary(X) | is_function(X, N) | is_pid(X) | is_record(X) |
is_constant(X) | is_integer(X) | is_port(X) | is_record(X, Tag) |
is_float(X) | is_list(X) | is_reference(X) | is_record(X, Tag, N) |
Type conversions
atom_to_list(Atom) | float_to_list(Float) | list_to_binary(List) | round(Float) |
binary_to_list(Binary) | integer_to_list(Integer) | list_to_integer(List) | trunc(Float) |
float(Integer) | list_to_atom(List) | list_to_tuple(List) | |
float(List) | list_to_existing_atom(List) | tuple_to_list(Tuple) |
Operations
|
number < atom < reference < fun < port < pid < tuple < list < binary . | |||||||||||||||||||||||||||||||||||||||
|
|
Pattern matching
The pattern matching expression
Pattern matching is the fundamental operation in Erlang. A simple pattern matching expression looks like an assignment statement in other languages: pattern = expression.
This says to evaluate the expression, and try to match the result to the pattern. In this context, it is an error if the pattern match does not succeed. Note that every statement in Erlang ends with a period.
In general, pattern matching succeeds in the following cases:
- The pattern is an unbound variable. When the pattern match succeeds, the variable is bound to the value of the expression.
- The pattern is bound to a value, and the expression evaluates to the same value.
- The pattern is a structure (list or tuple) which may contain unbound variables, and the expression results in the same structure; when the pattern match succeeds, the unbound variables become bound to the corresponding parts of the evaluated expression.
Examples
Variable = expression.
- The expression is evaluated.
- If the Variable has no previous value, it is given the value of the expression; this makes the two sides equal, so the pattern match succeeds.
- If the Variable has a previous value, and it is equal to the value of the expression, then the pattern match succeeds, otherwise it fails.
[H|T] = expression.
- If the value of the expression is a nonempty list, H is matched against the head of the list (the first element) and T is matched against the tail of the list (the remaining elements). If either fails to match, or if the expression does not evaluate to a nonempty list, the pattern match fails. Note that H and T may be variables, literals, or expressions.
[H1, H2, ..., HN|T] = expression.
H1, H2, ..., HN
are matched against the first N elements of the list, and T is matched against the remaining elements. If any part fails to match, the pattern match fails.{A, B, C} = {X, Y, Z}.
- The expressions on the right are evaluated and compared, in order, against the patterns on the left (that is, A
=X
, B=Y
,C=Z
). In order for the pattern match to succeed, the tuples must be the same length, and corresponding parts must match. #Name{Key = Variable, ..., Key = Variable} = Record.
- The Variables are matched against the values of the named Keys in the Record.
<<Pattern:Size, ..., Pattern:Size>> = Binary.
- The values in the Binary are unpacked into their component parts and matched against the Patterns.
Case expressions
Thecase
expression uses pattern matching, and has the following syntax:The brackets indicate that thecase Expression of Pattern1 [when Guard1] -> Expression_sequence1; Pattern2 [when Guard2] -> Expression_sequence2; ... PatternN [when GuardN] -> Expression_sequenceN end
when
part (which is just a condition) is optional. The expression is evaluated, and the patterns are tried, in order. When a matching pattern is found (and whose associated guard, if present, is true), the corresponding expression sequence is evaluated. The value of an expression sequence is the value of the last expression, and that becomes the value of the case
.If expressions
Theif
expression is like a case
expression without the pattern matching.The value of theif Guard1 -> Expression_sequence1; Guard2 -> Expression_sequence2; ... GuardN -> Expression_sequenceN end
if
expression is the value of the expression sequence that is chosen. The value of an expression sequence is the value of the last expression executed. It is an error if no guard succeeds; hence, it is common to use true
as the last guard.Guards
Guards may not have side effects. To ensure this, user-defined functions are not allowed in guards. Things that may be uses are: type tests, boolean operators, bitwise operators, arithmetic operators, relational operators, and the following BIFs (Built In Functions):abs(Number) | hd(List) | node(X) | size(TupleOrBinary) |
element(Integer, Tuple) | length(List) | round(Number) | trunc(Number) |
float(Number) | node() | self() | tl(List) |
Defining functions
A function is a value, or first-class object. That means it can be assigned to a variable, or given as an argument to a function, or returned as the value of a function.Named functions
The syntax for a named function is a series of one or more clauses:wherename(Patterns1) -> Body1; name(Patterns2) -> Body2; ... name(PatternsN) -> BodyN.
- The name and the arity (number of patterns given as parameters) are the same for each clause.
- Clauses are tried in order until one of the parameter lists (sequence of patterns) matches, then the corresponding Body is evaluated.
- Each Body consists of an sequence of expressions, separated by commas; the value of the sequence, and therefore the value of the function, is the value of the last expression evaluated.
- It is an error if no parameter list matches.
Recursion
Recursion is when a function calls itself, either directly (f
calls f
) or indirectly (f
calls g
, which calls h
, ..., which calls f
). Any program which uses a loop can be rewritten to use recursion, and vice versa. Erlang has no loops, therefore recursion is used heavily.Here is one way to write the equivalent of a loop in Erlang:
Tail recursion is when the recursive call is the very last thing done in the function. As an example, the usual definition of the factorial function,myFunction(args1) -> args2 = SomeExpression(args1); myFunction(args2).
factorial(0) -> 1;
factorial(N) -> N * factorial(N - 1).
is not tail recursive, because a multiplication is performed after the recursive call.
In general, each recursive call adds information to an internal stack; very deep recursions can cause Erlang to run out of memory. Tail recursion is desirable because the compiler can easily change a tail recursion into a loop, which does not add information to the stack, and therefore does not cause memory problems.
Functions that are not tail recursive (such as
factorial
) can usually be rewritten as tail recursive functions, with the aid of a helper function. As with many optimizations, this is not recommended until proven necessary, because the resultant code is harder to read and understand.Anonymous functions
The syntax for an anonymous function isfun(Patterns1) -> Body1; (Patterns2) -> Body2; ... (PatternsN) -> BodyN end
Functions as first-class objects
Functions are values. That is, they may be assigned to variables, passed as arguments to functions, and returned as the result of functions.An anonymous function may be used as a literal value. A named function may be referred to by using the syntax
fun FunctionName/Arity
.Lists
A list literal can be written as a bracketed, comma-separated list of values. The values may be of different types. Example:[5, "abc", [3.2, {a, <<255>>}]
.A list comprension has the syntax
[Expression || Generator, GuardOrGenerator, ..., GuardOrGenerator]
where
- The Expression typically makes use of variables defined by a Generator,
- A Generator provides a sequence of values; it has the form
Pattern <- List
, - A Guard is a test that determines whether the value will be used in the Expression.
- At least one Generator is required; Guards and additional Generators are optional.
N = [1, 2, 3, 4, 5].
L = [10 * X + Y || X <- N, Y <- N, X < Y]. % Result is [12,13,14,15,23,24,25,34,35,45]
hd(L)
returns the first element in the list L; tl(L)
returns the list of remaining elements.Selected operations on lists
The following operations are predefined.hd(List) -> Element
-- Returns the first element of the list.tl(List) -> List
-- Returns the list minus its first element.length(List) -> Integer
-- returns the length of the list.
lists
module. To call them, either first import
them, or prepend lists:
to the function call.The definitions are copied from http://www.erlang.org/doc/man/lists.html. Of these, the operations map
, filter
, foldl
, and seq
are the most commonly used.all(Pred, List) -> bool()
-- Returns true if Pred(Elem) returns true for all elements Elem in List, otherwise false.any(Pred, List) -> bool()
-- Returns true if Pred(Elem) returns true for at least one element Elem in List.append(List1, List2) -> List3
-- Returns a new list List3 which is made from the elements of List1 followed by the elements of List2.lists:append(A, B)
is equivalent toA ++ B
.
dropwhile(Pred, List1) -> List2
-- Drops elements Elem from List1 while Pred(Elem) returns true and returns the remaining list.filter(Pred, List1) -> List2
-- List2 is a list of all elements Elem in List1 for which Pred(Elem) returns true.- Example:
lists:filter(fun(X) -> X =< 3 end, [3, 1, 4, 1, 6]). % Result is [3,1,1]
- Example:
flatmap(Fun, List1) -> List2
-- Maps Fun to List1 and flattens the result.flatten(DeepList) -> List
-- Returns a flattened version of DeepList.foldl(Fun, Acc0, List) -> Acc1
-- Calls Fun(Elem, AccIn) on successive elements A of List, starting with AccIn == Acc0. Fun/2 must return a new accumulator which is passed to the next call. The function returns the final value of the accumulator. Acc0 is returned if the list is empty.- Example:
lists:foldl(fun(X, Y) -> X + 10 * Y end, 0, [1, 2, 3, 4, 5]). % Result is 12345
- Example:
foreach(Fun, List) -> void()
-- Calls Fun(Elem) for each element Elem in List. This function is used for its side effects and the evaluation order is defined to be the same as the order of the elements in the list.map(Fun, List1) -> List2
-- Takes a function from As to Bs, and a list of As and produces a list of Bs by applying the function to every element in the list. This function is used to obtain the return values. The evaluation order is implementation dependent.- Example:
lists:map(fun(X) -> 2 * X end, [1, 2, 3]). % Result is [2,4,6]
- Example:
member(Elem, List) -> bool()
-- Returns true if Elem matches some element of List, otherwise false.partition(Pred, List) -> {Satisfying, NonSatisfying}
-- Partitions List into two lists, where the first list contains all elements for which Pred(Elem) returns true, and the second list contains all elements for which Pred(Elem) returns false.reverse(List1) -> List2
-- Returns a list with the top level elements in List1 in reverse order, with the tail Tail appended.seq(From, To) -> Seq
-- Returns a sequence of integers from From to To, inclusive.seq(From, To, Incr) -> Seq
-- Returns a sequence of integers which starts with From and contains the successive results of adding Incr to the previous element, until To has been reached or passed (in the latter case, To is not an element of the sequence).sort(List1) -> List2
-- Returns a list containing the sorted elements of List1.takewhile(Pred, List1) -> List2
-- Takes elements Elem from List1 while Pred(Elem) returns true, that is, the function returns the longest prefix of the list for which all elements satisfy the predicate.unzip(List1) -> {List2, List3}
-- "Unzips" a list of two-tuples into two lists, where the first list contains the first element of each tuple, and the second list contains the second element of each tuple.zip(List1, List2) -> List3
-- "Zips" two lists of equal length into one list of two-tuples, where the first element of each tuple is taken from the first list and the second element is taken from corresponding element in the second list.
Selected operations on strings
Strings are lists of ASCII values, so all the list operations apply. The following are in thestring
module, so either import
them or prepend each function call with string:
.The definitions are copied from http://www.erlang.org/doc/man/string.html.len(String) -> Length
-- Returns the number of characters in the string.equal(String1, String2) -> bool()
-- Tests whether two strings are equal.chr(String, Character) -> Index
-- Returns the (1-based) index of the first occurrence of Character in String. 0 is returned if Character does not occur.rchr(String, Character) -> Index
-- Returns the (1-based) index of the last occurrence of Character in String. 0 is returned if Character does not occur.str(String, SubString) -> Index
-- Returns the (1-based) position where the first occurrence of SubString begins in String. 0 is returned if SubString does not exist in String.
rstr(String, SubString) -> Index
-- Returns the (1-based) position where the last occurrence of SubString begins in String. 0 is returned if SubString does not exist in String.substr(String, Start) -> Substring
-- Returns a substring of String, starting at the position Start, and ending at the end of the string.substr(String, Start, Length) -> Substring
-- Returns a substring of String, starting at the position Start, and ending at length Length.strip(String) -> Stripped
-- Returns a string where the leading and trailing blanks have been removed.to_float(String) -> {Float,Rest} | {error,Reason}
-- Argument String is expected to start with a valid text represented float (the digits being ASCII values). Remaining characters in the string after the float are returned in Rest.to_integer(String) -> {Int,Rest} | {error,Reason}
-- Argument String is expected to start with a valid text represented integer (the digits being ASCII values). Remaining characters in the string after the integer are returned in Rest.to_lower(String) -> Result
-- Returns a string in which uppercase characters have been converted to lowercase.to_upper(String) -> Result
-- Returns a string in which lowercase characters have been converted to uppercase.
Records
Records are declared in a file with the syntax-record(Name, {Key1 = Default1, ..., KeyN = DefaultN}).
To read the record declarations from a file, use the function
rr("records.hrl")
.To define a record, use the syntax
Variable1 = #Name{Key = Value, ..., Key = Value}.
The default value is used for any omitted
Key=Value
pairs. A new, modified record may be created with the syntax Variable2 = Variable1#Name{Key = Value, ..., Key = Value}.
Values may be extracted from a record by using pattern matching:
#Name{Key = Variable, ..., Key = Variable} = Record.
This assigns to the Variables the corresponding Values in the Record.
Pattern matching may be used in function definitions:
FunctionName(#Name{Key = Variable, ..., Key = Variable} = Variable) -> FunctionBody.
This makes the selected values, and the entire record (the last Variable) available in the function body.
A record is actually a tuple; the keys are just syntactic sugar available to the compiler. The function
rf(Record)
tells Erlang to drop the keys and treat the variable Record as the tuple {Name, Variable1, ..., VariableN}
. This changes the appearance of the variable in the program, not its actual value.The process dictionary
The process dictionary is a private, mutable hash table that is private to the current process. Keys are atoms; the value associated with a key may be changed. The use of a process dictionary negates many of the advantages of a single-assignment functional language, hence its use is strongly discouraged. Supplied operations are:put(Key, Value) -> OldValue
- Associates the Value with the Key, returning the previous value associated with the Key, or the atom
undefined
. get(Key) -> Value
- Returns the Value currently associated with the Key, or the atom
undefined
. get() -> [{Key, Value}, ..., {Key, Value}]
- Returns a list of all Key/Value tuples.
get_keys(Value) -> [Key, ..., Key]
- Returns a list of Keys having the given Value.
erase(Key) -> Value
- Returns the Value currently associated with the Key, or the atom
undefined
, and removes the Key/Value pair from the process dictionary. erase() -> [{Key, Value}, ..., {Key, Value}]
- Returns a list of all Key/Value tuples, and erases the contents of the process dictionary.
Concurrency
Concurrent programming is very simple in Erlang. There are three primitives:Primitive | Description |
---|---|
Pid = spawn(Fun) | Creates and starts new process ("Actor") and tells it to evaluate Fun. The new process is a very lightweight Thread, managed by Erlang, not an operating system process. A previously defined function may be passed in with the syntax fun FunctionName/arity . |
Pid ! Message | Sends the Message to the Pid process. This is an asynchronousoperation, that is, execution continues without waiting for a reply. The value of the expression is the Message itself. |
receive | The semantics are similar to that of the case expression. The syntax is a bit complex so that a process can handle messages of many different types.The after clause is optional. If used:
receive and after , the statement "sleeps" for the given number of milliseconds. |
receive
statement), it takes the first message that can be matched by some Pattern, and executes the correspondingExpression_sequence. Unmatched messages are left in the mailbox.To send a message to a process, you must know its process id (Pid). If you want the process to send you back a response, you must tell it your Pid, usually as part of the message; for example,
Pid ! {MyPid, MessageData}
.It is also possible to register a Pid, thus making it globally available. Here are the BIFs (built-in functions) for doing that:
register(AnAtom, Pid)
-- gives the Pid a globally accessible "name," AnAtom.unregister(AnAtom)
-- removes the registration. If a registered process dies, it is automatically unregistered.whereis(AnAtom) -> Pid | undefined
-- gets the Pid of a registered process, or undefined if no such process.registered() -> [AnAtom :: atom()]
-- returns a list of all registered processes.
Exceptions
In addition to exceptions resulting from program errors, there are three kinds of exceptions that the programmer can deliberately generate:exit(Reason)
-- exits the current process, and broadcasts{'EXIT', Pid, Reason}
to all linked processes.throw(Reason)
-- throws an exception that the caller might want to catch.erlang:error(Reason)
-- indicates a fatal error.
try..catch
statement, with this syntax:and this semantics:try FunctionOrExpressionSequence of Pattern1 [when Guard1] -> Expressions1; ... catch ExType1:ExPattern1 [when ExGuard1] -> ExExpressions1; ... after AfterExpressions end
- The FunctionOrExpressionSequence is evaluated,
- If it completes successfully,
- Its value is compared against the Patterns,
- the Expression associated with the first matching Pattern is evaluated, and this is the value of the
try..catch
.- The
when
guards are optional.
- The
- If an exception occurs,
- The first matching
catch
clause is evaluated- The ExTypes must be one of
throw
(default if omitted),exit
, orerror
.- Special case: The syntax
_:_
will catch every possible exception.
- Special case: The syntax
- The value of the
try..catch
is the value of the corresponding ExExpression.
- The ExTypes must be one of
- The first matching
- If it completes successfully,
- In any event, the (optional) AfterExpressions are executed, but the resulting value is discarded.
Input/Output
As with most languages, there are a lot of I/O routines. Files can be read as binary, as a sequence of lines, or as Erlang terms. This paper describes only line-oriented I/O.On output, data is interpolated (inserted) into the FormatString at the following locations (excluding
~n
):~s
Print as a string.~w
Print any value in "standard syntax". Strings are printed as lists of integers.~p
Pretty print any value (breaking lines, indenting, etc.)~n
Print a newline.
Input from the console
Line = io:get_line(Prompt). % Prompt is a string or an atom
Output to the console
io:format(FormatString, ListOfData).
Input from a file
{ok, Stream} = file:open(FileName, read). Line = io:get_line(S, ''). % May return eof file:close(S)
Output to a file
{ok, Stream} = file:open(FileName, write). io:format(S, FormatString, ListOfData). file:close(S).