Introduction

Arena is a typed, functional and natively compiled programming language. It was created to compare garbage collection algorithms. To be fair battleground for the competing algorithms it does not have any optimizations regarding memory management. Even though the language is kept as minimalistic as possible to simplify the development of the compiler it does implement all core features of a typed, functional programming language: it has immutable variables, user defined types, pattern matching, functions, type checking and an import system.

Currently Arena features three garbage collection algorithms that can be selected at compile time:

  • Spill as the name suggest allocates all objects on the heap with a call to the malloc function provided in libc and then never frees this storage until the program terminates.
  • ARC stands for automatic reference counting and is the default garbage collection algorithm. Every allocated object on the heap maintains a counter of the existing pointer to it. When this counter reaches 0 the object is no longer accessible and can therefore be freed.
  • TGC is the tracing garbage collection algorithm, currenlty a basic mark and copy collector. The available heap space is divided into two parts: an active and a copy heap. When the active heap is full a garbage collection pass is triggered which traverses the stack and copies all reachable objects to the copy heap. Afterwards the roles are swiched and the copy heap becomes the active one until the next collection.

Hello World

When learning a new programming language it is mandatory to write a Hello World program. For Arena it looks as follows:

Filename: hello_world.arena

fn main() = print("Hello World!\n")

To compile the program you need to call the Arena compiler and pass it the correct filename. Afterwards you can execute the binary. The default binary name is out.

$ arena hello_word.arena
$ ./out

If you want to specify another name you can do so with the -o flag.

$ arena hello_word.arena -o hello
$ ./hello

If you want to learn about all the other existing compiler flags you can list them with:

$ arena --help

Module

Each file with the extension .arena is interpreted by the Arena compiler as a module.

Each module consists of imports at the top of the file, afterward type definitions and finally arbitrary many function definitions. The compiler needs a main function in the provided module as the entry-point of the compiled binary, this is not necessary for imported modules.

Functions

A function in Arena has to have a defined set of typed input parameters, a return type and the body of the function. For example, a function between that checks if a number a lays between two numbers b and c would be defined like this:

fn between(a: i32, b: i32, c: i32) -> bool = a > b && a < c

The instructions after the = are the body of the function. Given a specific instantiation of the function parameters the body outputs a value of the return type of the function. If the return type is void the function does not return anything.

Function calls are performed similar to other mainstream languages by passing the arguments between round brackets. To check if 1 is smaller than 3 and larger than -1 you can use:

between(1, -1, 3)

A function with return type void will drop the last computed value and not return anything. void is the default return type of a function and can be omitted in the function definition:

fn println(s: str) -> void = print(s); print("\n")

// Or:

fn println(s: str) = print(s); print("\n")

Data Types

Arena currently has 4 primitive data types:

  • void is the empty type
  • i32 is a signed 32 bit integer
  • bool is a boolean and can therefore either be true of false
  • str is a literal String and is defined between two double quotes

The user can also define own types. These work similar user defined types in Haskell where the keyword data is used. Each type can have several cases which can contain fields to save data. The instances of these types are always allocated on the heap. The definition of a linked list of integers could look as follows:

type IntList {
    Nil,
    Cons(i32, IntList),
}

To construct a new instance you need to specify a case of the type and initialize all its fields:

let x = IntList.Cons(1, IntList.Cons(2, IntList.Cons(3, IntList.Nil)));
...

To access the fields of a user defined type you have to deconstruct the type with a match. Look into the chapter about Control Flows to learn more about them.

As mentioned before all used defined types are always allocated on the heap. The garbage collection strategy used to maintain the heap can affect the memory layout and execution time but will never alter the execution result. You can specify one of the three currently available garbage collection strategies with a compile time flag:

$ arena example.arena --spill  # will not free any memory until the program terminates
$ arena example.arena --arc    # automatic reference counting
$ arena example.arena          # arc is the Default
$ arena example.arena --tgc    # tracing garbage collection (mark and copy collector)

Operators

Arena has the following primitive operators:

  • + Adds two integers: 2 + 3
  • - Subtracts two integers or negates one: 2 - 3 or -2
  • * Multiplies two integers: 2 * 3
  • / Divides two integers and returns the integer result: 2 / 3 returns 0
  • % Divides two integers and returns the remainder: 2 % 3 returns 2
  • ! Negates a boolean: !true returns false and !false returns true
  • || Returns the logical or of two booleans: false || true *
  • && Returns the logical and of two booleans: true && false *
  • == Checks the two operands for equality: 2 == 3
  • != Checks the two operands for inequality: 2 != 3
  • < Checks if the first integer is smaller than the second one: 2 < 3
  • <= Checks if the first integer is smaller than or equal to the second one: 2 <= 3
  • > Checks if the first integer is larger than the second one: 2 > 3
  • >= Checks if the first integer is larger than or equal to the second one: 2 >= 3

* Both logical operators are lazy and only compute the second operant if necessary

Variables

Like most programming languages Arena also features variables. But as all functional languages this name is not really appropiate as they are immutable.

They work as one may expect: first the variables value is defined and afterwards this value can be accessed by the variables name:

fn main() =
    let x = 10;
    if x == 10 then
        print("x has the value 10\n")
    else
        print("something went very wrong\n")

This is also the first time that we see the semicolon ;. It is compulsory after variable definitions as the definition itself does not return any value and the expression is therefore not complete yet. But it can also be used to concatenate two expressions. This does only make sense if the first expression has side effects. Pure functional programming languages in a mathematical sense should not have any side effects but all languages end up having some mainly through interactions with IO:

fn main() =
    print("Hello ");
    print("World!");
    print("\n")

It's important to keep in mind that parenthesis in Arena also function as a namespace. Therefore all variables defined inside a parenthesis block can not be accessed after the namespace is closed and exited:

fn main() =
    // accessing y outside of the then block would result in an compilation error
    let x = if true then (let y = 1; 2) else 3;
    x

The main function would exit with the value 2.

Control Flow

Currently there is only one control flow construct: the if expression:

if expression

As usual for all functional programming languages all control flow constructs are expressions instead of statements and therefore always return a value. This is not different for Arena. The knowledge acquired in previous chapters combined with an if expression suffices to calculate the 40th fibonacci number:

fn main() = fib(40)

fn fib(n: i32) -> i32 =
    if n < 1 then
        n
    else
        fib(n-1) + fib(n-2)

If either the then or the else block include an expression with a semicolon you have to surround the block with curly brackets to resolve possible ambiguities of the location of the end of the block.

Alternatively you can use the fact, that in Arenaparanthesis can include any expression. This allows you to surround the block that includes a semicolon with paranthesis instead of surrounding both with curly brackets.

Round brackets in general and curly brackets specifically for if expressions can also always be added precautisouly.

fn print_n_times(str: String, n: i32) =
    if n <= 1 then
        print(str)
    else {
        print(str);
        print_n_times(str, n-1)
    }

// Or:

fn print_n_times(str: String, n: i32) =
    if n <= 1 then
        print(str)
    else (
        print(str);
        print_n_times(str, n-1)
    )

// Or:

fn print_n_times(str: String, n: i32) =
    if n <= 1 then {
        print(str)
    } else {
        print(str);
        print_n_times(str, n-1)
    }

match expression

The match keyword also starts an expression and is mainly used to deconstruct user defined types. Each match expression takes an object and tries to match it against a set of patterns. The body of the first pattern that succeeds will be executed populating the variables defined in the pattern with the corresponding values. This example shows how a spaced print could be implemented for a linked list of words.

type List {
    Nil,
    Cons(String, List),
}

fn printList(l: List) =
    match l {
        List.Nil => print("\n"),
        List.Cons(w, List.Nil) => print(w); print("\n"),
        List.Cons(w, tail) => print(w); print(" "); printList(tail),
    }

Imports

Every module can have arbitrary many imports statements at the top of the file.

There are three options to import another module. If the file is located in the same directory as the current one and the file name is an allowed identifier in arena (contains letters, numbers, and underscores and does not start with a number) then you can import the module as follows:

import other_module

// or if you want to use the module with the name 'lib':

import other_mode as lib

If you need to define a relative path or the name is not an allowed indentifier you need to use quotes around the import path and you need to give the imported module a name to be used by in the current module:

import "subfolder/other_module" as lib

All functions and types defined in the other module can now be used in the current module by prefixing the function or type name with the identifier specified during importation separatied by two colon. For example, to call a function foo defined in the other_module which was given the identifier lib during its importation would work like this:

import other_module as lib

fn main() = lib::foo()

If the imported file is not found relative to the current path, the compiler will search for it in two other locations: the user wide library folder: ~/.arena/lib/ and the system wide library collection in the lib subfolder of the installation directory of the arena compiler.