Arrp Language Guide

This document describes the structure and meaning of an Arrp program.

See also the Formal syntax specification. This document provides links into the formal specification whenever a syntactical element is introduced.

Table of contents

Modules

A module is the smallest unit of compilation, residing in a file or provided to the compiler through standard input.

A module consists of different kinds of declarations (all optional):

See Grammar: module.

Here is a simple complete module that takes a stream of integers as input, multiplies each integer by 2 and outputs a stream of resulting values:

module example;

input x : [~]int;
output y = x * 2;

Default module name

The module name declaration is optional. If it is missing, the module gets the default name "m".

Imported modules

A module import declaration causes the imported module to be compiled, and the global names in that module become available in the current module.

When importing the module using import lib, a value named x becomes available as lib.x (see Grammar: qualified-id).

We can also import a module under and alias using import lib as alias by which a value x in the module becomes available as alias.x.

The imported module must reside in a file named "module-name.arrp". The file must either be adjacent to the location of the importing module, or in one of the import directories given as an option to the compiler.

Input and output

An input declaration declares an input, assigning it a name and a type. For example, the following declares an input named x as a stream of 64 bit real values:

input x : [~]real64

The output is declared using an output declaration which can simply mark a name as an output or also specify its value or type. For example, any of the following lines declares the value named y to be the output:

output y;
output y = x * 2;
output y : [~]int;

If the value of the output is not provided in the output declaration, it must be defined separately, for example:

output y;
y = x * 2;

The way that input and output data is exchanged with the environment depends on the target form of compilation. See also the page on the C++ target.

Values and functions

Value declarations define values and functions and bind them to names:

pi = 3.14159265359;

area_of_circle(r) = pi * r * r;

area_of_right_triangle(a, b) = a * b / 2;

a = area_of_right_triangle(10, 5);

See Grammar: value-decl, func-decl

The scope of globally declared names is the entire module. They are mutually recursive.

Local declarations

Names with scope local to an expression can be declared using the "let" and "where" expressions:

f(x) = let y = x * x in sin(y) + cos(y);

g(x) = sin(y) + cos(y) where y = x * x;

See Grammar: let-expr, Grammar: where-expr

Function application and polymorphism

A function is applied when followed by arguments in parenthesis (see Grammar: func-application).

Partial application is supported: a function applied to less arguments than it has parameters results in another function with the remaining parameters.

All functions are polymorphic: they can be applied to arguments of any type consistent with the expression of the function.

Lambda functions

An unnamed function (a "lambda abstraction") can be defined as part of another expression:

f(g) = g(1,2) + g(3,4);

a = f((x,y) -> x*y);

See Grammar: func-lambda.

Type annotations

Occasionally, you may want to provide an explicit type declaration for a name:

y : real64;
y = 1 + f(x);

See Grammar: type-decl.

The type of a name can not be changed using a type declaration - the declared type must match the type of the name's expression.

Recursion

Recursion is allowed for the purpose of recursive arrays. However, recursive use of functions is not supported. Some common uses of functional recursion can be achieved with recursive arrays. See Recursive arrays.

External functions

An external function is declared with a name and a type, and it allows the name to be used as any other function, except that it is not polymorphic:

external transform : [100]real64 -> [50]real64;

x[i] = sin(i/100), if i < 100;
y = transform(x);

See Grammar: external-decl.

Precisely how an external function is executed depends on the target form of compilation. See also the page on the C++ target.

Arrays and Streams

Arrays are a special kind of functions that map integers to values of some primitive type. Each formal parameter of such a function is also called a dimension, and an actual parameter (argument) is also called an index. A multi-parameter function is therefore a multi-dimensional array.

An array dimension ranges from 0 up to (and excluding) some number called array size. However, the size of one dimension can be infinite. Such an array is called a stream.

In this document, we represent a sequence of indices or array sizes for multiple dimensions with a tuple: <x,y,...>.

Similarly to general functions, arrays and streams can be defined using equations (named) or as part of an expression (unnamed).

Array equations

See Grammar: elem-value-decl.

An array equation defines values of array elements at particular indices. In the following example, an array named a is defined using two equations:

a[0] = 1;
a[n] = 1/n, if i < 10;

On the left-hand side of an equation, each index can be given a specific value (e.g. 0) or a name (e.g. i). Named indices can be used in the value on the right-hand side of the equation and their range can be constrained using equalities or inequalities following the keyword if. Each index has an implicit lower bound of 0.

The right hand side of an equation can have multiple parts with different index constraints:

a[n] = {
    1, if n % 3 == 0;
    0, otherwise
}

Multi-dimensional arrays are defined by declaring multiple indices on the left-hand side of the equation:

a[x,y] = x+y, if x < 10 and y < 20;

The size of an array is the largest rectangle (or hyperrectangle) containing all the indices defined by the equations. It is an error if any indices in this rectangular area are undefined. For example, the following equation defines a set of indices that do not form a rectangle:

a[x,y] = x+y, if x < 10 and y < x;
    ... Error: Array is not rectangular.

An index constraint (following if) must be a quasi-affine expression in index variables and constant integers. This means it can only contain:

Array expressions (lambda arrays)

See Grammar: array-lambda.

An unnamed array can also be defined using an expression which can be part of a larger expression. In the following example, an array (in bold) is passed as an argument to a function f:

r = f([i:5] -> i*2)

Each index variable between the brackets on the left side of -> can optionally be followed by its upper bound. This removes the need to specify the bound using index constraints.

Streams

A stream is defined using array equations that leave the upper bound of an index undefined (the lower bound of 0 is still implied):

s[i] = i * 2;

A stream is also defined using an array expression without an index bounds, or using ~ (infinity) as the index bound:

[i]   -> i * 2
... or
[i:~] -> i * 2

Nested arrays

Arrays can be defined as elements of other arrays, resulting in an array with combined dimensions:

a = [t] -> [i:5] -> sin(t*i);

The array "a" is a 2-dimensional array with size <~, 5>.

Array application

An array is applied when followed by indices in square brackets (see Grammar: array-application):

a = [x:5,y:10] -> x+y;
b = a[3,4];

An array can be partially applied, just like an ordinary function, which results in an array with the remaining dimensions:

a = [x:5,y:10] -> x+y;
b = a[3];

Recursive arrays

A recursive array is defined by using the array name in its own definition:

a[0] = 0.0;
a[n] = a[n-1] * 0.8;

Multiple arrays can be mutually recursive:

a[0] = 0.0;
a[n] = b[n-1] * 0.8;
b[n] = a[n] + 1;

Array enumeration

An array can be defined by listing values of each of its elements (see Grammar: array-enum):

x = 100;
a = (4, x*3, (-1, -2), x+2);

Array concatenation

Two arrays can be concatenated (see Grammar: array-concat). The first array must be finite in its first dimension while the second can be a stream:

a = ([i:5] -> i) ++ ([j] -> j*2);

Primitive values

Arrp supports the following primitive types:

Literal values can be created for the following types: boolean, int, real64, complex64 (see Grammar: literal).

An expression can be converted to other primitive types (see Type Conversion).

Built-in operations and functions

Built-in operations and functions operate on primitive types. They can also be applied to arrays, pointwise.

The following tables list relations between argument types and result types, and additional conditions on argument types. In addition to type names, the following forms have special meanings:

If the types of arguments do not satisfy the conditions, argument promotion is attempted. Only the least number of promotions to satisfy the conditions are done. If there are multiple satisfiable sequences of promotions with equal length, the expression is ambiguous and the program is ill-defined.

The following type promotions are attempted:

Unary and binary operators

expression result conditions comments
!bool bool
bool || bool bool
bool && bool bool
bool && bool bool
a == a bool
a != a bool
a < a bool a : simple-numeric
a <= a bool a : simple-numeric
a > a bool a : simple-numeric
a >= a bool a : simple-numeric
a + a bool a : numeric
a - a bool a : numeric
a * a bool a : numeric
a / a bool a : real or complex Floating point or complex division.
a // a bool a : simple-numeric Integer division.
int % int int Remainder.
a ^ a a a : simple-numeric Exponentiation.

Math functions

expression result conditions comments
exp(a) a a : real or complex
exp2(a) a a : int or real
log(a) a a : real or complex
log2(a) a a : real
log10(a) a a : real or complex
sqrt(a) a a : real or complex
sin(a) a a : real or complex
cos(a) a a : real or complex
tan(a) a a : real or complex
asin(a) a a : real or complex
acos(a) a a : real or complex
atan(a) a a : real or complex
ceil(a) a a : int or real
floor(a) a a : int or real
abs(a) a a : int or real
min(a,a) a a : int or real
max(a,a) a a : int or real

Complex number functions

expression result conditions comments
real(complex32) real32 Real part of complex number.
real(complex64) real64 Real part of complex number.
imag(complex32) real32 Imaginary part of complex number.
imag(complex64) real64 Imaginary part of complex number.

Type conversion functions

expression result conditions comments
int(a) int a : simple-numeric
real32(a) real32 a : simple-numeric
real64(a) real64 a : simple-numeric
complex32(a) complex32 a : numeric
complex64(a) complex64 a : numeric

Pointwise operations and broadcasting

Pointwise application of primitive operations

When a primitive operation is applied to arrays, it is performed pointwise on array elements, given that the array sizes are compatible:

a = ([i:5] -> i) + ([i:5] -> 10*i);

Broadcasting

Two array sizes are compatible if they are equal in all common dimensions. The result of the operation will have the size of the array with more dimensions.

If an argument has less dimensions than the result, the result at some index will use the value of the argument at an index equal in the common dimensions.

In addition, a primitive value is compatible with an array of any size, and will be used for all the result values.

For example, the array "a" can be multiplied with a constant value (b), with an array of equal size (c), or an array with more dimensions (d):

a[t] = sin(t*0.2);
b = a * 0.1;
c = a * [t] -> sin(t*0.01);
d = a * [t] -> [i:10] -> sin(t*0.1/i);

Broadcasting in indexing

An array or primitive value can be indexed with more indices than it has dimensions, resulting in the array value at an index with equal common dimensions.

For example, the function f can be applied to the primitive value a, in which case k[t] is a constant expression. It can also be applied to the stream b so that k[t] is variable:

f(k) = [t] -> sin(t * k[t]);
a = f(0.1);
b = f([t] -> sin(t * 0.01));

Broadcasting in array definition

Broadcasting also applies to a set of array equations. If the values on the right hand sides are arrays of different sizes, the result reuses values of arrays with smaller number of dimensions just like in a primitive operation:

a[0] = 1;
a[t] = [i:10] -> t + i;