[/Band|c00t] GettingStarted Specification Download Blog About
Fork me on GitHub

Table of contents:

Language Specification

 Keywords

Here are the keywords which are currently in use:

extend fn int join long minus
project real rename return select string
summary time type union var void

The time keyword has no meaning at the moment and is reserved for the future release.

Program Structure

A program is defined in a single source file. The file is evaluated from top to bottom in one pass (similar to the C language). The top-level elements of the program can be of the following types:

The convention for Bandicoot source file extension is .b.

Primitive Types

Primitive types are scalar types and are used for attributes within relations, as well as input parameters for functions. There are four types available:

Type Size Description
int 32-bit signed integer
long 64-bit signed integer
real 64-bit IEEE 754 double precision
string 0-1024 bytes UTF-8 encoded string

The primitive types are referenced within this specification as PrimitiveType.

Bandicoot is a strongly-typed language and converting a primitive expression of a given type into another type must be explicit. The current version of Bandicoot supports only conversion from one numeric type to another. There is no support for conversion between strings and numbers. The following syntax forms are supported:

    (int  PrimitiveExpr)
    (real PrimitiveExpr)
    (long PrimitiveExpr)

Identifiers

Here is the regular expression defining an identifier: [_a-zA-Z0-9]+. Maximum identifier length is 32 characters. Below you will find the following references to the identifiers:

 Relational Types

There are two ways to declare a relational type: named and inline. Named declarations give an identifier to some particular type so that it can be referenced in the code later. Inline (or anonymous) declarations are useful when the type is used only once (e.g. as an input or output function parameter).

Named type can be declared in the following way:

type TypeName
{
    AttrName PrimitiveType [,]
    [more attributes]
}

and inline type:

{
    AttrName PrimitiveType [,]
    [more attributes]
}

The relational types (both inline and named) are referenced within this specification as RelType.

 Relational Variables

Relational variables are used for keeping the program state. The system provides two types of variables:

Here is how you can declare a global variable named VarName.

var VarName RelType ;

The relational variables are referenced within this specification as RelVar.

Functions

Functions are identified by names which must be unique across the whole program source file. A function can make complex state transformations on top of the global variables (see Transactions section).

fn FuncName ( FuncArgs ) FuncReturn
{
    FuncBody
}

FuncArgs can be contain only one relational argument and and several arguments of a primitive type all separated with the commas. Each argument has the following structure:

ArgName "RelTypePrimitiveType"

The FuncReturn defines the result type of a function. It can either be a relational type or no result at all, identified by keyword void:

RelTypevoid

Function body (_FuncBody_) is a list of statements evaluated from top to bottom. The list is separated with the semicolons (";"). Statements can be of three types:

A function cannot call another function. Also, only one assignment per global relational variable is possible within a function body. After the assignment the global variable cannot be accessed anymore (within the same function). This is a temporary limitation and you can workaround it with the help of temporary variables.

 Relational Operators and Expressions

Bandicoot implements 8 relational operators which provide rich data manipulative features. Some of the operators are binary (take 2 relations as input) and some are unary (take 1 relation as input). Apart from the relational inputs these operators usually take additional argument specific to the operator. Every operator returns a new relation and does not modify the inputs. The language provides these operators as functions with arguments:

OperatorName (arg1) (arg2) ... (argN)

The brackets around the arguments are mandatory only if the argument is an operator with at least one argument.

Every relational variable (global or local) is an operator as well and returns the value of the variable. The operators are the main building blocks in the language. Complex relational expressions (_RelExpr_) can be created by nesting the relational operators to compute the desired results.

Rename

rename ToAttrName = FromAttrName [,] [more attributes] RelExpr 

This operator creates a new relation with the specified attributes being renames, the relational body (tuples) does not change.

 Project

project AttrName [,] [more attributes] RelExpr

The result contains only the attributes defined as the first argument. It can have reduced number of tuples due to removal of duplicate values.

project

 Extend

extend AttrName = PrimitiveExpr [,] [more attributes] RelExpr

The operator adds the attributes defined as the first argument to each tuple of the input relation. The values are computed by primitive expressions.

extend

Select

select BooleanExpr RelExpr 

The result contains only those tuples of the input relation which match the boolean expression defined as the first argument.

select

Union

union RelExpr RelExpr

or

RelExpr + RelExpr

The union operator creates a new relation consisting of two input relations removing duplicate tuples. Both inputs need to have the same attributes.

union

Minus (Semidifference)

minus RelExpr RelExpr

or

RelExpr - RelExpr

Removes tuples from the first input which match tuples in the second input. The matching logic is an equality on the common attributes.

minus

 Natural Join

join RelExpr RelExpr

or

RelExpr * RelExpr

This operator creates a result where the tuples are combinations of matching tuples from both input relations. The matching logic is an equality on the common attributes. If there are no common attributes the result is a cartesian join (i.e. every tuple from the first input matches every tuple in the second input). All the attributes from the input relations are present in the result.

join

Summary

Unary version

summary AttrName = SumFunc [,] [more attributes] RelExpr

 Binary version

summary AttrName = SumFunc [,] [more attributes] RelExpr RelExpr

Both unary and binary versions of summary operator produce tuples containing summary data grouped according to the specified attributes. In case of the unary version, the grouping is done by a virtual relation with zero attributes and therefore the result contains up to one tuple. The binary version creates the groups according to all the attributes of the second relation.

summary

Result type is expressed as an extension of the empty relation (unary summary) or rightmost relation (binary summary). Each attribute can be of a specified summary function (_SumFunc_). Here is a list of currently supported functions:

Where DefVal is a constant expression. The type of the expression should match the type of the result and attribute. The exception is the avg function where the default value and result are always real numbers. DefVal is used in those cases when the RelExpr body is empty. In case of the binary summary this can happen when there is no matching tuple in left RelExpr for a tuple in the right RelExpr.

 Transactions

Each invocation of a function implicitly creates a transaction. All the statements within a function are part of the same transaction. There are no explicit keywords to commit or rollback a transaction. If there is an error the rollback is performed automatically and an error code is returned to the client.

Modification of a global variable is not allowed by two transactions at the same time. Therefore two functions which modify the same variable are serialized and executed one after the other. Read-only functions are executed in parallel with other read/write functions.

The level of isolation is always serializable and it means that if a read of the same variable occurs several times within a function it always returns the same data even if the variable is modified by a different function at the same time.

API

Bandicoot API is based on the HTTP/1.1 protocol. The interface exposes all the functions defined in a program source file through http://server:port/FuncName URLs. The HTTP POST method must be used to invoke a function with an input parameter. Otherwise the HTTP GET is required.

Both input and output parameters are exchanged in "comma separated values" format. The tuples are delimited with the \n end-of-line character. The first line is a relational head definition in the following format:

    AttrName PrimitiveType [,][more attributes]

The comma or the end-of-line character can be escaped by using \ character. It means the Bandicoot will not represent those characters and they will be treated as part of your data.