Juno Frontend
Syntax
Throughout the description of syntax, we use the convention that any text contained within a pair of angle-brackets with no space (such as ) denotes a placeholder. In particular, we use the following placeholders
- name: an identifier, which must begin with a letter and the other characters may be letters, numbers, or underscores
- qualified name: a sequence of identifiers separated by
::
- top levels: a number of top-level constructs; where the top-level constructs are those defined below as Modules, Constants, Types, and Functions
- type: a type name
- expr: an expression
Additionally, text contained within a pair of square brackets without space (such as [this]) denotes an optional part of the syntax
Modules
Modules can be imported using
use <qualified-name>;
In this context, the qualified name may also end with ::*
to indicate that all names in some package should be imported into the current scope.
To define a new module within an existing file, use
[pub] mod <name> { <top-levels> }
where the optional pub indicates whether the module is exported when this file is used by another program.
Constants
Constant values can be declared as
[pub] const <name> [: <type>] = <expr>;
Constant values must be able to be completely evaluated at compile time and therefore do not allow function calls, but all other expression forms, with the exception of those involving arrays, are supported.
Types
Type declarations have the form
[pub] type <name> [< <type-variables> >] = <type-def>
where <type-variables>
are a comma separated list of names and kinds expressed as <name> : <kind>
where the kinds are type
, usize
, number
, and integer
.
The kind type
indicates that the associated variable is some type; this is the standard type variables that are present in many languages.
The kind usize
indicates that the variable is a dynamic constant, and can therefore be used as a dimension for arrays as well as being used in expressions.
The kinds number
and integer
indicate that the associated variable is a type but must be a numeric type or integer type, respectively; this allows us to express computation such as matrix multiplication which can be applied to any numeric type while still ensuring that this parametric function will be well-typed for all types it can be applied to (in contrast to C++ where type errors can arise when a function is instantiated).
In addition, the kind can be omitted after a variable to instead create a list of variables followed by a kind, such as x, y, z : usize
, in which case all preceding variables after the last kind are given the specified kind (here usize
).
Type definitions have three forms, first they can just be another type, hence defining a type alias. They can also be a struct or union definition having forms
[pub] struct { <fields> }
[pub] union { <fields> }
where in this context pub indicates for a struct whether all fields are public and for a union whether the constructors and their arguments are public. The fields in these definitions are lists of fields of the following form
[pub] <name> [: <type>] ;
If a field does not have a specified type, it is taken (implicitly) to be unit. In a struct the pub on individual fields allows control of exactly which fields are exposed publically, and is a semantic error on unions.
Finally, types have the following forms
i8
i16
i32
i64
u8
u16
u32
u64
usize
f32
f64
bool
void
( <types> )
<qualified-name> [::< <type-expressions> >]
<type>[ <type-expressions> ]
Note that in this type system the void type and the unit type are considered to be the same type, and can be denoted as ()
as well.
The parentheses enclosed form (assuming the number of types inside is not one) where the types are comma separated, is used to define product types.
Lastly, type expressions, are used in contexts such as type arguments and array sizes and can be either types or dynamic constant expressions (though only the latter are semantically valid in array dimensions) and have the following forms
i8
i16
i32
i64
u8
u16
u32
u64
usize
f32
f64
bool
void
( <types-expressions> )
<qualified-name> [::< <type-expressions> >]
<type-expression>[ <type-expressions> ]
<integer-literal>
- <type-expression>
<type-expression> + <type-expression>
<type-expression> - <type-expression>
<type-expression> * <type-expression>
<type-expression> / <type-expression>
Functions
Function declarations have the form
[pub] fn <name> [< <type-variables> >] ( <arguments> ) [ -> <type> ] { <body> }
where the body is a statement and the arguments are a comma separated list of "argument binders" which have the form
[inout] <pattern> [: <type>]
where patterns are defined in the statements section.
Note that an argument that is marked as inout must be provided with a variable as its parameter when the function is called and after the call the value of that variable is updated to the value that argument had within the function when it returned.
Statements
let <pattern> [: <type>] [= <expr>];
const <pattern> [: <type>] [= <expr>];
<lexpr> <assign-op> <expr>;
if <non-struct-expr> { <body> } [else <if-stmt>]
match <non-struct-expr> { <cases> }
for <pattern> [: <type>] = <non-struct-expr> to <non-struct-expr> [by <signed-int-literal>] { <body> }
while <non-struct-expr> { <body> }
return [<expr>];
break;
continue;
{ <body> }
<qualified-name>[::< <type-expressions> >]( <parameters> );
The assignment-operators <assign-op>
are =
, +=
, -=
, *=
, /=
, %=
, &=
, |=
, ^=
, &&=
, ||=
<<=
, >>=
and have the standard meanings (for instance as in C).
An <if-stmt>
means either another if-then-else construct or a curly-bracket inclosed body.
Note that, like in Rust, the condition of an if-then-else, match, for, while is a non-struct expression, meaning just that it cannot contain a struct declaration that is not contained in parentheses.
A signed integer literal is an integer literal that is allowed to be prefixed by a single either +
or -
sign (but a sign is not needed either).
Left-hand side expressions (<lexpr>
) have the following forms
<name>
<lexpr> . <name>
<lexpr> <dot-number>
<lexpr> [ <expressions> ]
where <dot-number>
denotes a dot (.) followed immediately (with no space) by an integer; for instance .0; these are used to index into product types.
The cases of a match statement have the following form
<patterns> => <body>
where are a bar |
separated list of patterns; note that a bar is also permitted at the start, so a collection of patterns can have either the form a | b | c
or | a | b | c
.
Finally, a pattern has the following forms
_
<int-literal>
<qualified-name>
( <patterns-comma> )
<qualified-name> { <named-patterns> }
<qualified-name> ( <patterns-comma> )
where _
is the wildcard pattern and <patterns-comma>
denotes a comma-separated list of patterns.
Named patterns are also comma separated but have the form <name> : <pattern>
and are used for pattern matching on the fields of a struct.
Note that the final form is used to pattern match on unions.
Expressions
Finally, expressions have the following forms
<int-literal>
<float-literal>
true
false
<unary-op> <expression>
<expression> <binary-op> <expression>
<expression> as <type>
<lexpr>
( <expressions> )
<qualified-name> [::< <type-expressions> >] { <id-expressions> }
if <expression> then <expression> else <expression>
<qualified-name> [::< <type-expressions> >] ( <parameters> )
The unary operators are negation -
, bitwise-not ~
, and boolean not !
.
The binary operators are +
, -
, *
, /
, %
, &
|
, ^
, &&
, ||
, <<
, >>
, <
, <=
, >
, >=
, ==
, !=
which have their standard interpretations, precedences, and associativies.
An <id-expression>
has form <name> = <expression>
and lists are comma separated.
Parameters are comma separated and have the form [&] <expression>
where the ampersand & is necessary for inout arguments.
Unimplemented Features
The following features are incomplete
- Implementation of match and non-variable patterns in all other bindings
- Type inference
- Dynamic constant expressions (i.e. expressions of dynamic constants which are themselves dynamic constants)
- Module definitions, imports, and qualified names
- Partial indexing into an array
- Struct values where some arguments are left as their default values
- Supporting arbitrary l-expressions for inout arguments, we currently only support variables