^ go up - main page

vurce assembly description

This page describes the assembly language which is converted by vrcasm into bytecode for the vurce virtual machine.

overview

A program is a sequence of the following types of expressions:

All whitespace is insignificant, except when needed to separate certain types of tokens. Comments are enclosed with the characters ( and ).

Keywords generally correspond to bytecode instructions, and are described in detail later.

Integer literals are translated directly into the bytecode instruction that pushes the given literal value to the stack (ie. the push opcode followed by the value). Character literals are the same, except their value is the ASCII code of the given character.

String literals represent sequences of characters, and are only used as arguments to directives.

Label definitions mark specific locations within the program. Labels can be defined with either global scope or local scope. Global labels are accessible throughout the file, while local labels are only accessible within the local scope they were defined in. Local scopes are delimited by global label definitions.

Label names can contain only alphanumeric characters plus underscores (0-9A-Za-z_), and cannot start with a digit.

Label references behave the same as integer literals, except their value is the address of the given label.

Directives are like special commands within the assembler, which may take a fixed number of arguments after them. A list of available directives is provided later.

Below is an example program which prints the string "Hello world!" followed by a newline. (This is not the most efficient way to print a string, but it demonstrates more of the language's features.)

:start
    string printString call
    ret

    .string /s "Hello world!\n" /b 0

{ string -- }
:printString
    string set
    0 i set

    loop
        string get i get add get char setb
        char getb 0 neq while
        char getb 0x00 outb
        i get 1 add i set
    end

    ret

    .string /w 0
    .i /w 0
    .char /b 0

keywords

Keywords are reserved words which generally insert bytecode instructions into the program. Most of these map directly to single instructions; these are described on the machine description page. In addition to these, vurce assembly also supports various keywords which map to structured control flow mechanisms. Below is a list of these:

Keyword Synopsis Description
if cond -- Executes the following code block if cond is true (0xffff).
This is equivalent to not [end] jc (although, if the previous instruction is not, the assembler may optimize away the duplicate nots).
loop -- Repeatedly execute the following code block indefinitely.
end -- Ends a code block.
When ending a loop block, this is equivalent to [loop] jmp.
while cond -- Exit the current loop if cond is false (0x0000).
This is equivalent to not [end] jc (although, if the previous instruction is not, the assembler may optimize away the duplicate nots).
break -- Exit the current loop.
This is equivalent to [end] jmp.
cont -- Immediately skip to the next iteration of the current loop.
This is equivalent to [loop] jmp.

directives

The assembler supports the following directives:

Directive Description
/b [integer] Insert a byte into the program.
[integer] must be an integer, character literal, or label reference.
/w [integer] Insert a 16-bit word into the program.
[integer] must be an integer, character literal, or label reference.
/s [string] Insert a string of ASCII characters into the program.
Note that this only inserts the raw characters of the string into the program; the assembler does not automatically null-terminate or length-prefix the string.
/times [n] [directive] Applies the given directive a given number of times.
For example, /times 128 /b 0 reserves a buffer of 128 bytes initialized to zero.
/origin [addr] Sets the origin of memory addressing to the given value. The default origin value is 0.
/at [addr] Pads the output file with zeroes until the given address (relative to the origin) is reached.
/incbin [filepath] Inserts the raw binary contents of the given file into the program.

escape sequences

String and character literals can contain the following escape sequences:

Sequence Description
\\ Backslach (\)
\" Double quote (")
\' Single quote (')
\n Newline
\xNN Character with the given two-digit hexadecimal code

preprocessor

Before a file is assembled, a preprocessor is run on the contents of the file. This preprocessor can be used for purposes such as including other source code files, or defining simple macros. Preprocessor directives start with % and must be on a separate line. A preprocessor directive may be followed by whitespace-separated arguments. Arguments containing spaces must be enclosed with quotation marks (").

Below is a list of supported preprocessor directives:

Directive Description
%include [filepath] Imports the contents of the given source code file.
%define [macro] [expansion] Defines a macro which expands to the given string.
Macro names follow the same rules as label names.
Note that all arguments from the second onwards are automatically coalesced into a single space-separated string, so you do not need to enclose the expansion with quotation marks.