vurce assembly description
This page describes the assembly language which is converted by vrcasm into bytecode for the vurce virtual machine.
overview
A program is a sequence of the following types of expressions:
- Keywords:
set
,mul
, etc. - Integer literal (decimal, hex, or binary):
123
,0xCAFE
,0b10101010
- Character literal:
'a'
- String literal:
"Hello world!"
- Global label definition:
:ourLabel
- Local label definition:
.myLabel
- Label reference:
myLabel
- Directive:
/w
,/incbin
, etc.
All whitespace is insignificant, except when needed to separate certain types of tokens. Comments are enclosed with the characters {
and }
.
Keywords generally correspond to bytecode instructions, and are described in detail later.
Integer literals are translated directly into the bytecode instruction that pushes the given literal value to the stack (ie. the push
opcode followed by the value). Character literals are the same, except their value is the ASCII code of the given character.
String literals represent sequences of characters, and are only used as arguments to directives.
Label definitions mark specific locations within the program. Labels can be defined with either global scope or local scope. Global labels are accessible throughout the file, while local labels are only accessible within the local scope they were defined in. Local scopes are delimited by global label definitions.
Label names can contain only alphanumeric characters plus underscores (0-9A-Za-z_
), and cannot start with a digit.
Label references behave the same as integer literals, except their value is the address of the given label.
Directives are like special commands within the assembler, which may take a fixed number of arguments after them. A list of available directives is provided later.
Below is an example program which prints the string "Hello world!" followed by a newline. (This is not the most efficient way to print a string, but it demonstrates more of the language's features.)
:start string printString call ret .string /s "Hello world!\n" /b 0 { string -- } :printString string set 0 i set loop string get i get add get char setb char getb 0 neq while char getb 0x00 outb i get 1 add i set end ret .string /w 0 .i /w 0 .char /b 0
keywords
Keywords are reserved words which generally insert bytecode instructions into the program. Most of these map directly to single instructions; these are described on the machine description page. In addition to these, vurce assembly also supports various keywords which map to structured control flow mechanisms. Below is a list of these:
Keyword | Synopsis | Description |
---|---|---|
if |
cond -- |
Executes the following code block if cond is a non-zero value.This is equivalent to not [end] jc (although, if the previous instruction is not , the assembler may optimize away the duplicate not s). |
loop |
-- |
Repeatedly execute the following code block indefinitely. |
end |
-- |
Ends a code block. When ending a loop block, this is equivalent to [loop] jmp . |
while |
cond -- |
Exit the current loop if cond is zero.This is equivalent to not [end] jc (although, if the previous instruction is not , the assembler may optimize away the duplicate not s). |
break |
-- |
Exit the current loop. This is equivalent to [end] jmp . |
cont |
-- |
Immediately skip to the next iteration of the current loop. This is equivalent to [loop] jmp . |
directives
The assembler supports the following directives:
Directive | Description |
---|---|
/b [integer] |
Insert a byte into the program.[integer] must be an integer or character literal. |
/w [integer] |
Insert a 16-bit word into the program.[integer] must be an integer or character literal. |
/s [string] |
Insert a string of ASCII characters into the program. Note that this only inserts the raw characters of the string into the program; the assembler does not automatically null-terminate or length-prefix the string. |
/times [n] [directive] |
Applies the given directive a given number of times. For example, /times 128 /b 0 reserves a buffer of 128 bytes initialized to zero. |
/origin [addr] |
Sets the origin of memory addressing to the given value. The default origin value is 0. |
/at [addr] |
Pads the output file with zeroes until the given address (relative to the origin) is reached. |
/incbin [filepath] |
Inserts the raw binary contents of the given file into the program. |
escape sequences
String and character literals can contain the following escape sequences:
Sequence | Description |
---|---|
\\ |
Backslach (\ ) |
\" |
Double quote (" ) |
\' |
Single quote (' ) |
\n |
Newline |
\xNN |
Character with the given two-digit hexadecimal code |
preprocessor
Before a file is assembled, a preprocessor is run on the contents of the file. This preprocessor can be used for purposes such as including other source code files, or defining simple macros. Preprocessor directives start with %
and must be on a separate line. A preprocessor directive may be followed by whitespace-separated arguments. Arguments containing spaces must be enclosed with quotation marks ("
).
Below is a list of supported preprocessor directives:
Directive | Description |
---|---|
%include [filepath] |
Imports the contents of the given source code file. |
%define [macro] [expansion] |
Defines a macro which expands to the given string. Macro names follow the same rules as label names. Note that all arguments from the second onwards are automatically coalesced into a single space-separated string, so you do not need to enclose the expansion with quotation marks. |