creole/README.md

36 lines
1.3 KiB
Markdown

Creole is a bytecode designed for simple implementations.
## Bytecode Format
Each creole line consists of pseudo-UTF-8 characters. The first byte
is an unsigned number between 0 and 127 (the high bit is clear). Each
suceeding pseudo-UTF-8 character is encoded as follows:
* `110HHHHx 10xxxxxx`
* `1110HHHH 10xxxxxx 10xxxxxx`
* `11110HHH 10Hxxxxx 10xxxxxx 10xxxxxx`
* `111110HH 10HHxxxx 10xxxxxx 10xxxxxx 10xxxxxx`
* `1111110H 10HHHxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx`
* `11111110 10HHHHxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx`
The first four bytes determine the type. The LSB high bit determines
if the encoded value is a register (`0001`) or immediate (`0010`).
The second bit from LSB determines if the value should be treated
as a signed 32 bit two's compliment number (`001X`) or should be
treated as an unsigned 32 bit number (`000X`).
All other values are reserved. Overlong values are allowed, and for some
argument values they are necessary. All lines are terminated by a byte
of all zeros.
## Assembler
The macro assembler is Python (see the asm directory).
## Design Philsophy
Creole is small but not minimal. It is easy to add instructions, and
the amount of labels, registers, and stack space can changed at runtime.
System calls can be added dynamically, but static instructions that take
at most 3 arguments should be hard-coded.