barebones bytecode interpreter in C
Go to file
Peter McGoron 6469970fd0 0.2.1 2023-03-02 17:32:11 +00:00
asm add db overwrite 2023-02-25 21:10:08 +00:00
c_test export creole_decode; add db test 2023-02-25 21:01:03 +00:00
.gitignore prototype bytecode interpreter 2023-02-05 11:44:37 +00:00
LICENSE.md prototype bytecode interpreter 2023-02-05 11:44:37 +00:00
Makefile export creole_decode; add db test 2023-02-25 21:01:03 +00:00
README.md 0.2.1 2023-03-02 17:32:11 +00:00
creole.c add missing break 2023-03-02 17:32:00 +00:00
creole.h 0.2.1 2023-03-02 17:32:11 +00:00

README.md

Version 0.2.1

Creole is a bytecode designed for microcontrollers. It's C source file is less than 1000 lines long and does not depend on the C standard library.

Bytecode Format

The syntax of creole instructions are

[1 byte opcode][2 or more byte instruction]*[1 byte all zero]

Each creole instruction consists of pseudo-UTF-8 characters. The first byte is an unsigned number between 0 and 127 (the high bit is clear). Each suceeding pseudo-UTF-8 character is encoded as follows:

  • 110HHHHx 10xxxxxx
  • 1110HHHH 10xxxxxx 10xxxxxx
  • 11110HHH 10Hxxxxx 10xxxxxx 10xxxxxx
  • 111110HH 10HHxxxx 10xxxxxx 10xxxxxx 10xxxxxx
  • 1111110H 10HHHxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
  • 11111110 10HHHHxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx

The first four bits determine the type. The LSB high bit determines if the encoded value is a register (0001) or immediate (00X0). The second bit from LSB determines if the value should be treated as a signed 32 bit two's compliment number (001X) or should be treated as an unsigned 32 bit number (000X). All other values for the high bits are reserved.

The rest of the bits encode a number that is up to 32 bits long. Overlong encodings are accepted and sometimes used.

Assembler

The macro assembler is Python (see the asm directory). The macro assembler supports virtual instructions and jumps with named labels.

Design Philsophy

Creole is small but not minimal. It is easy to add instructions, and the amount of labels, registers, and stack space can changed at runtime. System calls can be added dynamically, but static instructions that take at most 3 arguments should be hard-coded.