Romasm Assembler

How assembly code is converted to executable instructions

Overview

The Romasm Assembler (romasm-assembler.js) translates human-readable Romasm assembly code into executable instruction objects.

Two Output Paths:

  • VM Instructions: For execution in the browser VM
  • x86 Assembly: For compilation to real hardware (via x86 Generator and RomanOS)

Assembly Process

Step 1: Parse Source Code

The assembler reads the source code line by line, removing comments and whitespace.

; Input:
LOAD R0, 10
ADD R0, R1

; After parsing:
["LOAD R0, 10", "ADD R0, R1"]

Step 2: Extract Labels

Labels are identified and their addresses are recorded.

; Input:
loop:
    LOAD R0, 10
    INC R0
    CMP R0, R1
    JLE loop

; Labels map:
{ "loop": 0 }

Step 3: Parse Instructions

Each line is parsed into an instruction object with opcode and operands.

; Input: "LOAD R0, 10"
; Output:
{
    opcode: 'L',
    operands: [
        { type: 'register', value: 'I' },
        { type: 'immediate', value: 10 }
    ]
}

Step 4: Resolve Labels

Label references are replaced with their actual addresses.

; Input: "JMP loop"
; Output:
{
    opcode: 'V',
    operands: [
        { type: 'label', value: 0 }  // Address of 'loop'
    ]
}

Opcode Mapping

Instructions are mapped to single-character or two-character opcodes:

Instruction Opcode
ADDA
SUBS
MULM
DIVDI
LOADL
CALLCA
MOVEMOV
DRAWDRW

Operand Parsing

The assembler recognizes three types of operands:

Registers

R0 → { type: 'register', value: 'I' }
R1 → { type: 'register', value: 'II' }
R8 → { type: 'register', value: 'IX' }

Immediate Values

42 → { type: 'immediate', value: 42 }
-10 → { type: 'immediate', value: -10 }

Memory Addresses

[100] → { type: 'immediate', value: 100, isMemory: true }

Labels

loop → { type: 'label', value: 
} sin → { type: 'label', value: 0, labelName: 'sin' } // Unresolved, for linker

Error Handling

The assembler reports errors for:

  • Unknown instructions
  • Invalid operands
  • Missing operands
  • Undefined labels (unless they'll be resolved by linker)

Output Format

The assembler returns an object with:

{
    success: true/false,
    instructions: [...],  // Array of instruction objects
    errors: [...]         // Array of error messages (if any)
}

From Assembly to Execution

Path 1: Browser VM (Default)

Romasm Source
  ↓
Assembler (romasm-assembler.js)
  ↓
VM Instructions
  ↓
Virtual Machine (romasm-vm.js)
  ↓
Execution in Browser

Path 2: Real Hardware (RomanOS)

Romasm Source
  ↓
Assembler (romasm-assembler.js)
  ↓
VM Instructions
  ↓
x86 Generator (romasm-x86-generator.js)
  ↓
x86 Assembly
  ↓
NASM → Machine Code
  ↓
Bootable Image
  ↓
Real Hardware / QEMU

Learn more about RomanOS →

Related Documentation