How the ARM3 JIT is implemented

Discuss development specific to the Pi version of ADFFS
Post Reply
JonAbbott
Posts: 2938
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex
Contact:

How the ARM3 JIT is implemented

Post by JonAbbott »

When running code via *GOARM3JIT <address>, the following happens:
  1. The WimpSlot size is expanded to 27.5mb
  2. Memory from 1200000 up to the 27.5mb limit is filled with Hypercalls that trigger the JIT (a Co-pro 8 instruction instruction that generates an Undefined Instruction Abort)
  3. An Undefined Instruction, Data Abort, Prefetch Abort and Branch Through Zero handlers are added to the relevant vectors. The SWI vector is already claimed by ADFFS and is notified to pass any SWI's for OS 8 to the Hypervisor
  4. Switch CPU to User mode
  5. Jump to <address> + 1200000
This immediately triggers an Undefined Instruction and at this point the JIT kicks in:
  1. The actual instruction is loaded from <Abort address - 1200000>
  2. The 1KB subpage (StrongARM) or 4KB page (non-StrongARM) at <Abort address - 1200000> is changed to read-only
  3. If it's not referencing the PC, altering the PSR, and isn't a BL, SWI or non-PC relative LDR with immediate, it's copied to <Abort address>
  4. Instructions that reference the PC, alter the PSR are BL are changed to a branch to a codelet which in summary do the following:
    • BL - R14 is changed to <R14 - 1200000> and the PSR bits added, it then branches to the BL destination address
    • Instructions that set PC - the PSR bits removed and if relevant set in CPRS. PC is then set to <address + 1200000>
    • Instructions that use PC for relative addressing - these are recoded to use <PC - 1200000>
  5. SWI - If it's passing a code address (eg OS_Claim, Sound_Configure etc), references PC (eg OS_WriteS) or requires hypervising, &800000 is added to the SWI to trigger our SWI hypervisor. All other SWI's are copied
  6. LDR - LDR's that could potentially read from page zero (LDR Rd, [Rn, ... where Rn<>PC / LDR Rd, [PC{]}, Rm {,...} ) are executed up too and provided the condition is true, the instruction is emulated once and the result address checked. If it's above &4000 the instruction is copied, any that are reading below &4000 remain as JIT entry instructions and emulated
In effect, code that would normally run at 8000 is recoded to run at 1208000, but use 8000 for all LDR, STR, LDM and STM instructions. What we end up with is all code at 1208000+ and all data at 8000+



Read-ahead

Once entered, the JIT will continue processing instructions up to a limit of 128 instructions or until one of the following is encountered, it will then exit and retry the initial instruction that triggered the JIT.
  • <ALU> PC, ...
  • B <address> that's conditional
  • BL <address> that's conditional
  • LDR PC, ...
  • LDM<mode> Rn, {..., PC}
  • SWI OS_Exit
  • SWI OS_ExitAndDie
  • SWI OS_GenerateError


Self-modifying code

Writes to pages already seen by the JIT trigger a Data Abort:
  1. Data Abort handler is entered
  2. The abort handler checks the abort was an access abort and within Application Space, if not its passed to the existing Abort handler (ie RISCOS)
  3. Word(s) at <Abort address + 1200000> are checked, if they're instructions previously seen by the JIT, they're changed to Hypercalls and the relevant cache flushes performed
  4. The instruction is emulated and the actual memory altered accordingly


RMA support

1.5mb of RAM from 800000 to 980000 is setup as a Heap to mirror RMA functionality
Post Reply