26bit CPU support

Discuss development specific to the Pi version of ADFFS
JonAbbott
Posts: 3084
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex
Contact:

Re: 26bit CPU support

Post by JonAbbott »

It's just occurred to me that we can't use a DA to store the converted code...DA's are created above 64mb so would potentially overlap the PSR flags in registers. Creating a DA below 64mb isn't possible due to the restricted memory map - it's mostly taken up by Application space. The RMA is out as well, as it's above the 64mb boundry on the Pi. This leaves Application space.

If we increase Application space to the max (28mb on RO3.6-RO4.x), we can use the top 16mb for the JIT translated code / codelets.

This creates some further issues:

1. There's currently 12 known games that require more than 8mb of RAM. However, it's not as bad as it sounds as on RO3.6-RO4.x we only need to support SA incompatible games. The max RAM requirement there is currently 5mb (Emotions - Search for Humanity).

On a 32-bit OS, we can increase Application space by a larger amount, say 64mb and use the top 32mb for the JIT translated code / codelets. The max RAM requirement for a StrongARM compatible game is currently 18mb (Descent 2).

2. Some compilers back in the day, altered the stack (R13) to go from the top of Application space down. Provided we work upwards in the JIT space, we should hopefully avoid this issue.

The proposed memory map on RO3/4 will look as follows:

----------------------- 1B80000
....
JIT translated code
----------------------- 1380000
Codelets
....
----------------------- 1000000
....
Original code
----------------------- 8000


NOTE 1: The Codelets grow downwards from the top of their memory space

NOTE 2: It's not clear at this point if allocating 4mb to the Codelets is enough. Each Codelet is between 16 and 48 bytes depending on the instruction being translated, with 4mb we have space for about 131,000 translated instructions

NOTE 3: To handle self-modifying code, as pages are touched by the JIT, it flags them as read-only. Should an abort subsequently occur caused by a write, the matching word in the JIT translated code is checked to see if the write is going over an instruction we've previously processed (if it's not the instruction we use to enter the JIT, it must be an instruction), if it is, the write is allowed to the original code space and the equivalent instruction in JIT memory space is replaced with a jump back into the JIT to re-interpret it
JonAbbott
Posts: 3084
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex
Contact:

Re: 26bit CPU support

Post by JonAbbott »

I've now coded this up and have started on the JIT.

See Phase 1 for instruction details
JonAbbott
Posts: 3084
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex
Contact:

Re: 26bit CPU support

Post by JonAbbott »

Have made good progress with phase 1 of the JIT.

Phase 1: Re-interpret all PC relative instructions into codelets (see previous post)
Phase 2: Add additional requirements for StrongARM (self-modifying code)
Phase 3: Add additional requirements for 32-bit (TEQP etc)
Phase 4: Add corrections for SWI's that have changed behaviour since RO3.1

The screenshots below are games running under the ARM3 JIT on a RiscPC ARM610. It's recoding the instructions from the previous post into 32-bit compliant versions, with R14 on entry into a BL containing the PSR flags and original BL address+4 at 8000+. The interpreted code is all running at 1388000 (ie original address + 1380000).

Speed wise, Zarch is running at 44% of its original ARM610 speed and is still running faster than it should :D
James Pond is running at it's original rate. As it's speed restricted in-game, its hard to establish just how much extra work the CPU is doing. Although from the screenshot below, the JIT has added an extra 17000+ instructions.

The two numbers in the top left are <last re-interpreted instruction address> <pointer to next codelet address> -the later goes down from 1380000, so indicates just how much memory is required for the re-interpreted instructions so far.
James Pond on ARM610 under JIT
James Pond on ARM610 under JIT
james_pond_arm3jit1.png (4.4 KiB) Viewed 5652 times
Zarch on ARM610 under JIT
Zarch on ARM610 under JIT
zarch_arm3jit1.png (2.85 KiB) Viewed 5652 times

I've uploaded ADFFS400 and ADFFS500 modules and source to /development/32bit/CPU

If you'd like to try something under it on a non-SA machine (I've yet to add the full SA support), do the following:

1. Load !ADFFS
2. RMKill ADFFS
3. RMLoad ADFFS400
4. *ADFRemapVideoMemory 13 160
5. *LOAD <code>
6. *GOARM3JIT <address>
JonAbbott
Posts: 3084
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex
Contact:

Re: 26bit CPU support

Post by JonAbbott »

I've added some additional debugging, the numbers are:

<last instruction> <last codelet> <total instructions> <re-interpreted instructions>

I've analyzed Zarch's code, a total of 613 instructions need re-interpreting (&265). Comparing that against the values in the picture below, the JIT has caught nearly all of them.
Zarch on ARM610 under JIT
Zarch on ARM610 under JIT
zarch_arm3jit2.png (3.17 KiB) Viewed 5626 times
And James Pond after playing to level 2:
James Pond on ARM610 under JIT
James Pond on ARM610 under JIT
james_pond_arm3jit2.png (5.16 KiB) Viewed 5626 times
Post Reply