JIT Phase 3

Discuss development specific to the Pi version of ADFFS
Post Reply
JonAbbott
Posts: 1736
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex

JIT Phase 3

Post by JonAbbott » Sat Dec 14, 2013 11:11 am

Add additional requirements for 32-bit:

Store DomainId when JIT is started and check it's not changed before performing any actions (see OS_ReadSysInfo 6) Also see this for Page Zero variable locations.

xxxNV (coded)

TEQ PC, ... (coded)
TST PC, ... (coded)
CMN PC, ... (coded)
CMP PC, ... (coded)

TEQ ..., PC (coded)
TST ..., PC (coded)
CMN ..., PC (coded)
CMP ..., PC (coded)

TEQP Rn, #immediate (coded)
TSTP Rn, #immediate (coded)
CMNP Rn, #immediate (coded)
CMPP Rn, #immediate (coded)

LDF<P> Rd, [PC, ...] (coded) (used by Quest for Gold)
STF<P> Rd, [PC, ...] (coded) (used by Quest for Gold)

TEQP ..., PC (coded)
TSTP ..., PC (coded)
CMNP ..., PC (coded)
CMPP ..., PC (coded)

CP15 C0 CPUID (coded)
CP15 C2 cache status (coded)


OS related
OS_WriteS (PRM1-517) (coded)
OS_File 12 / 14 / 16 / 255 (PRM2-43) (coded)
OS_GBPB 3 / 4 (PRM2-69) (coded)
OS_GetEnv (PRM1-301) (return new appspace size) (coded)
Wimp_SlotSize (PRM3-203) (prevent changes) (coded)
*GO (coded)
Lock page zero and manage writes to handle vector claims (coded)
*RUN / OS_FSControl 4 (PRM2-87) (coded)
ADFS_DiscOp 1 and 3 (coded)
BASIC USR and CALL commands (see CALLARMROUT) (coded)
OS_SetEnv (PRM1-305) (prevent changes to appspace) (coded)
OS_ClaimScreenMemory (PRM1-388) (coded to return &2000000-what's requested)
OS_Memory 9, &200 (coded)
OS_Memory 9, &300 (coded)
OS_SynchroniseCodeAreas (coded)
Squash_Compress (PRM4-104) (covered by OS_SynchroniseCodeAreas on RO3.71+)
Squash_Decompress (PRM4-106) (covered by OS_SynchroniseCodeAreas on RO3.71+)
OS_ReadSysInfo 1 (PRM1-749) (coded) (used by James Pond II+ Robocod)
OS_ReadSysInfo 2 (PRM1-750) (coded) (used by James Pond II+ Robocod)
OS_UpdateMEMC (PRM1-373) (coded)
OS_CallAVector (PRM1-70)
OS_InstallKeyHandler (PRM1-945)
OS_Word &16 - write screen base address (PRM1-724) (not required)
OS_EnterOS (implement CPU mode paravirtualization)
OS_DelinkApplication (PRM1-74)
OS_RelinkApplicaiotn (PRM1-76)

The following need their entry and exit managed within the 32mb limit
OS_Claim (PRM1-66) (coded)
OS_Release (PRM1-68) (coded)
OS_CallAfter (PRM1-441) (coded)
OS_CallEvery (PRM1-443) (coded)
OS_RemoveTickerEvent (PRM1-445, PRM5a-669) (coded)
OS_AddCallBack (PRM1-324) (coded)
OS_RemoveCallBack (PRM1-327) (coded)
OS_AddToVector (PRM1-72) (coded)
OS_ClaimDeviceVector (PRM1-123) devices 3 and 6 (coded)
OS_ReleaseDeviceVector (PRM1-125) (coded)
OS_ClaimDeviceVector (PRM1-123) device 9 (coded) (used by Rockfall)
OS_ClaimProcessorVector (PRM5a-30) (partially coded - ignores Vector 1 and reports all others. Reports an error below RO3.5)
OS_ChangeEnvironment (partially coded - needs to return a managed entry for the existing handler)
OS_CallBack (PRM1-307) (coded) (used by Chequered Flag)
OS_Control (PRM1-299) (coded) (used by James Pond II Robocod)
OS_BreakCtrl (PRM1-310) (coded) (used by James Pond II Robocod)
OS_BreakPt (PRM1-309) (used by James Pond II Robocod)
OS_UnusedSWI (PRM1-312)
OS_PlatformFeatures 0 (Wiki reference) (used by Overload crash handler)
Wimp interaction (used by Elite)


Audio related
Sound_Configure (PRM4-10 / PRM4-18) (coded)
Sound_InstallVoice (PRM4-13 / PRM4-29) (coded)
Sound_LinearHandler 1 (PRM5a-608)
Staging audio buffer for audio created by Voice Generators, to resolve buffer size mismatches (coded)


CLib related
SharedCLibrary_LibInitAPCS_R (PRM4-255) (coded)
SharedCLibrary_LibInitAPCS_A (PRM4-254) (partially coded) - reports the instruction
SharedCLibrary_LibInitModule (PRM4-259) (partially coded) - reports the instruction

CLib functions
Multiple language description blocks (partially coded) - reports the instruction
_kernel_init (0) (coded)
_kernel_swi (9) (coded)
_kernel_osfile (18) (coded) (used by Empire Soccer 94)
_kernel_register_allocs (25) (partially coded) - reports the instruction
_kernel_moduleinit (38) (partially coded) - reports the instruction
_kernel_entermodule (42) (partially coded) - reports the instruction
_kernel_swi_c (45) (coded)
_kernel_register_slotextend (46) (partially coded) - reports the instruction
InitProc (coded)
FinaliseProc (coded)
_run (returned in R0 from InitProc) (coded)
_main (18) (coded)
atexit (71) (coded) (used by Inferno)
qsort (76) (coded) (used by Alone in the Dark)
signal (128) (coded) (used by Elite)
_swi (184) (partially coded) - reports the instruction
_swix (185) (partially coded) - reports the instruction

The following could be hypervised to change the memory allocated back to read/write:
_kernel_osgbpb (15) (coded) - currently reports the instruction
fgets
gets
fread
calloc
free
malloc
realloc
memcpy
memmove
strcpy
strncpy
strcat
strncat
memset


OS_CallASWI related
Hypervised SWI's (coded)
Vector claims (coded) (used by Populous)


Module related
OS_Module 6 (PRM1-237) (coded)
OS_Module 7 (PRM1-238) (coded)
RMRun / OS_Module 0 (PRM1-231) (coded)
RMLoad / OS_Module 1 (PRM1-232) (coded)
OS_Module 10 (PRM1-241) (coded)
OS_Module 11 (PRM1-242) (coded)
Service_ModulePreInit (SA Support Notes)


Chipset related
MEMC Vinit, Vstart, Vend (coded)
MEMC Cinit (partially coded, not coded for Pi)
MEMC Sstart, SendN, Sptr (partially coded, doesn't work if set to RO default buffer addresses on SA) (required for Diggers and Rockfall)


Changes to kernel SWI's at RO3.5:
OS_Byte 135 - may return a mode specifier (PRM5a-115)
OS_Word 7 - undocumented in PRM (see here)
OS_Word 9 - deprecated (PRM5a-115)



SWI flag preservation
6 OS_Byte 129 - alters C
6 OS_Byte 138 - alters C (coded)
6 OS_Byte 145 - alters C (coded)
6 OS_Byte 152 - alters C (coded)
6 OS_Byte 153 - alters C (coded)
4 OS_ReadC - alters C (coded)
7 OS_Word 0 - alters C (coded)
7 OS_Word 15,15 - alters C (coded)
7 OS_Word 15,24 - alters C (coded)
A OS_BGet - alters C (coded)
E OS_ReadLine - alters C (coded)
C OS_GBPB 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 - alters C (coded)
26 OS_GSRead - alters C (coded)
27 OS_GSTrans - alters C (coded)
2C OS_ReadEscapeState - alters C (coded)
35 OS_ReadModeVariable - alters C (coded)
3A OS_ValidateAddress - alters C (coded)
3F OS_CheckModeValid - alters C (coded)
41 OS_ClaimScreenMemory - alters C (coded)
57 OS_SerialOp 3 - alters C (coded)
59 OS_Confirm - alters C, Z (coded)
42746 DeviceFS_ReceivedCharacter - alters C
42747 DeviceFS_TransmitCharacter - alters C
43044 Territory_Exists - alters Z
4305D Territory_Collate - alters N, Z, C

JonAbbott
Posts: 1736
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex

Re: JIT Phase 3

Post by JonAbbott » Thu Dec 26, 2013 9:57 pm

Have some more games running reliably on the Pi, screenshots below. The music on Pacmania is now correct as well.

Numbers across the top are:

<fps> <previous instruction block> <last instruction> <last codelet> I:<total instructions> R:<re-interpreted instructions>
A:<aborts> C:<cache flushes>
terramex_arm3jitPI1.png
Terramex on the Pi under ARM3 JIT
terramex_arm3jitPI1.png (8.01 KiB) Viewed 760 times
pacmania_arm3jitPI2.png
Pacmania on the Pi under ARM3 JIT
pacmania_arm3jitPI2.png (3.73 KiB) Viewed 753 times

JonAbbott
Posts: 1736
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex

Re: JIT Phase 3

Post by JonAbbott » Fri Dec 27, 2013 11:34 am

James Pond is now running on StrongARM. It's hanging on the Pi though - I've coded enough of MEMC / IOC / VIDC1 to get it working, so debugging required.

Fire & Ice is crashing on StrongARM, with "No writeable memory at this address" when it gets to the title page. I need to add code to hypervise OS_ClaimDeviceVector before it will run on RO5...I'll also track the claimants on RO3.x/4.x so it can unclaim them for breakpoints / aborts that need reporting - that will make debugging easier.

JonAbbott
Posts: 1736
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex

Re: JIT Phase 3

Post by JonAbbott » Tue Dec 31, 2013 5:03 pm

Chuck Rock on SA is an odd one, it gets to the prompt for disc 2 and then crashes - the odd thing being that the address it crashes at is in the middle of the ADFFS command table! If I quickly swap discs just before the prompt appears, the game runs normally. I can't for the life of me figure out what's going on, I'm almost at the point where I halt development and extend RPCEmu into a step debugger as JTAG isn't an option - not that I have any idea about JTAG or the kit anyway.

I've also just tested this on RO5/SA - Zarch just about runs but suffers major screen corruption before crashing. Any 4-bit mode game just goes bonkers, I think some unwanted Pi code has crept into the IOMD build. However, this should also affect RO3.71/SA as it's the same module so I'm somewhat confused at the minute.

JonAbbott
Posts: 1736
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex

Re: JIT Phase 3

Post by JonAbbott » Mon Feb 03, 2014 7:20 am

OS_FSControl 4 is now completely coded. This allows files to be *RUN and intercepted by ADFFS, it's working perfectly on StrongARM - I can't get it work on the Pi though. This is somewhat ironical, as I coded it to mirror the RO5 source.

As Transients (RO term, we call them Utilities) are loaded into the RMA, that's forced me to start Module support. I've allocated 1.5mb for RMA at 8mb+, which should be enough for any games. This is handled via OS_Heap so mirrors RMA behaviour.

However intecepting modules is a different matter, back in RO3.7 Service_ModulePreInit was added specifically for patching up modules for SA compatibility, however in RO5 the 32bit check is done before it issues this service call, so there's no opportunity for it to patch the module unless it's already 32bit compliant! The alternative is to hijack OS_Module, allocate RMA, load the module into it and then insert it into the chain as a RAM based module once it's been patched.

The next problem is that 26bit modules will now be in appspace to keep them within the 32mb limit. All interrupt driven entry/exits will need managing via the actual RMA so the memory can be mapped in if required, I've yet to look at the specifics of either aspect.

steve3000
Posts: 198
Joined: Thu May 02, 2013 9:25 pm

Re: JIT Phase 3

Post by steve3000 » Tue Feb 04, 2014 10:39 pm

Great work Jon!
JonAbbott wrote:However intecepting modules is a different matter, back in RO3.7 Service_ModulePreInit was added specifically for patching up modules for SA compatibility, however in RO5 the 32bit check is done before it issues this service call, so there's no opportunity for it to patch the module unless it's already 32bit compliant! The alternative is to hijack OS_Module, allocate RMA, load the module into it and then insert it into the chain as a RAM based module once it's been patched.
Damn. I had been thinking all along that there was a Service call (which I couldn't remember the name of, just that it was RO3.7 specific and wasn't in my PRM v5a) to allow patching of modules, so had lodged this in the back of my mind as the probable way to sort out 26bit modules. This was the call.

I can't understand why it's called after the 32bit check? Makes no sense at all. Is this something to take up with ROOL?

JonAbbott
Posts: 1736
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex

Re: JIT Phase 3

Post by JonAbbott » Tue Feb 04, 2014 10:49 pm

steve3000 wrote:Is this something to take up with ROOL?
I've already raised it and detailed what I believe needs changing to resolve it.

JonAbbott
Posts: 1736
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex

Re: JIT Phase 3

Post by JonAbbott » Tue Feb 04, 2014 10:56 pm

JonAbbott wrote:OS_FSControl 4 is now completely coded. This allows files to be *RUN and intercepted by ADFFS, it's working perfectly on StrongARM - I can't get it work on the Pi though.
This is now working on the Pi and all Obey scripts that were changed to use the JIT have been modified in the next beta (2.35 at the moment)

JonAbbott
Posts: 1736
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex

Re: JIT Phase 3

Post by JonAbbott » Fri Feb 21, 2014 11:05 pm

I've now implemented sub-pages* in the JIT which has reduced the number of Aborts quite substantially in some games. Zarch for example has reduced by 75%.

Before coding this I noticed Zarch was generating substantially more Aborts than it was in the early betas. Up from 10,000 / sec to 65,000 / sec, which had reduced the frame rate on StrongARM from 96 fps down to 12.5. fps. When I've fix StrongARM support (it's broken in 2.37) I'll see what difference this change makes. Why the number of Aborts has gone up in the first place I don't know. It's noticeable quicker on the Pi so I'm hoping it will make a similar difference on StrongARM which scales better with Aborts due to the cache flush hit.

I've also started coding OS_Module. Claim and Free are now done in the JIT RMA which fixes some games that copy code into the RMA. Load / Run / Insert are a bit more complicated as a stub Module needs to go into the actual RMA, I've yet to work out the specifics although it needs to contain a new header with the 32bit flag, entry points, command table/help and hypervised entry/exit to the actual module in the JIT RMA.

* sub-pages: Memory page access permissions can be split into 4 sub-pages allowing granular control over 4 x 1KB pages

JonAbbott
Posts: 1736
Joined: Thu Apr 11, 2013 12:13 pm
Location: Essex

Re: JIT Phase 3

Post by JonAbbott » Sat Feb 22, 2014 11:35 pm

JonAbbott wrote:I've now implemented sub-pages* in the JIT which has reduced the number of Aborts quite substantially in some games.
On further investigation RO5 on the Pi uses the new page descriptor format which doesn't support sub-pages. This is why Jet Fighter exhibits strange behaviour.

Thankfully, we don't really need to reduce the number of Aborts on the Pi as it's more than quick enough to cope with over 100,000 / second without impacting game speed. I'll restrict subpage support to StrongARM only.

Post Reply