Page 1 of 1

ADFFS 2.73 StrongARM JIT (ARMv4) beta

Posted: Tue Dec 31, 2019 9:48 pm
by JonAbbott
Following this request from Jan Rinze, asking what happened to the StrongARM JIT, I thought I'd get it into a working state and released before the end of 2019!

Attached are the updated ADFFS Modules, which should be extracted to the !ADFFS folder, overwriting the originals. To run the JIT in ARMv4 mode, use *GOARM4JIT instead of *GOARM3JIT to start/stop the JIT. The JIT cannot run in both modes simultaneously, so you can't currently mix ARMv3 and ARMv4 processes - that will take a lot more coding and is only really applicable for apps running under the Wimp.

I've not done extensive testing, having only tested RISCOSmark and Stunt Racer 2000 [SA version] in ARMv4 mode. STR PC and STM Rn, {...PC} have been updated to store PC+8 when in ARMv4, although I have not added or checked for any additional ARMv4 instructions such as the multiply instructions (SMULL, UMULL, UMLAL) or Co-pro. Some of the Co-pro instructions were added previously, to get IOMD based games working correctly so hopefully no additional instruction support should be required.

RISCOSmark returns the same results under the ARMv4 JIT as running natively. On my Pi3 the Processor - Looped instructions results are as follows:

Code: Select all

Native Pi3 - 1381%
ARMv3 mode - 58%
ARMv4 mode - 1380%
This shows just how much detecting self-modifying code can slow things down, in the case of RISCOSmark its mixing data and code in the same memory pages that really affects it. If the data and code were split, ARMv3 mode would be the same speed as ARMv4.

At some point I'll update the packaged beta version available via !PackMan, along with all the games that support StrongARM so they can all be tested fully (DONE 02 Jun 20)

Below is the list of games that run on physical StrongARM. Note that if any make use of self-modifying code and aren't correctly issuing OS_SynchroniseCodeAreas, they're going to fail under the ARMv4 JIT. Very few of these would actually benefit from running in ARMv4 mode, as they already run too fast on a Pi1 in ARMv3 mode, so it's questionable if they're worth testing. The only two that might benefit are Bubble Impact and Wizard Apprentice as they use higher resolution modes - running under the ARMv4 JIT would reduce the number of Aborts being generated.

One game that would definitely benefit from ARMv4 mode (which isn't listed as it makes use of self-modifying code and isn't StrongARM compatible), is DarkWood. It's graphics routines generate a lot of Aborts under the ARMv3 JIT because of mixed data and code in the same memory page - millions of Aborts per second in fact. On a Pi1 that can cause slow downs when the graphics are turned fully up.

F10003 Abuse (1998) (R-Comp Interactive)
F10006 Aggressor (1992) (Atomic Software)
F10007 Air Supremacy (1991) (Superior Software)
F10009 Aldebaran (1993) (Evolution Trading)
F10017 Apocalypse (1990) (The Fourth Dimension)
F10019 Arcade Soccer (1989) (The Fourth Dimension)
F10020 Arcendium (1988) (Alien Images)
F10033 Asylum (1993) (Digital Psychosis)
F10041 Big Bang (1996) (Psycore)
F10043 Black Angel (1992) (The Fourth Dimension)
F10047 BloodLust (1998) (The Fourth Dimension)
F10052 Bobby Blockhead vs The Dark Planet (1991) (Atomic Software)
F10055 Botkiller2 (1999) (Artex Software)
F10443 Brutal Horse Power (1997) (TBA Software)
F10058 Bubble Impact [demo version] (1997) (Moving Pixels)
F10461 Burn'Out [SA version] (1997) (Oregan Developments)
F10355 Cascade (1992) (Milo Shaffer and Richard Norman)
F10482 Cataclysm [SA version] (1998) (The Fourth Dimension)
F10072 Caverns (1991) (Minerva)
F10074 Chaos Engine, The (2000) (R-Comp Interactive)
F10606 Chocks Away Compendium [SA version] (2000) (The Fourth Dimension)
F10083 Cobalt Seed, The (1995) (TBA Software)
F10091 COPS (1989) (Alpine Software)
F10462 Deadline (1996) (Network 23)
F10109 Deeva (1990) (Calderglen Computers)
F10607 Demon's Lair [SA version] (1997) (The Fourth Dimension)
F10118 Dominate (1991) (RTFM Software)
F10601 Drop Ship [SA version] (1997) (The Fourth Dimension)
F10535 Dune II - Battle for Arrakis [CD version] (1997) (Eclipse)
F10307 Ego: Repton 4 (1992) (Superior Software)
F10457 Enter The Realm [SA version] (1992) (The Fourth Dimension)
F10163 Flying High: Joust (1997) (The Datafile)
F10163 Flying High: Euroblaster (1997) (The Datafile)
F10631 Formula Fun (1993) (Mystery Software)
F10165 Frak! (1998) (R-Comp Interactive)
F10167 Freddy's Folly (1988) (Minerva)
F10170 Galactic Dan (1992) (The Fourth Dimension)
F10556 Games Minipack Four: Word Up Word Down (1989) (GEM Electronics)
F10592 Games Minipack Five: Fireball II (1990) (Cambridge International Software)
F10180 Groundhog (1998) (The Fourth Dimension)
F10197 High Risc Racing (1995) (Modus Software)
F10199 Holed Out!! (1989) (The Fourth Dimension)
F10201 Hostages (1990) (Superior Software)
F10202 Hoverbod (1988) (Minerva)
F10565 Humanoids and Robotix (1993) (Cambridge International Software)
F10408 Inferno (1996) (Paradise Games)
F10698 Interdictor II [v1.1] (1990) (Clares Micro Supplies)
F10207 Interdictor II [v1.3] (1990) (Clares Micro Supplies)
F10208 Iron Lord (1990) (Cygnus Software)
F10221 Labyrinth (2000) (Acornsoft)
F10229 Logic Mania: Atomix (1996) (The Fourth Dimension)
F10229 Logic Mania: Blindfold (1996) (The Fourth Dimension)
F10229 Logic Mania: Gloop (1996) (The Fourth Dimension)
F10229 Logic Mania: Tilt (1996) (The Fourth Dimension)
F10239 Mah-Jong Patience (1990) (Cambridge International Software)
F10241 Man at Arms (1990) (The Fourth Dimension)
F10247 MicroDrive (1990) (Cambridge International Software)
F10458 MicroDrive Designer (1992) (Cambridge International Software)
F10248 MicroDrive World Edition (1991) (Cambridge International Software)
F10251 Minotaur (1987) (Minerva)
F10252 Mirror Image (1996) (TBA Software)
F10252 Merp (1996) (TBA Software)
F10256 Morph (1998) (The Fourth Dimension)
F10258 Mr Doo (1994) (Archimedes World)
F10613 Nevryon [SA version] (2001) (The Fourth Dimension)
F10263 No Excuses (1991) (Arcana Software)
F10266 OddBall (1995) (Digital Psychosis)
F10272 Overload (2000) (Paradise Games)
F10478 Pandora's Box [SA version] (1998) (The Fourth Dimension)
F10410 Plague Planet (1988) (Alpine Software)
F10301 Ravenskull (1997) (ProAction)
F10659 Repton 1 (1997) (ProAction)
F10309 Revolver (1995) (Psycore)
F10312 Rise in Crime (1988) (Robico Software)
F10614 Saloon Cars Deluxe [SA version] (2000) (The Fourth Dimension)
F10324 Scrabble (1994) (U.S. Gold)
F10328 Shuggy (1997) (Werewolf Software)
F10329 Silver Ball (1997) (The Fourth Dimension)
F10334 Small (1993) (Virgo Software)
F10544 Spheres of Chaos 2 (2000) (R-Comp Interactive)
F10350 Stranded! (1989) (Robico Software)
F10445 Stunt Racer 2000 [SA version] (1997) (The Fourth Dimension)
F10356 Super Snail (1998) (The Fourth Dimension)
F10363 T.A.N.K.S. (1996) (Werewolf Software)
F10365 TEK 1608 (2002) (R-Comp Interactive)
F10366 Terramex (1988) (Grandslam Entertainments)
F10370 Top Banana (1988) (Hex)
F10372 Trivial Pursuit (1989) (Domark)
F10380 WaveLength (1994) (GamesWare)
F10381 White Magic (1989) (The Fourth Dimension)
F10382 White Magic 2 (1989) (The Fourth Dimension)
F10386 Wizard Apprentice (1997) (The Datafile)
F10387 WolfPack (1992) (Software 42)
F10390 Word Up Word Down (1989) (GEM Electronics)
F10030 Zodiac - Aries: Hamsters (1994) (GamesWare)
F10030 Zodiac - Aries: Square Route (1994) (GamesWare)

Refer to this post for StrongARM game testing status

Re: ADFFS 2.73 StrongARM JIT (ARMv4) beta

Posted: Thu Jan 02, 2020 10:48 pm
by JonAbbott
I've spotted an issue with OS_SynchroniseCodeAreas, which isn't cleaning the JIT cache when a full memory sync is requested. These are ignored for ARMv3 code as the self-modifying code detection sorts it out, but needs to clean the whole JIT cache when in ARMv4 mode.

I purposely ignored it for the ARMv3 JIT because it's a massive performance hit and so many "StrongARM compatible" games were performing a full memory sync instead of cleaning just the region that had changed. A full sync requires over 12MB of RAM to be written and the JIT essentially resets itself as previously encoded instructions can no longer be trusted.

Re: ADFFS 2.73 StrongARM JIT (ARMv4) beta

Posted: Fri Jan 03, 2020 10:38 pm
by JonAbbott
JonAbbott wrote: Thu Jan 02, 2020 10:48 pm I've spotted an issue with OS_SynchroniseCodeAreas, which isn't cleaning the JIT cache when a full memory sync is requested.
Whilst resolving this issue I also noticed an issue with the ranged clean, which was cleaning a minimum of 128 bytes even if the requested size was smaller.

Further testing has highlighted an issue with the Paradise Games titles (Inferno and Overload.) Although they're supposedly "StrongARM compatible" it appears there's a bug in the Paradise Module, which doesn't issue OS_SynchroniseCodeAreas after unpacking the main game code to memory.

It works on real hardware by luck, because the instruction cache is getting cleaned by the random replacement method during the unpack, that won't work for the JIT however as instructions are cached until they're explicitly cleaned via either a write (in ARMv3 mode via self-modifying code detection) or OS_SynchroniseCodeAreas is issued.

There's nothing I can do about this except fix the game code or leave them running in ARMv3 mode. It does however highlight a potential issue with any code running under ARMv4 mode that doesn't correctly make use of OS_SynchroniseCodeAreas.

Re: ADFFS 2.73 StrongARM JIT (ARMv4) beta

Posted: Sat Jan 04, 2020 3:25 pm
by JonAbbott
Updated Modules attached to the OP, which fix the full memory sync issue detailed above.

Re: ADFFS 2.73 StrongARM JIT (ARMv4) beta

Posted: Tue Jun 02, 2020 4:05 pm
by JonAbbott
v2.73o added to the OP. There is a known issue when run on RiscPC where the video memory is not being cache cleaned at VSync.

2.73o changes since 2.73c:
  • JIT hv_reset_memory_block now disables IRQ/FIQ when reseting memory ranges
  • JIT hv_reset_memory_block would clear 128 bytes of JIT memory if the requested block size was below 128 bytes
  • ADF code wasn't setting ADF_Changed if the mount failed
  • MiscOp 1 now always returns "Maybe changed" for RISC OS 3.60+ due to a bug introduced after RISC OS 3.50 that causes disc changes to not be detected
  • ADFCache added to flush the JIT cache and enable self-modifying code support when in ARMv4 mode
  • OS_FSControl 4 now implements the UnSqzAIF fixup code, to correct compressed AIF that aren't StrongArm compatible
  • Debug mode now checks the JIT cache consistency against instructions in main memory and the RMA when running in ARMv4 mode and reports any mismatches
  • LDMIA R13!,{PC} is now switched to LDR PC,[R13],#4 regardless of the condition, previously only the AL condition was handled
  • JIT ALU instructions rewritten to only require 3 tmp variables, IRQ handler modified to cache 3 tmp variable
  • JIT OS_FSControl _utility_return was missing a NOP after an LDM Rx,{}^ in the IOMD builds
  • JIT now redirects VProtect OS_Module 6 (Claim) to the JIT RMA when VProtect handles a Module load or run (fixes Magnetoids)
  • ADFFS now updates VProtect to the latest version if it's already running
  • JIT Allocate_Transient_Stack was changing the CPU mode instead of disabling IRQ/FIQ (bug introduced in 2.71b)
  • JIT call_module_entry was missing a NOP after an LDM Rx,{}^ in the IOMD builds
  • map_screen_memory was corrupting R3 (L2PT address)
  • map_screen_memory was not turning off screen caching on StrongARM if the screen memory wasn't remapped
  • Claim_DataAbort_Vector was reenabling screen caching on IOMD
  • JIT IRQ handler was setting flags in R14 before calling the existing IRQ handler, which causes RISC OS 3.8+ to return to the wrong location
  • JIT IRQ handler now uses the existing IRQ32 stack instead of a temporary one
  • Modified code that gets the IRQ stack base to use a legitimate call on ROL OS builds
  • FSControl_Exit_handler wasn't preserving R0-R2 when passing on to the next Exit handler
  • Now RMKill's ZLib on RISC OS Select and installs the ROOL version
  • SetVars and ADFFS weren't checking for RISC OS 6 correctly
  • ADFRemapVideoMemory now reports an error under RISC OS 6
  • SWI handler exit code wasn't checking for a 32bit OS correctly
  • setup_screen_registers wasn't aligning L1/L2PT pointers
  • map_screen_memory wasn't resetting the screen domain when re-enabling caching on RISC OS 3.8-4.x (fixes lockup on RISC OS 3.8)
  • map_screen_memory was turning off caching on DA2 under RISC OS 5
  • OS_ChangeDynamicArea was not unmapping RO3.1 screen memory if caching was manually disabled on StrongARM
  • ADFRemapVideoMemory wasn't clearing the RO3.1 screen mapping before marking it as unmapped
  • mode_translation_table had an incorrect bpp for MODE 3
  • _pre_mode_change was not unmapping RO3.1 screen memory
  • _Early_Abort_Check no longer sets the 32bit Abort stack
  • changed ADFRemapVideoMemory to do an additional mode change on IOMD before taking over the Abort vector, to prevent hangs when the current desktop resolution is high
  • ADFRemapVideoMemory wasn't increasing the version of RISC OS being emulated when turning off RISC OS 3.11 video memory emulation
  • ADFRemapVideoMemory wasn't removing the current task from the supported task list
  • OS_ChangeDynamicArea wasn't checking if the current task was supported (fixes crash when returning to the desktop with large screen res)
  • JIT hv_OS_CallASWI/hv_OS_CallASWIR12 reinstated from 2.72c and CODELET_CALLASWI reinstated from 2.71i (fixes crash when Abuse calls OS_WimpPoll via OS_CallASWI)
  • Modified all HD installable game Boot Scripts (obey.zip) to not require modifying when replacing !Run
2.73c changes since 2.72:
  • ADFFSFiler Service table wasn't terminated correctly
  • Updated VoxLib to 1.07
  • Changed !Run to only load CLib if its both out of date and not already running
  • DiscOp 2 no longer alters the Sequence no (fixes writing to DOS/Atari format floppies)
  • ARMv4 JIT support added via *GOARM4JIT
  • JIT STR PC, ... and STM Rn, {...PC} store PC+8 when in ARMv4 mode
  • OS_SynchroniseCodeAreas all memory sync not ignored when in ARMv4 mode