Home
JAQForum Ver 24.01
Log In or Join  
Active Topics
Local Time 06:20 20 Apr 2026 Privacy Policy
Jump to

Notice. New forum software under development. It's going to miss a few functions and look a bit ugly for a while, but I'm working on it full time now as the old forum was too unstable. Couple days, all good. If you notice any issues, please contact me.

Forum Index : Microcontroller and PC projects : 1.8 times speed improvement in MMBasic?

     Page 1 of 2    
Author Message
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 11209
Posted: 04:54pm 18 Apr 2026
Copy link to clipboard 
Print this post

For some programs....

I've been playing (together with my mate) looking at every avenue possible to speed up MMbasic and the answer is there is very little sensible left to do unless...

What is the most used statement in most programs - answer LET

I wonder if we could cache LET statments so that there was no variable lookup except the first time in and no recursive parsing - sort of compile them on the fly.

and it works

Here is a test program
Option BASE 0
Dim col(7)
col(0)=RGB(0,0,0)      ' Black
col(1)=RGB(255,0,0)    ' Red
col(2)=RGB(0,255,0)    ' Green
col(3)=RGB(0,0,255)    ' Blue
col(4)=RGB(255,255,0)  ' Yellow
col(5)=RGB(0,255,255)  ' Cyan
col(6)=RGB(255,0,255)  ' Magenta
col(7)=RGB(255,255,255)' White

MODE 2 ' 320x240
Timer =0
xmin = -2.0
xmax = 1.0
ymin = -1.2
ymax = 1.2
maxiter = 32
Dim xp(319),yp(319),cp(319)
For i=0 To 319:xp(i)=i:Next
For py = 0 To 239
 Math set py,yp()
 y0 = ymin + (ymax-ymin)*py/239
 For px = 0 To 319
   x0 = xmin + (xmax-xmin)*px/319
   x = 0
   y = 0
   iter = 0
   Do While x*x + y*y <= 4 And iter < maxiter
     xtemp = x*x - y*y + x0
     y = 2*x*y + y0
     x = xtemp
     iter = iter + 1
   Loop
   If iter = maxiter Then
     c = 0
   Else
     c = (iter Mod 7) + 1
   EndIf
   cp(px)=col(c)
 Next px
 Pixel xp(),yp(),cp()
 Next py
Print Timer
End


On a HDMIUSB system with OPTION RESOLUTION 640,378000 set this takes 42.4 seconds

However, add the magic line at the top
Option tracecache on


and.....




You can try an experimental version of the code if you want


PicoMite (2).zip


Also you can try adding OPTION PROFILING ON at the top of the program
 
Volhout
Guru

Joined: 05/03/2018
Location: Netherlands
Posts: 5859
Posted: 05:10pm 18 Apr 2026
Copy link to clipboard 
Print this post

Hi Peter,

What is the drawback .? When there is no penalty, why not default it ?

Volhout
PicomiteVGA PETSCII ROBOTS
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 11209
Posted: 05:23pm 18 Apr 2026
Copy link to clipboard 
Print this post

Increased image size, increases ram usage both for the image and in use. In some programs it will make no difference or even very slightly slower. Use has more functionality that needs active user involvement to get the best out of it. The example is a best case.
Here is a picorp2040 version

PicoMite.zip


Additional functionality:
OPTION TRACECACHE ON n ' n is the number of cache slots - defaults to 128. Each slot uses about 400 bytes of heap

OPTION CACHE SUB subname [,subname...]
This specifies which subroutines should be optimised. Avoids filling the limited cache with subs that don't need it. Use OPTION PROFILING ON to see the "HOT" subs. Note that profiling reports when the program has an explicit end statement

OPTION CACHE DEBUG ON
Provides a diagnostic of LET statements that can't be optimised. Having lots of these is what can slow the program down

Limitations:
Maximum 4 variable names in a statement to pass optimisation
Can't optimise a LET statment with user functions

The test program on a RP2040 @ 420MHz goes from 74 seconds to 42 with a ILI9341
Edited 2026-04-19 04:06 by matherp
 
Bleep
Guru

Joined: 09/01/2022
Location: United Kingdom
Posts: 786
Posted: 06:09pm 18 Apr 2026
Copy link to clipboard 
Print this post

Hi Peter,
How about a 3X speedup! Try Bubble attached, it was 308mS per update, even without the option trace cache on it was down to 280mS, however with Trace cache it's 109mS!!!!  Starfield, is a bit more sedate at 103mS down to 72mS about 40% faster.
No Idea why Bubble is so good, presumably it's almost a perfect fit for the caching?

This is a HDMIUSBI2S board Resolution 640,378000.

Regards Kevin.

bubble2350vga.zip
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 11209
Posted: 06:35pm 18 Apr 2026
Copy link to clipboard 
Print this post

Here is the cache debug and profile report for bubble running for 50 iterations
[TC-BAD] (top): n(g)= RGB( 0,255,0)
[TC-BAD] (top): n(g)= RGB( a* 3.93,g* 2.575,128* (a+ g< 65))
[TC-BAD] (top): n(g)= RGB( 255,255,0)
[PERF] elapsed=5777109 us  statements=1711967  findvar=136196 (locals=0 [0%] globals=136196 [100%])  user_subs=0
[PERF] tracecache: enabled=1 size=64 replays=1323401 compiles_ok=8 compiles_bad=3
[PERF] tracecache: lookup_null=0 alloc_fail=0 optin_skip=0
[PERF] top commands by dispatch count:
    1330007  Let
     339966  Next
      13200  Math
       7550  If
       6030  EndIf
       3417  For
       3366  Memory
       3350  Inc
       3300  Pixel
       1520  Else
         52  FRAMEBUFFER
         51  CLS
         50  Loop
         50  Print
         50  Timer
          3  Dim
          1  Do
          1  End
          1  Const
          1  Font

As you can see, of 1330007 LET statements 1323401 have been "compiled" so there is no parsing when that statement is executed. This is where the huge speed up comes from
 
toml_12953
Guru

Joined: 13/02/2015
Location: United States
Posts: 602
Posted: 06:47pm 18 Apr 2026
Copy link to clipboard 
Print this post

  matherp said  For some programs....

I've been playing (together with my mate) looking at every avenue possible to speed up MMbasic and the answer is there is very little sensible left to do unless...

What is the most used statement in most programs - answer LET

I wonder if we could cache LET statments so that there was no variable lookup except the first time in and no recursive parsing - sort of compile them on the fly.

and it works



Some BASIC interpreters also pre-compile the address for NEXT, LOOP, DATA, GOTO and GOSUB so that lookup is only done once when the statement is first encountered. That can speed up things quite a bit, especially when the loop occurs toward the end of a very long program. Normally the interpreter has to search from the beginning of the program every time a transfer is made but precompiling allows it to jump directly to the target statement.
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 11209
Posted: 06:50pm 18 Apr 2026
Copy link to clipboard 
Print this post

NEXT and LOOP already do this. GOTO and GOSUB are deprecated and DATA isn't an issue for performance, If I take this any further the next would be the test in IF statements and the implied LET after THEN
Edited 2026-04-19 04:51 by matherp
 
Bleep
Guru

Joined: 09/01/2022
Location: United Kingdom
Posts: 786
Posted: 06:55pm 18 Apr 2026
Copy link to clipboard 
Print this post

Hi Peter,
How do you get the extra info, When I put Option Cache Debug On at the top of my code  all I got was the first few lines with [TC-BAD] at their start, none of the rest? Not sure what the [TC-BAD] are supposed to be telling me anyway, but they are not in the critical part of the loop, so they don't really matter.
Kevin.
 
Bleep
Guru

Joined: 09/01/2022
Location: United Kingdom
Posts: 786
Posted: 07:23pm 18 Apr 2026
Copy link to clipboard 
Print this post

I have no idea if this is feasible or worthwhile from a memory usage point of view, but in both the programs above, only one or two lines are worth caching, so rather than a blanket statement at the top, would a 'caching start' caching end' be feasable, use up less memory?
Regards Kevin.
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 11209
Posted: 07:30pm 18 Apr 2026
Copy link to clipboard 
Print this post

That is all it caches. In your program you could use OPTION TRACECACHE ON 16 which would hardly use memory.
Using OPTION CACHE SUB you can also control which subs are cached. Any not mentioned are not.
 
Bleep
Guru

Joined: 09/01/2022
Location: United Kingdom
Posts: 786
Posted: 07:40pm 18 Apr 2026
Copy link to clipboard 
Print this post

yes I already tried limiting the tracecache to 20 which gives 110mS so almost the full speedup. :-)
 
PhenixRising
Guru

Joined: 07/11/2023
Location: United Kingdom
Posts: 1845
Posted: 09:09pm 18 Apr 2026
Copy link to clipboard 
Print this post

Should this work on my RP2350 DIL board because it makes the COM port disappear.
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 11209
Posted: 09:58pm 18 Apr 2026
Copy link to clipboard 
Print this post

  Quote  Should this work on my RP2350 DIL board because it makes the COM port disappear.

I haven't posted a version for the RP2350 other than HDMIUSB
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4367
Posted: 11:30pm 18 Apr 2026
Copy link to clipboard 
Print this post

Hi Peter,

Sounds fun,

But without knowing the internals ...

Can it cope with LOCAL variables and recursion where multiple "versions" of a variable can bec stored in the variable table at different "levels"?

What about people playing "silly buggers" by using ERASE and (re)DIM which potentially changes where in the variable table a given variable is stored?

Hoping you are smarter than I,

Tom
Edited 2026-04-19 09:31 by thwill
MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 11209
Posted: 07:37am 19 Apr 2026
Copy link to clipboard 
Print this post

  Quote  Can it cope with LOCAL variables and recursion where multiple "versions" of a variable can bec stored in the variable table at different "levels"?

Yes, the top level can get optimised but the recursive call won't be. This is an example of where enabling could give slightly lower performance. Tested with a simple recursive factorial program.
  Quote  What about people playing "silly buggers" by using ERASE and (re)DIM which potentially changes where in the variable table a given variable is stored?

Probably not at the moment - untested It is definitely experimental at the moment. I've attached the code for you to have a look at - clever stuff. The use of erase and/or redim could easily be handled by invalidating the whole cache but a better solution would be preferable.

MMtrace.zip

Note: the header text in the c file is completely out-of-date with respect to the current capabilities. This was built in increments using Claude 4.7 - very expensive to use but largely infallible in terms of each step meeting the brief and working first try.
 
PhenixRising
Guru

Joined: 07/11/2023
Location: United Kingdom
Posts: 1845
Posted: 08:30am 19 Apr 2026
Copy link to clipboard 
Print this post

Fascinating. Never a dull moment  
 
lizby
Guru

Joined: 17/05/2016
Location: United States
Posts: 3740
Posted: 02:35pm 19 Apr 2026
Copy link to clipboard 
Print this post

  matherp said  built in increments using Claude 4.7 - very expensive to use but largely infallible in terms of each step meeting the brief and working first try.


How expensive is "very expensive"? I spent about $5USD a day for 5 days to work out some deep bugs--fairly intensive use, but far from "all day, every day". Expensive for a retired person doing it for recreation, but not for paid development.

Since that stretch I've managed lighter use on the monthly subscription--several times hitting a period limit and then doing other things for several hours until the next period rolled around.

~
Edited 2026-04-20 00:35 by lizby
PicoMite, Armmite F4, SensorKits, MMBasic Hardware, Games, etc. on FOTS
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 11209
Posted: 04:06pm 19 Apr 2026
Copy link to clipboard 
Print this post

  Quote  How expensive is "very expensive"?

USD26 in a day  
 
lizby
Guru

Joined: 17/05/2016
Location: United States
Posts: 3740
Posted: 04:17pm 19 Apr 2026
Copy link to clipboard 
Print this post

Wow. We should take up a collection for you. As if your time wasn't already of nearly inestimable value. I would be more than happy to contribute to a GoFundMe.

I understand and appreciate your insistence on not getting paid for the MMBasic development you do, but we could at least help to cover actual expenses at this scale.
PicoMite, Armmite F4, SensorKits, MMBasic Hardware, Games, etc. on FOTS
 
PhenixRising
Guru

Joined: 07/11/2023
Location: United Kingdom
Posts: 1845
Posted: 04:49pm 19 Apr 2026
Copy link to clipboard 
Print this post

  lizby said  Wow. We should take up a collection for you. As if your time wasn't already of nearly inestimable value. I would be more than happy to contribute to a GoFundMe.

I understand and appreciate your insistence on not getting paid for the MMBasic development you do, but we could at least help to cover actual expenses at this scale.


Count me in  

Probably not quite as costly as having a plane though 🤣😂
 
     Page 1 of 2    
Print this page
The Back Shed's forum code is written, and hosted, in Australia.
© JAQ Software 2026