Home
JAQForum Ver 24.01
Log In or Join  
Active Topics
Local Time 21:22 01 Jul 2025 Privacy Policy
Jump to

Notice. New forum software under development. It's going to miss a few functions and look a bit ugly for a while, but I'm working on it full time now as the old forum was too unstable. Couple days, all good. If you notice any issues, please contact me.

Forum Index : Microcontroller and PC projects : CMM2 V5.05.06exp: Huge performance improvement - please test

     Page 3 of 8    
Author Message
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4298
Posted: 05:53pm 21 Oct 2020
Copy link to clipboard 
Print this post

  matherp said  The NOHASH option seems like the best compromise so developers where speed is critical can choose the optimum for their application. It is a runtime only option, defaults to hashing when a program is run similar to the way OPTION EXPLICIT works


For what it is worth, I agree, thank you for adding OPTION NOHASH.

Tom
MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 10180
Posted: 06:09pm 21 Oct 2020
Copy link to clipboard 
Print this post

  Quote  BUG!


Try attached


CMM2V1.5.zip


  Quote  Does it happen whenever you ERASE a variable with a hash value conflict ?


No: ERASE doesn't use the hash to find the variable
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4298
Posted: 06:21pm 21 Oct 2020
Copy link to clipboard 
Print this post

  matherp said  No: ERASE doesn't use the hash to find the variable


That wasn't my point.

I figured (perhaps incorrectly) that if A and B both hash to N then ERASEing A (@N) means that B (@N+1) can no longer be located, because entry N of the table is empty and thus we won't look for B in N+1.

I will try the latest zip after the kids go to bed.

Best wishes,

Tom
MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
JohnS
Guru

Joined: 18/11/2011
Location: United Kingdom
Posts: 4032
Posted: 06:26pm 21 Oct 2020
Copy link to clipboard 
Print this post

If the collisions are chained, as commonly, you'd basically move the Next pointer along.

Non-chained gets a bit more painful (exercise for the reader LOL).

John
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4298
Posted: 06:46pm 21 Oct 2020
Copy link to clipboard 
Print this post

  JohnS said  If the collisions are chained, as commonly, you'd basically move the Next pointer along.

Non-chained gets a bit more painful (exercise for the reader LOL).

John


My understanding from earlier is that he isn't using a chaining strategy he is using an open addressing strategy, hence my concern about ERASE.

Tom
MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 10180
Posted: 07:00pm 21 Oct 2020
Copy link to clipboard 
Print this post

  Quote  I figured (perhaps incorrectly) that if A and B both hash to N then ERASEing A (@N) means that B (@N+1) can no longer be located, because entry N of the table is empty and thus we won't look for B in N+1.


Standard approach is just to block the key with a dummy value. Then in the hash if the new test string is not found and a blocked key was found as you were searching then you can re-use it.

Better fixed version attached


CMM2V1.5.zip
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4298
Posted: 07:04pm 21 Oct 2020
Copy link to clipboard 
Print this post

  matherp said  Standard approach is just to block the key with a dummy value. Then in the hash if the new test string is not found and a blocked key was found as you were searching then you can re-use it.


Which is what I was going to suggest, though I was going to call it a zombie variable rather than a dummy value.

> Better fixed version attached.

I'll let you know how it turns out

Tom
MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4298
Posted: 08:24pm 21 Oct 2020
Copy link to clipboard 
Print this post

  thwill said  
  matherp said  Better fixed version attached.


I'll let you know how it turns out


Fixes the bug, but I think whatever you've done has degraded performance for NOHASH back to the sub 1000 ZMIPS range again.

Assuming the day job doesn't interfere I'll start from scratch again tomorrow comparing RC6 with the experimental firmware with and without NOHASH.

Of course you have to do what is best for the majority of programs - even if it does seem to favour bad practice

Tom
MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
PeteCotton

Guru

Joined: 13/08/2020
Location: Canada
Posts: 543
Posted: 10:45pm 21 Oct 2020
Copy link to clipboard 
Print this post

  thwill said  
Of course you have to do what is best for the majority of programs - even if it does seem to favour bad practice

Tom


I'm not disagreeing with you per-se Tom, but the way I see it, this is BASIC. It's never going to be as clean as a modern language.

When coding in C# or C I am very cognizant that I can afford to spend extra processing cycles to make my code clean and maintainable. Processing power ceased to be an issue a long time ago, and I will quite happily have a slightly longer path through the code (when programming in C#) if it makes the logic easier to understand. Even my embedded stuff is more likely to be waiting on real world events than spare processing cycles.

However with the CMM2 we are back to using tricks to eek the best out of the machine. With my last program I started off with nice clean code, but ended up needing to compromise in order to optimise the processing steps/cycles in each loop through the code. This isn't a complaint, quite the opposite, I was genuinely thrilled to go back to the world of trying out different methods of shaving milliseconds off functions.

I think basically what I am saying is that I suspect we can have clean, easy to maintain code, or we can have speed optimised code, but not both. The more Peter optimises the BASIC interpreter, the further we will push the hardware, and the more tricks/hacks we will employ to get more speed, and the messier our code will get.

That is not to say that the code has to be sapgetti-fied or unmaintainable, you can still write complex assembler that is well documented and maintainable, so no reason the same can't be done with BASIC. It just won't be as super organised as a modern program.

Allocating and de-allocating local variables must take up some cycles. So, to be honest, they would probably still get tossed out in favour of a global variable in a fully speed optimised program.

Like I say. I'm not disagreeing with you - I like a well structured program as much as the next person - but regardless of what "the best practice is", there will be those among us who will probably end up ignoring it and hacking the system to make it go just that one extra FPS faster....  but if speed isn't top priority, then the system is still fast enough to enable clean well structured code Just my two pennies!
 
chris
Regular Member

Joined: 24/08/2020
Location: United Kingdom
Posts: 56
Posted: 11:33pm 21 Oct 2020
Copy link to clipboard 
Print this post

The speedup sounds brilliant. It's good to see an optimisation pass.

I have some questions for Peter, and these are not criticisms at all, just items of curiosity.

1) Are you considering any more optimization tricks?
2) Why do variables need to be looked up via hash or name at all? Could they not be assigned an id in a pre-processing step, or maybe serialized alongside the .bas file on the sd card?
3) Do you ever see a day when MMBasic will turn into a BASIC compiler? Even a compiler with no optimization pass done at all of course would be an order of magnitude faster. Do you view this as a 'never' feature, or a 'maybe' feature?

Anyway, really excited to hear about the speedup. Exciting to read that Gauntlet is running at 30fps now, and I do think that game is probably at the upper edge of what a 2D games needs in terms of per-frame sprites / collision logic.

I'm really glad that the CPU seems to have a lot of headroom, and look forward to seeing here MMBasic goes in the future.

 
LeoNicolas

Guru

Joined: 07/10/2020
Location: Canada
Posts: 503
Posted: 12:22am 22 Oct 2020
Copy link to clipboard 
Print this post

  PeteCotton said  
  thwill said  
Of course you have to do what is best for the majority of programs - even if it does seem to favour bad practice

Tom


I'm not disagreeing with you per-se Tom, but the way I see it, this is BASIC. It's never going to be as clean as a modern language.

When coding in C# or C I am very cognizant that I can afford to spend extra processing cycles to make my code clean and maintainable. Processing power ceased to be an issue a long time ago, and I will quite happily have a slightly longer path through the code (when programming in C#) if it makes the logic easier to understand. Even my embedded stuff is more likely to be waiting on real world events than spare processing cycles.

However with the CMM2 we are back to using tricks to eek the best out of the machine. With my last program I started off with nice clean code, but ended up needing to compromise in order to optimise the processing steps/cycles in each loop through the code. This isn't a complaint, quite the opposite, I was genuinely thrilled to go back to the world of trying out different methods of shaving milliseconds off functions.

I think basically what I am saying is that I suspect we can have clean, easy to maintain code, or we can have speed optimised code, but not both. The more Peter optimises the BASIC interpreter, the further we will push the hardware, and the more tricks/hacks we will employ to get more speed, and the messier our code will get.

That is not to say that the code has to be sapgetti-fied or unmaintainable, you can still write complex assembler that is well documented and maintainable, so no reason the same can't be done with BASIC. It just won't be as super organised as a modern program.

Allocating and de-allocating local variables must take up some cycles. So, to be honest, they would probably still get tossed out in favour of a global variable in a fully speed optimised program.

Like I say. I'm not disagreeing with you - I like a well structured program as much as the next person - but regardless of what "the best practice is", there will be those among us who will probably end up ignoring it and hacking the system to make it go just that one extra FPS faster....  but if speed isn't top priority, then the system is still fast enough to enable clean well structured code Just my two pennies!


Yep Peter, we are back to the 80's/90's trying to squeeze some more ms from the machine lol

I like to use local variables to better organization, but you are right about the allocating and deallocating process. Maybe does use static local variables will be better?
 
panky

Guru

Joined: 02/10/2012
Location: Australia
Posts: 1114
Posted: 12:30am 22 Oct 2020
Copy link to clipboard 
Print this post

@PeteCotton,

Interesting and thoughtful comments Pete,   I think sometimes it is easy to forget how fundamentally different an interpreter is to a compiler. Perhaps this should be taken into account rather than 'modern' v 'old' and 'clean' v 'messy'.

Doug.
... almost all of the Maximites, the MicromMites, the MM Extremes, the ArmMites, the PicoMite and loving it!
 
Tinine
Guru

Joined: 30/03/2016
Location: United Kingdom
Posts: 1646
Posted: 04:27am 22 Oct 2020
Copy link to clipboard 
Print this post

@PeteCotton

  Quote  
I was genuinely thrilled to go back to the world of trying out different methods of shaving milliseconds off functions.




Precisely why my grandkids are getting MX170s and not RPi
 
vegipete

Guru

Joined: 29/01/2013
Location: Canada
Posts: 1127
Posted: 05:12am 22 Oct 2020
Copy link to clipboard 
Print this post

Don't know if this is specific to this experimental vesrion, but I'm seeing something odd with BOX and the colour white.

With the following sequence: (typed from the prompt, mode 1,8)
> box 380,130,90,90,1,rgb(white),rgb(red)
> box 400,150,50,50,4,rgb(white),rgb(white)
> box 400,150,50,50,1,rgb(white)
the white border of the second command is not being drawn. Instead, the border is transparent. This may only be the case when the fill is white.
Visit Vegipete's *Mite Library for cool programs.
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 10180
Posted: 06:59am 22 Oct 2020
Copy link to clipboard 
Print this post

  Quote  Don't know if this is specific to this experimental vesrion, but I'm seeing something odd with BOX and the colour white.


Sorry - stupid bug I introduced during 5.05.06 betas - will fix
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4298
Posted: 10:13am 22 Oct 2020
Copy link to clipboard 
Print this post

For what it's worth here are the results of my benchmarking the latest experimental firmware:



I guess that as long as we have OPTION NOHASH then I will only grumble quietly in the background.

Peter are you / can you optimise for the case when a Sub/Function has declared no Local variables and thus you don't need to cleanup the variable table ? - or would that also require the Sub/Function to also have no parameters ?

BUG! More worryingly running "spflow" with the experimental firmware WITHOUT using of NOHASH reproducably reports a spurious "Local variable already declared" error that despite a concerted effort I have been unable to reproduce in a simple example.

Best wishes,

Tom
MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4298
Posted: 10:29am 22 Oct 2020
Copy link to clipboard 
Print this post

  "PeteCotton" said  ...


All valid, but you'll still permit me to grumble given the 5-10% performance hit this imposes on my code because the "rules" have changed.

I'm not certain comparisons with programming in Assembler hold up since if "weird sh*t" happens in Assembler then it is in general the fault of the developer and under their control. Whereas with the MMBasic interpreter the "weird sh*t" can be really "weird" and out of our control.

From what I've learnt and the mental model I've built the "problem" lies with MMBasic not having a traditional stack based model. It looks to me like it may just be maintaining a record of the current subroutine depth and flagging any LOCAL variables with that depth, thus:

- variable lookup in a subroutine depth N is always slower because if it first finds a global variable it has to keep searching to see if there is a LOCAL variable of the same name flagged with N.

- leaving a subroutine requires all variables flagged with N to be removed from the table, previously that just required the interpreter to only check up to the number of variables that were declared, but in the experimental firmware it has to check all the slots in the hash-table.

It's possible this lack of stack is a relic of MMBasic's history on lower resource micro-controllers ... but then again the BBC Micro had a stack based model ... with its own peculiarities, e.g. creating a LOCAL variable also resulted in creation of a global variable of the same name.

Anyway this is all just intellectual speculation AND MAY BE WRONG, I don't actually expect any changes to be made.

Best wishes,

Tom
Edited 2020-10-22 20:30 by thwill
MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 10180
Posted: 10:42am 22 Oct 2020
Copy link to clipboard 
Print this post

How do I run the benchmark in zmim?
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4298
Posted: 10:51am 22 Oct 2020
Copy link to clipboard 
Print this post

  matherp said  How do I run the benchmark in zmim?


For the optimised version:

RUN "/zmim/zmim.bas"
Select "minizork.z3"
Answer "N" at the prompt to write a script
At the >> prompt type *replay
Select "bench.scr"


For the unoptimised version, the same except:
RUN "/zmim/src/main.bas"


Note that the timing will not always be exactly the same, I think this is due to a random element in the interaction with the Troll that happens just before the Quit.

Thanks for spending your time on this,

Tom
MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 10180
Posted: 11:31am 22 Oct 2020
Copy link to clipboard 
Print this post

I assume you have a 400MHz CMM2?
 
     Page 3 of 8    
Print this page
The Back Shed's forum code is written, and hosted, in Australia.
© JAQ Software 2025