TheBackShed.com

Forum Index : Microcontroller and PC projects : CMM2: performance struggles with MMBasic

Page 1 of 2

Author

Message

abraxas
Regular Member

Joined: 16/06/2020
Location: Canada
Posts: 99

Posted: 04:59pm 02 Jul 2020

Copy link to clipboard

Print this post

Hi everyone and let me preface that I love the CMM2 project and I'm excited to develop software for it so whatever I say below please just take it as a way for me to try and be helpful and not criticize. I do software for a living so I have a full appreciation of how hard it is to make software...

With that said, I'm having real issues trying to get MMBasic to perform without introducing some serious ugliness in my code and even then it only helps so much.

As some of you may have seen I've set out to write a modern, flexible BoulderDash engine for CMM2. So far I've been stuck trying to optimize graphics rendering and while things like BLIT and SPRITE commands are phenomenally fast, the Basic interpreter is not. It's seriously holding back the performance of my rendering loop. I have two version of my rendering sub implemented - one is what I would consider a "cleanroom" implementation that calls other functions, subs etc. Its main part iterates over visible tiles (240 of them since the viewport is 20x12) and figures out which need to be BLIT due to animation, movement or viewport changes.

The BLIT commands are executed in microseconds which is great... but the IF statement that runs in a tight loop takes a whopping 70ms to execute 240 times unoptimized and about 26ms after hand optimizations.

I have no idea what is taking the interpreter so long to evaluate that code. Is there some way that most commonly executed BASIC lines in MMBasic could be parsed once and the parse tree cached somewhere? I suspect the issue is the cost of tokenizing. I tripled the performance of the nested FOR loops by removing any reference to external SUB and FUNCTION calls and replacing any variable references with hardcoded values. All in the name of making the interpreter go faster. What used to take 75ms now takes 26ms.

The problem is that the rendering function code has become write-only. It's impossible to glance at it and see what it does. It's starting to become tempting to write a transplier that takes good code and turns it into a cryptic version that makes the interpreter more happy. I'd rather we didn't have to but I'm struggling to see another way for anything that is performance sensitive.

For reference here are the two versions of my render call. First the "cleanroom" version that is readable and then the hand optimized version that isn't.

This runs very slow but is legible:

SUB RENDERSCREEN
STATIC lastAnimClock = 0
TIMER = 0
LOCAL deltaX = 0
LOCAL deltaY = 0
IF viewportX <> targetViewportX THEN
deltaX = SGN(targetViewportX - viewportX)*SCROLL_STEP
BLIT deltaX,0,0,0,VIEWPORT_W_PX-deltaX,VIEWPORT_H_PX,1
viewportX = viewportX + deltaX
ENDIF
IF viewportY <> targetViewportY THEN
deltaY = SGN(targetViewportY - viewportY)*SCROLL_STEP
BLIT 0,deltaY,0,0,VIEWPORT_W_PX,VIEWPORT_H_PX-deltaY,1
viewportY = viewportY + deltaY
ENDIF
LOCAL elapsed1 = TIMER
LOCAL n,m, evalCount
LOCAL tile, refreshHorizEdge, refreshVertEdge
LOCAL blitCount = 0
LOCAL leftEdge = FIX(viewportX / TILE_SIZE)
LOCAL rightEdge = FIX((viewportX + VIEWPORT_W_PX - 1)/TILE_SIZE)
LOCAL topEdge = FIX(viewportY / TILE_SIZE)
LOCAL bottomEdge = FIX((viewportY + VIEWPORT_H_PX - 1)/TILE_SIZE)
FOR n = topEdge TO bottomEdge
FOR m = leftEdge TO rightEdge
evalCount = evalCount + 1
tile = room(m+1, n+1)
refreshHorizEdge = (m = leftEdge AND deltaX < 0) OR (m = rightEdge AND deltaX > 0)
refreshVertEdge = (n = topEdge AND deltaY < 0) OR (n = bottomEdge AND deltaY > 0)
IF (GETSPRITEFRAMES(tile) > 1) OR GETCURRENT(tile) = 0 OR refreshHorizEdge OR refreshVertEdge THEN
DRAWTILE tile, -viewportX+(m*TILE_SIZE), -viewportY+(n*TILE_SIZE),animClock
room(m+1, n+1) = SETCURRENT(tile)
blitCount = blitCount + 1
ENDIF
NEXT m
NEXT n
BOX 0,MM.VRES-(TILE_SIZE/2),MM.HRES,TILE_SIZE/2,1,RGB(BLUE),RGB(BLUE)
LOCAL elapsed2 = TIMER
LOCAL msg AS STRING = "blits: " + STR$(blitCount) + " evals: " + STR$(evalCount) + " elapsed1: " + STR$(elapsed1) + " elapsed2: " + STR$(elapsed2) + " "
TEXT 0,192, msg, "LT", 7, 1, RGB(WHITE), RGB(BLUE)
PAGE COPY 1 TO 0
lastAnimClock = animClock
END SUB

This runs 3x as fast but is totally criptic due to inlining and variable substitutions:

SUB RENDERSCREEN
STATIC lastAnimClock = 0
TIMER = 0
LOCAL dX = 0
LOCAL dY = 0
IF viewportX <> targetViewportX THEN
dX = SGN(targetViewportX - viewportX)*SCROLL_STEP
BLIT dX,0,0,0,VIEWPORT_W_PX-dX,VIEWPORT_H_PX,1
viewportX = viewportX + dX
ENDIF
IF viewportY <> targetViewportY THEN
dY = SGN(targetViewportY - viewportY)*SCROLL_STEP
BLIT 0,dY,0,0,VIEWPORT_W_PX,VIEWPORT_H_PX-dY,1
viewportY = viewportY + dY
ENDIF
LOCAL elapsed1 = TIMER
LOCAL n,m,ec,t
LOCAL bc = 0
LOCAL le = 1+FIX(viewportX / TILE_SIZE)
LOCAL re = 1+FIX((viewportX + VIEWPORT_W_PX - 1)/TILE_SIZE)
LOCAL te = 1+FIX(viewportY / TILE_SIZE)
LOCAL be = 1+FIX((viewportY + VIEWPORT_H_PX - 1)/TILE_SIZE)
FOR n=te TO be
FOR m=le TO re
ec=ec+1
t=room(m,n)
IF(((t AND 1030792151040)>>36)>1)OR(t AND 288230376151711744)=0 OR(m=le AND dX<0)OR(m=re AND dX>0)OR(n=te AND dY<0)OR(n=be AND dY>0) THEN
BLIT 16*(animClock MOD((t AND 1030792151040)>>36)),16*(((t AND 64424509440)>>32)-1),-viewportX+((m-1)*16),-viewportY+((n-1)*16),16,16,(t AND 4026531840)>>28
room(m,n)=t OR 288230376151711744
bc=bc+1
ENDIF
NEXT
NEXT
BOX 0,MM.VRES-(TILE_SIZE/2),MM.HRES,TILE_SIZE/2,1,RGB(BLUE),RGB(BLUE)
LOCAL elapsed2 = TIMER
LOCAL msg AS STRING = "blits: " + STR$(bc) + " evals: " + STR$(ec) + " elapsed1: " + STR$(elapsed1) + " elapsed2: " + STR$(elapsed2) + " "
TEXT 0,192, msg, "LT", 7, 1, RGB(WHITE), RGB(BLUE)
PAGE COPY 1 TO 0
lastAnimClock = animClock
END SUB

I bemoan that as nice as the CMM2 platform is that it isn't open source. I'm pretty good at helping out with code optimization and if MMBasic were on github I would love to experiment with implementing a parse tree cache of some kind. My hunch is that it would make a lot of difference in tight loops like this and make my hand inlining unnecessary.

Before I wrap up, I want to preempt any challenges around the logic of the render loop. Yeah, I there are probably some tricks one might use in a BDash engine to avoid scanning the whole viewport on every render. I just don't think we should have to rely on them because an IF that runs 200 times in a loop should not be a performance hog.

I include in the zip file all the code I've written to date. You can fire it up by running game.bas or game_orig.bas for the fast and the slow version respectively.
They don't do very much yet beside animating some tiles and panning around the viewport (use the cursor keys to do that). It demonstrates though the dramatic difference that hand optimizing around the interpreter can have.

Again I beg that some cycles are devoted in the future betas that optimize the interpretation costs since we have no way to drop down to ASM or C on CMM2 to optimize the most time sensitive routines.

BDASH.zip

thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4308

Posted: 05:15pm 02 Jul 2020

Copy link to clipboard

Print this post

Hey "abraxas",

As it happens I am writing the basics of an MMBasic transpiler, I just didn't know that is what it was called. It's still very much WIP but I can share if you want.

Regards, Tom

MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures

thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4308

Posted: 05:19pm 02 Jul 2020

Copy link to clipboard

Print this post

Also note that BASIC does not do short circuit evaluation of boolean expressions so that IF statement may be doing more work than you expect.

Tom

MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures

abraxas
Regular Member

Joined: 16/06/2020
Location: Canada
Posts: 99

Posted: 05:45pm 02 Jul 2020

Copy link to clipboard

Print this post

Thanks, thwill.

I would definitely take you up on this offer if you have something generic enough that one can just send a .bas file through and get an optimized version out. That said, I wish we did not have to create tools of that sort as they just complicate the workflow by adding an extra step. It's basically just like having a "compile" step except it doesn't have as big a payoff as a real compile...

still hoping that we can get something more substantial from the creators as even my hand optimized loop is not keeping up with the frame rate - it takes 30-35ms which means I can't come close to hitting 60/75fps and loop over my viewport tiles.

matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 10273

Posted: 06:04pm 02 Jul 2020

Copy link to clipboard

Print this post

The source of MMBasic has been provided to Geoff and either is or will be available for download from MMBasic.com. You are fully entitled to download it for your own use and if you can find worthwhile optimisations that do not destabilise it or break existing programs then Geoff and I will be happy to consider them for inclusion.

In terms of simple optimisations of your code, choose variable names that are unique in the first 4 characters. The parser matches 4 characters at a time and there is no advantage at all to variable names less than 4 characters. e.g. use ydelta xdelta rather than deltax deltay

As thwill notes Basic does not do short circuit evaluation, and nor should it, so break the complex if statement up to eliminate as many options as possible as soon as possible.

As you note the graphics commands are fast so rather than using complex logic deciding what to draw, draw everything so that complex conditional Basic statements are minimised. Use the multiple video pages more and make sure your tiles don't overlap by even a single character. BLIT is more optimised when the source and destination areas are non-overlapping or on different pages.

Finally when Mauro publishes his code spend time to look at it. I have no idea how he gets the performance he does but something he does clearly works.

Quote I suspect the issue is the cost of tokenizing.

No: This happens once when you type "RUN"

CaptainBoing

Guru

Joined: 07/09/2016
Location: United Kingdom
Posts: 2170

Posted: 06:28pm 02 Jul 2020

Copy link to clipboard

Print this post

break down those big multi-condition If statements to eliminate the big things first. What I mean is instead of

If this And that And theother Then

Try doing this

If This Then
�If that Then
� �If theother Then
...

won't look great but a major step in accelerating things because you can jump out of the if having checked a single thing rather than forcing all three to be checked before a decision can be made

check this, same meat different gravy: https://www.youtube.com/watch?v=FnfAUheRnfM
Edited 2020-07-03 04:38 by CaptainBoing

abraxas
Regular Member

Joined: 16/06/2020
Location: Canada
Posts: 99

Posted: 06:32pm 02 Jul 2020

Copy link to clipboard

Print this post

matherp said The source of MMBasic has been provided to Geoff and either is or will be available for download from MMBasic.com. You are fully entitled to download it for your own use and if you can find worthwhile optimisations that do not destabilise it or break existing programs then Geoff and I will be happy to consider them for inclusion.

I've tried to download it once and filled out the form on this page (http://mmbasic.com/source.html) but never got an email with a link to a download. I'd love to tinker and see if I can help out (chances are 50/50 as compilers/interpreters are not an area I have much experience with but I've written a couple domain specific languages in the distant past).

I'll think some more about how I can increase blits while cutting down on logic in loops. That said, with tile based games it's a bit of a painful tradeoff as it's super convenient to just keep the state in each viewport tile and then ask in the render function "do you need to refresh?". Note that I'm already doing 90% of that as I'm blitting almost the entire viewport when panning around with cursor keys. That part is wicked fast (the first "elapsed" variable measures that). The bulk of time is spent in the nested FOR loops though which basically contain just one IF statement and a BLIT if IF evaluates to true. And here again it's the IF and not the BLITs that takes up almost all of the CPU time.

abraxas
Regular Member

Joined: 16/06/2020
Location: Canada
Posts: 99

Posted: 06:41pm 02 Jul 2020

Copy link to clipboard

Print this post

CaptainBoing said break down those big multi-condition If statements to eliminate the big things first. What I mean is instead of

If this And that And theother Then

If This Then
�If that Then
� �If theother Then
...

won't look great but a major step in accelerating things because you can jump out of the if having checked a single thing rather than forcing all three to be checked before a decision can be made

Unfortunately my IF expression is a series of OR statements so I can't really shunt evaluation this way. I could perhaps try this:

IF This THEN BLIT....
ELSEIF That THEN BLIT....
ELSEIF TheOther THEN BLIT...
ENDIF

Ugly because BLIT... will be the same blit statement repeated 3-4 times

I'm uncertain that it will run any faster than my current layout:

IF This OR That OR TheOther THEN BLIT...

because that looks like even more work for the interpreter but it's possible it will make a non-trivial difference. One way or the other. I'll try it when I'm off work and free to tinker with it tonight. Still not sure if it will give me the rocket boost I'm hoping for and put me under 13ms total render time (in order to keep up with the monitor's refresh rate).

vegipete

Guru

Joined: 29/01/2013
Location: Canada
Posts: 1129

Posted: 03:52am 03 Jul 2020

Copy link to clipboard

Print this post

I had a bit of a play this afternoon and created the attached.

Probably not useful for what you are doing, because I did none of the game engine you've created for you port of Boulder Dash, but maybe marginally interesting none the less.

I rearranged your sprites a bit, then drew the full world tile map on 4 pages. Four blit operations are then used to grab the required pieces for the desired view port. (Note a blit with a height or width of zero conveniently does nothing.)

Colour animation is done not by changing sprites but rather by changing the colour map.

BoulderDash.zip

Visit Vegipete's *Mite Library for cool programs.

thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4308

Posted: 10:36am 03 Jul 2020

Copy link to clipboard

Print this post

abraxas said Thanks, thwill.

I would definitely take you up on this offer if you have something generic enough that one can just send a .bas file through and get an optimized version out ...

Hmmm, yeah, I'm afraid I'm not capable of *magic*

The tool's main goal is to allow me to automate conversion of Z-MIM from CMM2 -> CMM1 and as such has the following functions:

1. Flatten out #Includes
- actually supports a hierarchy of #Include which MMBasic currently doesn't

2. Reformat / pretty-print code
- consistently apply whitespace / indenting
- or remove it completely, including comments
- I understand this is of no benefit on the CMM2 itself since it pre-tokenises when it loads the program onto the flash

3. Annotate code to comment or uncomment sections based on the setting of flags using the following annotations:
- '!set foo
- '!clear foo
- '!comment_if foo
- '!uncomment_if foo
- '!comment_if_not foo
- '!uncomment_if_not foo
- '!endif

4. Annotate code to replace tokens:
- '!replace foo bar
- useful for shrinking long identifiers and inlining constants

Note that the design is such that an unprocessed but annotated file should still be valid and executable MMBasic, hence all the annotations beginning with the comment mark '.

To achieve this it implements a pretty complete though no doubt buggy lexer for MMBasic upon which more powerful transformations could be built.

Best regards,

Tom

MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures

abraxas
Regular Member

Joined: 16/06/2020
Location: Canada
Posts: 99

Posted: 02:38pm 03 Jul 2020

Copy link to clipboard

Print this post

vegipete said I had a bit of a play this afternoon and created the attached.

Probably not useful for what you are doing, because I did none of the game engine you've created for you port of Boulder Dash, but maybe marginally interesting none the less.

I rearranged your sprites a bit, then drew the full world tile map on 4 pages. Four blit operations are then used to grab the required pieces for the desired view port. (Note a blit with a height or width of zero conveniently does nothing.)

Colour animation is done not by changing sprites but rather by changing the colour map.

BoulderDash.zip

Thank you for taking the time to play with this idea, vegipete. I contemplated this approach early on (not implemented, just contemplated). The reason I pursued my approach was that I wanted to have truly animated tiles like butterflies, bombs, lava tiles etc. The colour shift trick will only really work for diamonds, right?

Also a short update on my meager progress. In anger I cut out all the IF logic and just went with 240 blits in a very tight loop. That actually brought down the rendering time to about 25ms (up to 30-40 when panning around). This allows me to animate all objects smoothly regardless of what or where they are at a 20-40 fps.

That's usable... except it left me with not enough time to actually compute the game states. Normally Boulder Dash computes the next game state by iterating the entire room and doing simple checks on each tile (e.g. am I a stone and is there empty void immediately below me etc).

I know that Mauro thinks it can be done differently by keeping tabs on just the objects that might change in the current game frame. I don't know how to keep track of that so I'm working to the assumption that I'm right and Mauro is wrong (yeah, slim chance, I know). However, if I do go with my approach I have to compute over 800 tiles - 20 rows of 40 tiles (just for Peter Liepa levels).

Bad news, here's why.

According to my preliminary estimates that will take between 5 and 10ms per row depending how efficient I can be about computing every tile type. Some tiles like voids, dirt and steel walls have pretty much no logic. Others though like stones, need to do a check around them to see if they need to start moving, exploding etc. So it looks like it will average to about 5-10ms per row. At 20 rows that's 100ms in addition to the 25ms that my rendering loop already consumes. if I simply do one render, one row compute, one render and so on, it means my single render will take 30-40ms and I will need 20 repeats of that to transition from the current game state to the next. If you're keeping track that's in the realm of unplayable - game states will transition once every 600-800ms which will make for a BoulderDash slower than a ZX Spectrum one that ticked at about 500ms.

I really hoped for a lot more than that.

Which brings us right back to your technique, vegiepete. I love the idea of paining the whole screen with just four BLIT calls but I still have no idea how to deal with animations given that I don't want to be restricted to just colormap shift based "animation".

Back to the drawing board for me it seems...

abraxas
Regular Member

Joined: 16/06/2020
Location: Canada
Posts: 99

Posted: 02:49pm 03 Jul 2020

Copy link to clipboard

Print this post

thwill said

To achieve this it implements a pretty complete though no doubt buggy lexer for MMBasic upon which more powerful transformations could be built.

Best regards,

Tom

Well, for my purposes the of biggest value would be function inlining (taking a body of a function and copy pasting it into every place where it's being used and also inlining of CONST variables. These two features could speed things up a lot if you're calling out to functions in tight loops.

thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4308

Posted: 03:29pm 03 Jul 2020

Copy link to clipboard

Print this post

abraxas said Well, for my purposes the of biggest value would be function inlining (taking a body of a function and copy pasting it into every place where it's being used and also inlining of CONST variables. These two features could speed things up a lot if you're calling out to functions in tight loops.

Hi Abraxas,

I can offer it "as is" or you can wait a week or so for me to iron out some of the kinks. Unfortunately I won't be implementing function inlining unless/until I need it myself as it's non-trivial.

I haven't looked specifically at your problem but my gut is telling me that optimising your existing algorithm will not be enough in any case and you will need a different approach other than brute force iteration over every tile.

Hopefully Peter won't paint himself (and us) into a corner and close the door completely on "CFunctions" / calling into native code in the future but I can understand why it isn't a priority now, and to be completely honest I quite like the fact that limitations exist.

Regards,

Tom

MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures

thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 4308

Posted: 03:36pm 03 Jul 2020

Copy link to clipboard

Print this post

abraxas said If you're keeping track that's in the realm of unplayable - game states will transition once every 600-800ms which will make for a BoulderDash slower than a ZX Spectrum one that ticked at about 500ms.

That chimes with my limited observations. In a previous thread I summised that if you exclude the high performance graphics capabilities (which in most games will more than compensate) the computation power of MMBasic on the CMM2 is 50-75% that of machine-code on an 8-bit machine, though I was comparing with a C-64.

EDIT: ... of course as well as high-performance graphics we have virtually unlimited memory which gives us the opportunity to implement faster, but memory inefficient algorithms that were not possible on 8-bit platforms.

Regards,

Tom
Edited 2020-07-04 01:38 by thwill

MMBasic for Linux, Game*Mite, CMM2 Welcome Tape, Creaky old text adventures

abraxas
Regular Member

Joined: 16/06/2020
Location: Canada
Posts: 99

Posted: 03:47pm 03 Jul 2020

Copy link to clipboard

Print this post

Yeah, CFunction or an ASM block is my dream at this point. Yeah limitations are fun up to a point. Frankly I was hoping to do more than just remake Peter Liepa's levels. I had visions of mega levels with quests etc. All that is not on the table and even a fully functional C64 style Boulder Dash may be just out of reach.

Note that for all its greatness Mauro's MaxiDash demo is only a demo with much of the functionality still missing and that functionality will eat a lot of CPU cycles.

I'm studying his code to understand how he's able to keep track of movable objects and the causality chains without scanning the entire map on every game tick. The answer may lie there. That or he may have subtle causality bugs that just aren't noticeable yet without monsters, lava, butterflies etc.

robert.rozee
Guru

Joined: 31/12/2012
Location: New Zealand
Posts: 2440

Posted: 03:52pm 03 Jul 2020

Copy link to clipboard

Print this post

hi,
just had a quick look at the first three minutes of this video on youtube:
https://www.youtube.com/watch?v=FiEVfa1OK_o

i've never heard of or played the game before, so the first 3 minutes of the video is all i know about it and how gameplay works.

it strikes me that you could simplify things by having an 'action list'. at the start of a level scan through every tile and for those that are are going to do something at a future time, add that event (including a timestamp) to the action list.

then for each pass through the main loop:
1. look at the tiles around the player (and possible all those in the column above) and add items to the action list as required. eg, if the player digs out the material under a boulder, add the dropping event for that boulder to the action list.
2. sort the action list chronologically.
3. starting at the top of the action list, process actions until you come to the first one that is not ready to happen yet. in the case of items that have ongoing activity (eg, in the case of a boulder dropping), once moved down by one tile it may be appropriate to add it back into the action list. similarly for exploding boulders, etc.
4. update the playing field as required for left/right/up/down scrolling. this may require moving some tiles off the visible screen, while others from the opposite edge are moved on.

does this help solve any of the problems? bear in mind, my entire knowledge of the game comes from just 3 minutes of youtube video!

cheers,
rob :-)

abraxas
Regular Member

Joined: 16/06/2020
Location: Canada
Posts: 99

Posted: 04:30pm 03 Jul 2020

Copy link to clipboard

Print this post

Not entirely, Rob. The problem is or at least if I understand you well that computing causal chains just above Rockford isn't enough. Further levels of the game introduce flying monsters, lava, butterflies, expanding walls etc. Those objects do and must interact with each other. Every game tick the position of every monster is changed. As a result they might touch lava and explode in the next game tick. Which may cause it to blow up the wall. Which may cause it to release an avalanche of stones, which may kill a butterfly which will release diamonds and so on. The clean way to implement this is to make a pass over the entire room once per game tick. Everything else will be a bug prone headache. At least as far as I can tell.

vegipete

Guru

Joined: 29/01/2013
Location: Canada
Posts: 1129

Posted: 05:19pm 03 Jul 2020

Copy link to clipboard

Print this post

An interesting problem for sure. In essence a sort of physics engine is involved.

I too was largely unaware of the actual game until my son and I watched a youtube video of a skilled player. It's gonna take some sneaky skill to smoothly get that working.

I suppose there are two problems to solve:
- movement/action/physics of the entire world
- animation of stuff visible on screen

For movement, it looks like only actors can instigate general movement, although as you point out, this can cascade significantly. But the list of moving tiles is still less than all tiles of the world.

For animation, perhaps an animation counter could help. Increment (and wrap) the counter each animation tick and use the value to index which tile to use when rendering the visible section.

Fractional screen scrolling adds an extra challenge. Perhaps rendering an image 1 tile larger in width and height and then blitting the desired piece to the display would help. This will shrink the image to a tile smaller than the full screen, but oh well.

--------------------
My Falfus2 game required a simple 'falling down' engine, but the total playing area is only 16x12 (ish) with no fractional positioning so the computation load is drastically less.

Visit Vegipete's *Mite Library for cool programs.

capsikin
Guru

Joined: 30/06/2020
Location: Australia
Posts: 341

Posted: 07:44am 06 Jul 2020

Copy link to clipboard

Print this post

abraxas said

It might eventually be worth putting in both types of animation, if you need to display lots of diamonds but only a few other animated tiles at once.

abraxas said
Also a short update on my meager progress. In anger I cut out all the IF logic and just went with 240 blits in a very tight loop. That actually brought down the rendering time to about 25ms (up to 30-40 when panning around). This allows me to animate all objects smoothly regardless of what or where they are at a 20-40 fps.

That's usable... except it left me with not enough time to actually compute the game states. Normally Boulder Dash computes the next game state by iterating the entire room and doing simple checks on each tile (e.g. am I a stone and is there empty void immediately below me etc).

I wonder if adding in a single simple test to avoid drawing unchanged tiles would help - I know you'd needed to remove some tests, but maybe an array lookup "IF" test could be fast enough, and reduce drawing time.

Adding a test like that to eliminate half the drawing helped speed things up when I tested a simple example on my pi-cromite, but I'm not sure if it's the same on the CMM2.

Something like:

if needsDrawing(m,n)=1 then

'Do drawing stuff here
BLIT whatever

needsDrawing(m,n) = isAnimated(m,n)
' or needsDrawing(m,n) = (t AND something) >> otherthing
' check which is faster, or at least fast enough and clearer.

endif

You'd need to set the needsDrawing array whenever something moved, to draw old and new locations, and for the tiles scrolled in when the view moves.

For example:

Quote
FOR n=te TO be
FOR m=le TO re
IF needsDrawing(m,n)=1 THEN

'maybe set t, if you need it for the blit command
t=room(m,n)

'Do drawing stuff here
BLIT whatever

needsDrawing(m,n) = isAnimated(m,n)
' or needsDrawing(m,n) = (t AND something) >> otherthing
' check which is faster, or at least fast enough and clearer.

ENDIF
NEXT
NEXT

And for the scrolled-in tiles, doing this before the loop above:

Quote
IF dY<0
FOR m=le TO re
needsDrawing(m,te)=1
NEXT
ELSEIF dY>0
FOR m=le TO re
needsDrawing(m,ce)=1
NEXT
ENDIF

IF dX<0
FOR n=te TO be
needsDrawing(le,n)=1
NEXT
ELSEIF dX>0
FOR n=te TO be
needsDrawing(re,n)=1
NEXT
ENDIF

abraxas said
I know that Mauro thinks it can be done differently by keeping tabs on just the objects that might change in the current game frame. I don't know how to keep track of that so I'm working to the assumption that I'm right and Mauro is wrong (yeah, slim chance, I know).

What I did for the display part is like a compromise between the two approaches. You still check every location, but only a single array lookup for most inactive tiles. You could do a similar thing for game logic, with an array active(m,n) for whether you need to look up the tile type and run its logic.

abraxas said
However, if I do go with my approach I have to compute over 800 tiles - 20 rows of 40 tiles (just for Peter Liepa levels).

Bad news, here's why.

According to my preliminary estimates that will take between 5 and 10ms per row depending how efficient I can be about computing every tile type. Some tiles like voids, dirt and steel walls have pretty much no logic. Others though like stones, need to do a check around them to see if they need to start moving, exploding etc. So it looks like it will average to about 5-10ms per row.

Easiest would be for only voids, dirt and steel walls to be inactive, and to leave rocks and similar things set as active, but I don't know if it would be fast enough, and it would limit how many rocks you could have before it got too slow, but it could be worth trying.

Another option would be to mark rocks and diamonds as inactive most of the time, and whenever a change happens in a tile location, mark the nearest surrounding tiles as active, so they will process their full logic when the map loop reaches their location.

For stuff like monsters I'd leave them active all the time.

-- Caspian

abraxas
Regular Member

Joined: 16/06/2020
Location: Canada
Posts: 99

Posted: 03:30pm 06 Jul 2020

Copy link to clipboard

Print this post

Caspian and others,

I'm attaching the latest version of the code. It already runs acceptably though not blazingly fast. It's visible that CMM2 struggles to keep up with rendering as you moved outside the viewport and force the viewport to "catch up". It has this delay that the ZX Spectrum version also exhibits. Now, it's possible to cure it by making the viewport "chase" Rockford one tile per move when Rockford gets close to the edge. That's how Mauro's port worked and some other 8 bits ports did (e.g. MSX). I do have an issue with it however, as it strays from the original which instead tries to center the viewport on Rockford. This leads to different tradeoffs and thus affects the gameplay. In those implementations that follow the technique that Mauro used it's easier to escape baddies that are chasing you but harder to view a larger area of the map once your viewport moves. In my version which mimics what Peter Liepa's code did it's the opposite tradeoff. Now there is a setting in my code called "SCROLL_STEP" that is currently set to 6 pixels. I can make it bigger and the viewport will update more quickly but will be more jarring.

Additionally to help with the feel of the game I bail out of the RENDER sub any time the game requires a new compute of the game loop. This change improved playability a lot but does lead to occasional visual artefacts especially when you look at diamond animations. There may be some trial and error possibly by messing with the timing of the animation clock to make it less obvious. I haven't spent much time refining that.

A lot of the ideas you and the others talked about I'd already incorporated into the code. I don't have the "active/inactive" tile marking but I do mark dirt, empty spaces and steel walls as "inert" and shunt the eval by first testing for that flag.

I'm thinking about adding a "upToDate" flag on every tile and somehow use that in rendering. The animated tiles throw a big monkey wrench in it though. Additionally I'd rather have a more consistent rendering time that's easier to account for than a wildly variable situation with huge swings in performance depending on how much bling is visible in the viewport. I will for sure experiment more with the rendering routine.

Anyway for now I have the falling behavior implemented and debugged and Rockford movement as well as viewport repositioning working together. You can �walk around the screen using the cursor keys. I'd love for anyone even remotely interested in my efforts to download the attachment and have a play. Run the "game.bas" file. I very much welcome comments and suggestions for improvements.

BDASH.zip
Edited 2020-07-07 01:36 by abraxas

Page 1 of 2

Print this page