Home
JAQForum Ver 20.06
Log In or Join  
Active Topics
Local Time 09:18 26 Apr 2024 Privacy Policy
Jump to

Notice. New forum software under development. It's going to miss a few functions and look a bit ugly for a while, but I'm working on it full time now as the old forum was too unstable. Couple days, all good. If you notice any issues, please contact me.

Forum Index : Microcontroller and PC projects : CMM2 slow pixel drawing?

Author Message
void
Newbie

Joined: 02/01/2021
Location: Poland
Posts: 3
Posted: 10:13am 25 Jan 2021
Copy link to clipboard 
Print this post

Hi,
I'm new to Maximite, but I've readed a lot about it.
Finally (thanks to the PSLabs) I have my own!

I've started to write some experimental stuff, and I have a question - am I doing something wrong, or is pixel drawing slow on maximite?

The code:

mode 3, 8

start_timer = timer

for i% = 0 to 10  ' repeat 10 times
 for y% = 0 to 199
   for x% = 0 to 319
      pixel x%, y%
   next x%
 next y%
next i%

print timer - start_timer
pause 5000


This code displays around 4600 timer value. So does it mean that drawing 64000 pixels ten times consumes 4,5 seconds?

I have task that needs to take every pixel from screen, make some calculations to it, and put it back on the screen. According to above code I can have around 2 frames per second, which isn't acceptable.

So am I missing something? Am I doing something wrong?

I've tried to operate on arrays, but no luck in speeding things up.

Any tips or clues?
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 3839
Posted: 10:26am 25 Jan 2021
Copy link to clipboard 
Print this post

Hi void,

I'm sure others will chime in with more detailed explanations, but the bottom line is that with your example the vast majority of CPU is being spent interpreting BASIC and very little is going into drawing pixels. This is one of those cases where you will need to resort to a CSUB and probably manipulate the screen memory directly.

If you look at the source-code for the Mandelbrot Explorer example on the Welcome Tape then you will find you can switch from the default CSUB implementation to one that just calculates and plots using BASIC and the performance difference is vast.

Best wishes,

Tom
Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 8578
Posted: 10:37am 25 Jan 2021
Copy link to clipboard 
Print this post

Your loop is outputting 704000 pixels

  Quote  mode 3, 8

start_timer = timer

for i% = 0 to 10  ' repeat 10 times
for y% = 0 to 199
  for x% = 0 to 319
     pixel x%, y%
     inc c%
  next x%
next y%
next i%

print c%,timer - start_timer
pause 5000
 
twofingers
Guru

Joined: 02/06/2014
Location: Germany
Posts: 1133
Posted: 10:41am 25 Jan 2021
Copy link to clipboard 
Print this post

Hi void,

I can't help you, but you should change your code to:
for i% = 1 to 10  ' repeat 10 times

Your assumptions are correct so far (IMHO). PIXEL is slow.

Regards
Michael
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 3839
Posted: 10:42am 25 Jan 2021
Copy link to clipboard 
Print this post

  matherp said  Your loop is outputting 704000 pixels


There is that too
Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 3839
Posted: 10:44am 25 Jan 2021
Copy link to clipboard 
Print this post

  twofingers said  Your assumptions are correct so far (IMHO). PIXEL is slow.


Slow compared with what? You could try unrolling parts of the loop so the interpreter doesn't spend so much time in the FOR and NEXT statements ?

Maybe use the profiling in firmware 5.07 to find out how much time is being spent on the PIXEL statement compared with the others.

Tom
Edited 2021-01-25 20:45 by thwill
Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
void
Newbie

Joined: 02/01/2021
Location: Poland
Posts: 3
Posted: 10:46am 25 Jan 2021
Copy link to clipboard 
Print this post

  Quote  Your loop is outputting 704000 pixels


Yes, it is. It is writing 11 times (not 10 as I mentioned in code comment) 64000 pixels to the screen (fill whole screen). It takes around 4600ms so around 2.5 screens per second.

  Quote  If you look at the source-code for the Mandelbrot Explorer example on the Welcome Tape then you will find you can switch from the default CSUB implementation to one that just calculates and plots using BASIC and the performance difference is vast.


Thank you for clue - I see now, how it is done in Mandelbrot Explorer.
 
epsilon

Senior Member

Joined: 30/07/2020
Location: Belgium
Posts: 255
Posted: 11:00am 25 Jan 2021
Copy link to clipboard 
Print this post

You're doing it 11 times actually (0 to 10, actually means up to and including 10... yeah, sorry) and in your test, you're just writing, not reading.

If you would do "pixel x%, y%, pixel(x%, y%)", i.e. read followed by write, you would get 7899ms. I'm not helping, am I?

If you would use POKE/PEEK BYTE instead it's only slightly better: 6867ms.

If your pixel operation can be 'SIMD'd' maybe you can POKE/PEEK 8 bytes at a time using PEEK/POKE INTEGER and get 1005ms.

It can be made very fast using CSUBs, but that's probably not the answer you're looking for.

What frame rate are you targeting?
Epsilon CMM2 projects
 
jirsoft

Guru

Joined: 18/09/2020
Location: Czech Republic
Posts: 532
Posted: 11:16am 25 Jan 2021
Copy link to clipboard 
Print this post

Exactly, as was alredy mentioned, unroll the loop and use POKE/PEEK with 8 bytes at once, so you can calculate at least one complete pixel row...
Jiri
Napoleon Commander and SimplEd for CMM2 (GitHub),  CMM2.fun
 
void
Newbie

Joined: 02/01/2021
Location: Poland
Posts: 3
Posted: 11:21am 25 Jan 2021
Copy link to clipboard 
Print this post

About 11 iterations vs 10 - just let it go :). I made mistake, I know - but nothing changes in slow pixel drawing.

I've tried to made one loop, I've tried to made POKE'ing instead pixels without much luck.

What is my goal? Well I was trying to have some fun with pixel generated demoscene effects like plasma, fire, blur, bumpmapping, crossfading, etc. Basically anything that is related to palette rotation and/or pixel values manipulation.

So my target framerate is as high as possible.
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 8578
Posted: 12:06pm 25 Jan 2021
Copy link to clipboard 
Print this post

Here is the profile of your program

       1,    109716.0,  "Mode 3,8",,1
        1,         6.0,  "START_TIMER =Timer",,3
        1,        10.0,  "For I% =0 To 10",,5
       11,         7.0,  "For Y% =0 To 199",,6
     2200,         5.0,  "For X% =0 To 319",,7
   704000,         5.0,  "Pixel X%,Y%",,8
   704000,         1.0,  "Inc C%",,9
   704000,         1.0,  "Next X%",,10
     2200,         1.0,  "Next Y%",,11
       11,         1.0,  "Next I%",,12
        1,        91.0,  "Print C%,Timer -START_TIMER",,14
        0,         0.0,  "Pause 5000",,15
 
thwill

Guru

Joined: 16/09/2019
Location: United Kingdom
Posts: 3839
Posted: 12:08pm 25 Jan 2021
Copy link to clipboard 
Print this post

@matherp Something screwy with the statistics gathered for the PAUSE call at the end?

Tom
Game*Mite, CMM2 Welcome Tape, Creaky old text adventures
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 8578
Posted: 12:31pm 25 Jan 2021
Copy link to clipboard 
Print this post

The last line is always incorrect as there isn't a line after it to complete the timing. Easily solved

       1,    115391.0,  "Mode 3,8",,1
        1,         5.0,  "START_TIMER =Timer",,3
        1,        10.0,  "For I% =0 To 10",,5
       11,         6.6,  "For Y% =0 To 199",,6
     2200,         5.0,  "For X% =0 To 319",,7
   704000,         5.0,  "Pixel X%,Y%",,8
   704000,         1.0,  "Inc C%",,9
   704000,         1.0,  "Next X%",,10
     2200,         1.0,  "Next Y%",,11
       11,         1.0,  "Next I%",,12
        1,        90.0,  "Print C%,Timer -START_TIMER",,14
        1,   4999799.0,  "Pause 5000",,15
        0,         0.0,  "End ",,16
 
epsilon

Senior Member

Joined: 30/07/2020
Location: Belgium
Posts: 255
Posted: 12:31pm 25 Jan 2021
Copy link to clipboard 
Print this post

  void said  
What is my goal? Well I was trying to have some fun with pixel generated demoscene effects like plasma, fire, blur, bumpmapping, crossfading, etc.


If you're sticking to MMBasic, which is what you'd have to do for bragging rights, which are what demos are about, you'll have to throttle back your demo effect expectations a decade or two.
As Mauro Xavier has shown, it is possible to make cool demos on CMM2 in MMBASIC, but you have to roll with the platform's strengths, which does not include fast pixel-by-pixel processing.

If you go to CSUBs, it's a different story. I ran this on page 0 10 times:


void pixelrw(long long *paddr) {
int ii;
char *buf = (char*)(*paddr);

for (ii=0; ii<64000; ii++) {
buf[ii] = buf[ii]+1;
}
}


I measured 19ms (for 10 loops of the above). That's pretty fast, but at this point you have reduced your CMM2 to just another generic framebuffer device.
Epsilon CMM2 projects
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 8578
Posted: 12:42pm 25 Jan 2021
Copy link to clipboard 
Print this post

If you download and install V5.07.00b7 again you can use

PIXEL FAST x, y [,c]

This takes 3 uSec rather than 5

The PIXEL command is rather overloaded as it supports CMM1 mode, flood fill, and arrays as inputs

PIXEL FAST only does a single pixel which reduces parsing time
 
Plasmamac

Guru

Joined: 31/01/2019
Location: Germany
Posts: 501
Posted: 01:04pm 25 Jan 2021
Copy link to clipboard 
Print this post

Is pixel fast also check the boundaries?
Plasma
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 8578
Posted: 01:08pm 25 Jan 2021
Copy link to clipboard 
Print this post

  Quote  Is pixel fast also check the boundaries?


Yes: but that is trivial. As always the most time is simply parsing the parameters and then it has to do the colour conversion and decide if it needs to duplicate the pixel as in this case (mode 3)

sc = (((c >> 16) & 0b11100000)) | (((c >> 8) & 0b11100000)>>3) | ((c & 0b11000000)>>6);
if(PageTable[WritePage].expand==0){
s=(uint8_t *)((y * maxW + x) + wpa);
*s=sc;
} else {
y*=2;
s=(uint8_t *)((y * maxW + x) + wpa);
s1=(uint8_t *)(((y+1) * maxW + x) + wpa);
*s=sc;
*s1=sc;
}
 
Print this page


To reply to this topic, you need to log in.

© JAQ Software 2024