Home
JAQForum Ver 20.06
Log In or Join  
Active Topics
Local Time 15:39 20 Apr 2024 Privacy Policy
Jump to

Notice. New forum software under development. It's going to miss a few functions and look a bit ugly for a while, but I'm working on it full time now as the old forum was too unstable. Couple days, all good. If you notice any issues, please contact me.

Forum Index : Microcontroller and PC projects : CMM2: Page Copying vs. Swapping

     Page 1 of 2    
Author Message
Nelno

Regular Member

Joined: 22/01/2021
Location: United States
Posts: 59
Posted: 02:15am 24 Jan 2021
Copy link to clipboard 
Print this post

I'm new to the CMM2, so forgive me for anything I've gotten wrong here.

After a few nights of reading through CMM2 documentation and fiddling with some BASIC programs to figure out different ways of synching to refresh to avoid any tearing (and having varying results -- what I have is still not perfect across all modes), I'm wondering why there's no actual page swap capability.

The reason I'd like a swap is, of course, to avoid the copy from other pages to page 0. While the copies do seem fairly optimized, it's still costly, taking anywhere from 2.5 ms to and up. (I'm copying in the mode command's vblank callback right now, so I'm not sure what other threads of execution this might be blocking right now, but it's certainly time spent somewhere).

From reading Peter Mather's excellent post on CMM2 graphics, my understanding is that this is probably because there's only 512KB of STM32H743IIT6 to allocate to the frontbuffer.

Assuming the LTDC's read address can be updated during a vertical blank to point to another page, is the speed of the SDRAM just too slow to support refreshing the screen directly from SDRAM?

Even if that is the case, it seems that two 640x400 pages would fit perfectly in the 512KB allocated for the frontbuffer now, so this might be a reasonable approach for that mode (swapping instantaneously between two 640x400 pages, each 256KB in size). Clearly, there would be even more pages available for 320x200 modes.

Again, I'm new to CMM2 and don't have more than a few nights bouncing around the documentation and forums, so please forgive any erroneous assumptions I've made above.

-Jonathan
 
epsilon

Senior Member

Joined: 30/07/2020
Location: Belgium
Posts: 255
Posted: 11:19am 24 Jan 2021
Copy link to clipboard 
Print this post

Good question. I don't have the answer  
However, I measured the timing of PAGE COPY in the various modes, see below. There are several modes, including 640x400 8bits, where the PAGE COPY takes only about 0.5ms, i.e. the PAGE COPY isn't necessarily an expensive operation.

MODE 1,8: Clock delta (us) after PAGE COPY:3373
MODE 1,12: Clock delta (us) after PAGE COPY:26925
MODE 1,16: Clock delta (us) after PAGE COPY:15237
MODE 2,8: Clock delta (us) after PAGE COPY:527
MODE 2,12: Clock delta (us) after PAGE COPY:4400
MODE 2,16: Clock delta (us) after PAGE COPY:3594
MODE 3,8: Clock delta (us) after PAGE COPY:373
MODE 3,12: Clock delta (us) after PAGE COPY:533
MODE 3,16: Clock delta (us) after PAGE COPY:608
MODE 4,8: Clock delta (us) after PAGE COPY:438
MODE 4,12: Clock delta (us) after PAGE COPY:3608
MODE 4,16: Clock delta (us) after PAGE COPY:2921
MODE 5,8: Clock delta (us) after PAGE COPY:303
MODE 5,12: Clock delta (us) after PAGE COPY:438
MODE 5,16: Clock delta (us) after PAGE COPY:492
MODE 6,8: Clock delta (us) after PAGE COPY:420
MODE 6,12: Clock delta (us) after PAGE COPY:512
MODE 6,16: Clock delta (us) after PAGE COPY:639
MODE 7,8: Clock delta (us) after PAGE COPY:436
MODE 7,12: Clock delta (us) after PAGE COPY:2500
MODE 7,16: Clock delta (us) after PAGE COPY:723
MODE 8,8: Clock delta (us) after PAGE COPY:2179
MODE 8,12: Clock delta (us) after PAGE COPY:12738
MODE 8,16: Clock delta (us) after PAGE COPY:8978
MODE 9,8: Clock delta (us) after PAGE COPY:11505
MODE 9,16: Clock delta (us) after PAGE COPY:31979
MODE 10,8: Clock delta (us) after PAGE COPY:2864
MODE 10,12: Clock delta (us) after PAGE COPY:18521
MODE 10,16: Clock delta (us) after PAGE COPY:12456
Epsilon CMM2 projects
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 8567
Posted: 12:19pm 24 Jan 2021
Copy link to clipboard 
Print this post

The early versions of the S/W actually had this capability but it was removed.

It is very difficult to use effectively and very easy to get yourself into a mess.

Think about how you would program to use it

I update page 1 and then swap the display page to page 1
I now update page 0 with new changes I need and then swap to page 0

Ooops: I've now lost the change I made on page 1

The decision was therefore made to have a fixed display page and optimise page copying

There is another technical issue behind the decision: A page being displayed has to allow shared access to its memory for both CPU and the LTDC controller. This adds a significant overhead to the memory access times. The CMM2 firmware uses the chip's memory management so it is only page 0 that has this enabled all other pages are single access for the CPU only
 
Nelno

Regular Member

Joined: 22/01/2021
Location: United States
Posts: 59
Posted: 10:11pm 24 Jan 2021
Copy link to clipboard 
Print this post

  matherp said  The early versions of the S/W actually had this capability but it was removed.

It is very difficult to use effectively and very easy to get yourself into a mess.

Think about how you would program to use it

I update page 1 and then swap the display page to page 1
I now update page 0 with new changes I need and then swap to page 0

Ooops: I've now lost the change I made on page 1

The decision was therefore made to have a fixed display page and optimise page copying

There is another technical issue behind the decision: A page being displayed has to allow shared access to its memory for both CPU and the LTDC controller. This adds a significant overhead to the memory access times. The CMM2 firmware uses the chip's memory management so it is only page 0 that has this enabled all other pages are single access for the CPU only


Thanks for the background on that. I hadn't considered the memory access overhead this might entail. In the use case I'm thinking of I don't think I would ever want anything but the LTDC to read from the page that's being displayed, at the time it's being displayed, while the other page would only need write access (or read/write depending on the use of sprites).

I've spent the last 20+ years working on GPUs so I've gotten used to rendering the entire scene from scratch each frame, then swapping to the new buffer. This, of course, is fairly different and there's a lot of hardware in modern GPUs to make that efficient.

Since I don't have a specific scenario I'm working towards right now, this is all conjecture, but an example usage case:

1) set page 0 visible
2) set page 1 writable
3) hide sprites
4) render a tiled background to write page using blits
5) place sprites on write page
6) wait for refresh
7) swap pages (make write page visible, make visible page writable)
8) go to 3

In this scenario I don't care that I've lost everything on page 0 because I'm going to refresh the entire screen. I'll spend most of my memory bandwidth drawing the tiles and because I can swap, I don't need the additional copy to get it to the front buffer.

I think this might also work without updating the full screen, but it would require using 2 sprites for every object, since the background under moving sprites would be different for each page.

I don't know if this would be effective or worth the pains. It isn't always obvious that will be beneficial overall. If memory access itself has a lot more overhead, that may make the whole exercise worse. On top of that, I've been having to remind myself that MMBASIC is still interpreted and, despite it being a whole lot faster than an 8-bit machine from the 80's, there's still a high relative cost for every line of code executed. If it takes a lot more game logic to manage swapped pages, that might quickly eat up any benefits.

Anyway, thanks for the info. I may or may not spend more time experimenting with this. While I sort of want to deep-dive, one of the things that attracted me to the CMM2 was how easy it is to use without having to compile, worry about library dependencies, hardware compatibility, write my own assembler functions/graphics library, etc. All those things that I seem to spend way too much time on these days. So thanks, Peter, for doing all of that stuff for us and by no means do I intend to cast shade on the work you've done.
 
Nelno

Regular Member

Joined: 22/01/2021
Location: United States
Posts: 59
Posted: 01:25am 25 Jan 2021
Copy link to clipboard 
Print this post

  epsilon said  Good question. I don't have the answer  
However, I measured the timing of PAGE COPY in the various modes, see below. There are several modes, including 640x400 8bits, where the PAGE COPY takes only about 0.5ms, i.e. the PAGE COPY isn't necessarily an expensive operation.


I agree that cases where it's 0.5 ms, those are easily manageable. I've been doing most of my testing in mode 1, so it's the higher res modes that got me thinking about swapping vs. copying. In the best case the copy time is about 1/4 of the refresh time, in the worst case is far longer than the refresh time.

The times I'm seeing right now are:
MODE 1,8: 3.95 ms
MODE 1,12: 41 - 43 ms!!
MODE 1,16: 19.6 ms

All of those are longer than the times you're seeing. Do you mind running the below code and letting me know what you see? I'm running MMBasic 5.06.00 on a RetroMax.


option explicit

dim float frameDur = 0.0
dim float fps = 0.0
dim float copyDur = 0.0
dim float gameDur = 0.0
dim float renderDur = 0.0
dim float waitDur = 0.0
dim integer waitCount = 0
dim integer vsyncCount = 0
dim integer gameVsyncCount = 0

mode 1,16,0,onVBlank

page write 1

do
 gameLoop
loop
end

sub onVBlank
 vsyncCount = vsyncCount + 1
end sub

sub gameLoop
 local float frameStart = timer
 local float startTime = frameStart
 
 ' do some game logic
 'pause 8.0
 gameDur = timer - startTime

 ' render the results of game logic to back buffer
 startTime = timer
 text 0, 0, "gd: " + str$(gameDur, 3, 3) + " rd: " + str$(renderDur, 3, 3) + " cd: " + str$(copyDur, 3, 3)
 text 0, 16, "fps: " + str$(fps, 2, 1) + " wd: " + str$(waitDur, 3, 3) + " wc: " + str$(waitCount, 3, 0)
 renderDur = timer - startTime

 ' wait until screen blanking
 startTime = timer
 waitCount = 0
 do while(gameVsyncCount >= vsyncCount)
   waitCount = waitCount + 1
 loop
 gameVsyncCount = vsyncCount
 waitDur = timer - startTime

 ' copy to front buffer
 startTime = timer
 page copy 1 to 0, I
 copyDur = timer - startTime

 frameDur = timer - frameStart
 fps = 1000.0 / frameDur
end sub


Note in that code I'm using a vblank callback to increment a counter and I know that vsync has occurred in the game loop by waiting until that counter has incremented. Technically the wait shouldn't be necessary for this test but it also shouldn't affect the copy time.

This isn't exactly how I'd set things up for a game loop, but I wanted to test the actual time to copy by doing an immediate copy.

In other tests I did the copy with "I" in the vblank callback but this seems to fail when the copy takes longer that 1 frame (12 and 16 bit modes for me). I'm guessing that subroutine doesn't like being re-entered. Otherwise, doing it in the vblank has the advantage of the copy being done asynchronously, while still being able to wrap with timers.

At any rate, I see essentially the same times across all of my tests, independent of how I call page copy, so I'm wondering why mine are significantly slower than yours (especially mode 1, 12-bit!)

-Jonathan
 
TassyJim

Guru

Joined: 07/08/2011
Location: Australia
Posts: 5884
Posted: 02:27am 25 Jan 2021
Copy link to clipboard 
Print this post

While I have very little knowledge of the PAGE COPY process,
I have found that PAGE COPY 1 TO 0,B works best for me.
Let the system work out when to start the copy.

I suggest that you update to the latest beta firmware and use the OPTION PROFILING
It makes checking the timing much easier.

With one of my programs (PAGE COPY set to B)
Mode 1,8  > 11.4mS
Mode 1,12 > 33.0mS
Mode 1,16 > 21.7mS

With PAGE COPY set to I,
Mode 1,8  > 6.1mS
Mode 1,12 > 10.8mS
Mode 1,16 > 11.8mS

The difference due to no wait time before starting the copy.
It looks like Profiling includes the waiting.

Jim
Edited 2021-01-25 12:52 by TassyJim
VK7JH
MMedit   MMBasic Help
 
epsilon

Senior Member

Joined: 30/07/2020
Location: Belgium
Posts: 255
Posted: 08:31am 25 Jan 2021
Copy link to clipboard 
Print this post

  Nelno said  
All of those are longer than the times you're seeing. Do you mind running the below code and letting me know what you see? I'm running MMBasic 5.06.00 on a RetroMax.


I ran your program. I got:
cd: 0.012 rd: 0.465 cd: 15.235
fps: 60.4 wd: 0.810 wc: 75

'cd' or copy duration is in line with what I measured for MODE 1,16: 15.237ms.

I'm running 5.07.00b2. I have a 480MHz CPU. Maybe yours is 400MHz?
Epsilon CMM2 projects
 
Nelno

Regular Member

Joined: 22/01/2021
Location: United States
Posts: 59
Posted: 08:57am 25 Jan 2021
Copy link to clipboard 
Print this post

  TassyJim said  While I have very little knowledge of the PAGE COPY process,
I have found that PAGE COPY 1 TO 0,B works best for me.
Let the system work out when to start the copy.

I suggest that you update to the latest beta firmware and use the OPTION PROFILING
It makes checking the timing much easier.


I saw that profiling was added, but I just haven't had a chance to update my firmware yet.

I will do that next.

  TassyJim said  With one of my programs (PAGE COPY set to B)
Mode 1,8  > 11.4mS
Mode 1,12 > 33.0mS
Mode 1,16 > 21.7mS


With B I don't think those will mean much. The amount of time it's waiting depends entirely on when you start waiting, i.e. what you do between the previous vsync and when you call PAGE COPY B.

  TassyJim said  With PAGE COPY set to I,
Mode 1,8  > 6.1mS
Mode 1,12 > 10.8mS
Mode 1,16 > 11.8mS


Now, those are interesting. I get a better time in 1,8 at 3.95 ms. Mode 1,16 takes nearly twice as long as you're seeing at 19 ms, and I get a much, much worse time, 4x your time, in 1,12 @41 ms.

While I wouldn't be surprised to see a more complex program take longer to copy due to memory bandwidth, I don't have an explanation as to why a simple program like the example above would be 4x slower.

I think I definitely need to update my firmware. I may not be comparing apples to apples here.

And... I totally borked the vsync waiting in the code above when I took the copy out of the vsync callback. To be correct it needs to set gameVsyncCount = vsyncCount at the start of the frame loop, like below:


option explicit

dim float frameDur = 0.0
dim float fps = 0.0
dim float copyDur = 0.0
dim float gameDur = 0.0
dim float renderDur = 0.0
dim float waitDur = 0.0
dim integer waitCount = 0
dim integer vsyncCount = 0
dim integer gameVsyncCount = 0

mode 1,16,0,onVBlank

page write 1

do
gameLoop
loop
end

sub onVBlank
vsyncCount = vsyncCount + 1
end sub

sub gameLoop
local float frameStart = timer
local float startTime = frameStart

' this line moved up to here -- the only change.
gameVsyncCount = vsyncCount

' do some game logic
'pause 8.0
gameDur = timer - startTime

' render the results of game logic to back buffer
startTime = timer
text 0, 0, "gd: " + str$(gameDur, 3, 3) + " rd: " + str$(renderDur, 3, 3) + " cd: " + str$(copyDur, 3, 3)
text 0, 16, "fps: " + str$(fps, 2, 1) + " wd: " + str$(waitDur, 3, 3) + " wc: " + str$(waitCount, 3, 0)
renderDur = timer - startTime

' wait until screen blanking
startTime = timer
waitCount = 0
do while(gameVsyncCount >= vsyncCount)
  waitCount = waitCount + 1
loop
waitDur = timer - startTime

' copy to front buffer
startTime = timer
page copy 1 to 0, I
copyDur = timer - startTime

frameDur = timer - frameStart
fps = 1000.0 / frameDur
end sub


While that changes the total framerate overall (the older code could run faster since it didn't actually wait on vsync), it doesn't affect the copy times. In fact , if I comment out the custom wait loop and change page copy to use B like this:


' wait until screen blanking
'startTime = timer
'waitCount = 0
'do while(gameVsyncCount >= vsyncCount)
'  waitCount = waitCount + 1
'loop
'waitDur = timer - startTime

' copy to front buffer
startTime = timer
page copy 1 to 0, B
copyDur = timer - startTime


then "page copy 1 to 0, b" time == "page copy 1 to 0, I" time + waitDur

i.e., the wait and copy immediate (I) loop above is effectively equivalent to just calling copy with the blank (B) option. The only real reason to do the wait like above was to time the copy independently.

-Jonathan
 
Nelno

Regular Member

Joined: 22/01/2021
Location: United States
Posts: 59
Posted: 09:25am 25 Jan 2021
Copy link to clipboard 
Print this post

  epsilon said  
I ran your program. I got:
cd: 0.012 rd: 0.465 cd: 15.235
fps: 60.4 wd: 0.810 wc: 75

'cd' or copy duration is in line with what I measured for MODE 1,16: 15.237ms.


Thanks! Yes, cd is what I've been measuring, too, at just over 19 ms in that mode. If you add up gd + rd + cd + wd you'll get: 0.12 + 0.465 + 15.235 + 0.810 = 16.522 which is close enough to 1/60th of a second to confirm the timings all make sense there.

Note that with the original code I posted it won't necessarily wait on the vsync in some modes if you're not making framerate, but since your copy was coming in under 16.6 ms it wasn't a problem. In my case, I miss every other vsync due to the copy taking 19 ms, the fixed code will wait for 1 more vsync before copying, effectively throttling to 30 Hz.

  epsilon said  
I'm running 5.07.00b2. I have a 480MHz CPU. Maybe yours is 400MHz?


Now that you mention it, I did see something about the CPU speed varying, but it didn't dawn on me that I might have a 400 Mhz CPU. I have a RetroMax from CircuitGizmos and the page says "The CPU that powers the RetroMax is an ARM Cortex-M7 32-bit RISC processor running at up to 480MHz."

I have a STM32H743IIT6 CPU. Depending on where you look it may be 400 or 480 Mhz, though most places label it as 480 MHz. I found the data sheet for it which says "480 Mhz MCU" at the top, but later it says "up to 480 Mhz" and the sheet appears to be used for about 2-dozen variants, so... I'm not entirely sure yet.

-Jonathan
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 8567
Posted: 09:47am 25 Jan 2021
Copy link to clipboard 
Print this post

  Quote  I have a STM32H743IIT6 CPU. Depending on where you look it may be 400 or 480 Mhz, though most places label it as 480 MHz. I found the data sheet for it which says "480 Mhz MCU" at the top, but later it says "up to 480 Mhz" and the sheet appears to be used for about 2-dozen variants, so... I'm not entirely sure yet.


Type: ? MM.INFO(CPUSPEED)

And it will tell you which CPU you have

Unfortunately ST have produced two revisions of the ST32H743IIT6 one with a maximum clock speed of 400MHz and one with 480MHz. These have exactly the same part number so it is completely impossible for people building CMM2 to specify which they will receive. In fact a single order from a single supplier may include a mix.

There is a way in software to determine the revision and the CMM2 firmware uses this to set the clock speed to the maximum that a given chip will support.
 
matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 8567
Posted: 10:01am 25 Jan 2021
Copy link to clipboard 
Print this post

Re the copy times

There are multiple factors that affect this

Try in mode 1:

timer=0:page copy 0 to 1:?timer
timer=0:page copy 1 to 0:?timer

Note the difference in timing which is caused by the way the copy interleaves with the LTDC access to the memory.

The copy time is also massively affected by the video mode, not just because of the volume of data to copy, but also because of the video bandwidth that is taking cycles reading the memory

Worst case is mode 11

1280x720 @ 60Hz = 55Mbytes/second in 8 bit mode and 110 MBytes/second in 16 bit mode

With memory being read continuously at 110Mbytes/second there is very little bus bandwidth left for the copy. You can't use 12-bit colour in mode 11 because this would need 220MBytes/second and this is outside the capability of the processor bus.

The copies use some pretty strange tricks to get performance and are about 4x faster than using DMA. Page copies actually use the FPU registers to move 16 bytes per instruction.

For optimum copy performance you should restrict yourself to modes where the two pages are both held in the faster processor memory. The highest resolution where this is the case is MODE 2,8
 
epsilon

Senior Member

Joined: 30/07/2020
Location: Belgium
Posts: 255
Posted: 10:18am 25 Jan 2021
Copy link to clipboard 
Print this post

  matherp said  
Unfortunately ST have produced two revisions of the ST32H743IIT6 one with a maximum clock speed of 400MHz and one with 480MHz. These have exactly the same part number so it is completely impossible for people building CMM2 to specify which they will receive. In fact a single order from a single supplier may include a mix.

There is a way in software to determine the revision and the CMM2 firmware uses this to set the clock speed to the maximum that a given chip will support.


Can the PLL of a 480MHz device be reprogrammed to run at 400Mhz? That would make it easier to ensure that a program runs correctly on both variants.

  matherp said  
Try in mode 1:

timer=0:page copy 0 to 1:?timer
timer=0:page copy 1 to 0:?timer

Note the difference in timing which is caused by the way the copy interleaves with the LTDC access to the memory.


Woah, yeah, big differences in some configurations. Unfortunately, the 'wrong' direction is generally faster.

MODE 1,8: Clock delta (us) after PAGE COPY 1 TO 0:3379
MODE 1,8: Clock delta (us) after PAGE COPY 0 TO 1:2070
MODE 1,12: Clock delta (us) after PAGE COPY 1 TO 0:26983
MODE 1,12: Clock delta (us) after PAGE COPY 0 TO 1:25810
MODE 1,16: Clock delta (us) after PAGE COPY 1 TO 0:15278
MODE 1,16: Clock delta (us) after PAGE COPY 0 TO 1:15429
MODE 2,8: Clock delta (us) after PAGE COPY 1 TO 0:527
MODE 2,8: Clock delta (us) after PAGE COPY 0 TO 1:531
MODE 2,12: Clock delta (us) after PAGE COPY 1 TO 0:4399
MODE 2,12: Clock delta (us) after PAGE COPY 0 TO 1:2547
MODE 2,16: Clock delta (us) after PAGE COPY 1 TO 0:3608
MODE 2,16: Clock delta (us) after PAGE COPY 0 TO 1:2210
MODE 3,8: Clock delta (us) after PAGE COPY 1 TO 0:378
MODE 3,8: Clock delta (us) after PAGE COPY 0 TO 1:210
MODE 3,12: Clock delta (us) after PAGE COPY 1 TO 0:533
MODE 3,12: Clock delta (us) after PAGE COPY 0 TO 1:536
MODE 3,16: Clock delta (us) after PAGE COPY 1 TO 0:615
MODE 3,16: Clock delta (us) after PAGE COPY 0 TO 1:327
MODE 4,8: Clock delta (us) after PAGE COPY 1 TO 0:435
MODE 4,8: Clock delta (us) after PAGE COPY 0 TO 1:443
MODE 4,12: Clock delta (us) after PAGE COPY 1 TO 0:3622
MODE 4,12: Clock delta (us) after PAGE COPY 0 TO 1:1862
MODE 4,16: Clock delta (us) after PAGE COPY 1 TO 0:2926
MODE 4,16: Clock delta (us) after PAGE COPY 0 TO 1:1793
MODE 5,8: Clock delta (us) after PAGE COPY 1 TO 0:314
MODE 5,8: Clock delta (us) after PAGE COPY 0 TO 1:182
MODE 5,12: Clock delta (us) after PAGE COPY 1 TO 0:443
MODE 5,12: Clock delta (us) after PAGE COPY 0 TO 1:445
MODE 5,16: Clock delta (us) after PAGE COPY 1 TO 0:495
MODE 5,16: Clock delta (us) after PAGE COPY 0 TO 1:279
MODE 6,8: Clock delta (us) after PAGE COPY 1 TO 0:405
MODE 6,8: Clock delta (us) after PAGE COPY 0 TO 1:225
MODE 6,12: Clock delta (us) after PAGE COPY 1 TO 0:514
MODE 6,12: Clock delta (us) after PAGE COPY 0 TO 1:510
MODE 6,16: Clock delta (us) after PAGE COPY 1 TO 0:654
MODE 6,16: Clock delta (us) after PAGE COPY 0 TO 1:343
MODE 7,8: Clock delta (us) after PAGE COPY 1 TO 0:439
MODE 7,8: Clock delta (us) after PAGE COPY 0 TO 1:246
MODE 7,12: Clock delta (us) after PAGE COPY 1 TO 0:2509
MODE 7,12: Clock delta (us) after PAGE COPY 0 TO 1:1553
MODE 7,16: Clock delta (us) after PAGE COPY 1 TO 0:728
MODE 7,16: Clock delta (us) after PAGE COPY 0 TO 1:387
MODE 8,8: Clock delta (us) after PAGE COPY 1 TO 0:2184
MODE 8,8: Clock delta (us) after PAGE COPY 0 TO 1:1329
MODE 8,12: Clock delta (us) after PAGE COPY 1 TO 0:12755
MODE 8,12: Clock delta (us) after PAGE COPY 0 TO 1:12774
MODE 8,16: Clock delta (us) after PAGE COPY 1 TO 0:8994
MODE 8,16: Clock delta (us) after PAGE COPY 0 TO 1:8994
MODE 9,8: Clock delta (us) after PAGE COPY 1 TO 0:11541
MODE 9,8: Clock delta (us) after PAGE COPY 0 TO 1:11602
MODE 9,16: Clock delta (us) after PAGE COPY 1 TO 0:32034
MODE 9,16: Clock delta (us) after PAGE COPY 0 TO 1:32841
MODE 10,8: Clock delta (us) after PAGE COPY 1 TO 0:2867
MODE 10,8: Clock delta (us) after PAGE COPY 0 TO 1:1759
MODE 10,12: Clock delta (us) after PAGE COPY 1 TO 0:18555
MODE 10,12: Clock delta (us) after PAGE COPY 0 TO 1:18571
MODE 10,16: Clock delta (us) after PAGE COPY 1 TO 0:12468
MODE 10,16: Clock delta (us) after PAGE COPY 0 TO 1:12356
Epsilon CMM2 projects
 
vegipete

Guru

Joined: 29/01/2013
Location: Canada
Posts: 1082
Posted: 06:37pm 25 Jan 2021
Copy link to clipboard 
Print this post

The technique that has worked well for me is to use both page copying and page flipping.
Look for my "Rocks in Space" asteroids clone posted here. (Uses mode 1,8)

The PAGE COPY command has three timing parameters: I, B, D
I copies right now and all but guarantees image tearing,
B copies with no tearing but your program must wait,
D copies cleanly also but your program can mess up the page before it has been copied.

I started Rocks in Space using B but the frame rate dropped too low as the amount of work in the game loop increased. Too much time was spent waiting for the PAGE COPY.

When I switched to D, there was too much image tearing because the program was starting to rewrite the image before it had been completely copied. A pause cleaned it up but was not ideal. (A means to check when PAGE COPY x TO y,D was finished would have helped.)

The fix was to add page flipping. I clear and redraw a complete image on page 1. When done (and after a minimum time,) I give the page to PAGE COPY,D. Then I can immediately start all over again using page 2. This way the image on page 1 is undisturbed, giving PAGE COPY enough time to do its thing. Flip and repeat.
Visit Vegipete's *Mite Library for cool programs.
 
TassyJim

Guru

Joined: 07/08/2011
Location: Australia
Posts: 5884
Posted: 08:47pm 25 Jan 2021
Copy link to clipboard 
Print this post

  Quote  
 TassyJim said  
With PAGE COPY set to I,
Mode 1,8  > 6.1mS
Mode 1,12 > 10.8mS
Mode 1,16 > 11.8mS


Now, those are interesting. I get a better time in 1,8 at 3.95 ms. Mode 1,16 takes nearly twice as long as you're seeing at 19 ms, and I get a much, much worse time, 4x your time, in 1,12 @41 ms.


I later discovered a bug in the profiling implementation. Peter has now fixed.

With PAGE COPY set to I,
Mode 1,8  > 3.4mS
Mode 1,12 > 26.5mS
Mode 1,16 > 15.4mS

That agrees very closely with Epsilon

Jim
VK7JH
MMedit   MMBasic Help
 
Nelno

Regular Member

Joined: 22/01/2021
Location: United States
Posts: 59
Posted: 12:29am 26 Jan 2021
Copy link to clipboard 
Print this post

  matherp said  
  Quote  I have a STM32H743IIT6 CPU. Depending on where you look it may be 400 or 480 Mhz, though most places label it as 480 MHz. I found the data sheet for it which says "480 Mhz MCU" at the top, but later it says "up to 480 Mhz" and the sheet appears to be used for about 2-dozen variants, so... I'm not entirely sure yet.


Type: ? MM.INFO(CPUSPEED)

And it will tell you which CPU you have

Unfortunately ST have produced two revisions of the ST32H743IIT6 one with a maximum clock speed of 400MHz and one with 480MHz. These have exactly the same part number so it is completely impossible for people building CMM2 to specify which they will receive. In fact a single order from a single supplier may include a mix.

There is a way in software to determine the revision and the CMM2 firmware uses this to set the clock speed to the maximum that a given chip will support.


Thanks, I was going to bed when I thought "I think I saw an MM.INFO" for that and was going to check tonight.

I get:
> ? mm.info(cpuspeed)
480000000

So it seems that's not the issue. I'm glad I got the 480 Mhz one, though!
 
Nelno

Regular Member

Joined: 22/01/2021
Location: United States
Posts: 59
Posted: 12:38am 26 Jan 2021
Copy link to clipboard 
Print this post

  matherp said  Re the copy times

There are multiple factors that affect this

Try in mode 1:

timer=0:page copy 0 to 1:?timer
timer=0:page copy 1 to 0:?timer


Ok, using those commands and setting the mode before running them, the numbers I see agree with what I'm seeing when I time immediate copies with my code above (though this is a much simpler way to test!). In all cases these are for copy 1 to 0:


mode 1,8: 3.94 ms
mode 1,16: 19.623 ms
mode 1,12: 43.797 ms

  matherp said  Note the difference in timing which is caused by the way the copy interleaves with the LTDC access to the memory.

The copy time is also massively affected by the video mode, not just because of the volume of data to copy, but also because of the video bandwidth that is taking cycles reading the memory

Worst case is mode 11

1280x720 @ 60Hz = 55Mbytes/second in 8 bit mode and 110 MBytes/second in 16 bit mode

With memory being read continuously at 110Mbytes/second there is very little bus bandwidth left for the copy. You can't use 12-bit colour in mode 11 because this would need 220MBytes/second and this is outside the capability of the processor bus.

The copies use some pretty strange tricks to get performance and are about 4x faster than using DMA. Page copies actually use the FPU registers to move 16 bytes per instruction.

For optimum copy performance you should restrict yourself to modes where the two pages are both held in the faster processor memory. The highest resolution where this is the case is MODE 2,8


All of that makes sense. Just still not sure why I'm seeing significantly slower times than others at this point if it's not related to my firmware, which I'll look into updating tonight.
 
epsilon

Senior Member

Joined: 30/07/2020
Location: Belgium
Posts: 255
Posted: 08:48am 26 Jan 2021
Copy link to clipboard 
Print this post

  Nelno said  All of that makes sense. Just still not sure why I'm seeing significantly slower times than others at this point if it's not related to my firmware, which I'll look into updating tonight.


That is strange. Does your machine's timing match wall clock timing? i.e. If you "PAUSE 10000", is it really 10 seconds?
Epsilon CMM2 projects
 
Nelno

Regular Member

Joined: 22/01/2021
Location: United States
Posts: 59
Posted: 08:42am 27 Jan 2021
Copy link to clipboard 
Print this post

  epsilon said  
  Nelno said  All of that makes sense. Just still not sure why I'm seeing significantly slower times than others at this point if it's not related to my firmware, which I'll look into updating tonight.


That is strange. Does your machine's timing match wall clock timing? i.e. If you "PAUSE 10000", is it really 10 seconds?


Yes, it does. This occurred to me also, since there's an OPTION command to modify the clock rate.

I tested pause vs. my phone timer up to 60 seconds it seems right on.

Unfortunately I am mostly dead in the water right now as I managed to break off my micro SD card connector. It has been a comedy of errors with the micro SD connector since I got this board.

Suffice to say I've been designing / printing a custom case for it, so I've had the board in an out a lot for measurements and tests. And in the process of replacing the card, I broke the connector. There's more to it than that... the first connector on the board was bad and refused to work after 2 or 3 inserts, but it's been a comedy of errors since.

Still trying to get it to work. I have traced all the connections I can all the way back to the CPU and they seem fine. I've checked for bridges, don't have any. The insert detection works and the SD card light begins pulsing about once per second after insert. The CMM2 tries to read the card, I see data and clock pulses on the oscilliscope are in sync with the sd card activity LED, but it doesn't recognize the card. I've verified the card is still good. I'm out of ideas for the moment and tired of looking through SD card spec docs.
 
RetroJoe

Senior Member

Joined: 06/08/2020
Location: Canada
Posts: 290
Posted: 12:34pm 27 Jan 2021
Copy link to clipboard 
Print this post

Have you tried a "factory reset"? (see doc excerpt below). I had similar SD card trouble once and a full reset fixed it.I would also try a different SD card.

Pin 40 can be used to completely reset the Colour Maximite 2 to its "factory default" condition. If that pin is connected to ground (ie, pin 39) on power up all options will be reset to their defaults and any program in flash memory erased.

Enjoy Every Sandwich / Joe P.
 
epsilon

Senior Member

Joined: 30/07/2020
Location: Belgium
Posts: 255
Posted: 01:39pm 27 Jan 2021
Copy link to clipboard 
Print this post

  Quote  Yes, it does. This occurred to me also, since there's an OPTION command to modify the clock rate.


I was actually thinking that your assembly might have an external oscillator with incorrect frequency, but since the clock rate seems correct, that's not it.

Re your SD trouble, you are aware that the pins on the side of the SD socket very easily bridge against the metal cover, right? So easily in fact that at first, I didn't even realize they were supposed to be separate. I actually thought I was fixing the SD socket enclosure to the PCB.

Did you check the stability of the 5V supply and the current draw?
Epsilon CMM2 projects
 
     Page 1 of 2    
Print this page
© JAQ Software 2024