TheBackShed.com

Forum Index : Microcontroller and PC projects : CMM2 MMBasic CSUB - use ARM assembler

Author

Message

jirsoft

Guru

Joined: 18/09/2020
Location: Czech Republic
Posts: 533

Posted: 11:28pm 01 Oct 2020

Copy link to clipboard

Print this post

Hi,
as my last "old full screen programmable computer" was Acorn Archimedes, where I have developed apps mostly in combination BASIC + assembler (and before on C64 and Atari ST the same combination), is for me interesting to continue in this style and use CSUB not for C but for assembler.

If I correctly understood, CMM2 (Cortex M7) can't be programmed in (nice) ARM instruction set but in (not so nice) Thumb2 set. It's right? (I'm still waiting for my CMM2, so such stupid questions).

What will be format in CSUB for assembler? First will be 0 as offset when assembler part should start just with first instruction, but how to end it? Simple MOV pc, lr (MOV R15, R14)? Or BX lr (I found somewhere)? What registers could be changed? Hwo will be given parameters to the CSUB, on the stack or in registers?

I know most people will use MMBasic and for speedup C, but I think better is be comfortable (MMBasic) + fast (ASM), I don't need something in the middle... (I have to develop sometimes in C/C++ but don't like it too much).

Jiri
Napoleon Commander and SimplEd for CMM2 (GitHub), CMM2.fun

JohnS
Guru

Joined: 18/11/2011
Location: United Kingdom
Posts: 4185

Posted: 07:17am 02 Oct 2020

Copy link to clipboard

Print this post

You don't have to stick to thumb/thumb2 (see the ARM doc).

Compile some C and look at its assembler code (see gcc flags).

TBH I'd stick to C because it will be plenty fast enough, far more readable and far more maintainable.

You can of course mix C & ASM (or even do inline ASM).

John
Edited 2020-10-02 17:18 by JohnS

jirsoft

Guru

Joined: 18/09/2020
Location: Czech Republic
Posts: 533

Posted: 08:33am 02 Oct 2020

Copy link to clipboard

Print this post

Hi John,
thanks for answer; this with C as more maintainable is valid point (readable is for me is the question), but for C I need every time compile CSub outside of CMM2.
My idea was develop small assembler in BASIC, so I can assemble source file (SOMETHING.S) into BASIC file for including into main app (create SOMETHING.INC), there will be CSub as compiled asm + source code as comment inside). I have already written one quick&dirty in other BASIC dialect this week, so when I will get CMM2 I can convert it for MMBasic Code on GitHub ...
But I did it as simple ARM (not Thumb/Thumb2) assembler (as for me much easier, I know it and all instructions are 32 bit + very orthogonal) and finally (from ARM doc) I have found, that Cortex M7 can't do ARM, just Thumb2 (+Thumb). Maybe I'm wrong, but when not, I need to write Thumb2 assembler, what will not be as easy and fast (at least for me) because of instructions chaos and mix of 16/32 bit.
Of course, the inline assembler in MMBasic were much better and convenient, but at least this way can be all steps done direct on CMM2. And with small snippets usually needed (most parts in BASIC and just speedups in ASM) it could be also "readable and maintainable".
But as I said, it's just an idea...

Jiri
Napoleon Commander and SimplEd for CMM2 (GitHub), CMM2.fun

matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 10772

Posted: 08:55am 02 Oct 2020

Copy link to clipboard

Print this post

The CMM2 is compiled in Thumb2 so any assembler routine needs to be in thumb2. The calling sequence is that the compiler puts the function parameters in R0, R1, etc. and returns the answer in R0 if applicable. In MMBasic the parameters will always be addresses rather than values. Below is a very simple example routine that I use in the CMM2 for moving a 32-byte block of data. You should push/pop any registers you use to ensure MMBasic isn't corrupted

varcopy: PUSH {R5}
LDR r5,[r1],#4
str r5,[r0],#4
LDR r5,[r1],#4
str r5,[r0],#4
LDR r5,[r1],#4
str r5,[r0],#4
LDR r5,[r1],#4
str r5,[r0],#4
LDR r5,[r1],#4
str r5,[r0],#4
LDR r5,[r1],#4
str r5,[R0],#4
LDR r5,[r1],#4
str r5,[r0],#4
LDR r5,[r1],#4
str r5,[r0],#4
POP {R5}
bx lr

jirsoft

Guru

Joined: 18/09/2020
Location: Czech Republic
Posts: 533

Posted: 01:07pm 02 Oct 2020

Copy link to clipboard

Print this post

Hi Peter,
thank you very much for quick explanation, it helps a lot!
So, if correctly understood, in your example I can write:

CSUB varcopy integer, integer
00000000
20B451F8 045B40F8 045B51F8 045B40F8 045B51F8
045B40F8 045B51F8 045B40F8 045B51F8 045B40F8
045B51F8 045B40F8 045B51F8 045B40F8 045B51F8
045B40F8 045B20BC
70470000
END CSUB

?

My question is, I need to pad last 16 bits with 2 bytes of 00 (in bold)? Up to 10 parameters allowed, so r0-r9 could be filled with parameter, so they don't need to be preserved. Why in your case have to be R5 preserved? In this case, the offset would be 00000000 as underlined?

Thanks for help,
Jiri

Jiri
Napoleon Commander and SimplEd for CMM2 (GitHub), CMM2.fun

matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 10772

Posted: 01:38pm 02 Oct 2020

Copy link to clipboard

Print this post

Quote Up to 10 parameters allowed, so r0-r9 could be filled with parameter, so they don't need to be preserved. Why in your case have to be R5 preserved?

This is not a CSUB, it is an internal routine. You can't afford to make assumptions about the compiler, particularly with optimisations turned on. If an internal function is called with only two parameters then it isn't necessarily going to protect unused registers.

AFAIK no-one thus far has written A CSUBin assembler so you probably want to start by writing a simple one in C and then look at the generated assembler to confirm the calling mechanism and then copy that across into a test assembler version

jirsoft

Guru

Joined: 18/09/2020
Location: Czech Republic
Posts: 533

Posted: 03:41pm 02 Oct 2020

Copy link to clipboard

Print this post

OK, thanks for answer.

I will wait for my CMM2 and play in the meantime with the Thumb2 assembler (at least with some limited version)...

Jiri
Napoleon Commander and SimplEd for CMM2 (GitHub), CMM2.fun

jirsoft

Guru

Joined: 18/09/2020
Location: Czech Republic
Posts: 533

Posted: 11:38pm 02 Oct 2020

Copy link to clipboard

Print this post

Hi Peter,
I have played a little bit with CSub assembler files generated from arm-none-eabi-gcc and found few interesting things:
1. looks like -O3 against +O0 compile assembler near to my own code, why you switch the optimisation off? Maybe it can bring massive speed (and space) improvement...

void plus13(long long int *a)
{
*a = 13 + *a;
}

with -O0:

push {r4, r5, r7}
sub sp, sp, #12
add r7, sp, #0
str r0, [r7, #4]
ldr r3, [r7, #4]
ldrd r2, [r3]
adds r4, r2, #13
adc r5, r3, #0
ldr r3, [r7, #4]
strd r4, [r3]
nop
adds r7, r7, #12
mov sp, r7
@ sp needed
pop {r4, r5, r7}
bx lr

with -O3:

ldrd r3, r2, [r0]
adds r3, r3, #13
adc r2, r2, #0
strd r3, r2, [r0]
bx lr

2. as you see, address of first parameter it's put to r0, until just r0-r3 are used, nothing needs to be preserved...
3. if you have more than 4 parameters (address in r0-r3), they are put into stack

That's all I found until now.

Jiri
Napoleon Commander and SimplEd for CMM2 (GitHub), CMM2.fun

matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 10772

Posted: 07:36am 03 Oct 2020

Copy link to clipboard

Print this post

Quote why you switch the optimisation off? Maybe it can bring massive speed (and space) improvement...

Off is always safe but any CSUB developer can change the optimisation however they want. In this case it is up to them to ensure their code still works.

I'm probably overcautious about this. The concept of CSubs were initially developed for the PIC32 and the PIC32 compiler is very poor at generating position independent code. In particular optimisations other than 0 often resulted in unusable code.

The ARM compiler is much better at position independence but it is still sensible to code with O0 and then once the code is fully working and debugged tune the optimisation.

twofingers

Guru

Joined: 02/06/2014
Location: Germany
Posts: 1713

Posted: 11:01pm 03 Oct 2020

Copy link to clipboard

Print this post

Hi Peter,

I can confirm that the -O3 option works without issues (almost).
In my tests the speed advantage was up to 50%.
bitorderreverse -o3.zip
However, I think it was appropriate to start with "-O0" at first.

Kind regards
Michael

causality ≠ correlation ≠ coincidence

jirsoft

Guru

Joined: 18/09/2020
Location: Czech Republic
Posts: 533

Posted: 09:00am 05 Oct 2020

Copy link to clipboard

Print this post

Hi Peter,
just stupid question: MMBasic/FW is compiled with -O3?

Jiri
Napoleon Commander and SimplEd for CMM2 (GitHub), CMM2.fun

matherp
Guru

Joined: 11/12/2012
Location: United Kingdom
Posts: 10772

Posted: 09:22am 05 Oct 2020

Copy link to clipboard

Print this post

Quote MMBasic/FW is compiled with -O3?

No: some parts use Ofast and other parts use O2 - this as the result of much experimentation. O3 is slower in most cases

LeoNicolas

Guru

Joined: 07/10/2020
Location: Canada
Posts: 554

Posted: 12:08am 08 Oct 2020

Copy link to clipboard

Print this post

Hey jirsoft and matherp

I'm very interested to understand how to write csub code from C.

I'm creating an API for 3D rendering and for better performance I would like to convert it to csub routines. This is a video showing the current API status:
https://www.youtube.com/watch?v=JPLf6eobqa4&ab_channel=LeonardoNicolas

Do you have a link for a good documentation about csub?

The questions I have were:

How can I access the arguments (int, float, string, or array) received from the MMBasic call?

How can I output values from the C routine?

How do I compile the C code for the CMM2? Might I use GCC? Which arguments should I used?

Thank you
Leo
Edited 2020-10-08 10:09 by LeoNicolas

twofingers

Guru

Joined: 02/06/2014
Location: Germany
Posts: 1713

Posted: 12:20am 08 Oct 2020

Copy link to clipboard

Print this post

Hi Leo,

I think most of the questions are answered here.

Please also read the older threads for the Micromite 2.

Best regards
Michael

causality ≠ correlation ≠ coincidence

LeoNicolas

Guru

Joined: 07/10/2020
Location: Canada
Posts: 554

Posted: 02:05am 08 Oct 2020

Copy link to clipboard

Print this post

twofingers said Hi Leo,

I think most of the questions are answered here.

Please also read the older threads for the Micromite 2.

Best regards
Michael

Thank you Michael

Do you know if there is a MMBasic version for Linux?

JohnS
Guru

Joined: 18/11/2011
Location: United Kingdom
Posts: 4185

Posted: 09:59am 08 Oct 2020

Copy link to clipboard

Print this post

There was, akin to the DOS/Windows one and quite old in terms of functions etc, and then also one for (older) RPi boards (which relies on pigpio and some other stuff) - look for picromite.

John
Edited 2020-10-08 20:00 by JohnS

Print this page

To reply to this topic, you need to log in.