Notice. New forum software under development. It's going to miss a few functions and look a bit ugly for a while, but I'm working on it full time now as the old forum was too unstable. Couple days, all good. If you notice any issues, please contact me.
|
Forum Index : Microcontroller and PC projects : bit swap in byte
Page 1 of 2 | |||||
Author | Message | ||||
ville56 Senior Member Joined: 08/06/2022 Location: AustriaPosts: 143 |
I have the need to use SPI with LSB first to talk from a Pico to a VFD display. As i didn't find a way to tell the SPI to use LSB first, i need to swap the 8 bits in every byte in code. So far i figured out 2 methods: 1) bit shifting the byte to swap and the swapped byte (AND/OR to read/set bits). Slim, but takes a lot of CPU cycles. 2) table lookup, this requires 256 bytes data space (for 8 bits) but is quite fast. Currently i have method 1 implemented. Works but limits the responsiveness of my app (analog bargraph/level-meter). Don't like to use method 2 as i'm a "save memory space junkie" (old assembler coders disease i suppose). Does someone know of a method which is in between my 2 solutions, meaning more cpu and memory efficient in a combination? Many thanks for any hint ... Regards, Gerald (OE1HGA) Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 73 de OE1HGA, Gerald |
||||
thwill Guru Joined: 16/09/2019 Location: United KingdomPosts: 4092 |
Not tested, but: Use a 16 byte lookup table for 4 bits and then: swapped% = lookup%(x% And &b1111) << 4 + lookup%(x% >> 4) Best wishes, Tom Game*Mite, CMM2 Welcome Tape, Creaky old text adventures |
||||
ville56 Senior Member Joined: 08/06/2022 Location: AustriaPosts: 143 |
Thanks Tom, oh yes, this is a compromise i didn't think of. I have to compare actual execution time, but your approach should be a lot faster. Thank you again, Gerald                                  73 de OE1HGA, Gerald |
||||
matherp Guru Joined: 11/12/2012 Location: United KingdomPosts: 9392 |
Here is a fun solution sb%=(((ob% * &H80200802) AND &H0884422110) * &H0101010101 >> 32) and 255 > ob%=&B11011001 > sb% = (((ob% * &H80200802) AND &H0884422110) * &H0101010101 >> 32) and 255 > ? bin$(sb%) 10011011 > |
||||
Mixtel90 Guru Joined: 05/10/2019 Location: United KingdomPosts: 7023 |
Sheesh..... My brain hurts...... :( Mick Zilog Inside! nascom.info for Nascom & Gemini Preliminary MMBasic docs & my PCB designs |
||||
twofingers Guru Joined: 02/06/2014 Location: GermanyPosts: 1350 |
Hi! I don't know if this CSUB BitOrderReverse helps? Regards Michael causality ≠correlation ≠coincidence |
||||
matherp Guru Joined: 11/12/2012 Location: United KingdomPosts: 9392 |
This one is probably faster for b%=0 to 255:? bin$((b% * &H0202020202 and &H010884422010) mod 1023,8):next |
||||
twofingers Guru Joined: 02/06/2014 Location: GermanyPosts: 1350 |
I like to believe that. EDIT: Your solution is very sophisticated! But the CSub is one option more to solve that problem ... Edited 2022-07-26 17:27 by twofingers causality ≠correlation ≠coincidence |
||||
ville56 Senior Member Joined: 08/06/2022 Location: AustriaPosts: 143 |
well, i've timed the average runtime for 25600 runs (378MHz CPU): - fun solution: 0.0618 mSec - faster fun solution: 0.0584 mSec - my old solution: 0.3368 mSec a factor of almost 6 times faster, impressive. btw, i'm still havin a lot of "fun" figuring out how the fun solutions work. So far, i can see the results of the individual steps, but i do not understand the underlying principle/math. Can matherp please explain that a bit .... ad CSUB: have not tried csub at all until now, must read myself into that topic first. thanks so far to all contributors, have now 2 more solutions and 2 more new myracles :-) Regards, Gerald                                  73 de OE1HGA, Gerald |
||||
matherp Guru Joined: 11/12/2012 Location: United KingdomPosts: 9392 |
Not my work I'm afraid - explained here https://graphics.stanford.edu/~seander/bithacks.html#ReverseByteWith64BitsDiv Edited 2022-07-26 20:36 by matherp |
||||
ville56 Senior Member Joined: 08/06/2022 Location: AustriaPosts: 143 |
thanks for the link, it's now getting clearer to me. Anyway, a very sophisticated algorithm in deed. Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 73 de OE1HGA, Gerald |
||||
Volhout Guru Joined: 05/03/2018 Location: NetherlandsPosts: 4441 |
It is a ingenius algorythm, but only efficient for 64 bit processors. On an 8bit machine the 8 fold if a and 128 then b=b+1 If a and 64 then b=b+2 Etc.... Is MUCH faster... And a 256 entry lookup table is superior, even to the above algorythm. The charm of it is that it is a simple formula, hiding smart thinking... from rhe real math wizards Edited 2022-07-27 08:48 by Volhout PicomiteVGA PETSCII ROBOTS |
||||
phil99 Guru Joined: 11/02/2018 Location: AustraliaPosts: 2247 |
And another, not fast just different. Saved 113 bytes > LIST a%=31 :c%=0 :Print Bin$(a%,8) Timer =0:For n=0 To 7:c%=c%<<1:c%=c%+(a% And 1):a%=a%>>1:Next :Print Timer Print Bin$(c%,8) > RUN 00011111 0.279 11111000 > ? mm.info(cpuspeed) 378000000 > |
||||
Plasmamac Guru Joined: 31/01/2019 Location: GermanyPosts: 554 |
my mistake Edited 2022-07-27 19:41 by Plasmamac Plasma |
||||
twofingers Guru Joined: 02/06/2014 Location: GermanyPosts: 1350 |
Just comparison purposes (This is a recompilation of the CMM2 version): The BitOrderReverse CSUB for PicoMite takes 21us for a byte (8 bit) and 31us (23-34us) for 64 bit integers on a PicoMite at 378MHz (V5.07.05b15). Execution time is not constant and varies somewhat with the number of ones. Teraterm output: BitOrderReverse Demo Input: 10111100101111001011110010111100101111001011110010111100101111 Output for 8 Bits: 11110100 Time for 10000 loops: CSUBs Empty loops| 1x CSUB 250ms 45ms| 21us Basic-source: 'File BitOrderReverse.bas written 10-09-2020 14:03:13 ' v1.01 by twofingers@tbs (thanks to chris@TBS) ' 'a CSUB to reverse the order of bits for PicoMite 'Usage: 'CSUB BitOrderReverse(nInput, nBits, RetValue) Dim integer a, b=0, nBits=8,l=10000 ' 1234567890123456789012345678901234567890123456789012345678901234 ' 1 2 3 4 5 6 a=&b0010111100101111001011110010111100101111001011110010111100101111 Print "BitOrderReverse Demo":Print Print "Input:" Print Bin$(a,nBits) Timer =0 For i = 1 To l BitOrderReverse(a,nBits,b) Next i t0=Timer Timer =0 For i = 1 To l Next i t2=Timer Print "Output for"nBits" Bits:" Print Bin$(b,nBits) Print :Print Print "Time for"l " loops:" Print "CSUBs"," Empty loops|","1x CSUB" Print Cint(t0)"ms",,Cint(t2)"ms|",Cint((t0-t2)/l*1000)"us"; End 'File bitreverse.bas written 27-07-2022 17:49:23 v1.44 (optimized: -ofast) CSub BitOrderReverse 00000000 2300B5F0 46DE4657 4645464E 00104684 B5E02200 60436002 684A680B B085469A DD672A00 681B4663 46992400 685B4663 46982500 91022301 21209003 00274249 D43D1866 40B10019 001E468B 40A64649 400E4640 40014659 D037430E 37014652 22201BD7 18BA4252 0019D43E 91014091 40BA001A 92009803 68476806 9A019900 41571876 60476006 26019902 2700680A 20024692 2100684A 416F1936 414D1824 DC2B4295 4661D031 00346809 46614689 003D6849 21204688 00274249 D5C11866 1B0E2120 40F10019 E7BE468B 27002601 21002002 416F1936 414D1824 DC0D4295 0034D016 E7A8003D 00192220 40D11BD2 E7BD9101 D1012A00 D1932B00 BCF0B005 46B246BB 46A046A9 4554BDF0 E7F5D9CB D9E64554 46C0E7F2 End CSUB C-source: #include "PicoCFunctions.h" void borev(long long *orig, long long *nbits, long long *reversed) { int i = 1; *reversed = 0; unsigned long long one_bit = 1; for(;i<=*nbits;i++){ if ((*orig & (one_bit<<(i-1))) > 0) *reversed += one_bit<<(*nbits-i); } } For infos about CSUBs on PicoMite: https://www.thebackshed.com/forum/ViewTopic.php?TID=14126 Regards Michael Edited 2022-07-28 03:04 by twofingers causality ≠correlation ≠coincidence |
||||
ville56 Senior Member Joined: 08/06/2022 Location: AustriaPosts: 143 |
thnks for the link to csub topics, i have to read that first and figure out how to properly compile the stuff in the end. Interesting that the loop in C is about 3 times faster than the linear shift/multiply in the MMbasic interpreter. C is efficient, if you like it or not. And yes, i don't like it ... ;-) Regards, Gerald                                  73 de OE1HGA, Gerald |
||||
matherp Guru Joined: 11/12/2012 Location: United KingdomPosts: 9392 |
b% * &H0202020202 and &H010884422010) mod 1023 would be even faster implemented as a CSUB. Remember the biggest time cost in MMbasic is not the math but parsing the ascii. This is evidenced by integers only being fractionally quicker than double precision floats. The CSUB has only one parameter so very little to parse hence the speed improvement even with a slower algorithm |
||||
twofingers Guru Joined: 02/06/2014 Location: GermanyPosts: 1350 |
Hi Gerald, I just set up the compiler for the PicoMite CSubs today. On the power of C: I think nowadays you can do most things in MM-Basic and that's fantastic. But sometimes things have to be very fast - this is the time of the CSUBs. Kind regards Michael BTW: This is my Make.bat file. Maybe it helps ...? @ECHO OFF bin\arm-none-eabi-gcc -c -mcpu=cortex-m0 -mfloat-abi=soft -mthumb -Wall -Wno-main -ffunction-sections -O0 -fPIC -I. %1.c -o %1.o IF ERRORLEVEL 1 Goto Done pause bin\arm-none-eabi-ld -nostartfiles -T arm-gcc-link.ld -o %1.elf %1.o IF ERRORLEVEL 1 Goto Done pause MMBasic ArmCfGenV144.bas %1.elf %2 del %1.elf del %1.o :Done Edited 2022-07-28 03:25 by twofingers causality ≠correlation ≠coincidence |
||||
twofingers Guru Joined: 02/06/2014 Location: GermanyPosts: 1350 |
@Peter a%=(b% * &H0202020202 and &H010884422010) mod 1023 works for 8 bits. Yes, it is very fast (15us) ... for 8 bits. But what if we need 64 bits? Best regards Michael Edited 2022-07-28 03:57 by twofingers causality ≠correlation ≠coincidence |
||||
matherp Guru Joined: 11/12/2012 Location: United KingdomPosts: 9392 |
Just do it on each byte in turn and then swap the bytes end to end |
||||
Page 1 of 2 |
Print this page |