CMM2: CSUB BitOrderReverse

I wonder if you could get it faster by not using pointer indirection in the loop in your C code, or if the compiler already figured that out for you?