Looking at generated assembly, it turns out the optimizer is not smart
enough to get rid of the loop - use `copyMem` instead.
At least the compiler is smart enough to constant-propagate runtime
endian direction, resolving the review comment.
Also clarify why a minimum length is enfored - it could perhaps be
revisited, but that would leave a slightly odd API.
the `array` overloads are actually unnecessary with an optimizing
compiler - as long as it can prove the length, any extra checks will go
away on their own
also add `initCopyFrom`
* document optimizations
* random helpers
* arrayops as a home for small array/openArray utilities
* assign2 - a replacement for genericAssign and assignment operators in
general which in nim are very slow
* use assign2 in a few places to speed things up
* fixes
* fixes