The previous fix only worked correctly for values where
the most significant enabled bit was the only enabled bit.
This change changes the implementation back to using clz,
but so that the result is changed with additional arithmetics.
There is still at least one known limitation with regards
to acceptable input types, but this is documented in the code.