previously, everything was going through an intermediate conversion to
long double, which caused the extern __fpclassifyl function to get
invoked, preventing virtually all optimizations of these operations.
with the new code, tests on constant float or double arguments compile
to a constant 0 or 1, and tests on non-constant expressions are
efficient. I may later add support for __builtin versions on compilers
that support them.