Misc. GPU improvements
- GPU backend fixes.
- Emit float infinity properly.
- Emit
thread_block_tile
variables at start of function.
- RT backend fixes.
- Lower intrinsics.
- Emit float infinity properly.
- Add max / min to monoid reduction cleaning (utilities for working w/ smallest/largest values of a datatype).
- Don't outline scalar constants (TODO: inline these interprocedurally later).
- Optimize edge detection schedules, especially GPU.
- Emit a 2-level max reduction tree.