Added profileStart to GrandBrownTown.cu and allowed nccl_broadcast to overlap with computeForce on GPU 0