Skip to content
Snippets Groups Projects
notes.org 7.45 KiB

active

make rigid body object device pointer

organization

RgidBodyController (RBC) holds device pointers, manages force evaluation and integration

Opportunities for memory bandwidth savings

each block should (ideally) contain a compact set of density grid points

cache (automatically!?) potential grid lookups and coefficients!

each block should have same transformation matrices applied to each grid point?!

each block could have same inverse transformation matrix applied to each grid point

how well this peforms will depend on the number

but also simplifies reductions

new data structure for grids?

each grid contains blocks of data of a size optmized for the device

Each thread can operate on multiple data points if needed (any advantage?)

questions

Q: overhead of dynamic parallelism?

Where does it make sense to have a kernel call subkernels? A: At least: to sychronize blocks

Q: overhead for classes?

Q: could algorithm use shared memory through persistent threads?

bring in rigid body integrator