-
-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Labels
bugSomething isn't workingSomething isn't workingquestionFurther information is requestedFurther information is requested
Description
A linear CUDA grid can have 2^31-1 blocks (in the x dimension), each of size 1024 elements, for a total of a little under 2^41 threads. Currently, our types and grid_info functions assume all dimensions can fit within 32-bit integers... and that is not the case.
At the same time, it is costly to default to use 64-bit values for dimensions when a kernel author knows that the dimensions don't actually exceed 32-bits. (Limiting to 16 bits is less useful, since NVIDIA GPU cores don't operate faster on 16-bit integers).
So, we need to figure out how to support over-32-bit dimensions while not forcing them on users by default. Currently we simply do not support this.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingquestionFurther information is requestedFurther information is requested