Executive Summary
While writing the CUDA kernel we think about the kernel configurations and device pointers but, seldom do we think about factors that affect the occupancy? The whitepaper discusses how kernel configurations, usage of registers, use of memory, how the concept of warp affects occupancy and in turn, how occupancy affects kernel performance.
Project Highlights
Occupancy is the ratio of the number of active warps per multiprocessor to the maximum number of possible active warps. Warp is a single execution unit, comprised of 32 threads in Nvidia Kepler architecture.