International Conference on Software Technology and Engineering (ICSTE 2012)

Thermal and energy constraints are bringing about a paradigm shift in the new age computational revolution. The GPU has become an integral co-processor in the high performance computing domain. The GPU is designed for highly data parallel applications but now NVIDIAs Fermi architecture pioneers concurrent kernel execution and facilitates task parallelism. We analyze the performance of concurrent kernels for various allocations of computational resources and come up with a framework to help up allocate computational resources optimally.

1. Introduction
2. Background
3. Analysis Framework
4. Experimental Results and Conclusion
