T-14: GPU-Acceleration of Signal Processing Workflows from Python: Part 1
Adam Thompson, Matthew Nicely, Zoe Ryan
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 03:38:32
In this tutorial, we will introduce developers and users alike to the cuSignal API and demonstrate performance in both an online and offline signal processing workflow. We will also demonstrate how to connect cuSignal to the PyTorch deep learning framework to begin deep learning training and inferencing tasks without data leaving the GPU. Further, we will devote a significant amount of time to teaching attendees how to build their own GPU kernels within Python. We will provide examples and best practices on how to transition from standard Python code to fast Numba CUDA kernels, how to profile the result, and how to then implement custom CuPy CUDA kernels for optimum performance. Throughout the tutorial, we will discuss cost-benefit tradeoffs including the developer learning curve and anticipated performance. Our goal of the tutorial is to demonstrate the ease and flexibility of creating and implementing GPU-based high-performance signal processing workloads from Python.