CUDA

CUDA (Compute Unified Device Architecture) is an extension of C++ (and also a lot of other languages like Fortran, Java, and Python) designed for writing programs that leverage GPUs for accelerating functions.

Introduced in 2006 with NVIDIA’s Tesla architecture, CUDA’s success relies on understanding GPU organization. CUDA is composed of software (language, API, and runtime), firmware (drivers and runtime), and hardware (CUDA-enabled GPU).

CUDA provides two main APIs for GPU device management:

the CUDA Driver
CUDA Runtime, which are mutually exclusive. On top of these, CUDA includes libraries for executing popular algorithms and functions, enhancing productivity and performance.

CUDA libraries

Examples of CUDA GPU-accelerated libraries to optimize performance and enhance software productivity are:

Math Libraries: cuBLAS, cuFFT, CUDA Math Library, etc.
Parallel Algorithm Libraries: Thrust
Image and Video Libraries: nvJPEG, Video Codec SDK
Communication Libraries: NVSHMEM, NCCL
Deep Learning Libraries: cuDNN, TensorRT
Partner Libraries: OpenCV, Ffmpeg, ArrayFire

Detailed list and more info: NVIDIA GPU-accelerated libraries

Common Library Workflow:

Create a library-specific handle
Allocate device memory for inputs/outputs.
Format inputs to library-specific formats.
Execute the library function.
Retrieve outputs and convert them if necessary.
Release CUDA resources and continue with the application.

blog.martino.im

Explorer

03.CUDA

CUDA

CUDA libraries

Graph View

Table of Contents

Backlinks