Nsight Systems and Nsight Compute

Nsight Tools composes of Nsight Systems and Nsight Compute 1. Nsight Systems profiles a whole application (for Pascal and newer). Nsight Compute profiles a CUDA kernel (for Volta and newer).

Environment

  • Host
  • Remote on Singularity
    • Singularity==3.8.1
    • Debian-based container: From: nvidia/cuda:11.2.1-devel-ubuntu20.04

Install Nsight Tools on host

After installation, set up PATH:

export PATH=${PATH}:/path/to/NVIDIA-Nsight-Compute
export PATH=${PATH}:/path/to/nsight-systems-2021.4.1/bin

Install Nsight Tools on singularity container (remote)

Set up proper perf_event_paranoid on the remote machine:

sudo sh -c 'echo 2 >/proc/sys/kernel/perf_event_paranoid'
sudo sh -c 'echo kernel.perf_event_paranoid=2 > /etc/sysctl.d/local.conf'

Append the following in a Singularity definition file:

# container.sif

%post
    # Nsight Systems 2021.4.1 and Nsight Compute 2021.3.0
    apt-get update -y
    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
        apt-transport-https \
        ca-certificates \
        gnupg \
        wget
    rm -rf /var/lib/apt/lists/*
    wget -qO - https://developer.download.nvidia.com/devtools/repos/ubuntu2004/amd64/nvidia.pub | apt-key add -
    echo "deb https://developer.download.nvidia.com/devtools/repos/ubuntu2004/amd64/ /" >> /etc/apt/sources.list.d/nsight.list
    apt-get update -y
    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
        nsight-systems-2021.4.1 \
              nsight-compute-2021.3.0
    rm -rf /var/lib/apt/lists/*

%environment
    export PATH=${PATH}:/opt/nvidia/nsight-compute/2021.3.0/

Grant SYS_ADMIN privilege to user on the remote machine:

sudo singularity capability add --user user CAP_SYS_ADMIN

Profile with Nsight Systems on remote

Run container with SYS_ADNIN privilege

singularity run --nv --add-caps=SYS_ADMIN container.sif

Profile a.out and save its report at out.qdrep :

nsys profile --force-overwrite=true --stats=true -o out.qdrep ./a.out

Display profiling report by Nsight Systems on host

Launch Nsight Systems' GUI

nsys-ui

Open out.qdrep, and then you will see a display like below: f:id:lan496:20211030102555p:plain

On GUI, right-click a kernel and click "Analyze the Selected Kernel with Nsight Compute". Then, a command for profiling the selected kernel with Nsight Compute is displayed like below:

# for example
ncu --kernel-name cuda_parallel_launch_constant_memory --launch-skip 10 --launch-count 1 "./a.out"

Profile a kernel with Nsight Compute on remote

Profile and save its report at report.ncu-rep (append -o report -set full -f from the above command):

ncu --kernel-name cuda_parallel_launch_constant_memory --launch-skip 10 --launch-count 1 -o report --set full -f "./a.out"

Display profiling report by Nsight Compute on host

Launch Nsight Compute's GUI

ncu-ui

Open report.ncu-rep, and then you will see a display like below: f:id:lan496:20211030102611p:plain

Misc

Use nvprof

nsys nvprof ./a.out

References


  1. “Nsight” seems to be pronounced N-sight