NVSM Health

Getting Started with NVSM Health

Getting Started with NVSM Health

Show Health

The nvsm show health command can be used to quickly assess overall system health

user@dgx-2:~$ sudo nvsm show health

Example output:

      ...
      Checks
      ------
Verify installed DIMM memory sticks.......................... Healthy
Number of logical CPU cores [96]............................. Healthy
GPU link speed [0000:39:00.0][8GT/s]......................... Healthy
GPU link width [0000:39:00.0][x16]........................... Healthy
      ...
      Health Summary
      --------------
      205 out of 205 checks are Healthy
      Overall system status is Healthy

If any system health problems are found, this will be reflected in the health summary at the bottom of the nvsm show health output. Detailed information on health checks performed will appear above.

For a complete list of health checks performed by nvsm show health, please refer to Table of Health Checks Performed

Dump Health

The nvsm dump health command produces a health report file suitable for attaching to support tickets.

user@dgx-2:~$ sudo nvsm dump health

Example output:

Writing output to /tmp/nvsm-health-dgx-2-20190430123318.tar.xz

The file produced by nvsm dump health is a familiar compressed tar archive, and its contents can be examined by using the tar command as shown in the following example.

user@dgx-2:~$ cd /tmp
user@dgx-2:/tmp$ sudo tar xlf nvsm-health-dgx-2-20190430123318.tar.xz
user@dgx-2:/tmp$ sudo ls ./nvsm-health-dgx-2-20190430123318
date            java         nvsysinfo_commands  sos_reports
df              last         nvsysinfo_log.txt   sos_strings
dmidecode       lib          proc                sys
etc             lsb-release  ps                  uname
free            lsmod        pstree              uptime
hostname        lsof         route               usr
initctl         lspci        run                 var
installed-debs  mount        sos_commands        version.txt
ip_addr         netstat      sos_logs            vgdisplay

This archive includes output from many system logs and commands. For a complete list of logs and commands collected by nvsm dump health, please refer to Summary Information.