NVSM Health

Command Details

Command Details

bash

Brief

None

Description

None

Module

dump

Command-line

bash --version 

Timeout

300 seconds.

bash_hello_world

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/hello.bash 

Timeout

300 seconds.

collect_fru

Brief

Run ipmitool fru print command

Description

This runs the “ipmitool fru” command to obtain FRU (field replaceable unit) information from the BMC (baseboard management controller). FRU information is important for keeping inventory of the components installed on the system and their serial numbers.

Module

fru

Used By

Command-line

ipmitool fru print 

Timeout

300 seconds.

collect_nvsm

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/collect_nvsm.py 

Timeout

300 seconds.

collect_usb_sysfs

Brief

Collect information for connected USB devices from sysfs

Description

None

Module

usb

Command-line

echo TODO 

Timeout

300 seconds.

date

Brief

None

Description

None

Module

dump

Command-line

date 

Timeout

300 seconds.

date_utc

Brief

None

Description

None

Module

dump

Command-line

date --utc 

Timeout

300 seconds.

dcc_ipmitool_sel_writeraw

Brief

None

Description

None

Module

dump

Command-line

ipmitool -I lanplus -H 192.168.1.42 -U nvsm-admin -P None sel writeraw \
    bin_file 

Timeout

300 seconds.

dcc_passgen

Brief

Run dcc_passgen tool

Description

Run dcc_passgen for DCC BMC. This command requires superuser privileges.

Module

dcs_modules

Command-line

dcc_passgen 

Timeout

300 seconds.

dcs_cam_camera_mapping

Brief

None

Description

None

Module

dump

Command-line

python3 ${NVSMHEALTH_DUMP_TOOLS}/dcs_camera_info.py --cmd camera_mapping \
    --display 0 

Timeout

300 seconds.

dcs_cam_gpus_all

Brief

None

Description

None

Module

dump

Command-line

python3 ${NVSMHEALTH_DUMP_TOOLS}/dcs_camera_info.py --cmd gpus_all --display 0 

Timeout

300 seconds.

dcs_cam_query_gpu_info

Brief

None

Description

None

Module

dump

Command-line

python3 ${NVSMHEALTH_DUMP_TOOLS}/dcs_camera_info.py --cmd query_gpu_info 

Timeout

300 seconds.

df

Brief

None

Description

None

Module

dump

Command-line

df -k 

Timeout

300 seconds.

dmesg

Brief

None

Description

None

Module

dump

Command-line

dmesg 

Timeout

300 seconds.

dmidecode

Brief

None

Description

None

Module

dump

Command-line

dmidecode 

Timeout

300 seconds.

docker_info

Brief

None

Description

None

Module

dump

Command-line

docker info 

Timeout

300 seconds.

docker_ps

Brief

None

Description

None

Module

dump

Command-line

docker ps 

Timeout

300 seconds.

dpkg_list

Brief

None

Description

None

Module

dump

Command-line

dpkg --list 

Timeout

300 seconds.

dpkg_verify

Brief

None

Description

None

Module

dump

Command-line

dpkg --verify 

Timeout

300 seconds.

ethtool

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/ethtool.sh 

Timeout

300 seconds.

fru_dcc_version

Brief

Determine system version using DCC version stored in DCS FRU

Description

This command reads the dcc version stored in the DCS FRU table by reading its value using ipmitool. On C1.1 systems this will be “1.1”. This command does not require superuser privileges.

Module

sysfs

Command-line

ipmitool fru print 0 | grep -E 'Product Extra(\s+):' | head -n 3 | awk 'NR==3 \
    {{print $4}}' 

Timeout

300 seconds.

gcc

Brief

None

Description

None

Module

dump

Command-line

gcc -v 

Timeout

300 seconds.

gds_check

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_GDS_CUDA_PATH}/gds/tools/gdscheck.py -pvV 

Timeout

300 seconds.

gds_stack_trace

Brief

None

Description

None

Module

dump

Command-line

for x in `nvidia-smi --query-compute-apps=pid --format=csv,noheader` ; do cat \
    /proc/$x/task/*/stack; done 

Timeout

300 seconds.

gds_stats

Brief

None

Description

None

Module

dump

Command-line

for x in `nvidia-smi --query-compute-apps=pid --format=csv,noheader` ; do \
    ${NVSMHEALTH_DUMP_GDS_CUDA_PATH}/gds/tools/gds_stats -p $x -l 3; done 

Timeout

300 seconds.

glxinfo

Brief

None

Description

None

Module

dump

Command-line

ldd /usr/bin/glxinfo 

Timeout

300 seconds.

gpp

Brief

None

Description

None

Module

dump

Command-line

g++ -v 

Timeout

300 seconds.

hca_self_test

Brief

None

Description

None

Module

dump

Command-line

hca_self_test.ofed 

Timeout

300 seconds.

ibdev2netdev

Brief

None

Description

None

Module

dump

Command-line

ibdev2netdev 

Timeout

300 seconds.

ibstat

Brief

None

Description

None

Module

dump

Command-line

ibstat 

Timeout

300 seconds.

ibstatus

Brief

None

Description

None

Module

dump

Command-line

ibstatus 

Timeout

300 seconds.

ibv_devinfo

Brief

None

Description

None

Module

dump

Command-line

ibv_devinfo 

Timeout

300 seconds.

ip_addr_show

Brief

None

Description

None

Module

dump

Command-line

ip addr show 

Timeout

300 seconds.

ip_route_show

Brief

None

Description

None

Module

dump

Command-line

ip route show 

Timeout

300 seconds.

ipmitool_bmc_info

Brief

None

Description

None

Module

dump

Command-line

ipmitool bmc info 

Timeout

300 seconds.

ipmitool_chassis_status

Brief

None

Description

None

Module

dump

Command-line

ipmitool chassis status 

Timeout

300 seconds.

ipmitool_fru

Brief

None

Description

None

Module

dump

Command-line

ipmitool fru 

Timeout

300 seconds.

ipmitool_lan_print

Brief

None

Description

None

Module

dump

Command-line

ipmitool lan print 1 

Timeout

300 seconds.

ipmitool_power_led_status

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/ipmitool_power_led_status.sh 

Timeout

300 seconds.

ipmitool_raw

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/ipmitool_raw.sh 

Timeout

300 seconds.

ipmitool_raw_dgxa100

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/ipmitool_raw_dgxa100.sh 

Timeout

300 seconds.

ipmitool_sdr

Brief

None

Description

None

Module

dump

Command-line

ipmitool sdr 

Timeout

300 seconds.

ipmitool_sdr_dump

Brief

None

Description

None

Module

dump

Command-line

out=$(mktemp); ipmitool sdr dump $out > /dev/null 2>&1; cat $out 

Timeout

300 seconds.

ipmitool_sdr_info

Brief

None

Description

None

Module

dump

Command-line

ipmitool sdr info 

Timeout

300 seconds.

ipmitool_sel_elist

Brief

None

Description

None

Module

dump

Command-line

ipmitool sel elist 

Timeout

300 seconds.

ipmitool_sel_info

Brief

None

Description

None

Module

dump

Command-line

ipmitool sel info 

Timeout

300 seconds.

ipmitool_sel_list

Brief

None

Description

None

Module

dump

Command-line

ipmitool sel list 

Timeout

300 seconds.

ipmitool_sel_time_get

Brief

None

Description

None

Module

dump

Command-line

ipmitool sel time get 

Timeout

300 seconds.

ipmitool_sel_writeraw

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/sel_writeraw.sh 

Timeout

300 seconds.

ipmitool_user_list_1

Brief

None

Description

None

Module

dump

Command-line

ipmitool user list 1 

Timeout

300 seconds.

java

Brief

None

Description

None

Module

dump

Command-line

java -version 

Timeout

300 seconds.

java_hello_world

Brief

None

Description

None

Module

dump

Command-line

java -classpath ${NVSMHEALTH_DUMP_TOOLS}/tools hello 

Timeout

300 seconds.

ldconfig

Brief

None

Description

None

Module

dump

Command-line

ldconfig -p 

Timeout

300 seconds.

lsb_release

Brief

None

Description

None

Module

dump

Command-line

lsb_release -a 

Timeout

300 seconds.

lsblk

Brief

None

Description

None

Module

dump

Command-line

lsblk 

Timeout

300 seconds.

lsblk_discard

Brief

None

Description

None

Module

dump

Command-line

lsblk --discard 

Timeout

300 seconds.

lsblk_topology

Brief

None

Description

None

Module

dump

Command-line

lsblk --topology 

Timeout

300 seconds.

lscpu

Brief

None

Description

None

Module

dump

Command-line

lscpu 

Timeout

300 seconds.

lshw

Brief

None

Description

None

Module

dump

Command-line

lshw 

Timeout

300 seconds.

lslocks

Brief

None

Description

None

Module

dump

Command-line

lslocks 

Timeout

300 seconds.

lsmod

Brief

None

Description

None

Module

dump

Command-line

lsmod 

Timeout

300 seconds.

lspci

Brief

None

Description

None

Module

dump

Command-line

lspci -vvn 

Timeout

300 seconds.

lspci_plain

Brief

None

Description

None

Module

dump

Command-line

lspci 

Timeout

300 seconds.

lspci_tree

Brief

None

Description

None

Module

dump

Command-line

lspci -t 

Timeout

300 seconds.

lsusb

Brief

None

Description

None

Module

dump

Command-line

lsusb 

Timeout

300 seconds.

lsusb_tree

Brief

None

Description

None

Module

dump

Command-line

lsusb -t 

Timeout

300 seconds.

lsusb_verbose

Brief

None

Description

None

Module

dump

Command-line

lsusb --verbose 

Timeout

300 seconds.

mdadm_detail

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/mdadm-detail.sh 

Timeout

300 seconds.

mdadm_examine

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/mdadm-examine.sh 

Timeout

300 seconds.

mlx_fetch_arm_log

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/mlnx_arm_logs.sh 

Timeout

300 seconds.

mlxcables

Brief

None

Description

None

Module

dump

Command-line

mst start && mst cable add && mlxcables 

Timeout

300 seconds.

modinfo

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/modinfo.sh 

Timeout

300 seconds.

mount

Brief

None

Description

None

Module

dump

Command-line

mount 

Timeout

300 seconds.

ntpq

Brief

None

Description

None

Module

dump

Command-line

ntpq -p 

Timeout

300 seconds.

numactl

Brief

None

Description

None

Module

dump

Command-line

numactl --hardware 

Timeout

300 seconds.

nvcc

Brief

None

Description

None

Module

dump

Command-line

nvcc --version 

Timeout

300 seconds.

nvidia_address_text

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/nvidia_address_text.py 

Timeout

300 seconds.

nvidia_debugdump

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/nvidia-debugdump.sh 

Timeout

300 seconds.

nvidia_dkms_log

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/nvidia-dkms-log.sh 

Timeout

300 seconds.

nvidia_driver_ko

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/nvidia_driver_ko.py 

Timeout

300 seconds.

nvidia_settings

Brief

None

Description

None

Module

dump

Command-line

nvidia-settings -q all 

Timeout

300 seconds.

nvidia_smi

Brief

None

Description

None

Module

dump

Command-line

nvidia-smi 

Timeout

300 seconds.

nvidia_smi_query

Brief

None

Description

None

Module

dump

Command-line

nvidia-smi -q 

Timeout

300 seconds.

nvidia_smi_query_unit

Brief

None

Description

None

Module

dump

Command-line

nvidia-smi -q -u 

Timeout

300 seconds.

nvidia_smi_topo

Brief

None

Description

None

Module

dump

Command-line

nvidia-smi topo -m 

Timeout

300 seconds.

nvidia_smi_xml

Brief

None

Description

None

Module

dump

Command-line

nvidia-smi -q -x 

Timeout

300 seconds.

nvidia_vm_health_check_show

Brief

None

Description

None

Module

dump

Command-line

nvidia-vm health-check show 

Timeout

300 seconds.

nvidia_vm_image_show

Brief

None

Description

None

Module

dump

Command-line

nvidia-vm image show 

Timeout

300 seconds.

nvidia_vm_resources_show

Brief

None

Description

None

Module

dump

Command-line

nvidia-vm resources show 

Timeout

300 seconds.

nvme_list

Brief

None

Description

None

Module

dump

Command-line

nvme list 

Timeout

300 seconds.

nvme_list

Brief

Collect list of NVMe devices using the nvme-cli tool

Description

None

Module

nvme

Command-line

nvme list --output-format=json 

Timeout

300 seconds.

nvme_logs

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/nvme-logs.sh 

Timeout

300 seconds.

nvsm_health_show_debug

Brief

None

Description

None

Module

dump

Command-line

nvsm-health --show --log-level=debug 

Timeout

300 seconds.

nvsm_show

Brief

None

Description

None

Module

dump

Command-line

nvsm show -level all 

Timeout

300 seconds.

nvsm_show_alerts

Brief

None

Description

None

Module

dump

Command-line

nvsm show alerts 

Timeout

300 seconds.

nvsm_show_debug

Brief

None

Description

None

Module

dump

Command-line

nvsm --log-level=debug show -level all 

Timeout

300 seconds.

ofed_info

Brief

None

Description

None

Module

dump

Command-line

ofed_info 

Timeout

300 seconds.

perl

Brief

None

Description

None

Module

dump

Command-line

perl -v 

Timeout

300 seconds.

perl_hello_world

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/hello.pl 

Timeout

300 seconds.

ping_compute

Brief

None

Description

None

Module

dump

Command-line

ping -w 5 ngc.nvidia.com 

Timeout

300 seconds.

printenv

Brief

None

Description

None

Module

dump

Command-line

printenv 

Timeout

300 seconds.

ps

Brief

None

Description

None

Module

dump

Command-line

ps -wwo pid,uid,pcpu,pmem,etime,state,ppid,user,args --pid 2 --ppid 2 \
    --deselect 

Timeout

300 seconds.

ps_aux

Brief

None

Description

None

Module

dump

Command-line

ps aux 

Timeout

300 seconds.

psu_info_dgx1

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/psu_info_dgx1.sh 

Timeout

300 seconds.

python

Brief

None

Description

None

Module

dump

Command-line

python --version 

Timeout

300 seconds.

python_hello_world

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/hello.py 

Timeout

300 seconds.

run_bmc_boot_slot_task

Brief

Run ipmitool raw 0x3C 0x3 0x0

Description

Get bmc boot slot. This command requires superuser privileges.

Module

cec_info

Command-line

ipmitool raw 0x3C 0x3 0x0 

Timeout

300 seconds.

run_cec_boot_status

Brief

Run ipmitool raw 0x3C 0x68 0x00

Description

Get boot status. This command requires superuser privileges.

Module

cec_info

Command-line

ipmitool raw 0x3C 0x68 0x00 

Timeout

300 seconds.

run_cec_version

Brief

Run ipmitool raw 0x3C 0xF 0x9

Description

Get CEC version. This command requires superuser privileges.

Module

cec_info

Command-line

ipmitool raw 0x3C 0xF 0x9 

Timeout

300 seconds.

run_dmidecode

Brief

Run the dmidecode command

Description

Verify system as described by SMBIOS/DMI using the dmidecode tool

Module

dmidecode

Command-line

dmidecode 

Timeout

300 seconds.

run_dmidecode_memory

Brief

Run the dmidecode command

Description

Run the “dmidecode” command to get memory DMI type information. Some flags are added to output in a machine-readable format. This command does not require superuser privileges.

Module

dmidecode

Command-line

dmidecode --type memory 

Timeout

300 seconds.

run_dpkg_grep_kvm

Brief

Run dpkg list and grep for kvm package

Description

None

Module

kvm

Used By

Command-line

bash -c "dpkg -l | grep -c dgx-kvm-sw" 

Timeout

300 seconds.

run_gpu_monitor_status

Brief

Execute GET on nvsm_core

Description

This runs the “nvsm_core –mode=client GET /nvsm/v1/Systems/1/GPUs” command to obtain gpumonitor status information.

Module

vgpu

Command-line

nvsm_core --mode=client GET /nvsm/v1/Systems/1/GPUs 

Timeout

300 seconds.

run_ipmi_fru

Brief

Run ipmitool fru print command

Description

This runs the “ipmitool fru” command to obtain FRU (field replaceable unit) information from the BMC (baseboard management controller). FRU information is important for keeping inventory of the components installed on the system and their serial numbers.

Module

ipmitool

Depends On

Command-line

ipmitool fru print 

Timeout

300 seconds.

run_ipmi_getenables

Brief

Run ipmitool mc getenables command

Description

Check BMC status with ipmitool. This command requires superuser privileges.

Module

ipmitool

Depends On

Command-line

ipmitool mc getenables 

Timeout

300 seconds.

run_ipmi_info

Brief

Run ipmitool mc info command

Description

Check BMC status with ipmitool. This command requires superuser privileges.

Command-line

ipmitool mc info 

Timeout

300 seconds.

run_ipmi_sdr_elist

Brief

Run ipmitool sdr elist command

Description

Check BMC bom devices with ipmitool. This command requires superuser privileges.

Module

ipmitool

Command-line

ipmitool sdr elist 

Timeout

300 seconds.

run_ipmi_sensor

Brief

Run ipmitool sensor command

Description

Check BMC sensor status with ipmitool. This command requires superuser privileges.

Module

ipmitool

Command-line

ipmitool sensor 

Timeout

300 seconds.

run_ipmitool

Brief

Run the ipmitool command

Description

This simply runs the “ipmitool” command to make sure that ipmitool is able to access the BMC (baseboard management controller).

Module

bmc

Command-line

ipmitool 

Timeout

300 seconds.

run_lsblk_scsi_device_info

Brief

Run the lsblk utility

Description

Run the “lsblk” utility to get info for scsi block devices. Get the output in json format.

Module

lsblk

Command-line

lsblk -S -P -o NAME,HCTL,TYPE,VENDOR,MODEL,REV,TRAN 

Timeout

300 seconds.

run_lscpu

Brief

Run lscpu command

Description

Verify hyperthreading and NUMA are enabled

Module

lscpu

Used By

Command-line

lscpu 

Timeout

300 seconds.

run_lspci

Brief

Run the lspci command

Description

Run the “lspci” command to list PCI devices. Some flags are added such that lspci output is printed in a machine-readable format. This command does not require superuser privileges.

Module

lspci

Used By

Command-line

lspci -vmm -nn 

Timeout

300 seconds.

run_lspci_n

Brief

Run the lspci command

Description

Run the “lspci” command to list PCI devices. Some flags are added such that lspci output is printed in a machine-readable format. This command does not require superuser privileges.

Module

lspci

Used By

Command-line

lspci -vmm -n 

Timeout

300 seconds.

run_lspci_verbose

Brief

Run the lspci command with verbose flags

Description

Run the “lspci” command with verbose flags to show detailed information about PCI devices. This command requires superuser privileges in order to read privileged PCI device registers. Much of the verbose output from lspci is not necessarily in a machine-readable format.

Module

lspci

Command-line

lspci -vvv -nn -D 

Timeout

300 seconds.

run_mlxfwmanager

Brief

Collect details of mellanox devices firmware version using Mellanox Firmware Manager

Description

None

Module

mlnx

Command-line

mlxfwmanager --query-format xml 

Timeout

300 seconds.

run_net_ifconfig

Brief

Run ifconfig command to show all network interfaces

Description

See all network interfaces

Module

net

Command-line

ifconfig -a 

Timeout

300 seconds.

run_nvidia_smi_gpu_bus_id

Brief

Collect GPU’s identified with the NVIDIA System Management Interface (nvidia-smi) tool

Description

None

Module

nvidia_smi

Command-line

nvidia-smi --query-gpu=gpu_bus_id --format=csv,noheader 

Timeout

300 seconds.

run_nvidia_smi_p2p_topology

Brief

Collect GPUs p2p topology using the nvidia-smi tool

Description

None

Module

nvidia_smi

Command-line

nvidia-smi topo -p2p rw 

Timeout

300 seconds.

run_nvidia_smi_topology

Brief

Collect GPUDirect topology using the nvidia-smi tool

Description

None

Module

nvidia_smi

Command-line

nvidia-smi topo --matrix 

Timeout

300 seconds.

run_smartctl_scan

Brief

Run the smartctl utility

Description

Run the “smartctl” utility to scan for devices. Some flags are added to output in a machine-readable format. This command requires superuser privileges.

Module

smartctl

Command-line

smartctl --scan 

Timeout

300 seconds.

run_storcli_pall

Brief

Run the storcli command

Description

None

Module

storcli

Command-line

storcli64 /c0/pall show all J 

Timeout

300 seconds.

run_storcli_vall

Brief

Run the storcli command

Description

None

Module

storcli

Command-line

storcli64 /c0/vall show all J 

Timeout

300 seconds.

run_storcli_version

Brief

Run the storcli command

Description

None

Module

storcli

Command-line

storcli64 -v -NoLog 

Timeout

300 seconds.

run_xl_info

Brief

Run the “xl info” command for XenServer information

Description

The “xl info” command prints basic information about the running XenServer hypervisor.

Module

xenserver

Used By

Command-line

xl info 

Timeout

300 seconds.

service_cachefilesd_status

Brief

None

Description

None

Module

dump

Command-line

service cachefilesd status 

Timeout

300 seconds.

service_status_all

Brief

None

Description

None

Module

dump

Command-line

service --status-all 

Timeout

300 seconds.

smartctl

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/smartctl.sh 

Timeout

300 seconds.

smartctl_scan

Brief

None

Description

None

Module

dump

Command-line

smartctl --scan 

Timeout

300 seconds.

storcli_cmds

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/storcli_cmds.sh 

Timeout

300 seconds.

sysctl

Brief

None

Description

None

Module

dump

Command-line

sysctl -a 

Timeout

300 seconds.

sysfs_dmi_bios_version

Brief

Determine BIOS version in DMI table via sysfs

Description

This command reads the BIOS version stored in the DMI table by reading its value using sysfs. The product name is used to determine which BIOS version is running with, e.g. DGX-1, DGX-2, or DGX Station. This command does not require superuser privileges.

Module

sysfs

Command-line

cat /sys/devices/virtual/dmi/id/bios_version 

Timeout

300 seconds.

sysfs_dmi_product_name

Brief

Determine product name in DMI table via sysfs

Description

This command reads the product name stored in the DMI table by reading its value using sysfs. The product name is used to determine which platform NVSysinfo is running on, e.g. DGX-1, DGX-2, or DGX Station. This command does not require superuser privileges.

Module

sysfs

Command-line

cat /sys/devices/virtual/dmi/id/product_name 

Timeout

300 seconds.

sysfs_dmi_system_vendor

Brief

Determine system vendor in DMI table via sysfs

Description

This command reads the system vendor name (sometimes also “Manufacturer”) stored in the DMI table by reading its value using sysfs. On DGX systems this will be “NVIDIA”, but might be some other string depending on the system. This command does not require superuser privileges.

Module

sysfs

Command-line

cat /sys/devices/virtual/dmi/id/sys_vendor 

Timeout

300 seconds.

timedatectl_status

Brief

None

Description

None

Module

dump

Command-line

timedatectl status 

Timeout

300 seconds.

top

Brief

None

Description

None

Module

dump

Command-line

top -b -n 5 

Timeout

300 seconds.

ulimit

Brief

None

Description

None

Module

dump

Command-line

bash -c "ulimit -a" 

Timeout

300 seconds.

uname

Brief

None

Description

None

Module

dump

Command-line

uname -a 

Timeout

300 seconds.

uptime

Brief

Run uptime command

Description

Check system uptime with the uptime utility

Module

system

Command-line

uptime -p 

Timeout

300 seconds.

virsh_list_all

Brief

None

Description

None

Module

dump

Command-line

virsh list --all 

Timeout

300 seconds.

xenserver_status_report

Brief

None

Description

None

Module

dump

Command-line

${NVSMHEALTH_DUMP_TOOLS}/xenserver-status-report.sh 

Timeout

300 seconds.

xl_info

Brief

None

Description

None

Module

dump

Command-line

xl info 

Timeout

300 seconds.

xrandr

Brief

None

Description

None

Module

dump

Command-line

xrandr --verbose 

Timeout

300 seconds.

xset

Brief

None

Description

None

Module

dump

Command-line

xset -q 

Timeout

300 seconds.