4
votes

For example, when I use multi-GPU system with CUDA C/C++ and GPUDirect 2.0 P2P, and I use nested PCI-Express Switches, as shown on picture, then I must know how many switches between any two GPUs by their PCI Bus ID, to optimize data transfer and distribution of calculation.

Or if I already know hardware PCIe topology with PCIe-switches, then I must know, to which hardware PCIe slot on board is connected any GPU card. enter image description here

As I know, even if I already know hardware PCIe topology with PCIe-switches, then these identifiers is not hard-bound to PCIe slots on the board, and these IDs may change and be different from run to run of system:

  • CUDA device_id
  • nvidia-smi/nvml GPU id
  • PCI Bus ID

What is the best way to discover the topology of the PCIe bus with detailed device tree and the number of PCIe slot on the board on Windows and Linux?

7
@Robert Crovella Thank you. But nvidia-smi topo works only on Linux, and not on Windows - but I need it. And nvidia-smi topo -m shows only 4 values between any 2 GPUs, which shows number of PCI-Switches: SOC (0 + QPI), PHB(0), PXB(1 or more), PIX(1 internal). But if I have topology with 2, 3 or more levels of PCIe-Switches, then I can't use it.Alex
Maybe you should be talking to your system vendor. They might have a proprietary hardware abstraction layer or library you could use. The types of system architectures you are asking about are pretty exotic and out of the realms of most standard setups and tools.talonmies

7 Answers

6
votes

PCI devices (endpoints) have a unique address. This address has 3 parts:

  • BusID
  • DeviceID
  • FunctionID

For example function 3 of device 12 on bus 3 is written in BDF notion: 03:0C.3. An extended BDF notation adds a domain (mostly 0000) as a prefix: 0000:03:0c.3.

Linux lists these devices in /sys/bus/pci/devices

paebbels@debian8:~$ ll /sys/bus/pci/devices/
drwxr-xr-x 2 root root 0 Aug 19 11:44 .
drwxr-xr-x 5 root root 0 Aug  5 15:14 ..
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:00:00.0 -> ../../../devices/pci0000:00/0000:00:00.0
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:00:01.0 -> ../../../devices/pci0000:00/0000:00:01.0
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:00:07.0 -> ../../../devices/pci0000:00/0000:00:07.0
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:00:07.1 -> ../../../devices/pci0000:00/0000:00:07.1
...
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:00:18.6 -> ../../../devices/pci0000:00/0000:00:18.6
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:00:18.7 -> ../../../devices/pci0000:00/0000:00:18.7
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:02:00.0 -> ../../../devices/pci0000:00/0000:00:11.0/0000:02:00.0
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:02:01.0 -> ../../../devices/pci0000:00/0000:00:11.0/0000:02:01.0
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:02:02.0 -> ../../../devices/pci0000:00/0000:00:11.0/0000:02:02.0
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:02:03.0 -> ../../../devices/pci0000:00/0000:00:11.0/0000:02:03.0
lrwxrwxrwx 1 root root 0 Aug 19 11:44 0000:03:00.0 -> ../../../devices/pci0000:00/0000:00:15.0/0000:03:00.0

Here you can see that sys-fs lists devices 00 to 03 of bus 02 as connected to bus 00, device 11, function 0

From these information, you can rebuilt the complete PCI bus-tree. The tree is always the same after a boot up, unless you add or remove devices.

The windows device manager offers the same information. The property dialog shows you the device type, vendor and location: e.g. PCI bus 0, device 2, function 0 for an integrated Intel HD 4600 graphics.

Currently, I don't know how you can get these information by scripting or programming language in a Windows environment, but there are commercial and free tools in the internet, that provide these information. Maybe there is an API.

5
votes

Here is a version of the script that does not need to parse the registry. All of the information (used here) is available within win32_pnpentity.

Function Get-BusFunctionID { 

    gwmi -namespace root\cimv2 -class Win32_PnPEntity |% {

        if ($_.PNPDeviceID -like "PCI\*") {

            $locationInfo = $_.GetDeviceProperties('DEVPKEY_Device_LocationInfo').deviceProperties.Data

            if ($locationInfo -match 'PCI bus (\d+), device (\d+), function (\d+)') {

                new-object psobject -property @{ 
                    "Name"       = $_.Name
                    "PnPID"      = $_.PNPDeviceID
                    "BusID"      = $matches[1]
                    "DeviceID"   = $matches[2]
                    "FunctionID" = $matches[3]
                }
            }
        }
    }
} 
1
votes

On Windows you can use e.g.: the following Powershell-Script with the devcon.exe tool from the Windows Device Driver Kit:

Function Get-BusFunctionID { 
    $Devices = .\devcon.exe find PCI\*

    for($i=0; $i -lt $Devices.length; $i++) { 

        if(!($Devices[$i] -match "PCI\\*")) {
            continue
        }
        $DevInfo = $Devices[$i].split(":")
        $deviceId = $DevInfo[0]
        $locationInfo = (get-itemproperty -path "HKLM:\SYSTEM\CurrentControlSet\Enum\$deviceID" -name locationinformation).locationINformation

        $businfo = Resolve-PCIBusInfo -locationInfo $locationinfo 

        new-object psobject -property @{ 
            "Name"        = $DevInfo[1];
            "PnPID"       = $DevInfo[0]
            "PCIBusID"      = $businfo.BusID; 
            "PCIDeviceID"   = $businfo.DeviceID; 
            "PCIFunctionID" = $businfo.FunctionID 
        } 
    }
}

Function Resolve-PCIBusInfo { 

param ( 
[parameter(ValueFromPipeline=$true,Mandatory=$true)] 
[string] 
$locationInfo 
) 
PROCESS { 
[void]($locationInfo -match  "\d+,\d+,\d+")
$busId,$deviceID,$functionID = $matches[0] -split "," 

new-object psobject -property @{ 
          "BusID" = $busID; 
          "DeviceID" = "$deviceID" 
          "FunctionID" = "$functionID" 
          } 
}          
}

Usage example:

Get-BusFunctionID | Where-Object {$_.PCIBusID -eq 0 -and $_.PCIDeviceID -eq 0} | Format-Table
Get-BusFunctionID | Sort-Object PCIBusID, PCIDeviceID, PCIFunctionID | Format-Table -GroupBy PCIBusID
Get-BusFunctionID | Sort-Object PCIBusID, PCIDeviceID, PCIFunctionID | Out-GridView
1
votes

For Windows a ready to run Powershell-script:

Function Get-BusFunctionID { 
    #gwmi -query "SELECT * FROM Win32_PnPEntity"
    $Devices = get-wmiobject -namespace root\cimv2 -class Win32_PnPEntity

    for($i=0; $i -lt $Devices.length; $i++) { 

        if(!($Devices[$i].PNPDeviceID -match "PCI\\*")) {
            continue
        }
        $deviceId = $Devices[$i].PNPDeviceID
        $locationInfo = (get-itemproperty -path "HKLM:\SYSTEM\CurrentControlSet\Enum\$deviceID" -name locationinformation).locationINformation

        $businfo = Resolve-PCIBusInfo -locationInfo $locationinfo 

        new-object psobject -property @{ 
            "Name"        = $Devices[$i].Name;
            "PnPID"       = $Devices[$i].PNPDeviceID
            "PCIBusID"      = $businfo.BusID; 
            "PCIDeviceID"   = $businfo.DeviceID; 
            "PCIFunctionID" = $businfo.FunctionID 
        } 
    }
}

Function Resolve-PCIBusInfo { 

param ( 
[parameter(ValueFromPipeline=$true,Mandatory=$true)] 
[string] 
$locationInfo 
) 
PROCESS { 
[void]($locationInfo -match  "\d+,\d+,\d+")
$busId,$deviceID,$functionID = $matches[0] -split "," 

new-object psobject -property @{ 
          "BusID" = $busID; 
          "DeviceID" = "$deviceID" 
          "FunctionID" = "$functionID" 
          } 
}          
}

Usage example:

Get-BusFunctionID | Where-Object {$_.PCIBusID -eq 0 -and $_.PCIDeviceID -eq 0} | Format-Table
Get-BusFunctionID | Sort-Object PCIBusID, PCIDeviceID, PCIFunctionID | Format-Table -GroupBy
1
votes

For network adapters, starting with Windows 8 or Windows 2012, you can use WMI class MSFT_NetAdapterHardwareInfoSettingData:

gwmi -Namespace root\standardcimv2 MSFT_NetAdapterHardwareInfoSettingData | Format-Table Description,BusNumber,DeviceNumber,FunctionNumber

Description                        BusNumber DeviceNumber FunctionNumber
-----------                        --------- ------------ --------------
Red Hat VirtIO Ethernet Adapter #6         0           17              0
Red Hat VirtIO Ethernet Adapter #3         0            9              0
Red Hat VirtIO Ethernet Adapter #5         0           16              0
Red Hat VirtIO Ethernet Adapter #2         0            8              0
Red Hat VirtIO Ethernet Adapter #7         0           18              0
Red Hat VirtIO Ethernet Adapter #8         0           19              0
Red Hat VirtIO Ethernet Adapter            0            3              0
Red Hat VirtIO Ethernet Adapter #4         0           10              0

...or PowerShell command Get-NetAdapterHardwareInfo:

Get-NetAdapterHardwareInfo | Format-Table Description,Bus,Device,Function

Description                        Bus Device Function
-----------                        --- ------ --------
Red Hat VirtIO Ethernet Adapter #6   0     17        0
Red Hat VirtIO Ethernet Adapter #3   0      9        0
Red Hat VirtIO Ethernet Adapter #5   0     16        0
Red Hat VirtIO Ethernet Adapter #2   0      8        0
Red Hat VirtIO Ethernet Adapter #7   0     18        0
Red Hat VirtIO Ethernet Adapter #8   0     19        0
Red Hat VirtIO Ethernet Adapter      0      3        0
Red Hat VirtIO Ethernet Adapter #4   0     10        0
0
votes

On Windows device id and function are also encoded in the last part of PNPDeviceID, which is easier to get. However, the way they're encoded is not documented.

Example: PCI\\VEN_8086&DEV_2829&SUBSYS_00000000&REV_02\\3&267A616A&0&68.

68 (hex) may be device number 13 (dec) shifted by 3 bits to the left, while the last 3 bits may be the function number 0. As function is usually 0, you may notice that most values of PNPDeviceID end with 0 or 8. 3 bits offset may come from CONFIG_ADDRESS PCI register:

The format of CONFIG_ADDRESS is the following:

0x80000000 | bus << 16 | device << 11 | function << 8 | offset

See also Intel chipset documentation, page 5:

  1. The device’s “PCI device number” must be written into bits [15:11] of I/O location CF8h

0 before 68 may indicate bus number.

0
votes

Among device properties of WMI_PnPDevice, in registry key or in the output of devcon.exe, there is Address property. It's 32-bit integer (DWORD) in which higher 16 bits are device number and lower 16 bits are function number:

PS C:\Users\Administrator> (gwmi win32_pnpEntity).getDeviceProperties('DEVPKEY_Device_Address').deviceProperties | ?{$_.data} | Format-Table DeviceID,keyName,{$_.data -shr 16},{$_.data -band 0xFFFF}

DeviceID                                                               keyName                $_.data -shr 16 $_.data -band 0xFFFF
--------                                                               -------                --------------- --------------------
PCI\VEN_8086&DEV_7000&SUBSYS_00000000&REV_00\3&267A616A&0&08           DEVPKEY_Device_Address               1                    0
STORAGE\VOLUME\{EF420E5C-5744-11E9-92B5-806E6F6E6963}#0000000000100000 DEVPKEY_Device_Address               0                    1
PCI\VEN_80EE&DEV_CAFE&SUBSYS_00000000&REV_00\3&267A616A&0&20           DEVPKEY_Device_Address               4                    0
PCI\VEN_8086&DEV_2829&SUBSYS_00000000&REV_02\3&267A616A&0&68           DEVPKEY_Device_Address              13                    0
SCSI\DISK&VEN_VBOX&PROD_HARDDISK\4&2617AEAE&0&000000                   DEVPKEY_Device_Address               0                65535
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&18           DEVPKEY_Device_Address               3                    0
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&40           DEVPKEY_Device_Address               8                    0
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&48           DEVPKEY_Device_Address               9                    0
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&50           DEVPKEY_Device_Address              10                    0
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&80           DEVPKEY_Device_Address              16                    0
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&88           DEVPKEY_Device_Address              17                    0
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&90           DEVPKEY_Device_Address              18                    0
PCI\VEN_1AF4&DEV_1000&SUBSYS_00011AF4&REV_00\3&267A616A&0&98           DEVPKEY_Device_Address              19                    0
PCI\VEN_80EE&DEV_BEEF&SUBSYS_00000000&REV_00\3&267A616A&0&10           DEVPKEY_Device_Address               2                    0
SCSI\CDROM&VEN_VBOX&PROD_CD-ROM\4&2617AEAE&0&010000                    DEVPKEY_Device_Address               1                65535

Question about the meaning of address and answer. My answer on related question.