0
votes

I'm using several PCIe 3.0 extension cards (GPUs and Infiniband interconnects). I'm wondering how lanes are actually managed and if I may optimize my devices by changing ports or by using some adapters (16x -> 8x). Intel Haswell-EP may manage 40 lanes PCIe 3.0. On Intel's schematics, the PCIe 3.0 controller seems to be split in two x16 and one x8 sub-bridges.

On some commercial schematics for the Haswell-EP CPU, we might read:

Up to 40 PCIe Gen3 Lanes 2x16 + 1x8 up to 3x8 Graphics.

Are all devices connected to a main PCIe bridge (and quantity of lanes automatically negotiated for each device), or do the motherboard connect the devices directly to one of the supposedly 3 sub-bridges 16x, 16x and 8x (quantity of lane are then negotiated for each of those sub-bridges)?

I do not have a direct access to the motherboard to see how devices are connected, but I suspect that the lanes of the supposedly 8x sub-bridge are not utilized. Also, I would like to know if by using a 16x to 8x adapter, I could harness more lanes and increase my total PCIe bandwidth (even tough the maximum theoretical bandwidth would be divided by two for that device).

[edit]

Example of what I obtain for one CPU socket with lstopo:

HostBridge L#0
  PCIBridge
    PCI 15b3:1011
      Net L#16 "ib0"
      OpenFabrics L#17 "mlx5_0"
  PCIBridge
    PCI 8086:1d6b
  PCIBridge
    PCI 102b:0532
      GPU L#18 "card0"
      GPU L#19 "controlD64"
1
perhaps this belongs in electronics.stackexchange.com ? SO is software-oriented.WeaponsGrade
this is for systems programming/performance. But you are right, this is more a hardware-orientend question. I thougth some programmers with some experience in GPU/infiniband programming might have the answer. Anyway, I'll try on electronics.stackexchange.com after the expiration of the bounty attribution time.jyvet

1 Answers

4
votes

Are all devices connected to a main PCIe bridge (and quantity of lanes automatically negotiated for each device), or do the motherboard connect the devices directly to one of the supposedly 3 sub-bridges 16x, 16x and 8x (quantity of lane are then negotiated for each of those sub-bridges)?

This is a function of motherboard design, at least in part, so a specific answer cannot be given. But assuming your motherboard has no additional PCIE hardware such as PCIE switches, then it's likely that your motherboard has at least 1 PCIE x16 "port" and some number of other "ports" i.e. slots, which may have varying "widths", i.e. x16, x8, x4, x2, x1, etc.

A modern Intel CPU has an internal PCIE "root complex" which is shared by all the lanes leaving the device. The lanes leaving the device will be grouped into one or more "ports". The PCIE root complex is a logical entity, whereas the ports have both a logical and physical character to them.

There is automatic lane width negotiation, but this is usually only in place as a support and error mitigation strategy. A x16 port will expect to negotiate to x16 width if a x16 "endpoint" (i.e. device) is plugged into it (it may also negotiate to a lower width if errors are detected that are localizable to particular lanes). Usually a port can handle a device of lesser width, so if a x8 device is plugged into a x16 port, things will usually "just work", although this does not usually mean that you have 8 additional lanes you can use "somewhere else".

Reconfiguration of a x16 port to two x8 ports is not something that would normally automatically occur by plugging in a "x16 to x8 adapter", whatever that is. You could certainly reduce a x16 port to a x8 port, but that does not give you 8 extra lanes to use elsewhere automatically.

The process of subdivision of the 40 lanes exiting your Haswell device into logical "ports" involves both hardware design of the motherboard as well as firmware (BIOS) design. A x16 port cannot automatically be split into two (logical) x8 ports. Some motherboards have such configuration options and they are usually selected by some explicit means such as BIOS configuration or modification of a switch or routing PCB, along with the provision of two slots, one for each of the possible ports.

What is fairly common, however, is the use of PCIE switches. Such switches allow a single PCIE (upstream) port to service two (or more) downstream ports. This does not necessarily have to imply conversion of x16 logical character to x8 logical character (although it might, depending on implementation), but it will usually imply whatever bandwidth limit is in place for the upstream port is applied in aggregate to the downstream ports. Nevertheless, this is a fairly common product strategy, and you can find examples of motherboards which have these devices designed into them (to effectively provide more slots, or ports) as well as adapters/planars, which can be plugged into an existing port (i.e. slot) and will provide multiple ports/slots from that single port/slot.

In the linux space, the lstopo command is useful for discovering these topologies. You may need to install the hwloc package in your linux distro.