0
votes

I'm trying to load lustre modules into a Linux instance running Linux 4.15.0-1040-aws and a 18.04 Ubuntu disk image (the 18.04 AMI).

I've downloaded and installed (i.e. dpkg -i) the lustre client debs for 18.04 from Ubuntu 18.04 - Lustre 2.12.2:

- lustre-client-modules-4.15.0-45-generic_2.12.2-1_amd64.deb
- lustre-client-utils_2.12.2-1_amd64.deb

The .ko module files get installed in /lib/modules/4.15.0-45-generic/updates/fs/, but they're not picked up by default by modprobe, because they're outside my kernel's default lookup path: /lib/modules/4.15.0-1040-aws.

Is there a way to get them loaded, or does my kernel need to match exactly what is provided by the deb? Are users expected to muck around with writing custom lustre .conf files for modprobe?

Edit

I think the answer is probably that the kernel needs to precisely match the modules -- which may require recompilation of the module source code. I've eventually managed to install the lustre client on a linux 4.14.123-111.109.amzn2.x86_64, but that's running on "Amazon Linux 2" image (not ubuntu 18.04), and I had to use the command amazon-linux-extras install -y lustre2.10.

The other thing which I'd not initially realized, is that amazon's Lustre FSx is only compatible with Lustre client 2.10.5 and 2.10.6 (see note at the top of this page), in case that matters. Tricky.

1

1 Answers

2
votes

The client kernel modules need to be compiled for the specific kernel running on the system, or in the case of RHEL kernels at least a kernel from the same major release (e.g. RHEL7.5 kernel 3.10.0-862.x). There are several guides for building Lustre clients from source, see for example Building Lustre - TLDR Guide or Rebuilding the Lustre client RPMs for a new kernel.

The Lustre 2.12.x clients will normally be able to mount servers running Lustre 2.10.x, so long as the modules are rebuilt for your specific kernel so that they can be loaded. Lustre uses a more sophisticated mechanism (more than just version numbers) to exchange feature compatibility between clients and servers at connection time, and normal Lustre clients/servers allow connection between different releases even if newer features cannot be used.

I can't comment about whether the AWS FSx implementation limits clients to run a specific version. Limiting clients to require a specific version would be possible with a patch to the server code (the client and server also exchange their Lustre release versions at connection time, in addition to the list of supported features), but the version number is not used for anything other than informational purposes in the standard Lustre 2.10.x or 2.12.x releases.

You can check both the client and server versions, as well as mutually supported features (connect_flags) for each server by running "lctl get_param {mdc,osc}.*.import" on the client, or for each client by running "lctl get_param {mdt,obdfilter}.*.exports.*.export" on 2.11 and later servers (if you can login to the servers, which you typically cannot with AWS FSx).