Filesystems, testing, and stable trees

By Jake Edge
May 31, 2022

In a filesystem session at the 2022 Linux Storage, Filesystem, Memory-management and BPF Summit (LSFMM), Amir Goldstein led a discussion about the stable kernel trees. Those trees, and especially the long-term support (LTS) versions, are used as a basis for a variety of Linux-based products, but the kind of testing that is being done on them for filesystems is lacking. Part of the problem is that the tests target filesystem developers so they are not easily used by downstream consumers of the stable kernel trees.

His interest in the problem comes about because he is using the 5.10 LTS kernel and the XFS filesystem. He realized that XFS is not being maintained in that kernel; there are only three XFS patches backported to it in the past two years or more. There is some history behind that, though most in the room already know it, he said.

He has been backporting XFS patches to 5.10 because there are more than just three bug fixes for XFS since that kernel was released. In something of a disclaimer, he said that it is his responsibility to do those backports; he is not suggesting that others should be doing that work. He has made some progress with the backports and has been doing some testing of them in conjunction with Luis Chamberlain. His intent in the session was to discuss the process for stable kernels and filesystems.

One reason that the stable kernels exist, Goldstein said, is to allow multiple organizations to collaborate and "not duplicate work". That only works if the LTS releases are used by the "big players", so the value of those releases drops if they are not widely used. Many distributions do not use the LTS kernels, but there are some organizations that do. Google Cloud, for one, is following the stable kernel releases, and he has heard that Microsoft is doing the same. Android is also following the stable releases, but that project has no interest in XFS.

The key to having stable kernels with stable filesystems is being able to run fstests (formerly xfstests) on them. That means collaborating on testing, the test suite, and the baselines of which tests are expected to pass and fail. Josef Bacik said that when he worked at Red Hat, one of the pain points was in running the most recent fstests on older kernels, as it would "blow up" in various ways, which was annoying. But running the latest fstests and seeing newer tests fail can also point to patches that you may want to backport "depending on how much pain you are willing to absorb", he said.

Goldstein said that fstests are mainly used to test the upstream kernels; when they are applied to LTS kernels "things happen" so it is not easy to do so. Fstests is not friendly to people trying to test LTS kernels, which is a different approach than that of another test framework that he works with, the Linux Test Project (LTP). That project has some practices that could be adopted by fstests; in particular, having a standard way to annotate regression tests, giving the commit that fixed the bug and what version of the kernel it is fixed in. That way, if the test fails on a different kernel, "you get a hint" that maybe a backport of that commit is needed or, perhaps, that the kernel under test will not support the feature being tested.

LTP also has a simple script that can be run on a kernel branch to determine if it has the commits that appear in the annotations, or has backports of those commits that refer to the original commits. That will give you a list of the tests that should work; the list will be customized to that exact kernel branch, he said.

Ted Ts'o said that most filesystems are happy to allow the stable developers to choose fixes to incorporate—XFS is a notable exception to that. For ext4, that the process works well, he said; every year or so there is a problematic ext4 patch that has to be reverted from the stable trees because it was not suitable for them. Normally, those kinds of patches are spotted during the stable review.

Ts'o and his team have been working on identifying XFS patches to apply to the 5.15 kernel, because that is a kernel of interest for Google, using the same scripts that Greg Kroah-Hartman and Sasha Levin use to identify candidate patches. It has taken longer to do this work than he had hoped, in part because of the time it has taken to get a baseline of which fstests should be passing so that they can detect failures caused by backports. They have been using an automated test system, with around ten different configurations based on input from XFS maintainer Darrick Wong.

It turns out that there were some fstests that only passed if they cherry-picked some of the "hundred-odd out-of-tree commits" that are in Wong's personal fstests tree, but have not yet gotten to the upstream repository. So, Ts'o now has his own fstests branch with the pieces from Wong that were needed.

It is his intent to report on the work that they have done to the XFS mailing list, including a list of the patches that they are proposing to add to 5.15. After that, there will need to be a negotiation about what is considered appropriate testing, Ts'o said, as well as a need to figure out how the XFS maintainers want to proceed. Whether the process will be to propose the fixes for stable and await any explicit nacks from the XFS folks, or whether the XFS maintainers will be explicitly choosing the set of patches to add to stable, is unclear at this point. That is a conversation that he hopes to have soon.

Chamberlain said that in the past, the XFS maintainers have agreed that he and Goldstein could review XFS patches for the stable kernels. But, as noted by Ts'o and others, establishing the baseline takes a lot of thankless work; it also requires fairly large systems, Chamberlain said. Right now, each developer is making their best effort at testing, but the community needs to collaborate more on the testing effort; the next LSFMM session would cover some of that, he said. Candidates for XFS fixes can be sent to him and Goldstein; they will queue the patches up for their testing, which will help give some confidence about whether the patches are good candidates or not.

Jan Kara came in over the Zoom link to say that the distributions, including SUSE where he works, do care about XFS fixes. The SUSE folks pick up XFS fixes and he thinks that Red Hat does the same thing. If those fixes do not end up in the stable kernel, they get backported to the enterprise kernels and then tested. The resources required to do all of that are fairly large. There is a need for developers with "at least a bit of a clue" to look at the patches to see if they make sense to be backported and then do that work if so. Then there is "quite a lot of testing", he said.

Goldstein talked about a tool that he created when he was looking at all of the XFS fixes from 5.10 to 5.17, which turned out to be around 600 patches. The tool uses the public-inbox mailing list archives to collect up all of the relevant patch series and, in particular, the cover letter. That made it much easier to see what dependencies there are and which patches to choose. It is "still human work", but the tool is a great assistant.

Ts'o noted that he does a round of testing of ext4 every three to four months using the latest LTS kernels. The resources required to actually run the test are modest; for a few dollars of Google Cloud time, he can run multiple configurations of fstests. The expensive part is the developer time to interpret the failures and to figure out if there is patch that did not get automatically chosen but should have.

Every time he does that round of tests, he finds one to three patches that he needs to manually backport and send to the stable developers. He is not sure whether other filesystem maintainers are doing similar testing, but it is valuable. That kind of testing is also not something that the maintainers themselves would need to do, it might be a good opportunity to add some newer developers to the filesystem community, he suggested.

There was some more discussion of what needs to be done to make it easier to run fstests on older kernels. Steve French wondered if there needed to be stable branches of fstests that could be kept in sync with the stable kernel releases. Goldstein said that annotations of commits and versions for fixes will be important to make it easier to use fstests on a wider variety of kernel versions.

Index entries for this article
Kernel	Development model/Stable tree
Kernel	Filesystems/Testing
Conference	Storage, Filesystem, Memory-Management and BPF Summit/2022

Filesystems, testing, and stable trees

Posted May 31, 2022 19:50 UTC (Tue) by iabervon (subscriber, #722) [Link] (2 responses)

It seems like fstests should be able to have three results: the operation worked, the operation was cleanly rejected, or something went terribly wrong. For testing stable trees, you'd run the latest fstests against the .0 kernel version and the current kernel version, and want no "terribly wrong" results in the current kernel version and no "cleanly rejected" results from tests that "worked" in the .0 kernel version. Of course, upstream developers would want all of the tests to work on their kernels, since the kernel should support the feature you added. On the other hand, it could be useful for new development to verify that the user space usage you're encouraging doesn't corrupt filesystems with older kernels or otherwise get into incomprehensible states if your new code isn't in the kernel.

Filesystems, testing, and stable trees

Posted May 31, 2022 22:36 UTC (Tue) by Paf (subscriber, #91811) [Link] (1 responses)

This is straightforward - version checking in the test to recognize what result is expected - but requires a lot of work to validate all of these different versions and keep it clean.

Filesystems, testing, and stable trees

Posted Jun 1, 2022 13:44 UTC (Wed) by metan (subscriber, #74107) [Link]

In LTP we do maintain all of that as a static data. For example all that is needed to set minimal kernel version required for the functionality is to set the 'min_kver' variable in a test, the rest of the magic happens in the test library. Generally it takes a lot of work to properly track all the metadata such as kernel commits that fixes regression the test is testing etc., however all this efforts actually pays for itself in the long term. After we managed to tag significant amount of tests with different kinds of metadata the number of emails and issues asking for a help with test failures dropped significantly.

Also I would say that this is more about setting the right guidelines and culture rather than being a technical problem. Once implementation and processes are in place all new tests include the right metadata from a start and everything grows into the right direction effortlessly.

Filesystems, testing, and stable trees

Posted Jun 2, 2022 17:56 UTC (Thu) by amir73il (subscriber, #66165) [Link]

I was contacted by LTP developer who asked me about
"LTP also has a simple script that can be run on a kernel branch to determine if it has the commits that appear in the annotations, or has backports of those commits that refer to the original commits."

I was misunderstood on that point. What I said is that it would be simple to write such a script to auto create an expunge list for fstests for any kernel branch. This may be better explained in my post to fstests: https://lore.kernel.org/fstests/20220419125637.2502181-1-...

Thank you Jake for another great write up!

Filesystems, testing, and stable trees

Posted Jun 15, 2022 11:05 UTC (Wed) by nix (subscriber, #2304) [Link] (2 responses)

> The key to having stable kernels with stable filesystems is being able to run fstests (formerly xfstests) on them.

This is important, but it would not spot any problems of the class linked higher in this article (which I have a personal interest in, being the person that bug bit). AIUI, fstests creates filesystems and then tests them, all in the same kernel version: situations in which filesystems touched by earlier kernels (or, alternatively, made by older mkfses) have properties that newly-mkfsed ones do not will not be spotted, which is how the bug in question sneaked in (the new kernel could handle stuff *it* had written fine, just not stuff older kernels predating the problematic change had written).

And, of course, almost all filesystems will have been written by "older kernels" by this definition: nobody re-mkfses all their writable filesystems on every kernel upgrade!

(Maybe a fstest run mode in which the mkfses and some of the writes are done by a stable baseline kernel and then the rest are done by the kernel under test might work, but where do you draw the boundary between which things should be done by an older kernel and which by a newer? Bugs of this nature might occur after any of those writes...)

Filesystems, testing, and stable trees

Posted Jun 15, 2022 16:11 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

This sounds like a problem that hit raid ... a problem sneaked in, the layout of an asymetric raid-10 was accidentally changed. (Asymetric as in different sized disks.)

It was really nasty in that you had to run a kernel the same side of the change as the one the raid was created by - both new kernel on old raid and old kernel on new raid would lead to data corruption. What do you do? You can't revert the change because any arrays created with a modified kernel would be inaccessible.

They ended up by adding a flag to the raid to identify old or new, and all new kernels now refuse to start an array if that flag is missing. It's easy to add, so new kernels are fine, people are unlikely to revert to old kernels, and asymetric raids aren't that common ...

Yes, upgrading does show up a completely different class of bug ... :-)

Cheers,
Wol

Filesystems, testing, and stable trees

Posted Jun 17, 2022 14:58 UTC (Fri) by nix (subscriber, #2304) [Link]

That was much worse because it wasn't spotted for ages, so simply reverting the change wasn't possible because many people were running arrays built with the new code :/ I'm ever so glad I wasn't the person who had to fix that.