(Translated by https://www.hiragana.jp/)
Statistics from the 5.2 kernel — and before [LWN.net]
|
|
Subscribe / Log in / New account

Statistics from the 5.2 kernel — and before

Did you know...?

LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

By Jonathan Corbet
June 21, 2019
As of this writing, just over 13,600 non-merge changesets have been pulled into the mainline repository for the 5.2 development cycle. The time has come, once again, for a look at where that work came from and who supported it. There are some unique aspects to 5.2 that have thrown off some of the usual numbers.

1,716 developers contributed changes for the 5.2 kernel, 245 of whom made their first contribution during this cycle. Those 1,716 developers removed nearly 490,000 lines of code, which is a lot, but the addition of 596,000 new lines of code means that the kernel still grew by 106,000 lines. The most active developers this time around were:

Most active 5.2 developers
By changesets
Thomas Gleixner4413.2%
Alexandre Belloni1861.4%
Yue Haibing1711.3%
Chris Wilson1681.2%
Guenter Roeck1601.2%
Ville Syrjälä1351.0%
Christoph Hellwig1250.9%
Axel Lin1100.8%
David Sterba1070.8%
Christophe Leroy1060.8%
Gustavo A. R. Silva1040.8%
Colin Ian King1010.7%
Bart Van Assche990.7%
Alex Shi960.7%
Masahiro Yamada930.7%
David Ahern880.6%
Maxime Ripard880.6%
Andy Shevchenko830.6%
Arnd Bergmann790.6%
Laurent Pinchart780.6%
By changed lines
Greg Kroah-Hartman13217914.4%
Thomas Gleixner12654713.8%
Yan-Hsuan Chuang475185.2%
Gabriel Krisman Bertazi201702.2%
Liam Girdwood173791.9%
Olaf Weber134411.5%
Andi Kleen122321.3%
Hans Verkuil117091.3%
Chris Wilson100251.1%
Mauro Carvalho Chehab98091.1%
Vladimir Oltean70740.8%
Gal Pressman61430.7%
David Howells56520.6%
Neil Brown53450.6%
Linus Walleij50230.5%
Rob Herring50230.5%
Tzvetomir Stoyanov49980.5%
Ryder Lee48900.5%
Tony Lindgren48680.5%
Neil Armstrong47830.5%

Of the 441 patches that put Thomas Gleixner at the top of the "by changesets" column, 349 were a part of the ongoing effort to add SPDX tags to every kernel source file. That still leaves nearly 100 changes for little issues like the the microarchitectural data sampling vulnerabilities, reworking the handling of stack traces, and more. Alexandre Belloni contributed a long list of realtime clock changes, Yue Haibing made code cleanups all over the tree, Chris Wilson made many changes to the i915 graphics driver, and Guenter Roeck made a lot of improvements to the hardware monitoring and watchdog subsystems.

In the "lines changed" column, Greg Kroah-Hartman deleted the rtlwifi driver from the staging tree; it has been superseded by a non-staging driver. One might not think that adding a single SPDX line to files would lead to a lot of changed lines, even when a lot of files are involved, but most of those additions also allowed the removal of a lot of license boilerplate. As a result, Gleixner's SPDX work resulted in the removal of over 100,000 lines from the kernel. Yan-Hsuan Chuang only contributed three changes, but they brought in the new production Realtek driver, which is not small. Gabriel Krisman Bertazi added support for case-insensitive file-name lookups to the ext4 filesystem; that, in turn, brought in a large body of automatically-generated code for UTF-8 handling. Liam Girdwood add the Sound Open Firmware system.

The top testers and reviewers this time around are a bit different from the usual crowd:

Test and review credits in 5.2
Tested-by
Andrew Bowers13216.6%
Sebastian Reichel455.7%
Zhang Lei243.0%
Leo Yan202.5%
Robert Walker202.5%
Arnaldo Carvalho de Melo141.8%
Jon Hunter131.6%
Jon Masters121.5%
Jerome Brunet111.4%
Brice Goglin101.3%
Stefan Wahren101.3%
Aaron Brown101.3%
Mathieu Malaterre91.1%
Holger Hoffstätte91.1%
Bart Van Assche81.0%
Jeffrey Hugo81.0%
Oleksandr Natalenko81.0%
Sudip Mukherjee81.0%
James Smart70.9%
Michał Mirosław70.9%
Robert Yang70.9%
Reviewed-by
Allison Randal3455.4%
Rob Herring1993.1%
Kate Stewart1742.7%
Richard Fontana1572.4%
Alexios Zavras1201.9%
Mukesh Ojha1161.8%
Huang Rui1071.7%
David Sterba1031.6%
Alex Deucher1001.6%
Tvrtko Ursulin991.5%
Armijn Hemel961.5%
Florian Fainelli851.3%
Simon Horman791.2%
Andrew Lunn771.2%
Takashi Iwai741.2%
Chris Wilson731.1%
Steve Winslow651.0%
Guenter Roeck631.0%
Christoph Hellwig621.0%
Mauro Carvalho Chehab600.9%

Andrew Bowers apparently tested a large number of i40e network-driver patches written by co-workers at Intel. Then, there are some familiar names in the review column. People like Allison Randal, Kate Stewart, and Richard Fontana are all quite well known in our community, but they tend not to show up often in kernel patches. In this case, all of them spent time (along with Alexios Zavras and Armijn Hemel) reviewing the SPDX changes. This is important work: the addition of SPDX lines must reflect the actual license of each file and not inadvertently change that license.

Work on 5.2 was supported by 215 employers (that we know of), the most active of which were:

Most active 5.2 employers
By changesets
Intel168412.4%
(Unknown)10637.8%
Red Hat8596.3%
Google7305.4%
(None)6334.7%
AMD5444.0%
Linutronix4803.5%
SUSE4233.1%
IBM4113.0%
Linaro3702.7%
Huawei Technologies3672.7%
Bootlin3602.6%
Mellanox3542.6%
ARM3052.2%
(Consultant)2952.2%
Renesas Electronics2762.0%
Oracle2171.6%
NXP Semiconductors1821.3%
Linux Foundation1601.2%
BayLibre1591.2%
By lines changed
Linux Foundation13286314.5%
Linutronix12735213.9%
Intel9972110.9%
Realtek477055.2%
(Unknown)355153.9%
Red Hat348693.8%
AMD239932.6%
Collabora Multimedia227942.5%
Linaro226962.5%
Google224502.5%
(None)222002.4%
IBM196512.1%
MediaTek149371.6%
Mellanox146571.6%
Cisco146561.6%
Samsung142601.6%
SUSE135591.5%
SGI134411.5%
ARM124071.4%
BayLibre117531.3%

There are few surprises here, as usual. It is nice to see Realtek showing up as a top contributor, though — a change from the recent past.

When developers apply a Signed-off-by tag to a patch that they did not write, it is usually an indication that they are merging that patch into a subsystem tree. A look at non-author signoffs thus gives an idea of who the gatekeepers for the kernel are. In 5.2, the most active developers in this role, and the companies that supported them, are:

Most non-author signoffs
Individuals
David S. Miller12149.4%
Greg Kroah-Hartman11919.2%
Mark Brown6435.0%
Alex Deucher5304.1%
Martin K. Petersen3182.5%
Andrew Morton3042.4%
Mauro Carvalho Chehab2221.7%
Ingo Molnar2091.6%
Michael Ellerman2081.6%
Jason Gunthorpe1941.5%
Jens Axboe1871.5%
Jonathan Cameron1771.4%
Jonathan Corbet1731.3%
Herbert Xu1651.3%
Alexei Starovoitov1491.2%
Jeff Kirsher1461.1%
David Sterba1431.1%
Hans Verkuil1401.1%
Shawn Guo1381.1%
Daniel Borkmann1261.0%
Employers
Red Hat225617.5%
Linux Foundation12559.7%
Linaro12319.6%
Intel10458.1%
Google7045.5%
AMD5564.3%
(None)4173.2%
Mellanox3973.1%
IBM3943.1%
Facebook3943.1%
SUSE3823.0%
Oracle3572.8%
ARM3332.6%
Huawei Technologies3302.6%
Samsung3222.5%
(Unknown)2461.9%
Bootlin1981.5%
LWN.net1731.3%
Code Aurora Forum1461.1%
(Consultant)1411.1%

The concentration of subsystem maintainers into a small number of companies has spread out over time, but it is still true that (just) over 50% of the non-author signoffs are made by developers working in five companies.

Code longevity

Finally, it has been a while since we have looked at how much code from each development cycle remains in the 5.2 kernel. This determination is made through the application of a fair amount of brute force: for each file in the kernel tree, git blame is run and the commit that added each line is noted. After associating each commit with the development cycle that brought it into the mainline, it is possible to see how many lines of code were introduced by each cycle. The results look like this:

[Bar chart]

As can be seen here, newer releases are more heavily represented than older ones, which is not surprising. There has been more time for changes to be made to code from older releases, and more recent cycles tend to add more lines in general. Still, it seems clear that a lot of the code we add stays in the kernel for a long time.

The most graphic illustration of that, actually, does not appear in the graph because it would squeeze the rest into insignificance: over 2.5 million lines of code in the kernel — nearly 10% of the total — date back to the initial Git commit in 2005. Much of that code was old even then, and certainly hasn't gotten any younger.

Stepping back one release, there are 215 files in 5.1 that have not seen a single change since the initial commit. This number can only undercount the number of dormant files by a significant factor: any file that has seen even a single whitespace fix, coding-style change, or typo fix will not appear in this list. Indeed, the list was generated from the 5.1 kernel because the SPDX work in 5.2 is obscuring things further. These files may get shiny new license tags, but that won't make them any more alive.

Code that is this old perhaps just reached perfection years ago and needs no further changes. In general, though, static code is often unmaintained, unused, and possibly insecure. For anybody looking to do a little cleanup work, this list might not be a bad place to start.
Index entries for this article
KernelReleases/5.2


(Log in to post comments)

Statistics from the 5.2 kernel — and before

Posted Jun 21, 2019 19:47 UTC (Fri) by smurf (subscriber, #17840) [Link]

There's the historic tree of imported kernel tarballs and whatnot, which could extend this graph backwards … just add a graft.

Statistics from the 5.2 kernel — and before

Posted Jun 23, 2019 0:15 UTC (Sun) by dankohn (guest, #6006) [Link]

This would be an incredibly interesting to see. In particular, do any lines remain from Linus's original public posting?

Statistics from the 5.2 kernel — and before

Posted Jun 23, 2019 1:44 UTC (Sun) by karkhaz (subscriber, #99844) [Link]

There's a cool visualization to be had here. Each pixel on a diagram is a line of code, red if Linus wrote it and black otherwise. The pixels are grouped by subsystem or something logical. Initially the diagram is just a small red blob, and then the animation starts, black pixels get added the red core, and while the red area may even continue to grow for a short period of time, it eventually gets covered in black.

It would be cool to see this on a per-subsystem basis. On which subsystem did Linus's code get totally replaced the fastest?

And actually, no reason why this should only be for Linus. It would be cool to see the contributions over time for any large contributor of code, again by subsystem (have they always contributed to the same area, or did they jump around? etc).

Statistics from the 5.2 kernel — and before

Posted Jun 24, 2019 10:02 UTC (Mon) by nix (subscriber, #2304) [Link]

It feels like a souped-up version of Gource would be useful for this. (Very souped up. OK possibly more like a rewrite.)

Statistics from the 5.2 kernel — and before

Posted Jun 22, 2019 4:15 UTC (Sat) by pabs (subscriber, #43278) [Link]

I wonder how different the code longevity graph would look if cregit were used instead.


Copyright © 2019, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds