According to the number of dirty words/version number statistics
By language density/version number statistics
The above illustration shows the c,h in the Linux kernel and the bad language statistics in the s source code, which I will update once a month, and updated once the new version is released. I was inspired by the Linux kernel fuck count, but unfortunately the data inside it has expired.
It is clear from the figure that the number of swearing has increased substantially since the 2.4 edition. However, the total amount of code has also increased a lot, so, overall, the average curse density per line is reduced.
Introduction to statistical methods: words that appear in any place are counted in the total--in another word. Could have done more reasonable, but it turns out that the FreeBSD regular expression engine has a serious memory leak problem, I have no further improvement. A dirty word may be counted several times in a row, because sometimes a programmer encounters a very, very annoying day.
You can find this script, but it's written in a really messy way, not recommended.