Design patterns for the Linux kernel

Source: Internet
Author: User
Tags andrew morton

Originally from: http://lwn.net/Articles/336224/

Select the content of interest simply translated below :

The interest in the kernel community has always been to ensure quality. We need to ensure and improve quality is obvious. But how to do it is not that simple. A broad approach is to find some success to increase the transparency of the kernel in many ways. This will make the quality of these aspects clearer and therefore change the kernel quality.

Increase transparency in a variety of forms:

    • The checkpatch.pl script highlights the divergence from the written style of the accepted code. This will encourage people using this script to correct formatting problems. Therefore, by increasing the transparency of style guidance, we increase the consistency of the code representation and, to a certain extent, improve the quality.
    • The embedded "LOCKDEP" system dynamically sizes the dependencies and related states of the locks (such as when it can be interrupted). It will report what happens when the lock is abnormal. Exceptions are not just deadlocks or similar problems, but many things, and deadlocks can be removed. Therefore, by increasing the transparency of the lock dependency graph, the quality is improved.
    • The kernel contains a number of other transparency improvements, such as locating unused memory locations to improve transparency of effective access, or making bug reports more useful when stack traces are made with symbolic names rather than 16-based addresses.
    • At a higher level, use the GIT version to track software changes and see when everyone is doing something. The fact is that it encourages comments on patches to answer why this code is written. This transparency increases understanding of the code and improves quality as other developers are better informed.

There are many other places in the kernel to improve transparency or to improve quality. We will explore areas that improve the transparency of quality. It is called the kernel-related design pattern.

Design Patterns

Design patterns are first presented in the architecture domain and are brought into computer engineering, especially in the field of OO programming. < design Patterns published in 1994: Reusing elements of object-oriented software.

In simple terms, A design pattern describes a particular type of design problem and details of how effective solutions have been proven to solve such problems. The benefit of the design pattern is that it combines problem description and solution description together and gives them a name. It is very valuable to give a pattern a simple and memorable name. If the developer and review code are aware of the same pattern Named, a clear design decision can be communicated with one or two words, making this decision more transparent.

There are many effective design patterns in Linux code. However, most patterns are not documented, so it is not easy for other developers to find them. I want these patterns to be documented, to help people use them more broadly, and to get effective solutions when they are confronted with similar problems.

In this series we will see 3 areas of problems and find some very different important design patterns. The goal we do here is not only to point it out, but to demonstrate the scope and value of these patterns so that other developers can find the design patterns they see.

This series will show a large number of inspiring patterns of the Linux kernel. The code comes from 2.6.30-RC3.

Reference count

The idea of reference counting to manage an object's life cycle is very common. Its core idea is that once a new reference occurs, the counter is incremented by 1 and the release is reduced by 1. The object resource (such as memory) is freed when the counter =0.

The mechanism for managing reference counts seems quite straightforward. However, there are some details that make it easy to use the wrong mechanism. Partly for this reason, since 2004 the Linux kernel has a data type called "Kref". These encapsulate the details, Specifically clarified that a given counter will be used as a reference count in a certain way. As mentioned above, naming a design pattern is valuable, and providing a name that the kernel developer uses will be useful for reviewers.

Andrew Morton says:

I would be very careful to say: "Aha, it used kref. I understand the reference counting mechanism, I know it's good to debug, I know it will track common errors." That's more than "Oh, this implements my own reference count, I need to check for common errors."

This kref conclusion gives support for a tick and an obvious design pattern in the Linux kernel. A tick means that kref clearly encapsulates an important design pattern. Here are some references to the use of the count, the KREF model is not particularly applicable. A reference count does not provide the desired function function in fact will be wrong, sometimes people do not use when used to kref, in fact, not effective.

A useful step to understand the reference count complexity is to understand two common different classifications of an object reference, in fact there are 3 or more, but there are two types of generalization. We will take "internal" and "external" as examples, or "strong references" and "weak references" to be more appropriate.

An external reference is our most common. They can use "get" and "put" to be used by subsystems that are completely different from the subsystems that manage it. The existence of a counted external reference is strong and simple means: This object is in use.

In contrast, an internal reference is not counted and is simply managed by the system that holds it. No internal references can have different meanings, so there are different implementations.

The most common example of an internal reference is a cache that provides a "query name" service. If you know the name of an object, you can want to cache the request to get an external reference that actually only exists with the cache. Such a cache will save each object on the linked list, or a list of lists: such as a hash table. An object on a linked list is a reference to this object. However, it is probably not a reference count. It does not mean "This object is in use" but "this object hangs in some cases once some people need it". The object is removed from the list until the external reference completely disappears. or by then, it will not be removed immediately. The existence and characteristics of internal references are related to the implementation of reference counts.

An effective way to classify different reference counting styles is to implement the "put" operation. Get "operations are often the same. It eliminates an external reference and produces another external reference. It is often done this way:

    ASSERT (Obj->refcount > 0); Increment (Obj->refcount);

Or in the Linux kernel

    Bug_on (Atomic_read (&obj->refcnt)); Atomic_inc (&OBJ->REFCNT);

Note Get cannot be used on unreferenced objects.

There are three different implementations of put operations. There are overlaps in the user. The following:

   1      atomic_dec (&obj->refcnt);   2      if (Atomic_dec_and_test (&obj->refcnt)) {... do stuff ...}   3      if (Atomic_dec_and_lock (&obj->refcnt, &subsystem_lock)) {                 .... do stuff .... Spin_unlock (& Subsystem_lock);  }

Kref Style:

Style 2 is a kref style. This style is appropriate when an object is invalidated by the last external reference. When the reference count is 0, the object needs to be freed or otherwise processed. Therefore, it is necessary to detect whether 0 is not.

Objects that apply to this style often do not have any internal references to the concerns. Since most objects in this case are in Sysfs, the kref is a heavy operator. If an object with Kref style does not have an intrinsic reference, it is not allowed to create an external reference from an internal reference unless it is known to have other external references. If necessary:

     Atomic_inc_not_zero (&OBJ->REFCNT);

Adds a value that is not 0 and returns a result that implies success or failure. Atomic_inc_not_zero is a modern invention of Linux, which works as a lock-free page cache later in 2005. For this reason it has not been widely used, Some code can be replaced with spinlocks from it. The tragedy is that the KREF package does not use this feature.

An interesting example of the this style of reference the does not the use Kref, and does not even useatomic_dec_and_test () (though it could and arguably should) is the, ref counts in struct Super:s_count and S_activ E.

An interesting example of not using KREF is not even using atomic_dec_test to be two reference counts in Supers:count and is a_active.

A_active is just right for kref style. A super block starts at S-atcive=1 (set in Alloc_super, when S_active is 0, the external reference is not allowed again.) This rule is encoded in grab_super, though not very clear. Now the code adds a very large value when s_active is not 0 and Grab_super envoy S_count more than S_bias instead of s_active for 0. With Atomic_inc _not_zero will be well tested and avoid using spin locks.

S_count provides different references, with both internal and external references. Internal references are semantically weaker than s_active_count references. The S_count reference only means that the super block does not indicate that it is currently actually valid. External references are more like Kref start with 1, and when changed to 0 the super block is cleared .

So the two references can be replaced by two krefs:

    • S_bais set to 1
    • Grab_super with Atomic_inc_not_zero instead of detection s_bias

A whole bunch of spin locks can roll.

Kref style

The Linux kernel doesn ' t has a "kcref" object, but that's a name, seems suitable to propose for the next style of re Ference count. The "C" stands for "cached" as this style is very often used in caches. So it's a Kernel Cached REFerence.

A Kcref uses atomic_dec_and_lock () as given in option 3 above. It does this because, on the last put, it needs-be freed or checked-see if any other special handling are needed. This needs-to-be-done under a-lock to ensure no new reference are taken while the current state is being evaluated.

A Simple example Here's the I_count reference counter in struct Inode. The important part of Iput ()reads:

    if (Atomic_dec_and_lock (&inode->i_count, &inode_lock)) iput_final (inode);

where iput_final () examines the state of the inode and decides if it can is destroyed, or left in the cache in CA Se it could get reused soon.

among other things, the  inode_lock   Prevents new external references being created from the internal references of the Inode hash table. For the reason converting internal references to external references are only permitted while the  Inode_lock  is held. It is no accident, the function supporting this is called  iget_locked ()   (or  iget5_locked ()).

a slightly more complex example is in  struct dentry , where  d_count  is managed like a kcref. It is more complex because-locks need to being taken before we can be sure no new reference can be taken-both  Dcache_lock  and  de->d_lock . This requires the either we hold one lock and then Atomic_dec_and_lock ()  the Other (as in  prune_one_d Entry () ), or that we  atomic_dec_and_lock ()  the First, then claim the second and retest T-as in  dput () . This is good example of the fact, can never assume you have encapsulated all possible reference counting styles. Needing locks could hardly be foreseen.

An even more complex kcref-style refcount are mnt_count in struct Vfsmount. The complexity here is the interplay of the "the" and "refcounts that" this structure have: Mnt_count, which is a fairly s Traightforward count of external references, and mnt_pinned, which counts internal references from the process AC Counting module. In particular it counts the number of accounting files that is open on the filesystem (and as such could with a more Meani Ngful name). The complexity comes from the fact if there is only internal references remaining, they is all converted to Exter NAL references. Exploring the details of this are again left as a exercise for the interested reader.

The "plain" style

The final style for refcounting involves just decrementing the reference count (Atomic_dec ()) and not doing Anyth ing else. This style was relatively uncommon in the kernel, and for good reason. Leaving unreferenced objects just lying around isn ' t a good idea.

One use of the This style are in struct Buffer_head, managed by Fs/buffer.c and <LINUX/BUFFER_HEAD.H&G t;. Theput_bh () function is simply:

    static inline void Put_bh (struct buffer_head *bh)    {        smp_mb__before_atomic_dec ();        Atomic_dec (&bh->b_count);    }

This is OK because Buffer_heads has lifetime rules that was closely tied to a page. One or more buffer_heads get allocated to a page to chop it up into smaller pieces (buffers). They tend to remain there until the page was freed at which point all the buffer_heads would be purged (bydrop_buffers () called from try_to_free_buffers ()).

In general, the ' plain ' style is suitable if it's known that there would always be a internal reference so that the OBJEC T doesn ' t get lost, and if there is some process whereby this internal reference would eventually get used to find and free The object.

Anti-patterns

to wrap up this little review of reference counting as an introd Uction to design patterns, we'll discuss the related concept of an anti-pattern. While design patterns is approaches that has been shown to work and should is encouraged, anti-patterns is approaches t Hat history shows us don't work well and should is discouraged.

your author would like to suggest so the use of a "bias" in a R Efcount is a example of an anti-pattern. A bias in this context is a large value of added to, or subtracted from, the reference count and are used to effective Ly store one bit of information. We have already glimpsed the idea of a bias in the management of  s_count  for superblocks. The presence of the bias indicates that the value of  s_active  is Non-zero, which are easy E Nough to test directly. The bias adds no value here and only obscures the true purpose of the code.

another example of a bias is in the management of  Struct sysfs_dirent , in  fs/sysfs/sysfs.h  and fs/sysfs/dir.c . interestingly,  sysfs_dirent  has-refcounts just like superblocks, also called  S_count and  s_active . In this case , s_active  has A large negative bias when the entry is being deactivated. The same bit of information could is stored just as effectively and much more clearly in the flag word s_flags . Storing single bits of information in flags are much easier to understand the storing them as a bias in a counter, and sho Uld be preferred.

In general, the using a bias does not add any clarity as it's not a common pattern. It cannot add more functionality than a single flag bit can provide, and it would being extremely rare that memory are so Tigh T that one bit cannot is found to record whatever would otherwise is denoted by the presence of the bias. For these reasons, biases in refcounts should is considered anti-patterns and avoided if at all possible.

Conclusion

This brings to a close our exploration of the various design patterns surrounding reference counts. Simply has terminology such a "kref" versus "kcref" and "external" versus "internal" references can be very helpful in Increasing the visibility of the behaviour of different references and counts. Have code to embody the as we do with Kref and could with Kcref, and using the this code at every opportunity, would is a G Reat help both to developers who might find it easy-choose the right model first time, and to reviewers who can see MOR E clearly what is intended.

The design patterns we have covered in this article is:

  • kref: When the lifetime of an object extends only to the moment, the last external reference are dropped, a KR EF is appropriate. If there is any internal reference to the object, they can-only is promoted to external references with Atomic_inc_no T_zero (). Examples:s_active and s_count in struct Super_block.

  • kcref: With the lifetime of a object can extend beyond the dropping of the last external reference, the KCR EF with it atomic_dec_and_lock () is appropriate. An internal reference can only being converted to an external reference would the subsystem lock is held. Examples: i_countin struct inode.

  • Plain: When the lifetime of a object is subordinate to some and other object, the plain reference pattern is APPROPR Iate. Non-zero reference counts on the object must is treated as internal reference to the parent object, and converting Interna L references to external references must follow the same rules as for the parent object. Examples: b_count in struct Buffer_head.

  • biased-reference: When you feel the need to use add a large bias to the value in a reference count to indicate so Me particular state, and don ' t. Use a flag bit elsewhere. This was an anti-pattern.

Next Week we'll move on to another area where the Linux kernel have proved some successful design patterns and explore th e slightly richer area of complex data structures. (Part 2 and part 3of The This series is now available).

Exercises

As your author have been reminded while preparing this series, there are nothing like a directed study of code to clarify UN Derstanding of these sorts of issues. With this in mind, here is some exercises for the interested reader.

  1. Replace s_active and s_count in struct super with krefs, discarding s_bias in the process. Compare the result with the original using the trifecta of correctness, maintainability, and performance.

  2. Choose a more meaningful name is mnt_pinned and related functions that manipulate it.

  3. Add a function to the KREF library that makes use of Atomic_inc_not_zero (), and using it (or otherwise) remove th E Use of atomic_dec_and_lock () on a kref in net/sunrpc/svcauth.c -a usage which violates the Kref abstr Action.

  4. Examine the _count reference count in struct page (see mm_types.h for example) and determine WH Ether It behaves most like a kref or a kcref (Hint:it are not "plain"). This should involve identifying any and all internal references and related locking rules. Identify Why the page caches (struct Address_space.page_tree) owns a counted reference or explain why it should no T. This would involve understanding page_freeze_refs () and its usage in__remove_mapping (), as well as page_cache_{get,add}_speculative ().

Bonus credit:provide a series of minimal self-contained patches to implement all changes that the above investigations PR Oved useful.



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.