C + + and the perils of double-checked Locking:part i_

C + + and the perils of double-checked Locking:part i__c++

Last Update:2018-07-26 Source: Internet

Author: User

Tags posix

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

C + + and the perils of double-checked Locking:part I
In this Two-part article, Scott and Andrei examine double-checked locking.

Google the newsgroups or Web for the "names of design patterns, and" Re sure to find this one of the most commonly menti OneD is Singleton. Try to put Singleton in practice, however, and you ' re all but certain to bump into a significant limitation:as traditio Nally implemented, Singleton isn ' t thread safe.

Much effort has the been put into addressing this shortcoming. One of the most popular approaches is-a design-in-its-own right, the double-checked-locking pattern (DCLP); Douglas c. Schmidt et al., "double-checked locking" and Douglas C. Schmidt et al. pattern-oriented Software Architect Ure, Volume 2. DCLP is designed to add efficient thread safety to initialization of a shared resource (such as a Singleton), but it has a Problem-it ' s not reliable. Furthermore, there ' s virtually no portable way to make it reliable in C + + (or in C) without substantively the CO Nventional pattern implementation. To make matters even more interesting, DCLP can fail for different reasons on uniprocessor and multiprocessor S.

In this Two-part article, we explain why Singleton isn ' t thread safe, how DCLP attempts to address that problem, why DC LP may fail on both Uni-and multiprocessor architectures, and why you Can ' t (portably) do anything about it. Along the way, we clarify the relationships among statement ordering in source code, sequence points, compiler and Hardwar E optimizations, and the actual order of statement execution. Finally, in the next installment, we conclude with some suggestions regarding you to add thread safety to Singleton (and S Imilar constructs) such that resulting code is both reliable and efficient.
The Singleton pattern and multithreading

The traditional implementation of the Singleton pattern (for a., design patterns:elements of reusable OBJ Ect-oriented Software) is based on making a pointer point to a new object the "the" is requested. In a single-threaded environment, Example 1 generally works fine, though interrupts can be problematic. If you are are in Singleton::instance, receive a interrupt, and invoke Singleton::instance from the handler, you can You ' d get into trouble. Interrupts aside, however, this implementation works fine in a single-threaded environment.

From the header file
Class Singleton {
Public
Static singleton* instance ();
...
Private
Static singleton* pinstance;
};

From the implementation file
singleton* Singleton::p instance = 0;

singleton* singleton::instance () {
if (pinstance = = 0) {
Pinstance = new Singleton;
}
return pinstance;
}

Example 1:in single-threaded environments, this code generally works okay.

Unfortunately, this implementation isn't reliable in a multithreaded environment. Suppose that Thread A enters the instance function, executes through line, and is then suspended. At the point where it's suspended, it has just determined that pinstance is null; This is, no Singleton object has yet been created.

Thread B now enters instance and executes line 14. It sees that pinstance are null, so it proceeds to and creates a Singleton for pinstance to point to. It then returns pinstance to instance ' s caller.

At some point later, Thread A are allowed to continue running, and the i thing it does is move Jures up another Singleton object and makes Pinstance point to it. It should is clear this violates the meaning of a Singleton, as there are now two Singleton.

Technically, line one is where pinstance are initialized, but for practical purposes, it's line so makes it point where We want it to, so for the "remainder of this article, we'll treat line as the" where Pinstance is initialized.

Making the classic Singleton implementation thread safe is easy. Just acquire a lock before testing pinstance, as in Example 2. The downside to this solution are that it may be expensive. Each access to the Singleton requires acquisition's a lock, but in reality, we need a lock only when initializing Pinstan Ce. That's should occur only the ' the ' the ' only ' instance is called. If instance is called N during the course of a program run, we are need the lock only for the "I call." Why pay for N lock acquisitions then you know that n-1 of them are? DCLP is designed to prevent for you.

singleton* singleton::instance () {
Lock lock; Acquire lock (params omitted for simplicity)
if (pinstance = = 0) {
Pinstance = new Singleton;
}
return pinstance;
}//Release lock (via lock destructor)

Example 2:acquiring a lock before testing pinstance.
The double-checked locking pattern

The crux of DCLP is the observation this most calls to instance and you are not NULL, and do pinstance try to ini Tialize it. Therefore, DCLP tests pinstance for Nullness before trying to acquire a lock. Only if the test succeeds (which, if pinstance has not yet been initialized) is the lock acquired. After that, the test was performed again to ensure Pinstance is still null (hence, the "double-checked" locking). The second test is necessary because it's possible that another thread happened to initialize Pinstance Pinstance was-tested and the time of the lock was acquired.

Example 3 is the classic DCLP implementation (click Douglas c. Schmidt et al., "double-checked locking" and Douglas C. Schmi DT et al., pattern-oriented Software architecture, Volume 2). The papers defining DCLP discuss some implementation (that is, the issues of importance Pointer and the impact of separate caches on multiprocessor systems, both of which we-address later; As OK as the need to ensure the atomicity of certain reads and writes, which we don't discuss in this article), but the Y fail to consider a much fundamental problem:ensuring that machine instructions executed the during DCLP are execut Ed in a acceptable order. This is the fundamental problem we are here.

singleton* singleton::instance () {
if (pinstance = = 0) {//1st Test
Lock lock;
if (pinstance = = 0) {//2nd Test
Pinstance = new Singleton;
}
}
return pinstance;
}

Example 3:the Classic DCLP implementation.

DCLP and instruction ordering

Consider again pinstance = new Singleton;, the line that initializes pinstance. This is statement causes three things to happen:

Step 1. Allocate memory to hold a Singleton object.
Step 2. Construct a Singleton object in the allocated memory.
Step 3. Make pinstance the allocated memory.

Of critical importance is the observation that compilers are not constrained to perform these steps in this order! In particular, compilers are sometimes allowed to swap Steps 2 and 3. Why they might want to be a question we ' ll address in a moment. For now, let's what happens if they do.

Consider Example 4, where we ' ve expanded pinstance ' s initialization line into the three constituent tasks just mentioned a nd where we ' ve merged Steps 1 (memory allocation) and 3 (pinstance Assignment) into a-statement that-precedes step 2 (Singleton construction). The idea isn't that's a human would write this code. Rather, it's that's a compiler might generate code equivalent to the response conventional source code that A human would write.

singleton* singleton::instance () {
if (pinstance = = 0) {
Lock lock;
if (pinstance = = 0) {
Pinstance =//Step 3
operator new (sizeof (Singleton)); Step 1
New (pinstance) Singleton; Step 2
}
}
return pinstance;
}

Example 4:pinstance ' s initialization line expanded to three tasks.

In general, this isn't a valid translation of the original DCLP source code because the Singleton constructor called in S Tep 2 might throw an exception. And, if a exception is thrown, it's important that pinstance has not yet been. That's why, in general, the compilers cannot move step 3 above step 2. However, there are conditions under which this transformation is legitimate. Perhaps the simplest such condition is when a compiler can prove to that Singleton constructor cannot throw (via Postinli Ning flow analysis, for instance), and but that isn't the only condition. Some constructors that throw can also have their instructions reordered such the This problem arises.

Given the above translation, consider the following sequence of events:

Thread A enters instance, performs the ' Pinstance of the ', acquires the lock, and executes the statement made up of S Teps 1 and 3. It is then suspended. At this point, pinstance isn't null, but no Singleton object has yet been constructed in the memory pinstance points to.< C0/>thread B enters instance, determines that pinstance isn't null, and returns it to instance ' s caller. The caller then dereferences the pointer to access the Singleton this, oops, has not yet been.

DCLP works only if Steps 1 and 2 are completed before step 3 is performed, but there are no way to express this constraint In C or C + +. That's the dagger in the heart of dclp-you need to define a constraint on relative instruction ordering, but the languages Give you are no way to express the constraint.

Yes, the C and C + + standards (9899:1999 International Standard and ISO/IEC 14882:1998 (E), respectively) do def INE sequence points, which define constraints on the order of evaluation. For example, paragraph 7 of section 1.9 of the C + + Standard encouragingly states:

At certain specified points on the execution sequence called sequence points, all side effects of previous evaluations Sha ll is complete and no side effects of subsequent evaluations shall have taken place.

Furthermore, both standards State "a sequence point occurs in the" end of each statement. So it seems this if you ' re just careful and how do you sequence your statements, everything to place.

Oh, Odysseus, don ' t let thyself being lured by sirens ' voices; For more trouble is waiting for thou and thy mates!

Both standards define correct program behavior in terms of the ' observable behavior ' of an abstract machine. But not everything about this machine is observable. For example, consider function Foo in Example 5 (which looks silly, but might plausibly is the result of inlining some oth Er functions called by Foo).

void Foo () {
int x = 0, y = 0; Statement 1
x = 5; Statement 2
y = 10; Statement 3
printf ("%d,_%d", X, y); Statement 4
}

Example 5:this code could is the result of inlining some other functions by Foo.

In both C and C + +, the standards guarantee that Foo prints "5,_10". But that ' s about the extent of what we ' re guaranteed. We don ' t know whether statements 1-3 'll be executed on all and, in fact, a good optimizer'll get rid of them. If statements 1-3 are executed, we know that statement 1 precedes statements 2-4 and-assuming This call to printf isn ' T inlined and the result further optimized-we know about the relative ordering of statements of 2 and 3. Compilers might choose to execute Statement 2-A, statement 3-A-or-even to-execute them both in parallel, assuming The hardware has some way to do it. Which it might the have. Modern processors have a large word size and several execution. Two or more arithmetic units are common. (for example, the Pentium 4 has three integer alus, PowerPC ' s g4e has, and Itanium four has.) Their machine language allows compilers to generate code this yields parallel execution of two or more instructions in a s Ingle Clock cycle.

Optimizing compilers carefully analyze and reorder your code so as into execute as many things at once as possible (Withi n the constraints on observable behavior). Discovering and exploiting such parallelism in regular serial code are the single most important to for reason cod E and introducing Out-of-order execution. But it ' s not the only reason. Compilers (and linkers) might also reorder instructions to avoid spilling data from a register, to keep the instruction Pi Peline full, to perform common subexpression elimination, and reduce the size of the generated executable (in the "Bruno De Bu S et al., "Post-pass compaction Techniques").

When performing these kinds of optimizations, C + + compilers and linkers are by the constrained of dictates Vable behavior on the abstract machines defined by the language standards, And-this is the important abstract MA Chines are implicitly single threaded. As languages, neither C nor C + + have threads, so compilers don ' t have to worry about breaking threaded when programs are optimizing. It should, therefore, not surprise your they sometimes do.

That being the case, how can you write C and C + + multithreaded programs that actually work? By using the System-specific libraries defined for that purpose. Libraries such as POSIX threads (pthreads) (ansi/ieee 1003.1c-1995) give precise specifications for the execution sema Ntics of various synchronization primitives. These are libraries impose restrictions on the code this library-conformant compilers are permitted to generate, thus forcing Such compilers to emit code, that respects the execution ordering, constraints on which those libraries depend. That's why threading packages have parts written in assembler or issue system calls this are themselves written in ASSEMBL ER (or in some unportable language): Your have to go outside Standard C and C + + to express the ordering constraints ltithreaded programs require. DCLP tries to get by using only language constructs. That ' s why DCLP isn ' t reliable.

As a rule, the programmers don ' t like is pushed around by their compilers. Perhaps you are such a programmer. If So, the May is tempted to try to outsmart your compiler by adjusting your source code so this pinstance remains Ed until after Singleton ' s construction is complete. For example, you might try inserting with a temporary variable, as in Example 6. In essence, your ' ve just fired the opening salvo in a war of optimization. Your compiler wants to optimize. Don ' t want it to, in least not here. But This isn't a battle you want to get into. Your foe are Wiley and sophisticated, imbued with strategems dreamed up over decades by people who does nothing but UT this is kind of thing all day long, day after, and year. Unless you write optimizing compilers yourself, they are way the ahead of you. In this case, for example, it would is a simple matter for the compiler to apply dependence analysis to determine that TEM P is a unnecessary variable, hence, to eliminate IT, thus treating your carefully crafted "unoptimizable" code if it had been written in the traditional DCLP. Game over. You lose.

singleton* singleton::instance () {
if (pinstance = = 0) {
Lock lock;
if (pinstance = = 0) {
singleton* temp = new Singleton; Initialize to Temp
Pinstance = temp; Assign temp to Pinstance
}
}
return pinstance;
}

Example 6:using a temporary variable.

If you are for bigger ammo and try moving temp to a larger scope (say, by making it file static), the compiler can St Ill perform the same analysis and come to the same conclusion. Scope, Schmope. Game over. You lose. So, call for backup. Declare temp extern and define it in a separate translation unit, thus preventing your compiler from seeing what you Re doing. Alas, some compilers have the optimizing equivalent of Night-vision Goggles:they perform-interprocedural analysis, Discov Er your ruse with temp, and again optimize it out of existence. Remember, these are optimizing compilers. They ' re supposed to track down unnecessary code and eliminate it. Game over. You lose.

So your try to disable inlining by defining a helper function in a different file, thus forcing the compiler to assume The constructor might throw a exception and, therefore, delay the assignment to pinstance. Nice try, but some builds environments perform link-time inlining followed by the more code optimizations (OK Bruno De bus et Al., "Post-pass compaction techniques;" Robert Cohn et al., "Spike:an Optimizer for alpha/nt executables;" and Matt Pietrek, "Link-time Code Generation"). Game over. You lose.

Nothing for you can alter the fundamental problem:you need-to is able to specify a constraint on instruction ordering, and Your language gives no way to do it.
Next Month

In the next installment of this Two-part article, we'll examine the role of the volatile keyword, and you'll be what impact DCLP has On multiprocessor machines, and conclude with a few suggestions.
References

Bruno De Bus, Daniel Kaestner, Dominique Chanet, Ludo Van put, and Bjorn De Sutter. "Post-pass compaction techniques." Communications of the ACM (8): 41-46, August 2003. ISSN 0001-0782. http://doi.acm.org/10.1145/859670.859696.

Robert Cohn, David Goodwin, P. Geoffrey Lowney, and Norman Rubin. "Spike:an Optimizer for alpha/nt executables." http://www.usenix.org/publications/library/proceedings/usenix-nt97/ Presentations/goodwin/index.htm, August 1997.

Ansi/ieee 1003.1c-1995, 1995. IEEE Standard for information Technology. Portable Operating System Interface (POSIX)-system application program Interface (API) Amendment 2:threads-Extension (C L Anguage).

Erich Gamma, Richard Helm, Ralph Johnson, and John vlissides. Design patterns:elements of reusable object-oriented Software. Addison-wesley, 1995.

Matt Pietrek. "Link-time Code Generation." MSDN Magazine, May 2002. Http://msdn.microsoft.com/msdnmag/issues/02/05/Hood/.

Douglas C. Schmidt and Tim Harrison. "Double-checked locking." In Robert Martin, Dirk Riehle, and Frank Buschmann, editors, Pattern Languages of the program Design 3, Addison-wesley, 1998. Http://www.cs.wustl.edu/~schmidt/PDF/DC-Locking.pdf.

Douglas C. Schmidt, Michael Stal, Hans rohnert, and Frank Buschmann. pattern-oriented Software Architecture, Volume 2. Wiley, 2000. Tutorial notes based on the "patterns in", are available at http://cs.wustl.edu/~schmidt/posa2.ppt.

ISO/IEC 14882:1998 (E) International Standard. Programming languages-c++. ISO/IEC, 1998.

ISO/IEC 9899:1999 International Standard. Programming languages-c. ISO/IEC, 1999.

Scott is author of effective C + + and consulting editor for the Addison-wesley effective Software Development series. He can is contacted at http://aristeia.com/. Andrei is the author of modern C + + design and a columnist for the C + + Users Journal. He can is contacted at http://moderncppdesign.com/.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More