Dan Farino about MySpace Architecture
The interviewee, Dan Farino, was interviewed by Ryan Slobojan on May 25, 2009 3:30 A.M.
Hi, I'm Ryan Slobojan, this is Dan Farino of MySpace. Dan, can you introduce your job at MySpace? No problem. I am the chief system architect of MySpace. Simply put, I developed many of the background custom performance monitoring and debugging tools we use. When I first got there, the problem was that the system relied on a lot of manual configuration, a lot of manual management, and a large number of administrators to create a variety of scripts to perform such things as restarting a server. You might want to spend 30 minutes or even one hours on a very simple question. So at first I focused on building automated tools from a system management perspective, hoping to make it easier to troubleshoot problems like troubleshooting or performance diagnostics. Would you like us to explain the challenges of debugging and scheduling such a large site? Most of the time the problem is how to find the problem from thousands of servers. So I developed a performance monitoring system that can get real-time information such as the CPU of each machine in the entire server farm, the number of requests in the queue, and the number of requests processed per second. This allows us to visually look at a red server on the screen: "Hello, Cumulative request in my queue." The question then becomes: "Well, we already know which machine is having a problem, but what to do now." "If we get a memory snapshot (memory dump) and send it to Microsoft, it could take one weeks to respond and say" Hi, your database server is dead. " But we'd like to see this problem within two hours, so we put a couple of simple tools in the right menu that let some operators--not developers--find out that "I saw hundreds of threads blocking database access, and I'm guessing it's a database problem." We focus on providing highly visible solutions for faulty servers and developing tools to quickly identify problems with less-skilled administrators. Very interesting. Can you describe the architecture of MySpace from a technical perspective? Of course. It may be most appropriate to start talking about the problem from the front. We have about 3,000 Windows 2003 IIS 6 servers. The code runs on. NET 2.0, and a small portion is. NET 3.5. Different. NET version can cause a variety of problems, but the front end still runs. NET 2.0 and 3.5. We use a regulated SQL Server for back-end storage. However, the performance of the database query is not sufficient to meet the scalability requirements of the site, so we developed a custom cache component. This is a simple object storage unit that communicates through a custom socket layer, all stored in unmanaged memory in a 64-bit. NET 2.0 machine. Because one of the first problems we encounteredis a. NET garbage collector.
Even on a 64-bit platform, the recovery and compression of billions of objects can cause more significant delays. So one of the problems with scalability is that you have to write unmanaged memory storage yourself. We use the front end of the storage. NET to communicate and interact with this storage layer--What we call the cache layer--which greatly reduces the pressure on the database. Now our database is upgraded from SQL 2000 to 64-bit SQL 2005, so that more data can be stored in memory, which provides a great performance boost. At first we used Codefusion 5 in Windows 2000 and a small number of databases. The first effort we made in database scalability was a vertical division--or a horizontal division, and I don't know what you're calling it--put every 1 million users in a different database.
This improves scalability and allows us to easily add hardware and isolate bugs. If a database is a machine, only a small number of users will get the error message or the message being maintained before we handle the problem. We built a custom distributed file system on Linux to store user uploaded media content and load balance. All video or mp3 content is placed on this custom DFS layer, implementing redundancy across the data center, and fetching data directly from the disk via HTTP. What challenges have you faced in expanding your site? You mentioned that at first you used the Codefusion server, but now you use thousands of IIS servers. How are you going to switch? Do you increase the size of your site just by increasing the number of servers? Many servers are used to build the central cache, which can greatly reduce the pressure on the database server. Also, if you use a large number of back-end databases, you may experience a problem: if one of the database servers becomes a machine, it will quickly reject requests from the Web server, which is not a problem. However, if only this backend server slows down, imagine that a Web server is servicing a large number of different users, and it is likely that one of the slow requests will be blocked on the bad database.
However, other requests are still working correctly, but sooner or later there will be another request blocking the bad database. It may be more than 10 seconds before all the threads on the Web server are blocked by the database. So the first thing I did when I got there was the error isolation system, which runs on each Web server and says, "I only allow X requests to be made to servers of the same scope at the same time." This avoids a single bad server that stops the entire site from responding. We would have been able to get worse at the same time as we were expanding, as the likelihood of the whole site stopping responding was increased by adding more places that could cause a single point of failure. Now we have the Error isolation module deployed on IIS, which increases the uptime of the site. Plus the effect of the cache layer, the normal running time further increased. It sounds like you've done a lot of things by building extensions on the basis of existing tools, so how does it feel to be on the Microsoft platform? It feels very good. We use Microsoft's PowerShell. Its earlier code name, Monad, refers to this name because we started using it in the production environment from Monad Beta 3. It's just the right time because we've almost put all the recommended Windows management techniques, such as VBScript, command files, and so on, at the limit.
Those things work well on this machine, and if you want to see the registry configuration on another machine, it's good. However, if you want to run commands on thousands of servers, you will encounter bottlenecks. First you can access only one server at a time, and then the whole job stops if you encounter a server that is a computer. To solve this problem, I wrote a simple remote service layer myself, so the question became "How can I make it universal?" "We want to make it easy to operate on a range of servers, such as executing a command, getting some results, processing, and then possibly running another command--that is, these operations form a pipeline."
The
is at this time PowerShell appeared, helped a lot of busy. PowerShell is written entirely in. NET, and it runs all the commands in the same process, and the advantage is that command a can output a variety of things, not just strings, and allow command B to proceed with it. They can be passed along the pipe arbitrarily. NET objects, creating a very powerful programming model for administrators who are willing to sit down and write a little command. So we decided to use those commands, like I said to MySpace, "VIP," a set of virtual IPs that run a particular function. If I say "Profile VIP", it means all hundreds of servers running profiles.myspace.com, and then they can be processed uniformly. So we constructed a command called "Get VIP".
We use the Runagent command pipeline to execute any command on the remote machine in parallel and put any object into the pipe. For example, you can use PowerShell syntax to describe "tell me whether this thing is true or false, a file is not present", and then pass the result to another pipeline and pass it on, so that you can quickly handle some specific administrative tasks. People do not need to worry about whether the network fails or the multithreading implementation when they write scripts, because I have already dealt with these problems. I abstracted the network connection and abstracted the parallelism so that it would take some 30 minutes or even 1 hours of work to be done in about 5 seconds, more reliable, easier to control, and easier to account management or logging. It's amazing. I think you've heard this before, why you choose to use Microsoft technology rather than some of the other common choices. Well, I actually made this decision before I arrived, and when I first arrived, the system was running on the Codefusion platform with about 20 million users. We have very, very good. NET developer, but I'm not sure we're looking specifically for it. NET talent, or find the best developers, and then they just talked about. NET is a good thing, let's try it. " But I think, in fact, we have been using the Microsoft platform very early, so continue to use. NET is a very natural thing. I am not a Java person, my background and Microsoft technology is relatively close, Microsoft also gave us a lot of help, let us in use. NET this relatively new technology has been through a difficult time, so it gives me a pretty good feeling. As you mentioned earlier, you have mixed. NET 2.0 and. NET 3.5. How they fit in. There is no plan to completely migrate to 3.5. Microsoft is now a bit crazy about the use of version numbers. I looked at their roadmap earlier, but in fact 3.5 just added a little more to the 2.0 base. So after you install 3.5, you naturally have a 2.0 running base. This is a bit strange, but I think some of the things in 3.5 seem to be OK, for example, Windows Communication Foundation provides a very convenient Web service feature and includes features such as transaction control, Let's give up the old remoting and Web service programming model. We haven't used WCF for too long, but it looks exciting and it can get us out of a lot of the burdens that have been around for years before we had to write our own library of network communications, just because there were some glitches in remoting and Web service functionality. So, the whole site is built on 2.0 and 3.5, the tool base you useIn PowerShell. However, there is a point, you have the entire system with the use of debugging tools. What I like most about these tools is a tool called "Profiler". It is written in C + +, based on the Microsoft CLR Reconnaissance (profiling) interface, so we just say "OK, our stack dump interface can show some real-time information, such as what the current components are doing", and then the detection interface can say "OK, I'm going to check this thread every 10 seconds, I'll monitor it all over again, and I'll tell you how much time each call takes, tells you every exception, every memory allocation, the lock's competition, and all sorts of information that you can't get by using the debugger to view static state.
The role of this profiler is not only to tell us what happened under the wrong circumstances, but also to tell us under normal circumstances what effect those requests have on the system. There are a lot of different kinds of modules written by people in our system, so for people like me, even if you've got a sense of the system, there are times when there are new discoveries, like "I don't know we actually did these things." It may be the most complex tool we have developed, and this is the only few that are not used. NET development tool, this is because you cannot write. NET code to monitor. NET code. Here we have a lot of C + +, interestingly, it took advantage of Microsoft Research's Detours Class library, technically similar to the way they used to sneak a patch to the kernel of the method: the existing code elsewhere, and then call back, so that the program does not know that they have been patched. We also use this approach to get some information that Microsoft's interfaces cannot provide but are sometimes particularly useful. You mentioned C + +, I remember you talked about VB and VBScript before. We used VBScript about two years ago. Now we use C # as. NET development language. What language is used inside MySpace? I must mention that the CLR itself is basically written by C #. We also have some "pioneers" experimenting with things like F #. F # from Microsoft, it looks like a pretty good thing. I have embedded the IronPython script in a number of tools because I think it can bring more control to the configuration work. Compared with the hundreds of selection boxes on the hook, I now have a few IronPython scripts to complete the work. On the whole, we use C # to do the foreground and backstage development, in all kinds of necessary time will use IronPython and PowerShell. Are you going to migrate to all the. NET 3.5 features such as LINQ? As a matter of fact, LINQ gives me a deeper sense of the day. I can't find any reason not to upgrade all servers, or to let developers use it. LINQ seems like a very ingenious technique, and it's really cool to be able to use LINQ to process XML, various objects, or data in memory for someone like me who writes SQL queries. Maybe one day we can also use custom Provider, who knows. It's a good thing, I hope to see the use of LINQ on the server soon. For the recently released MySpace Developer tool, do you have any plans for developers to use a number of different languages, such as C #. or let them continue to develop in scripting languages such as JavaScript or VBScript. In fact, I have not decided the future. I know the developer platform just released is a small part of the siteWidget, but only JavaScript is supported at this time. Perhaps your own rich client application can invoke the complete API in the system, but I can't really determine that. For a large web site such as MySpace, if you encounter problems, such as the server's load reached the peak, there is no way to create a patch to let the problem disappear directly, or only 1.1 points to change, the problem divided into several parts in order to resolve. This is a very good question and I think we have to look at the problem in two ways. One is to find the problem, the other is to solve the problem. Many times we will find "OK, the site's request line up, we should do." "We don't really know why this happens, and then we just add a few more servers." Well, it seems that the problem has improved, but you do not know if you should increase the server at this time, it is possible that the backend database is just in the backup. It's not just that you don't know where the error is, but that the current practice may not work in the future. So the difficulty at first is how we can begin to identify these issues, and how we can make the front-end Network operations Center (Noc,network Operation) People able to point out these problems. Once this difficulty is solved, it is easier for them to determine who to report the problem to.
Of course, it is possible that this is really a code issue and we are trying to make the middle tier or back-end code compatible with the foreground. If we release a thing, the result is normal in QA, the test is normal, but after publishing to the site but there is a problem but not clear what is the reason, the easiest way is to roll back the release or disable this feature. In general, after using the right toolkit, there are few situations where the problem is not found. At first our toolkit was much smaller than it is now, but if you find a similar problem every day, so we'll say "OK, then we'll check with the debugger," and the next day say "I'll look at the problem with the debugger", and on the third day: "Forget it, I'm going to write a tool of my own that doesn't come in handy now." Our toolkit is more and more complete, so the ability to quickly discover and diagnose problems increases. What is the whole process of error-arranging? How the system adapts to the scaling requirements, is there any difference between the theory and the product? How does it affect the architecture of the system? Well, the system architecture model is mainly divided into three layers: front-end, cache layer and database. So it would be difficult to dramatically improve the scalability. We'll start by saying, "Our current database query is not very good, so let's try a simple fix and put it in the cache." Well, it doesn't work, so we might have to build another system. We start to think about this, we have a lot of these systems, and if there are problems with performance, then we can optimize them individually, such as moving this part out of the database, putting it on the disk, or putting it in memory. If there is a problem somewhere, we will develop a new system. This is not much seen in MySpace because we were very effective at first on the level split. But when you plan to get more than 9 "99.99......%" available time, there's a big change in decision making. This may be a 2-3-year-old old question, but it is mentioned. NET cannot scale "when the question, how would you reply it. I do not think there is such a problem, the key is whether you use the right tools, there are experts in this field. Actually, I think now. NET is already a very mature platform. Obviously Java is ahead of. NET in this area, but I think we're still the biggest on the internet. NET site. I don't know what you would compare our available time and performance with Java sites of the same size, and all the problems we encounter are not. NET platform itself. It may be our own bug, it may be a hardware problem, but I haven't really encountered it. NET, in addition to garbage collection, I might say, "It's really hard to scale."
But. The best place on net is its scalability, so if the garbage collector doesn't perform well in memory management on your 16GB machine, you can replace it with Berkeley DB or your own unmanaged store. to enjoy. NET's advantages do not have to be completely limited to. NET. I would like to say. NET has good scalability, and our scale will be more and more big. You may wish to ask me this question in a few years and see what I will say. Do you use any other Microsoft enterprise products in terms of server or scalability? It should be said that we will try when something new comes out. For example enterprise Application block, remoting, and Web service we've all tried. If you really want to do a large-scale application, then that's my recommendation: In general, we end up using our own implementations, because we don't need such a common solution, what we really need is performance. What I'm trying to say is that after we've tried something, we've learned something, and then we might be able to peel out the valuable parts of it, and then write a new one for our performance or scalability needs. I don't think Microsoft will test in our size applications, so it's like we're testing for them. So, if there's something that works in dozens of service-scale programs, but it doesn't meet the scalability requirements of MySpace, I think it's a commonplace thing to do. Can you give an example of the right. NET or Microsoft tells you what you should do, but do the opposite. I think I wrote Profiler is a bar. If they don't find out, I'll use a lot of things that definitely violate the server warranty, for example, because performance is patched for CLR code. We have seen many C + + runtime code, and have been using reflector to look at the internal situation, and may sometimes be used in the product code decompile to fix some of the things we think Microsoft does bad. But we try to avoid it as much as possible, so I can't give a real example of where we're going in the opposite way. But every time we encounter these little problems, we say, "Let's change it a little bit, hopefully it will work," rather than "well, we'll reuse a new technology." Can you talk a little more about the cache layer? I'm interested in the way you deal with some common caching problems, such as updates and things like that. We are just dealing with this problem. At present, the cache layer is neither write through nor read through. Basically what the Web server does is two steps away. First check the cache, if nothing, then the Web server takes the object out of the database and serializes it, sends it to the user page, and then asynchronously submits it to the cache, and then repeats the process the next time. I believeAt some point we will use a more traditional three-tier model so that the Web server does not directly connect to the database, but we are still based on a simple two-tier Web server and database, and the cache layer is simply an appendage. When the scale of the future becomes larger, we begin to move access to the cache, which ultimately allows the Web server to connect to the caching server rather than the database.
But for now, object storage is working pretty well, and we've been thinking a lot about it. It's just used to store objects, it doesn't know what to save or where those things come from, which may be good for performance, I'm not sure. But when we needed to cache it a few years ago, we designed it as a service that could be easily added. We now have 400 servers running fast, and if something slows down then we're going to upgrade. It's nothing special.