The revised version of crazy code is written to webmasters of Web2.0.

Last Update:2018-12-07 Source: Internet

Author: User

Tags delete cache

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Source: My BT download lab-blogjava

When the Internet entered the 2.0 age, when the technology of the Internet was no longer so unattainable, and when replication became commonplace, the Internet became popular.

Myspace is booming, and more Myspace is emerging in China

As YouTube was just getting up, video websites in China are everywhere

Facebook has changed the way for Chinese webmasters to plagiarize and stop learning chianren.
..........

When plagiarism becomes a habit, what I want to say is: imitation, webmaster. Are you ready?

If you are planning to make a waste site or make some advertising fees for the website, please do not click thisArticleFrom a technical point of view, I will talk about the imitation of the Web 0. 2 website.

When both investment and traffic are not a problem, what I want to say is, are you really smooth sailing?

Taking SNS websites for example, when 2.0 of the traffic goes online in a rush, when a pen of investment goes in, when the traffic goes up, where are your puzzles?

I have worked as a technical consultant for over 2.0 companies and briefly talked about the problems encountered by 2.0 companies (I used a B c d instead of privacy ), here, we will not go into details about page static, cache, andCodeSecurity and other issues. CTO of some technical companies knows these things. Let's talk about things after development.

Company

Company A is working on an SNS website,ProgramIt was done by two young men with a direct target of 51. The program development was smooth sailing, with more functions than 51 Niu. The promotion was also smooth sailing (Company A had its own unique promotion method. But when Alexa reached, the problem came out. Every day around four o'clock P.M., the website speed was so slow that it could not be opened. The company's three servers cpu100 %, the depressing thing is that the company's network configuration method is actually a dual-web cluster, and a separate DB database. The whole bottleneck lies in the database, so I suggest you create a database cluster and analyze the data structure. md, a typical web programmer's work, does not have any database design specifications, so function implementation is acceptable, if expansion is required, the cluster is basically impossible. What should I do? You can't do it. So, after a month of modifying the program, the data structure basically changed to the hundreds of thousands of times that had been dumped in the early stage, and the user went away.

Conclusion: During the preliminary design of Web2.0, we should not only consider functions, but also carefully consider the underlying and data structures.

Company B

Company B is also a SNS website. The program is developed by three people. The CEO is a master of economics from a famous university, and has some characteristics. To be honest, the company has good potential. The CEO has a strong operational ability and feels promising. The system architecture is okay, but --- but the system crashes, why? The system does not take into account the massive number of users, but also the massive number of files. users' photo albums and images are all stored in one partition of the web server, with each user having a directory, however, when the performance monitor is turned on, the disk's Io is surprisingly high and basically has no response time. As we all know, the file system is also a database, and it doesn't matter if you have a single large file. The key is that the entire file is more than 300 GB of fragmented files, massive read/write operations, system crashes, and data loss, A chain in the file system is broken, and all user data is lost !!! Raid cannot solve all problems. The disk array can only be restored when the hard disk is damaged. However, this is because the file system is damaged and raid cannot be recovered. This is a very heavy problem. The system has stopped for a whole month to recover data (it is easy to separate files, but no software can be organized for a large number of files, the data recovery software is usually dead when the index of the directory structure is created. It has tried to use a 16 GB memory server for restoration, which is invalid ). Solution: modify the program architecture and perform distributed file storage (it took 8 days for the program to be modified, but the file transfer took nearly a month ), 0.2 million users lost like this http://www.bt285.cn BT download

Conclusion: In the early stage of Web2.0, there should be considerations for dealing with massive storage. The whole process involves the modification of the program architecture. If the preliminary planning is poor, there is basically a line of thought.

Company C

Company C is a respectable company with a CEO's technical background. like Bill Gates, the University has not graduated from the university and has made a fortune in text messages from 01 to 03 years, I was very impressed with the small projects I made later. The company focuses on alumni, but focuses more on the MySpace style, personal homepage, and promotion. The cause of system crash is actually very simple. Because Microsoft's sqlserver is used, Microsoft's msdn tells us that sqlserver does not support load clusters, but only disaster recovery clusters, their database is overloaded, and 100% won't go on. They can only add configurations horizontally. They use a 4-Core 4-core CPU system, but the system crashes... high interaction is destined for high load. Solution: we will start from the basics to solve several large program energy-consuming users, adopt horizontal cutting for the database, group every 0.1 million users, and hash the database system, split multiple tables vertically and group files at the same time to solve the problem. because the data structure has been modified, the program has basically moved a bit. Fortunately, the system did not make a big mistake, and the loss was not very large, but it had a bad impact on the user experience.

Note: sqlserver can actually implement clusters, usually through replication and distribution, but applications need to classify, update, and query database operations. However, there is a problem at the same time. In the case of frequent database updates during high interactions, the replication latency may be long, or even 5 minutes! The application should be prepared to cope with latency!

Conclusion: the preliminary design of Web2.0 should be well hashed, and the program should be expandable in line with the database expansion.

Company D

Company D is a good company in all aspects, with CDN acceleration, and N servers and a good database (CTO is a database expert ), the reason for the system crash is that the Web can easily be used as a cluster. However, when a cluster is discovered and cannot be resolved, only four web clusters can be used as the cluster, however, all four of them are replaced. After careful analysis, find out the cause. I guess it is also the most common mistake that most ctos make, or the problem they don't think of at all, is the problem of Web upload, during the upload, the thread maintains the link due to data transmission. The first thread can take a web server down. Solution: this is the easiest way to separate upload from other energy-consuming users and perform asynchronous distributed upload at the same time. Program changes are not very large, but the loss of user experience cannot be underestimated if the speed of the previous half month is full. Like this http://www.5a520.cn novel 520

Conclusion: There is no conclusion. After all, there are not many CTO engineers with massive access experience, that is, those websites.

Conclusion: It is easy to imitate because it is not just a cold water. It is easy to find a few web programmers, and the speed may be very efficient, because Web2.0 is nothing more than dealing with databases, will operate the database. However, it is not easy to really expand, because it is not easy to cope with the massive access of the program. The current programmers are too pretentious. In fact, there are not many real experiences, do not believe that a programmer with a monthly salary of 5-10 K can surprise you. It is not the price that programmers can cope with massive access. If you want to increase the performance by 2.0, there are several suggestions:

1. Ask DBMS experts to design databases. Most programmers do not know partition views, data hashes, and the concept of data groups.

II. well-designed program architecture (this is not difficult, but it will be fine if you have an expert guide) to maintain good scalability. For cost consideration, you can find a part-time system architecture designer to do a good job in system architecture, determine future development bottlenecks.

3. Consider file storage issues. The technical content of file storage seems very low, but it is actually very high. You can consider the reverse proxy solution. When there is a problem with file storage, the site is basically finished, not only the raid problem and the storage server problem, but the truth is broken.

4. Considering China's national conditions, this is the most critical issue. We need to consider the issue of telecommunications and China Netcom. CDN cannot solve all problems. CDN is not very effective for interactive things. The most important thing is that the existing dual-Data Center suffers from DDoS attacks. The reason is very simple. The dual-data center is a private data center, so there will be no high bandwidth, if you attack it at will, you can just drop it (along with a joke, I know that the boss of a dual-line data center has bought a 4G Alibaba Cloud security wall with a total bandwidth of 1 GB, it's easy to handle MB of attacks ).

5. network latency. This is an issue that must be taken into account in Distributed Systems. Programs must tolerate data latency between 0 and 100 seconds, that is, synchronization. Don't underestimate the dozens of seconds. The problem is big. If your site has interactive functions, such as instant chat, you can imagine what the result is. For instant chat, you can use reverse proxy (high cost ). But it has little impact on messages and comments, but if the system caches and static content for robustness, this may be disastrous. Static file update and rewriting must be done asynchronously.

6. Spread your programs. If you don't have much money to build millions of servers, we recommend that you spread the functions, such as one server for photo albums and one server for messages.

7. optimistic about your programmers. Without good incentives, your programmers can easily write perfunctory code, which may be a major problem in the future, it may take a lot of effort to modify the program architecture after it is finalized. It is best for your CTO to be 100% responsible and 100% responsible for you.

8. the file synchronization problem may be unnecessary. If you look at the TTL of China Netcom and China Telecom, you will understand that synchronization must support continuous transmission and cannot be sustained, otherwise, your cost will be n times higher. when the traffic is high, you need to use a synchronization server for updates. Don't expect it to be implemented through your software. Hand it over to your programmers, tell him what he has done.

9. one of the hardest problems is the biggest loss. No matter how good your relationship with the cyber police is, optimistic about your users and review your stuff, it may be fatal if your services are shut down, I have suffered N losses.

10. For cache and static files, an independent cache server should be used to maintain, update and delete cache and file indexes.

Finally, I wish you all the masters a great deal.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More