July 14 News, 11th from 2 o'clock in the afternoon to 12th 4 o'clock in the afternoon, the Art Dragon Travel Network has continued to visit the fault. It is understood that the event was initially an EMC storage device failure, and because the Art Dragon network's storage structure is not perfect, resulting in a long time cannot be repaired.
The incident has sparked a lot of discussion in the area of the Internet industry's system architecture, Art dragon because of this downtime, its Web services and call center business can not be carried out, according to some media calculations, art Dragon net this direct loss of more than 147,000 operating income, and its customers caused by the potential impact can not be estimated.
EMC storage problem triggers chain reaction
On the afternoon of 11th, there was an error in the Netizen's response to the network, and soon the official appeared "system failure, repairing ...". This failure directly affects the operation of the business, which is the main operating window for the website and call center.
12th Morning 8, Art Dragon CEO Cui said, Art Dragon's storage system failure, resulting in all service disruption, Cui said the art Dragon and EMC engineers have been repaired for 18 hours.
At this point, a lot of attention focused on the EMC company, Art Dragon Network storage products used by EMC, according to people close to the site, the downtime is indeed a problem with storage hardware, resulting in database hangs, system Recovery takes a long time to cause.
12th Afternoon Call center resumption of air ticket service, Web services, etc. began to resume at 4 o'clock, to 18 point full business resumption operations.
Incomplete backup architecture leads to longer maintenance time
For the art dragon nets this time the cause of the problem is a matter of opinion, in many people think that EMC hardware problems, some enterprise technology architects began to show solidarity with EMC.
Lilac Garden website CTO Feng Dahui said in Weibo that EMC's products would not last until dozens of hours, and one netizen said, "as EMC's competitor also has to say, this is not just a hardware problem".
Sun, of the IT services company, said he was in the field on 12th to participate in the system recovery. From his retelling, the EMC storage hardware failed to cause the entire event, and because the art dragon to the database backup is not enough, the storage layer has no disaster preparedness scheme, resulting in slow system recovery, although the hardware soon returned to normal, the system still cannot work.
According to the in-depth introduction, enterprises in the operating platform system equipment architecture, in order to deal with sudden hardware, software failures, the general need for all levels of the system to backup, such as on the server side of the use of dual-machine hot standby, in the storage layer to complete disaster preparedness, in the software layer to do redundant Such a problem can be found in a timely manner.
In the event of the Arts Dragon, the art of the Dragon's storage architecture only prepared a high-performance architecture of the cluster backup, disaster preparedness depends on the only storage hardware, the software layer is also lack of redundancy preparation, so that storage problems, prepared in advance of disaster preparedness will not work.
"Put all the eggs in one basket, the basket went wrong and the eggs were all broken." Sun.
Some manufacturers in the storage industry have said on Weibo that hardware is unlikely to guarantee 100% data security, and that it is not possible to ensure that the hardware does not occur, and that the enterprise needs to reduce the impact of hardware errors on the operation of the enterprise.
By the deadline, Art Dragon and EMC did not respond to questions about the technical process.
Art Dragon again downtime upgrades or for enhanced storage systems
On the morning of 14th, Yi Long Network again announced the cessation of operation for 7 hours to achieve system upgrades. Prior to the Cui on Weibo issued a "hero post", inviting consultants, program service providers, experts, such as the Art Dragon Data Center system framework, disaster preparedness plan and operation and maintenance management consultation. This also illustrates the importance of the art dragon to this event.
Industry insiders believe that for an online service provider, the moment online and stability to become a consumer trust and reliance, and the emergence of a system completely stop service will have a greater impact on customer experience.
According to storage technology sources, the main disaster recovery technology structure has been improved, the various units to adopt a different structure of the main reason or because of the cost.
It is also argued that this event will be a boost for the disaster preparedness industry, where businesses and government agencies will invest in data systems when they recognize the unintended consequences.
(Responsible editor: admin)