http://www.csdn.net/article/2014-05-08/2819679
Summary: Per Brashers, now the founder of Yttibrium, has been the head of Facebook's entire storage division. He invented a number of storage platforms that have far-reaching implications for the industry, with a total of 21 patented inventions (including pending).
Cloud computing and Big Data industry top event, "The six China cloud computing conference " will be held in May 2014 20-23rd at the Beijing National Convention Center.
The theme of this session is " cloud computing Big data to promote smart China ", with an international perspective, grasp the trend of global cloud computing, through training courses, thematic forums and project selection, such as deep analysis of cloud computing and big data core technology, from the application, to explore cloud computing and big data in transportation, manufacturing, Practical experience in medical, education, finance, digital entertainment and other fields. The conference lasted four days, covering "highlight the industry application, share technology trends, promote international cooperation, build a win platform" four characteristics, more refined content, is expected to significantly exceed the previous session of more than 12,000 people.
The six China Cloud Computing conference has invited Yttibrium, the founder of the company, per Brashers, who has been in charge of Facebook's entire storage division, as a guest speaker.
Per Brashers, now Yttibrium company founder. From January 2012 to August 2013, per was the total technical architect of data Direct Networks's storage solutions division. From May 2011 to November 2012, per director of Facebook's entire storage division.
Per has worked at EMC for 11 years. From June 2010 to May 2011, per acted as a senior technical expert in EMC's backup and data recovery industry, designing solutions for the complex challenges of our customers. November 2006-June 2010, per as director of EMC's NAS engineering division, led the MPFS development team to achieve $100 million in revenue. January 2000-November 2006, Per acted as a customer-facing technical business consultant for EMC Corporation in the western United States. about per Brashers
Per is an extremely visionary storage strategist. He has invented several storage platforms that have far-reaching implications for the industry, including Openvault and cold storage solutions for the Facebook-led Open Compute project system. He has designed multiple interconnected systems for use in data centers. Per is also the author of Pnfs-block and the architect of today's fastest Hadoop storage array system. Per also translates the traditional 3X replication strategy into a memory-erasure code (erasure code) solution to maximize usage efficiency.
Under the name of each of the 21 (including pending) patented inventions, most in the storage area, good at solving storage and storage network connectivity, data blocks, files and object storage, and good at using erasure code to achieve data distribution, flexible scheduling and improve efficiency. Per is also an expert in removing duplicate storage to reduce storage costs. He is adept at translating user needs into actionable execution solutions, specialising in performance improvements in the areas of Hadoop and bigdata applications. He works in several areas, including Nas, SAN, IP network connectivity, backup and recovery solutions, and application performance analysis, as well as multiple identities such as standard drafting/editing, puzzle experts, inventors, etc.
The interests and concerns of per are not limited to this, he strives to contribute to improving people's living environment, such as the provision of air-efficiency related inventions in the home to make unremitting efforts. One of the hobbies of per is organic farming and also a family brewer who has earned an honorary Master's degree in winemaking science from UC Davis. At the same time, Per is constantly learning to improve its winemaking skills.
Per has a wealth of storage system design experience, the combination of hardware and software has a clear idea, the following is per in September 2013 Xldb (the seventh session of the super-large database conference) of the views expressed, although some trends may have been achieved, but the value of its existence can not be ignored. Hardware Transformation will have a major impact on the software industry
Each begins with five points of storage, controller (or network), memory, CPU, data Center environment, shares the changing trends in the hardware landscape, and analyzes the impact of these changes on the application:
One, storage
1. Trends in storage:SATA-4 by the IETF, mixed-type disks will be promising, object storage is eroding the world, SSDs are still not fully utilized in the short term, cloud storage will reduce the organization's internal disk purchases, mobile computing will be completely built on SSDs, New types of disks designed for big data will be born and institutions will be plagued by new types of disk density issues.
2. Disk power consumption will affect the adoption. It is believed that the 2.5-inch disk will be widely used in big data scenarios, with the exception of someone who can find a way to optimize the power consumption of a 3.5-inch disk.
3. Disk performance increases are not in sight. in the past, the speed of disk access has grown considerably, but 2000 years later, the whole 13 years have gone without a boost.
4. History-based speculation. in the past, capacity growth has increased linearly, but the present has basically reached a bottleneck.
5. Disk density The future is worrying. as the density increases, the rules of the game have been changed: first, the current exploration has been transformed into the life of the data, while the reduction of operations has become a target for everyone, and second, the disk technology iteration time is 3 years, but the current has changed to 5 years, the disk service life must increase, no doubt, This will have an impact on the RMA value and the sales increment, and finally, the new controller needs the data storage space and the startup performance design for the existing data set.
at the moment of data explosion, the prospect of such a worry will have a lot to do with the design of the application, mainly in terms of development and data persistence:
1. Developments do not expect the explosion of storage capacity in a short period of time to increase the reliance on flash memory, but the API is not fully mature using flash disk design for big data can be a way out, but persistence will be a fatal flaw
2. Data persistence New alternatives may not bring an intrinsic boost, unless the application begins to split the raid against its design to provide a dec-like effect, and the rack and layer layouts need to be designed as part of the system Design for RV and mitigation Bit-error speed may help improve performance. If the application can correct some bitwise errors and retry those that cannot be repaired, the system's IOPS will increase significantly second, the controller/network
Controller and Network trends: stronger performance, smaller size, 12GBb may be the final state; SAS and PCIe will be the contenders in this field; PHY Add-ins will require more complex configurations; chip sales will be split; DMA/RDMA has matured , equipment-level cooperation will increase, the target of the organization will be placed on the raid split; T10-diff and other check/safety features; Traditional raid is still the main source of revenue; There will be huge changes in the network, such as Sas/pcie/silicon Photonics and OpenFlow /"Agnostic Networks".
What is the impact of a controller/network change on your application? This is also seen in terms of development and data sustainability:
The rapid development of new types of communication channels will appear, Open sockets, can be increased design, closed sockets will be replaced by the intelligent controller, new applications and drivers can benefit from the new density solution will bring iops and save energy, Flash assist will compensate for the lack of rpm
data Persistence data will eventually become mobile, non-layered topologies will provide better bandwidth many persistence tasks may be under pressure, such as encryption, error handling, and other networks of aggregation means more reserved capacity, QoS is old, new ideas need to be built three, memory
1. This is an era of change: Many players are introducing new, dense, slower dram alternatives; everyone is looking forward to memory persistence.
2.3D NAND implementation:Toshiba's demo, other vendors are ready to release the product in 2014 and target the removal of dram (more dense "DIMM" and host memory persistence).
So, what does the memory change mean to the application design, which is interpreted from two aspects of development and data persistence:
Development Trend The motherboard allows for greater capacity of memory, particularly in favor of memory database development access time will increase, which may have some impact on the memory database The cost curve is still high
data Persistence more write cycles, thermal and recovery issues resolved "self-healing" firmware will help prevent data loss by helping with error handling, but there is a problem with old data processing
Iv. CPU
CPU Trends: The frequency of the article has disappeared; the focus of multicore and offload continues to grow; libraries and other compile-time aids become commonplace; mobile market-driven low-power builds offer many interesting split options, assemble and release network components on demand, and the era of software-defined computers.
impact on the application. Development: More and more in-card operations, increased performance resulting in extreme density, new libraries need to validate the application. Data persistence: More threads, more cores, more fragments, and the need to pay attention to the threshold; splitting means more error checking, offload may help, but you might expect more rigorous validation of the method.
Five, data center
The data center environment mainly involves 3 parts, data center design, power distribution and rack/server/storage. According to statistics, 21% of companies operate an intelligent data center, the data center in the new project investment more than 50%. More carbon dioxide (power supply), heat, sewage, etc. are discharged as the data center accounts for an increase in the overall computing weight of the IT sector. At the same time, due to high heat and other factors, more copper, silver and other materials are corroded, and storage medium failure rate increase is one of the problems to be solved urgently. In such cases, the maturity of the application will be affected.
first, the development of a data center environment is not overnight, but equipment suppliers are struggling to overcome this and they do not want to see more of the cost of RMA. Larger devices mean flexible workload transfers, and integration with DCIM tools will help the data center to function properly.
second, data persistence
Applications will assume more availability responsibilities, and data center failures are unavoidable. DCIM helps to transfer loads, thus avoiding downtime.