With the development of Internet technology, a great amount of information is produced every day in the network, which includes semi-structured and unstructured data. Organizations can find out what their customers really need and why they need it through an analysis of massive amounts of information. Now Apache Hadoop has become the driving force behind the development of the big data industry.
Facebook engineers believe they run the largest data platform based on Hadoop. Jay Parikh, vice president of infrastructure engineering at Facebook, says most of Facebook's web site data is stored in a single cluster, and 100pb,facebook clusters are unique compared to other companies ' clusters.
The Facebook product team measures products by scanning 105TB of data every 30 minutes, while Facebook manages millions of photos and billions of like button traffic logs to recommend content to users based on their preferences.
The following is the daily data traffic for Facebook
2.7 billion like button traffic 300 million photos uploaded to Facebook 70000 query Execution (manual or automated) more than 500TB of data growth
Original link: CNET (li/compiling Zhang Zhiping/revisers)
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.