Big data on Silicon Valley's "top": Excavators and big data companies that changed the world

Source: Internet
Author: User
Keywords Large data processing changing the world
Tags analysis application behavior big data change change the world cloud company

Silicon Valley in these one months, I in startups demo days and a variety of large companies day-trips, thinking will escape the domestic various kinds of "big data" and the stalk of excavators, but never thought here even more. Hi~ This article is from the center of Silicon Valley, second to five, to share with you the real growth of large data on this land.

Next: http://www.china-cloud.com/yunjishu/shujuzhongxin/20141208_44108.html

What is a big data company that changes the world

In the last two weeks, Silicon Valley's two big demo conferences, there are more than 10 self-proclaimed big data startups, have to do consumer behavior, have to do sports analysis, have to do NGO financing, have to do environmental protection, have to do UX, have to do credit rating, of course, do not have to do mobile advertising. At first glance they are tall products, but careful thinking will reveal some less-than-big details.

For example, one of the introductions said, "brings big data to teams, media and fans," using Moneyball as a primer. After the presentation, ask them how to analyze the video to get a variety of data, the demo's buddies said they invited some people to watch the video. Yes, it's artificial. Naturally, the next question is: How do you expand in the future to cope with full-length videos from a variety of sporting events? His answer was simple enough to hire more people. After listening to me, I wondered how I was going to use the data collected. Answer Yue: Open API, oneself do not do analysis.

Well, what about the big data? Is there any data that's called Big data companies? If Qingfeng buns have bought and sold for more than half a century, will it have to be called Big data companies?

Yes, but neither.

First summed up the Silicon Valley "big Data Company" type, there are additions or amendments to the brick:

Data owners, data sources: The feature is that business advantages can collect large amounts of data, just as coal bosses monopolize a region of mines. In fact, most companies that have the ability to generate or collect data belong to this type, such as Vantage QSL and a bun shop with PB-level data collected.

Large Data Consulting: The feature is very technical, providing services from infrastructure planning and maintenance to software development and data analysis, but does not own data, such as Cloudera, a less than 500-person startup, is the most famous Hadoop architecture consulting firm.

Big Data tools: such as Amplab out of Databricks and Yahoo-led Hortonworks.

Integrated application: The feature is to collect or buy some data, and then combine the AI to solve more practical pain points.

So the answer to the question before: Yes, because the bun shop as long as the amount of consumer data collected is large enough to become the data owner, there is such a large number of data has the possibility of insight;

Yes, I believe the future is AI, and AI food is data. Like many industrial chains, the most difficult and valuable innovations often occur near end users, such as the IPhone. The most valuable part of the big data industry is how to use machines to process data to gain insights, influence the behavior of organizations and individuals, and change the world. Data collection and collation will become standardized and automated in the future, and the ability to use AI for analytics will become more critical.

Looking at Silicon Valley's main AI companies, now can be divided into the following three categories: 1. Analyze user behavior, improve product and marketing, such as LinkedIn recommendation system and ibeacon to achieve store marketing; 2. Coordinate a large number of dispersed individuals, using large data to achieve accurate and effective forecasting and planning, such as Uber and the Amazon Fresh and Grub market;3 that appeared in the previous period. Analyze and identify various types of data, develop smarter devices and programs such as Google brain and unmanned vehicles and smart devices represented by Nest.

The obvious commonality of these products is that they are trying to make the machines smarter to reduce the human workload. This objective is in line with the dynamics of technological development, and therefore considers the fourth type of company that has previously been mentioned as the most promising way to change the world.

What kind of people do these big data companies need

What about big data companies, or big data companies that really can change the world? Here's an introduction to a high-frequency word that's hot in Silicon Valley: Data scientists.

The reason for this position is not that the data is large and that there is a better way to access it, which is the work of data engineers. What is the reason for that? Just to match the needs of the fourth company above. The data is an integral part of the AI, and the larger the better, mathematically, the more data we can have confidence in the analysis from the sample results to the unknown data, that is, the effect of machine learning more and more good, AI more and more intelligent.

The resulting data scientists are a very comprehensive profession. It requires the knowledge of the analysis of statistics, to optimize the selection of algorithms, and then to a deep understanding of industry knowledge. This group of people is the core of developing data products. Most of the startup in Silicon Valley has already taken it as a necessity, so that new recruits can get almost $100k salaries. and vague definition and misunderstanding also let some people jokingly, the data scientist is a data analyst living in the Bay Area.

It is worth mentioning that the rapid development of the data itself has also brought many challenges to the data engineers in the large processing. Mainly from the following two aspects:

The rapid growth of data volume. Today, data generation becomes extraordinarily easy. Social networks, mobile apps, and almost all the Internet-related products are generating a lot of data every moment. The traditional centralized storage method obviously cannot handle such a large amount of data. At this point, we need new storage methods, such as cloud storage, and new processing schemes such as Hadoop, a distributed computing platform.

The unstructured nature of the data itself. In the traditional data processing field, we mainly deal with structured data, for example, Excel tables can display quantitative data. Now we are faced with more and more unstructured data, such as social network comments, user-uploaded audio and video. These data exist in a variety of data formats including text, pictures, video, audio and so on, which contains a lot of valuable information, but this information requires deep calculation to be able to analyze. This requires us to use intelligent analysis, image recognition and so on a series of new algorithms for data mining, which is the "Big Data" challenge.

Now Silicon Valley start-ups are exploring new applications and methods, such as the Internet of things. Now smart devices are only just beginning, Nest, Nest acquired Dropcam, Iotera, emberlight and so on are part of a small number of people's toys. The power of large data will come into play with the huge scale of use when every household installs smart refrigerators, smart bulbs, smart tables, smart sofas and so on.

Another angle is people. If you replace all the devices you've talked with before, the intersection of their relationships in various dimensions creates a huge network, each of which consists of a lot of data. Analytical understanding predicts that these social relationships will be another interesting application direction for large data, namely social physics. But according to the speed from Silicon Valley to the whole country, it feels like no matter which aspect of popularization must wait for at least five years.

Looking ahead, in view of the previous technological revolution and industry development to see large data, then the underlying infrastructure of large data will gradually be isolated, modular and standardized, or even automated, and on its middle tier and application layer will become the major companies in the main battlefield of data engineers.

Silicon Valley Company's big data running status

At present the Silicon Valley each company's data processing level and the pattern difference is quite big. In addition to several leading companies such as Facebook, most companies are either not able to process data themselves or are setting up separate data-processing departments, which are primarily responsible for the processing of data from basic to late analysis and then to other departments within the company.

For these companies, setting up a separate data-processing department may still be a long way off. For example, Facebook has a team of more than 30 people who spent nearly 4 years building Facebook's data-processing platform. Today, Facebook still needs more than 100 engineers to support the day-to-day operation of the platform. It is conceivable that the infrastructure of large data analysis is already a time-consuming project. The construction of LinkedIn's Big Data division has also taken a full six years.

Generally speaking, the company establishes the data processing platform independently has several difficulties:

Not enough good data engineers to build a team

Not enough ability to consolidate data

No easy to operate basic hardware and software to support data analysis

These main difficulties make large data analysis more and more professional, service, so that we gradually see a "Silicon Valley data processing industry chain" appeared. From the data storage, data analysis platform, to data analysis, data visualization and so on every aspect of the cost is more and more high, which makes itself highly technical companies are still using professional data processing companies to provide services, and more talent and resources to the core business development.

In addition, the company's data processing requirements are increasingly high. Not only need effective processing result, also need data processing can self-service, self-managing, guarantee data security, perfect real-time analysis. These many needs also make the specialized team's superiority to be more outstanding. And such an integrated service chain itinerary, but also to many large data companies to provide opportunities.

Silicon Valley is a very magical place. The concept of science and technology here also can not immune will be sought after, be fired very hot. But this passion and concern is, to some extent, the driving force behind Silicon Valley's innovation. Even if there are a lot of speculative stickers, even if a piece of large data startups was shot to death on the beach, even if Gartner predicted that the concept of large data will be back to reality, but believe that more people will invest in large data in the industry to develop more intelligent, more influential products. After all, the big data itself, unlike a simple pitch, is guaranteed to be visible and used.

In the "bottom" of the Big Data section of Silicon Valley, I interviewed Evernote AI department head Zeesha Currimbhoy, the director of LinkedIn's Big Data division, and the big data development of three leading US companies. By doing this, you can be more specific about how Silicon Valley companies are doing a good excavator and how to "change the world."

(Responsible editor: Mengyishan)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.