Survey shows 76% of data scientists think Hadoop is too slow

Source: Internet
Author: User
Keywords Think that the data scientists too slow survey show that said

According to a survey by the analysis and research firm PARADIGM4, 76% of data scientists think Hadoop is too slow. Data scientists say that Hadoop, as an Open-source software framework, requires more effort to program in practical applications than it does with large data application requirements.

According to a survey by the analysis and research firm PARADIGM4, 76% of data scientists think Hadoop is too slow. Data scientists say that Hadoop, as an Open-source software framework, requires more effort to program in practical applications than it does with large data application requirements.

91% per cent of respondents said that a complex analysis of large data was being carried out, with 39% of them saying their work had become more difficult. 71% of respondents said: The diversity of data types and the volume of data make analysis more difficult.

76% per cent of respondents mentioned the issue of Hadoop, 39% thought it required too much programming effort, 37% said the ad hoc query was too slow, and 30% thought it was too slow to perform real-time analysis.

Big data is becoming more and more important to businesses today. According to a study commissioned by Dell's competitive Edge research, the number of midsize enterprises with 2000~5000 people has begun to embrace the rise of big data technology, and 80% of midsize companies think they should better analyze their data, They believe that large data applications can better improve the level of corporate decision-making.

For small businesses, free and inexpensive tools make it easy to collect and analyze large data, as well as to enhance competitiveness. PARADIGM4 the survey, which began in March, ended in April and lasted one months, was echoed by 111 data scientists in the United States.

What is Hadoop:

A distributed system infrastructure developed by the Apache Foundation.

Users can develop distributed programs without understanding distributed low-level details. Take full advantage of the power of cluster high speed operation and storage.

Hadoop implements a distributed filesystem (Hadoop Distributed File System), referred to as HDFs. HDFs is characterized by high fault tolerance and is designed to be deployed on inexpensive (low-cost) hardware, and it provides high transmission rates (throughput) to access application data for applications with large datasets (SCM data set). HDFs relaxes the (relax) POSIX requirement to access data in a streaming form (streaming access) file system.

The most central design of Hadoop's framework is that HDFS and MAPREDUCE.HDFS provide storage for massive amounts of data, and MapReduce provide computing for massive amounts of data.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.