Hadoop

Want to know hadoop? we have a huge selection of hadoop information on alibabacloud.com

Hadoop Technology Center

Hadoop Technology and Architecture Analysis Hadoop Programming Primer Hadoop Distributed File system: Structure and design using Hadoop for distributed parallel programming, part 1th, distributed parallel programming with Hadoop, part 2nd Map reduce-the free lunch is no T over? Hadoop installation and deployment running Hadoop on Ubuntu Linux (Single-node clus ...

Talk about Hadoop and distributed Lucene

Lucene is the most used open source search engine. This article does not discuss how Lucene updates (http://issues.apache.org/jira/browse/LUCENE-1313) in real time, and how to modify the Lucene scoring mechanism to add such as PageRank scoring factor,     This article only discusses distributed Lucene. When it comes to Lucene, it's generally mentioned that Nutch,hadoop was first Doung Cu ...

The data cube is 80 times times faster than Hadoop!

Data cube and Hadoop hbase performance test Comparison report please download: Temp_13063022073515.doc data cube related information: http://www.cstor.cn/proTextdetail_121.html The test of the data cube, hbase in different data volumes, warehousing, warehousing flow, query data performance testing, from the test results: 1, data warehousing: Data cube and HBase in small amount of data in the performance of the difference between the two.

A multi-link strategy based on Hadoop

A multi-table-link strategy based on Hadoop Xu Chen Wang Li Shi Wye Hadoop System when dealing with multiple-link problems, each round writes a large number of intermediate results to the local disk, which severely reduces the system's processing efficiency. To solve this problem, a "substitution-query" method is proposed, which reduces the I/O cost of intermediate result by indexing the linked table, substituting the output of the meta set with index information to the intermediate result, and participating in the multi-table link in the form of index. Use buffer pool, two times platoon ...

Research on storage strategy of small text corpus on Hadoop platform

Research on the storage strategy of small text corpus on Hadoop platform normal Zheng Lijie in order to solve the contradiction between the distributed storage and retrieval speed of the small text corpus in the Hadoop platform storage, this paper proposes a new HSCs (Hadoop smalltexts Corpus Storage) Storage policy. The strategy first uses small text merging technology to add a layer of merge_client to the HDFS architecture, merging multiple small text files into a large text file with directory structure, effectively reducing ...

Hadoop and Memcached:performance and power characterization and analysis

Hadoop and Memcached:performance and characterization Joseph Issa Silvia Figueira In this monitors, we charact Erize different workloads running on Hadoop Framewo ...

Design and implementation of idle time scheduler in Hadoop platform

Design and implementation of idle time scheduler in Hadoop platform Yang Hao tengfei Li Tianrui Li Yu Hadoop is widely used in natural language processing, machine learning, large-scale image processing and so on as the open source cloud computing platform. With cloud computing and extensive and deep integration of industries, A variety of services is becoming more and more time-sensitive. The existing Hadoop scheduler focuses more on shortening response times than meeting the time limit for a job. In order to improve the performance of the cluster processing hard real-time operation, the design and implementation ...

Research on NoSQL database security based on Hadoop

Research on NoSQL database security based on Hadoop Shanghai Jiao Tong University Chapin This paper first analyzes the key theory and technology of NoSQL database, according to the characteristics of the lack of security of the current NoSQL database products, based on the confidentiality, integrality, usability and consistency of the database, combined with the distributed characteristics, The user security requirements of NoSQL database are analyzed and defined from internal and external. Then, through the source analysis of Hadoop and hbase, this paper studies the security mechanism of Hadoop platform and HBase database.

Apache Hadoop

Apache Hadoop jerrin JOSEPH hadoop Hadoop distributed File System (HDFS) Hadoop MapReduce Introduction Architecture Operations Conclusion References Apache Hadoop

Performance problems of cloud computing heterogeneous Hadoop cluster

Configured Issues of heterogeneous Hadoop clusters in Cloud Computing B.thirumala Rao, N.v.sridevi, V.krishna Reddy, L.S.S. Reddy in-monitors we address the issues ...

Hadoop fully distributed environment to build

I. Preparatory work environment: Vmware virtual three hosts, the system is CentOS_6.4_i386 used software: Hadoop-1.2.1-1.i386.rpm, jdk-7u9-linux-i586.rpm Host Planning: IP Address & http: //www.aliyun.com/zixun/aggregation/37954.html "> nbsp; ...

Overview and improvement of Hadoop mapreduce scheduling in cloud environment

Survey on improved scheduling in Hadoop MapReduce in Cloud environments B.thirumala Rao L.s.s.reddy in this monitors we study vari OUs Scheduler improvements possible with Hadoop ...

Large Data Age Hadoop security

We are in a time of data explosion, more and more information is produced, the volume of data is large and the variety is complex. According to statistics, in the coming years, the data generated by smart cities, intelligent transportation, medical care and Internet of things will be overwhelming. So much of the data contains a lot of valuable information, but how do we extract that information? Now the usual approach is to use Hadoop, but Hadoop is not that safe. "Hadoop is actually a software library," said Jon Clay, at the CIO summit of Trend Technology.

Document] The WAMS power Data 處理 based on Hadoop

The WAMS Power Data 處理 based in Hadoop Zhaoyang Qu, Shilin Zhang for massive WAMS Data, this monitors used the MapReduce t o make parallel the data ETL operations for several files,...

A job transfer scheduling algorithm based on Hadoop

A kind of job transfer scheduling algorithm based on Hadoop Deng, Van Tong, Peak cloud environment Service cluster There is a problem of non-uniform distribution in job submission, which leads to the aggregation of jobs at one time, which causes the response time of the job to exceed the user tolerance range. To solve this problem, a queue based job transfer scheduling strategy (JTSA) is proposed under the Hadoop platform using two-level queue technology. The experimental results show that in the case of a sharp increase in the number of operations, the total completion time has little effect and can be more significant ...

Research on logistics Vehicle transport monitoring data management based on Hadoop

Research on monitoring data management of logistics vehicle transport based on Hadoop Dalian Maritime University under this paper is based on the original logistics vehicle monitoring and management system. Use the new Hadoop clustering technology to manage the data instead of the traditional database. Under the existing conditions, a cluster environment with 3 nodes is built, and the data format of the monitoring data is redesigned based on the hbase of the Distributed database system, which supports real-time reading and writing, considering the characteristics of the monitoring data. And the use of Hadoop powerful data parallel ...

Hadoop Seven year development storm record

In this area of the internet has been such a saying: "If the second can not defeat the eldest brother, then the eldest to survive the source of things." When Yahoo! and Google were in a strong competitive relationship, they recruited Doug (the founder of Hadoop) to open up the DFS and map-reduce on which Google's boss depended, and began Hadoop's childhood.   It was almost 2008, when Hadoop became mature. From the start-up to the present, Hadoop after at least 7 years of accumulation, now ...

Hadoop platform on the mass data sorting (1)

& nbsp; Yahoo! researchers completed a Jim Gray benchmark sort using Hadoop, which contains many related benchmarks, each benchmarking its own rules All sort baselines are made by measuring the sorting time of different records, each record is 100 bytes, of which the first 10 bytes are the keys, and the rest are ...

Hadoop ecological hive, pig, hbase relationship and difference

Hadoop technology friends will certainly be confused about its system under the parasitic open-source projects confused, and I promise Hive, Pig, http://www.aliyun.com/zixun/aggregation/13713.html "> HBase these open source Technology will get you some confused, do not confused more than just one, such as a rookie post doubt, when to use Hbase and when to use Hive? ...

Research on resource discovery and collection method of OA periodicals based on Hadoop

Research on resource discovery and collection method of OA journal papers based on Hadoop University of Du Baorie on the Internet a large number of OA journal paper Resources belong to the Deep Web (deepweb) resources, the traditional search engine can not effectively establish the index, users in the search for the expected OA journal paper Resources. An effective way to solve this problem is to realize the integrated integration of OA periodical paper Resources on the Internet, and to provide users with a unified and transparent search service interface, while the discovery and collection of OA journal paper Resources is an important link. For mass o ...

Total Pages: 9 1 2 3 4 5 6 .... 9 Go to: Go

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.