Using MapReduce to build text indexes in Hadoop
Shup
Hadoop is a open-source, distributed system infrastructure that enables Hadoop to develop distributed programs without knowing the underlying details of the distribution. Text indexing is widely used in production and life, and it is necessary to use text indexing from the search engine's inverted index to the operating system's instructions. Building text indexes in a Hadoop environment provides support for search engines and Full-text indexing of documents, while balancing the benefits of distributed systems. The main values for building this index in the Hadoop environment are: the ability to build inverted indexes on a distributed platform Hadoop can improve indexing speed, store large amounts of data conveniently, and have good scalability to achieve medium advantage in a large scale system.
Using MapReduce to build text indexes in Hadoop