So far, this column series has explored the cloud computing in Google and Amazon platforms. Although they are both implemented and structurally different, both platforms support rapid and scalable deployment. They can quickly and economically assemble, test, run, and maintain Java applications, which is no doubt unprecedented. However, the cloud is not the only factor affecting the speed of Java development today. Open source solutions can also help you quickly assemble software applications because you no longer need to write a lot of code. The era of manual authoring of object-relational mappings (ORM), logs, or test frameworks is gone. These problems have been resolved over time, and then again in the open source realm-faced with these problems again-but these solutions are almost always better than yours.
About this series
Since the first appearance of Java technology, the pattern of Java development has changed dramatically. Thanks to a proven open source framework and a reliable rental deployment infrastructure, Java applications can now be assembled, tested, run, and maintained quickly and economically. In this series, Andrew Glover will explore a variety of technologies and tools that make this new development paradigm possible.
In the whole process of Java development, open source innovation simplifies the assembly process of the application. The new Open source database Apache CouchDB (as of 0.10.0) is no exception when you write this article. You can use it easily after you build a CouchDB environment. You only need to operate its HTTP connection, neither the JDBC driver nor the Third-party control management platform is required. In this article, I'll introduce you to CouchDB and show you how to use it to improve development speed. Given the ease of installation, you will use Amazon's EC2 platform. And you'll be communicating with it through an easy-to-use Groovy module.
The database facing the document
The relational database basically dominates the database market. But other similar databases-including object-oriented and document-oriented databases--are very different in a relationship-oriented world-and often play a pivotal role. CouchDB is a document-oriented database. It is modeless and allows you to store documents in the form of a JavaScript Object notation (JSON) string.
Json
JSON is a lightweight data interchange format that is also an alternative format for WEB applications. It's similar to XML, but it's far less detailed than it is. Thanks to its lightweight features, it is becoming a lingua-Web language.
Imagine a parking ticket. The ticket will cover the following items:
Date of violation
Time
Position
Description of the vehicle
Licence information
violation situation
The format and data collected on a ticket vary by jurisdiction. Even for standard parking tickets within a single jurisdiction, their content is likely to be different. For example, a police officer may not fill out the time when issuing a ticket, or can omit the model, and only fill in the details of the license plate. The location can be a combination of two streets (such as the intersection of Fourth and Lexington) or a fixed address (for example, 19993 Main Street). But the semantics of the collected information are probably similar.
The data points of a ticket can be modeled in a relational database, but the details are a bit cumbersome. For example, how do you effectively capture an intersection in a relational database? And without a street intersection, does the database use a null field to represent the second address (assuming that the modeling method captures different street names in each column)?
In these cases, the level of abstraction of the relational database may be slightly higher. The required information is already in the form of a document (ticket). Why not model your data as a document? This can be done without having to cling to strict relational schemas, and only to follow the semantics of the advanced schema in general. This is where CouchDB is. It allows you to model these domain types in a flexible way-the result is a complete document that has no schema, but uses blueprints that are roughly similar to other documents.
Mapreduce
Google's original MapReduce is a conceptual framework for dealing with massive datasets. It is a highly optimized distributed problem-solving mechanism that uses a large number of computers. MapReduce contains two functions: Map and reduce. The map function accepts large amounts of input and splits them into smaller portions (while passing the data to other processes). The function of the reduce function is to consolidate all individual outputs from the map into one final output.
With CouchDB, you can search documents, document properties, and even associate documents in a relational world. The way you implement this is to use views, not SQL. Essentially, a view is a function that you write in the MapReduce style (in JavaScript), that is, you end up writing only a map function and a reduce function. These functions will collectively filter or extract document data, or effectively utilize the relationships between them. In fact, CouchDB has the flexibility to speed up the view process as long as the underlying document does not occur, and it only needs to run these functions once.
The most interesting part of CouchDB is the way it is designed. CouchDB embodies the basic (and extremely successful) concept of the Web itself. It exposes a comprehensive set of restful APIs that allow you to create, query, update, and delete documents, views, and databases. This makes the use of CouchDB very simple. You do not need to use other drivers or platforms to start development: A browser can do all the work. That is, rich libraries make the use of CouchDB very simple-but internally, they simply use the REST concept through HTTP.
Similar to the intrinsic nature of the Web, CouchDB is designed to incorporate a large number of scalability factors. It is written in Erlang using the Concurrent programming language, which supports distributed, fault tolerant, and uninterrupted applications. The language (now available for open source) is developed by Ericsson and is widely used in the telecommunications environment.