Transferred from: Http://www.csdn.net/article/2012-12-20/2813054-Database
Whether you're building big data applications or just trying to get a little bit of inspiration from developing mobile apps, programmers now need data analysis tools more than ever before. This is definitely a good thing, so many companies have built some data analysis tools from the needs and skills of programmers. Gigaom reporter Derrick Harris listed 12 tools, CSDN compiled:
Over the past few years, Derrick has seen many startups, projects and development tools, all designed to bring advanced data analysis capabilities to programmers. Sometimes, programmers use simple scripts to develop powerful displays, or to implement data delivery in a simpler way in the development process, Derrick believes this is a significant development trend.
In the world of cloud computing and mobile applications, creating a new business around a simple application is much easier than ever. Even in large companies, developers are struggling to promote applications or monetize their applications. However, in the application development process, developers may need to add some data flow, so that the application "fire" up.
Needless to do, most programmers work around the overwhelming code rather than the flow of data. So programmers may need a little help, Derrick 12 tools for Developers (alphabetically), but he says there may be some good options missing, and if the attentive reader finds out, please leave a comment in the article.
1. Bitdeli
Bitdeli is a start-up company founded in San Francisco in November this year. It can measure any indicator of any application using Python script, co-founder and CEO Ville Tuulos tells Derrick that the script can be simple or complex--even in the future to machine learning. However, compared to "heavyweight" Hadoop, Bitdeli is considered a lightweight ruby.
2. continuuity
Continuuity is the work of Jonathan Gray, the former chief cloud architect Todd Papaioannou and Facebook HBase engineer, Continuuity wants to make all companies like Yahoo, Run like Facebook. The team created a big data tool that simplifies the complexities of Hadoop and hbase clusters, and includes a range of development kits designed to help programmers develop big data applications that use Hadoop technology to allow developers to deploy, scale, and manage big data applications, both inside and outside the firewall. Todd Papaioannou, co-founder and CEO of the company, said that as a start-up, Continuuity was trying to unleash the next wave of big data applications, and the tools offered by the company could greatly improve the scalability of different parts and phases of software in development.
3. Flurry
Flurry is a benchmark in the field of mobile application statistics, and it is earning as much as $100 million a year because of its unique advantage in the industry. Flurry has a very comprehensive set of capabilities, not just to help developers build mobile apps, but also to help developers analyze all of the data to create greater benefits. In fact, the data also supports the company's advertising network, they can help developers to push accurate ads to the needs of users through data analysis. But purely from the data statistics function of mobile application, flurry is definitely in the leading position. The function module is set up reasonably, the analysis dimension is comprehensive, and the analysis process is easy to understand.
4. Google Prediction API
The Google Prediction API is probably the coolest tool! The Google Prediction API is a cloud-based machine learning tool that helps developers analyze data and add features such as sentiment analysis, anti-spam, upsell analysis, suspicious activity and diagnostics to applications. The API supports a wide range of programming languages such as. NET, Go, Java, PHP, Ruby, Python, JavaScript, Objective-c, and the Application scripting language. Google's Developer home page provides training and development guidance that allows readers to access the Prediction API introduction page to learn.
5. Infochimps
Although infochimps very hard to make himself an enterprise-level IT company, there is clearly a certain gap. But a platform with the same name as the company does bring real value to developers. The tools for configuring and managing your big Data environment are called wukong--. This is a ruby-based command-line interface that allows developers to write big data applications that call data Delivery service or Hadoop with a very simple syntax. Developers don't need to learn MapReduce or flume. Dhruv Bansal, chief strategy officer of Infochimps, said: "The common scenario is that customers use the Infochimps Platform development program to process analytical data and use Hadoop only when they need to bulk analyze massive amounts of data." Based on this experience, their new version focuses on real-time processing of data (rather than Hadoop).
6. Keen IO
Keen IO won the structure Launchpad tournament, which is dedicated to providing powerful analytics tools for mobile developers. Developers only need to insert a line of code into the specified tracking location, and the company also says that developers can track arbitrary code in their applications. If this is the case, you just need to create a display panel or query process to turn all the data into useful information.
7. Kontagent
Kontagent's basic business is primarily based on analytics platforms for mobile, social, and Web applications, but it's all built on the Hadoop infrastructure. Earlier this year, the company expanded a new business: Using hive to create a data mining service, and providing a SQL-like interface to query data stored on Hadoop instead of tracking predefined variables, they can dig deeper into the selection.
8. Mortar Data
Mortar data claims "Hadoop, no Complexity". The company has provided its own cloud services-integrating pig and python to replace mapreduce--for a year. In November, it released an open-source mortar framework designed to build a community that not only facilitates sharing data sets among members, but also makes it easier to build Hadoop pipelines. Mortar data runs on top of AWS, and currently supports sources from Amazon S3 and MongoDB (hosted on Amazon EC2).
9. Placed Analytics
Placed "Kill" The script, the API, and other jobs that require the developer to "run errands," just the delivery results. In the case of placed, the results show some detail information, such as when and where users use mobile apps and Web sites. This type of information can be very helpful to advertisers, but it also helps the design of the application.
Ten. Precog
Click to view related videos
PRECOG provides a service labcoat, which is an interactive development environment that can be used to write an open source-based Quirrel (statistical query language implemented by PRECOG, Quirrel many aspects are similar to the R programming language) query Language analysis work, The integrated development environment includes a language learning tutorial and a number of complex functions. Precog's COO tells Derrick that even people without any programming experience can learn to operate within a few hours.
PRECOG can fetch input data from a variety of data sources, including SQL databases, Amazon S3, Hadoop, MongoDB, client Web applications, and back-end servers. RESTful APIs enable developers to fetch data from external sources such as Twitter or Facebook, CSV files, or mobile devices. The captured data is saved to a custom database called Precogdb, and can be enriched with demographic, attitude, location, and other information. In an interview, Precog's CEO and founder John A.de goes explained: "The architecture of the system is somewhat similar to database analysis, such as including column-oriented storage." However, the difference is that the former supports completely heterogeneous, non-normalized data, and with the support of Quirrel, compared with the use of RDBMS analysis, it is very convenient to perform many more advanced calculations in a language similar to "R for Big Data". "(Information from Infoq)
Spring for Apache Hadoop
Click to watch the video
Although Hadoop is written in the Java language, it does not mean it is easy to learn or use for Java developers. Earlier in 2012, SpringSource announced the contribution of the spring box to the Apache Hadoop project, which made it easier to build Java applications using the spring framework, but it also meant consolidating other spring framework applications. It is easier to use JVM-based scripting and to develop applications using Hadoop or related technologies such as Hive, HBase.
Statsmix.
Statsmix and Bitdeli and keen io are same strain, Statsmix also want to implement the application data collection and analysis of the programming language used by developers. The service can automatically track specific metrics, but requires developers to add the Statsmix API and pre-determine the codebase. The final result will be presented through a user-defined display panel that allows users to share not only on it, but also to consolidate multiple data sources into a single, simple view. (Compile/@CSDN Wang Peng, review/Zhonghao)
Programmers want to play big data: 12 tools to know