Kettle Management Tools
A web-side management tool developed specifically for kettle, an excellent ETL tool.
Project Introduction
Kettle as a very good open source ETL tool has been very widely used, the general use of the use of client operations management, but the problem is that when p
city of Chicago, Expedia, Google, the Weather Channel, BuzzFeed and Facebook. In addition to the free open source version, the company also offers a paid Enterprise version and MongoDB Atlas, a cloud- Hosted version. Forrester has named MongoDB a "Leader" for Big Data NoSQL.SpagoBI.SpagoBI is a open source business in
custom service in the product by mapguide only, other products are not available ).Mapguide uses the fdo (featuredataobjects) provider to achieve unified access and performance of multiple sources and different spatial data structures, without converting other spatial data into private spatial data model data.
3. Hierarchical comparison of systems1) Data Access ChannelComparison objects: fdo, FME, ArcSDE, and MapInfo spatialwareSupported types of data formats: FME> = fdo> ArcSDE = spatialware;
manager with the ability to control, verify, validate, and distribute these BI objects. SpagoBI features include support for portal, report, OLAP, QbE, ETL, Dashboard, document management, meta Data management, data mining, and geo-information analysis.Get Address: http://forge.ow2.org/project/showfiles.php?group_id=204KnimeKnime (Konstanz information Miner) is a user-friendly, intelligent, and abundant open
developed with C#/WPF with a simple ETL function.
Skyscraper-a web crawler that supports asynchronous networks and has a good extensibility.
Javascript
Scraperjs-A full-featured web crawler based on JS.
Scrape-it-web crawler based on node. js.
Simplecrawler-a web crawler based on event-driven development.
Node-crawler-Provides a simple API for two-time web crawler development.
Js-crawler-a web crawler that supports H
There's a sudden 300 stars on GitHub today.
Worked on data-related work for many years. Have a deep understanding of various problems in data development. Data processing work mainly include: Crawler, ETL, machine learning. The development process is the process of building the pipeline pipeline of data processing. The various modules are spliced together. The summary steps are: Get data, convert, merge, store, send. There are many differences in dat
(FeatureDataObjects)Provider implements unified access and performance for multiple sources and different spatial data structures, without converting other spatial data into private spatial data model data.
3. Hierarchical comparison of systems1) Data Access ChannelComparison objects: FDO, FME, ArcSDE, and MapInfo SpatialWareSupported types of data formats: FME> = FDO> ArcSDE = SpatialWare;As a common spatial data model tool, FDO is equivalent to FME. Currently, FDO supports the following data
Optimization Module Suitable for general application scenarios.
Hadoop is not just a distributed file system for storage, but a framework designed to execute distributed applications on a large cluster composed of general computing devices.
Hive is a hadoop-based data warehouse platform. With hive, we can easily perform ETL work.
Hive defines a query language similar to SQL: hql, which can convert user-written QL into corresponding mapreduce programs
wireless Internet access devices, or publishes and subscribes to channel content via email, anytime, anywhere, SAS Business Intelligence provides valuable information and answers.
With a customizable development environment, SAS provides a flexible and scalable interface and Support Service, and provides software packages, SAS Business Intelligence provides the following components for the end-to-end intelligent generation and delivery process:1. easy to access and stable enterprise data;
2
Here to the current industry open source of some real-time stream processing system to do a summary, as a reference for future technical research.S4S4 (Simple scalable streaming System) is Yahoo's latest release of an open source computing platform, it is a general, distributed, extensible, with partition fault toleran
Hive is a Hadoop-based data Warehouse platform. With hive, we can easily work with ETL. Hive defines a SQL-like query language: HQL, which converts a user-written QL into a corresponding MapReduce program based on Hadoop execution.Hive is a data Warehouse framework that Facebook has just open source for August 2008, and its system targets are similar to pig, but
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.