Tags: small and medium-sized enterprises big data technology route Selection of big data technology routes for Small and Medium-sized Enterprises
Currently, big data is mainly used in the Internet and e-commerce fields, and is gradually used in the telecom and power industries. For the majority of small and medium-sized enterprises, big data has heard too much. However, the technical threshold for big data is still high. From the technical line, choosing the technical solutions used by large com
Recently, Twitter developed a distributed real-time statistics system, Rainbird.
Usage
Rainbird can be used for real-time data statistics:
1. count the number of clicks on each page and domain name on the website
2. Internal System Operation Monitoring (count the running status of the monitored server)
3. Record the maximum and minimum values
Performance Requirements
As a distributed application of large websites, the following performance is required:
1 High Write Performance, up to 100,000 WP
following way:
All events generated by the original system should be marked with a timestamp.
When the processor in the pipeline processes the stream, it tracks the maximum timestamp, assuming that the persisted global clock is behind, and updates it to the maximum timestamp. Time synchronization of other processors with the global clock
While the data is being played back. Resets the global clock.
Persisted storage RollupWe've discussed how persistent storage can be used
dynamically pushing to your fans is called message distribution (fanout).History and backgroundFashiolista's feed system has undergone three major improvements. The first version is based on the PostgreSQL database, the second version uses a Redis database, and the current version uses the Cassandra database. In order to facilitate readers to better understand the time and reasons for the replacement of these versions, I will first introduce some bac
This article is mainly for the Kong installation of the small note, the system environment for CentOS 6.7Please specify the source--xiaoeight the article reproduced
Introduced
Kong is an API gateway that forwards API traffic between the client and (micro) services, extending functionality through plug-ins. There are two main components of Kong:
Kong Server: Nginx-based server used to receive API requests.Apache Cassandra: U
(similar to checkpoint ).
Replication: MemSQL currently supports master-slave replication. It supports local replication protocol to transfer transaction logs to slave.
Distributed architecture: working based on the concepts of aggregators and leaf nodes, a leaf node is a MemSQL database. The aggregators are responsible for decomposing and querying the relevant leaf nodes, and aggregating the results back to the client.
GemFireThe technical principles of the memory-based distributed cluster sys
services in cloud platforms like Cloud Foundry and He Roku.
spring-boot-starter-data-elasticsearch
Support for the Elasticsearch search and analytics engine including Spring-data-elasticsearch.
spring-boot-starter-data-gemfire
Support for the GemFire distributed data store including Spring-data-gemfire.
spring-
terabytes of data can be loaded into the memory for memory computing. The computing process itself does not need to read or write data to the disk, but regularly writes data to the disk in synchronous or asynchronous mode. Gemfire stores multiple copies of data in a distributed cluster. If one machine fails and other machines have backup data, you do not have to worry about data loss and have disk data as a backup.
is not in-depth enough, it will increase the risk.
Later, VMWare recommended a memory database product called gemfire, saying it had achieved good results in 12306 website applications. So they brainwashed us during pre-sales and demonstrated some successful cases, which increased our confidence in this product. What impressed me most is that this product is positioned as a memory computing platform solution that can be directly computed on data node
.
Spring-boot-starter-amqp
Support for the "Advanced message Queuing Protocol" via Spring-rabbit.
Spring-boot-starter-aop
Support for aspect-oriented programming including SPRING-AOP and AspectJ.
Spring-boot-starter-artemis
Support for the "Java message Service API" via Apache Artemis.
Spring-boot-starter-batch
Support for "Spring Batch" including HSQLDB database.
-starter-cache supports spring's cache abstraction.
8) Spring-boot-starter-cloud-connectors supports spring cloud connectors, simplifying the connection of services on cloud platforms such as cloud Foundry or Heroku.
9) Spring-boot-starter-data-elasticsearch supports Elasticsearch search and analysis engine, including Spring-data-elasticsearch.
Spring-boot-starter-data-gemfire supports GemFire distributed d
,twitter and digg.com are using Cassandra. The main characteristic of Cassandra is that it is not a database, but a distributed network service composed of a bunch of database nodes, a write operation to Cassandra will be copied to the other nodes, and the read operation to Cassandra will be routed to a node to read. F
Databases are generally divided into the following types: relational (transactional) databases, represented by Oracle and MySQL, with keyValue databases, represented by redis and memcached dB, there are document-based databases such as MongoDB, and columnar databases represented by hbase, Cassandra, and Dynamo, as well as other graphic databases, object data libraries, and XML databases.
Some databases are distributed database design concepts, such as
on the feasibility of Big Data 3.0 replacing SAP HANA
First, Big Data 3.0 Introduction
In short, Big Data 3.0 is about implementing SQL on the big data and balancing performance, ease of use, and scalability. At present, "search engine + Large data +sql" such a convergence as one of the trend.
Let's take a look at some of my other posts:
1. Agile large data based on Facebookpresto+cassandrahttp://blog.csdn.net/china_world/article/details/39966699
2, small and medium-sized enterprises of large d
. You might as well build a relational db cluster, but they're using shared storage, which is not the type we want. So there's a nosql era in which Google, Facebook, and Amazon are trying to handle more transmissions.NoSQL eraThere are a lot of NoSQL databases now, such as MongoDB, Redis, Riak, HBase, Cassandra, and so on. Each one has one of the following features:
No longer use the SQL language, such as MongoDB,
. You might as well build a relational db cluster, but they're using shared storage, which is not the type we want. So there's a nosql era in which Google, Facebook, and Amazon are trying to handle more transmissions.NoSQL eraThere are a lot of NoSQL databases now, such as MongoDB, Redis, Riak, HBase, Cassandra, and so on. Each one has one of the following features:
No longer use the SQL language, such as MongoDB,
require huge overhead and a performance cap, and it's never possible to use a single machine to support all of the load on Google and Facebook. In view of this situation, we need a new database because the relational database does not run well on the cluster. You might as well build a relational db cluster, but they're using shared storage, which is not the type we want. So there's a nosql era in which Google, Facebook, and Amazon are trying to handle more transmissions.NoSQL eraThere are a lot
require huge overhead and a performance cap, and it's never possible to use a single machine to support all of the load on Google and Facebook. In view of this situation, we need a new database because the relational database does not run well on the cluster. You might as well build a relational db cluster, but they're using shared storage, which is not the type we want. So there's a nosql era in which Google, Facebook, and Amazon are trying to handle more transmissions.NoSQL eraThere are a lot
Michael Kopp has more than 10 years of architecture and development experience in C ++ and Java/Jee. He is now a compuware technology strategist who specializes in the architecture and performance of large-scale product deployment.
The following is a translation:
Traditional enterprise database vendors often propose that nosql lacks professional monitoring and management tools. Their arguments are: enterprise applicationsProgramThe database needs to be fine-tuned and monitored to ensure stab
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.