Absrtact: The REDIS+MYSQL+MONGODB technology architecture enables big data storage and real-time cloud computing in this project. Using the horizontal dynamic addition of mongodb slices, it can guarantee the query speed and cloud computing efficiency after the expansion without interrupting the platform business system, and according to the Slice key index Shard, it is in each slice to calculate independently, so that real-time analysis under Big Data becomes a reality. High-frequency access to the data in the Redis, effectively reduce disk I/O, so that the business system response more agile, to meet the high concurrency of the application services to higher throughput requirements.
Keywords: mobile location service Saas;redis;mongodb
Mobile location based application is a value-added service provided by the user location, mainly through mobile positioning technology to obtain its current location, with the support of electronic map and business platform to provide location-related information services. SaaS (software as a service) mode which provides software service through the Internet has the unique advantages of enterprise initial zero investment, no server, system development and other hardware and software investment, for the majority of small and medium-sized enterprises to solve the situation of insufficient capital investment under the circumstances of information construction, The introduction of management information system provides a feasible model.
1 Project Introduction
Based on the background of this requirement, this paper proposes to develop a SaaS platform for SMB Mobile location service, which can reduce cost for all the small and medium-sized enterprises with field, outside and outside services, combine location technology with smartphone client, utilize wireless network such as operator's GSM/WCDMA, For enterprises to provide the specific location and walking track of staff outside, at the same time to achieve attendance check-in, rapid approval, location labeling, voice group chat, data reporting, regional early warning, better geographical analysis, performance review, rapid response to customer needs and effective management of staff, The depth strengthens the enterprise in the market main position and strengthens the enterprise core competitive power.
2 Business Data analysis
Mobile Location Services SaaS platform as an enterprise mobile Internet application, the application process will accumulate a large amount of data. These include: Static information (mobile phone number, registration information, mobile phone model, etc.), location information (action track, speed, dwell time, Location attribute), data associated with the app (Access behavior, social behavior, transaction behavior, etc.), interaction characteristics (report frequency, data type and format, etc.). Its data capacity and characteristics than traditional business has a greater change.
2.1 Data Source Analysis
Data sources include data from terminal data and SaaS platform data, including Android, iOS Intelligent terminal and PC, intelligent terminal is the data collector of enterprise application, and the extension of "human organ" in business activity of enterprise. At the same time, some of the data originates from the PC, and in addition, a lot of log data is generated during the system operation.
(1) data collected by the terminal
① trajectory data: to include the company ID, user ID, latitude and longitude, address, positioning time, positioning type information such as a data sample, the default is collected once, if the enterprise employees default working time 8 hours, each employee number of 2 080 per day, assuming the number of users is 10 000, then every day 2 800,000 single data occupy space 184 kb,10 000 users occupy approximately 3 GB per day.
② general Business Data: General business data types include attendance, work plan, logbook, application, event reminder, notification announcement, sales escalation, etc. conservatively estimated single data capacity is 15 KB per user, with a data volume of 7 680 kb,10 000 users generate about a day of data volume of about a megabyte MB.
③ Live chat and work Weibo data: Live chat and work Weibo data is unstructured and contains the following categories: Voice, pictures, text, location sharing, etc. Conservative expected single image voice data volume is: 30 KB, per user per day to produce 3, the amount of data is kb,10 000 users a day to produce approximately GB of data.
(2) Platform data
As a cloud platform to serve many enterprises, there are also the following kinds of data need to be generated and managed: Enterprise, enterprise Organization, Enterprise user, user communication record, user communication record personalized notes, group business cards, etc. the platform data are not considered for the moment, and are similar to common enterprise applications.
2.2 Analysis of data characteristics
(1) Mobile. Compared with the PC application, the mobile application data acquisition time and space changes, the intelligent terminal is not tired, can automatically collect escalation such as location information, and mobile so that the convenience of data acquisition has been greatly improved, with mobile phone photos can be uploaded immediately, There is no space limit compared to the past camera acquisition and there is no limit to the upload of linked PCs.
(2) unstructured. The collection of Image voice and other media data unstructured, such as the collection of store goods display image data, work micro-blog sharing data documentation, and traditional structured, need to support the transaction data significantly different.
(3) Platform-level incremental. Compared with the previous enterprise-class application for an enterprise increment, the data volume of platform-level data increment greatly increases, through the above analysis, 10 000 users will bring about about one GB of data increment each day. Some of the data is submitted evenly to the platform, some are submitted to the platform at a peak, and attendance is usually focused on commuting hours while the tracks are evenly distributed across all hours of work.
For the above data analysis, how to solve their large-capacity and unstructured data characteristics of the storage and processing challenges? The solution of REDIS+MYSQL+MONGODB architecture is chosen by comparing the technology selection with the previous test data.
3 Related Technologies
About 3.1 Redis
Redis (Remote Dictionary Server) is an open-source key-value storage System developed using the ANSI C language, which is similar to the current popular memcached, and is based on memory (cache) data storage. The difference is that Redis supports a richer variety of data types and provides rich operations on each data structure. At the same time, Redis differs from memcached in that it persists the updated data asynchronously to the hard disk or writes the modified operation to the log file. Although Redis is a key/value form of database, it absorbs the advantages of some relational databases, such as the ability to save lists and sets types of data while still completing advanced functions such as sorting, while achieving incr (self-increment), The atomicity of the SETNX (if no key is created and set value) guarantees its operation. The Master-slave (master-Slave) synchronization is also implemented on this basis [2]. Redis master-slave Replication features: (1) support for a master can have multiple slave, while slave can also receive other slave, (2) master-slave replication does not block master and slave, when synchronizing data, Both master and slave can receive client requests [2].
Introduction to 3.2 MongoDB and its automatic shards [3]
MongoDB is a database based on distributed file storage [4]. Written by the C + + language. The data structure it supports is very loose and is a JSON-like Bson format, so you can store more complex data types. MongoDB is characterized by collection-oriented storage, mode freedom, support for dynamic querying, full indexing, querying, replication, and failover, and automatic processing of fragments [5]. The core idea of MongoDB is the document model, which is the basic unit of MONGODB data and is equivalent to the row of relational database. A collection in MongoDB is equivalent to a table in a relational database. A single MongoDB can host multiple independent databases, each of which can have its own collection and administrative privileges.
MongoDB's Shard architecture refers to the partitioning of data into different parts, stored processes on different machines, by splitting the data onto different servers, making it unnecessary to use more powerful machines to store more data and handle larger loads. MongoDB supports auto-sharding, and clusters can automatically split data and data. MongoDB provides the following sharding technologies: (1) automatic balancing of load transformations and data distribution, (2) Dynamic addition of additional servers, (3) No single point of failure, (4) automatic failover [6].
4 Technical Implementation
4.1 Schema functional Roles
The REDIS+MYSQL+MONGODB architecture corresponds to the following function roles.
Redis: Based on memory cache, save cluster Central session, Instant Messaging offline message queue, instant Messaging re-send message collection, user token lifecycle management, application high frequency Access data cache, HTML5 template data cache, static application resource cache.
MySQL: Transactional data storage: Related enterprise account data, corporate business data, business platform transaction data.
MONGODB: Non-structured document data storage: including pictures, icons, voice, work micro-Bowenben and unstructured document data combined with location data, need to dynamically expand the data without fixed mode, apply log data, need to map-reduce the calculated data.
4.2 Reliability and Availability assurance measures
In order to ensure the reliability and availability of the production system data, to evade Redis+mysql+mongodb single point of failure, the master-slave backup was made, based on which the keepalive was adopted, and the automatic fault switching was realized by VRRP protocol. Redis is configured with Master and slave, MySQL is configured with Master and slave, MongoDB is configured with slices, the detailed configuration list is as follows.
Redis Master-slave configuration requires specifying the primary IP and port from the configuration file redis.conf: slaveof 192.168.10.10 6379
MySQL master configuration:
Main configuration: Server-id=1;log-bin=mysql-bin;binlog-do-db=wqt_web
From configuration: server-id=2;log-bin=msyql-bin;master-host=192.168.10.3;master-user=slaveuser;master-password= gotop4001680756;master-port=3306, .....
MongoDB Tile Configuration:
Mongod-shardsvr-port 10001-dbpath=/home/data/shard11/-logpath/home/data/shard11/mongodb.log--fork
Mongod-shardsvr-port 10002-dbpath=/home/data/shard12/-logpath
...
MONGO 127.0.0.1:20000/admin
The configuration shard must be linked to the admin collection. After the link succeeds, the Shard can be added to the cluster:
Db.runcommand ({"Addshard": "127.0.0.1:10001"})
...
Db.runcommand ({"Addshard": "127.0.0.1:10004"})
This successfully adds 4 shard to the Shard. The rules for making shards are as follows:
Db.runcommand ({"Shardcollection": "Kingfihser.tablename", "key": {"PrimaryKey": 1}})
The settings for activating the Shard: Db.runcommand ({"enablesharding": "Kingfisher"}), and finally the Shard was successfully configured.
4.3 Detailed code
4.3.1 Redis Implementation Case
In communication, as a publication subscription queue, the Web publishes messages, enters the Redis publishing subscription channel, the communication center consumes this channel message, and all the information is released in Redis, thereby increasing the speed of the response.
public boolean sendmsg (String msg) {
Boolean rebool=true;
Jedis Jedis=null;
try{
Jedis= (Jedis) Pool.getresource ();
Jedis.publish ("kingfisher.*", msg);
}catch (Exception e) {
E.printstacktrace ();
Rebool=false;
}finally{
Pool.returnresource (Jedis);
}
return rebool;
}
4.3.2 MySQL Implementation
Transaction data storage: including relevant enterprise account data, enterprise general business data, enterprise and platform transaction data. The storage calculation in this section is implemented in hibernate+spring manner.
4.3.3 MongoDB Implementation Case
(1) Media data is stored using the GFS grid file subsystem.
Class Fileservice (Basehandler):
def get (self):
Id=self.get_argument ("id", "" ")
F=gridout (Self.mongo.fs,objectid (ID))
Try
Fn=f.filename.lower ()
...
Self.write (F.read ())
Def post (self):
...
def delete (self):
...
(2) Work micro-blog content and two-dimensional spatial index, as well as track data index and query.
Class Listmark (Basehandler):
′′′
Search work microblogging list
′′′
def get (self):
Self.set_header ("Content-type", "Application/json")
...
Class Mark (Basehandler):
′′′
Search based on two-dimensional space
′′′
def get (self):
Self.set_header ("Content-type", "Application/json")
Try
...
(3). Map-reduce calculation to do log analysis.
′′′
User access behavior on scheduled generation day
′′′
Class Currdayuser (Basehandler):
def get (self):
...
′′′
Scheduling Build Day Service run behavior
′′′
Class Currdayservice (Basehandler):
def get (self):
The current storage structure solves the need for big data storage and real-time cloud computing in the project. Using the horizontal dynamic addition of MongoDB slices, the platform service system is not interrupted, and the query speed and cloud computing efficiency are guaranteed, and the calculation is carried out independently according to the slicing key index, so that real-time analysis under Big Data becomes a reality. The data of high frequency access is placed in Redis, which effectively reduces the disk I/O and makes the business system more responsive and satisfies the high throughput requirement of high concurrency application service. While the storage and computation of big data has become simple, the management of data systems is not easy due to the changing dynamics of versions and technologies. Operation and maintenance management under the new architecture will also meet new challenges and need to be continuously optimized.
Application based on REDIS+MYSQL+MONGODB storage architecture