#!/usr/bin/python3#-*-coding:utf-8-*-import requestsimport geventimport pymysqlfrom gevent import monkey# plug Mark Monkey. Patch_all () class Sqlsave (object): "" "" "" "" "" "" "" "" Def __init__ (self): SQL_DBA = {' Host ': ' Loc Alhost ', ' db ': ' Jobole ', ' user ': ' Root ', ' password ': ' jiayuan95814 ', ' use_unic Ode ': True, ' charset ': ' UTF8 '} self.conn = Pymysql.connect (**sql_dba) self.cursor = self. Conn.cursor () def process_item (self): sql = Self.__get_sql () print (SQL)
# Gevent.joinall to Database operations ([Gevent.spawn (Self.__go_sql, SQL),]) def __go_sql (self, sq L): Self.cursor.execute (SQL) self.conn.commit () def __get_sql (self): # test Data Return "" "I Nsert into article (cont_id, Cont_url, title, Publish_time, Cont, Img_url, Img_path, Like_num, Collection_num, Comment_num ) value (' D374f2fc6bd013a58513687fe2fe4e97 ', ' http://blog.jobbole.com/111866/', ' DB ' sub-database sub-table (2): Global primary key generation strategy ', ' 2017-07-16 ', Source: Laurence's Technical Blog This article will focus on some common global primary key generation strategies, and then highlight a very good global primary key generation scheme used by Flickr. For the split strategy and implementation details of the sub-database (sharding), please refer to the previous article in the series: DB sub-list (1): Split implementation Policy and example demo Part I: Some common primary key generation strategies once the database is sliced to multiple physical nodes, We will no longer be able to rely on the primary key generation mechanism of the database itself. On the one hand, the self-generated ID of a partitioned database is not guaranteed to be globally unique; On the other hand, the application needs to obtain the ID before inserting the data for SQL routing. At present, several feasible primary key generation strategies are: 1. UUID: Using UUID as the primary key is the simplest solution, but the disadvantage is very obvious. Because the UUID is very long, in addition to consuming a lot of storage space, the main problem is that on the index, there are performance problems in indexing and index-based querying. 2. Maintain a SEQUENCE table in conjunction with the database: The idea of this scenario is also very simple, create a SEQUENCE table in the database, the table structure is similar to: Create TABLE ' SEQUENCE ' (' tablename ' varchar () N OT NULL, ' NextID ' bigint (20)Not NULL, PRIMARY KEY (' tablename ')) engine=innodb12345createtable ' SEQUENCE ' (' tablename ' varchar () Notn ULL, ' NextID ' bigint (Notnull,primarykey (' tablename ')) Engine=innodb the sequence of the corresponding table from the NextID table whenever it is necessary to generate an ID for a new record for a table. The value of NextID is updated to the database for next use after adding 1. This scenario is also simpler, but the downside is equally obvious: because all inserts require access to the table, the table can easily become a system performance bottleneck, and it also has a single point of failure, and the entire application will not work once the table database fails. It is proposed to use Master-slave for master-slave synchronization, but this can only solve the single point problem, and can not solve the read-write ratio of 1:1 access pressure problem. In addition to this, there are some scenarios, such as partitioning the ID of each database node, and some ID generation algorithms on the web, which are not recommended because of the lack of operability and practical testing. In fact, next, we're going to introduce a primary key generation scheme used by Fickr, which is one of the best scenarios I've known, and has withstood the test of practice, and can be used for most application systems. Part II: A very good primary key generation strategy the Flickr development team introduced a primary key generation measurement strategy used by Flickr in 2010, and said the actual operation on Flickr was also very satisfying, with the original connection: Ticket Servers: Distributed Unique Primary Keys on the cheap this is the best solution I know at the moment, similar to the General Sequence table scheme, but it's a good solution to performance bottlenecks and single point problems, is a very reliable and efficient global primary key generation scheme. Figure 1. Flickr uses the Sharding primary key generation scheme (click to view larger image) The whole idea of Flickr is to build more than two database ID generation servers, each with a sequence table that records the current IDs of each table. However, the step of ID growth in sequence is the number of servers, and the starting values are staggered sequentially, which is equivalent to hashing the generation of IDs into each server node. For example: If we set up two database IDs to build the server, then let one of the sequence table's ID start value is 1, each growth step is 2, the other sequence table ID starting value is 2, each timeThe growth step is also 2, then the result is an odd ID will be generated from the first server, even the ID is generated from the second server, so that the pressure to generate the ID is evenly dispersed to the two servers, while cooperating with the application control, when a server fails, The system can automatically switch to another server to obtain the ID, thus guaranteeing the system's fault tolerance. For this program, there are a few details here to explain again: 1. Flickr's database ID generation server is a dedicated server, there is only one database on the server, and the tables in the database are used to generate sequence, This is also because the two database variables, Auto-increment-offset and auto-increment-increment, are variables at the database instance level. 2. The stub field in the table in the Flickr scenario is just a char (1) Not a null stub field, not a table name, so generally, a sequence table has only one record and can generate IDs for multiple tables at the same time, if the ID of the table needs to be contiguous, You need to create a separate sequence table for the table. 3. The program uses the MySQL last_insert_id () function, which also determines that the sequence table can have only one record. 4. Using replace into to insert data is very flattering, and it is very important to use MySQL's own mechanism to generate the ID, not only because it is so simple, but also because we need the ID to be generated in the way we set it (initial value and step size). 5. SELECT last_insert_id () must have the replace into statement under the same database connection to get the new ID that was just inserted, otherwise the returned value is always 06. In this scenario, the sequence table uses the MyISAM engine for higher performance, note: The MyISAM engine uses a table-level lock, MyISAM reads and writes to the table serially, so you don't have to worry about having the same ID for both reads when concurrency occurs (in addition, the program does not need to be synchronized, Each requested thread gets a new connection, and there are no shared resources that need to be synchronized. After the actual comparison test, using the same sequence table for ID generation, the MyISAM engine is much higher than the INNODB performance! 7. You can use pure JDBC to achieve the operation of the sequence table for higher efficiency, experiments show that even if only using SPRINGJDBC performance is less than pure JDBC come fast! Implementing this scenario, the application also needs to do some processing, mainly two aspects of the work: 1. Auto-balance Database ID build server access 2. Make sure that if a database ID generation server fails, the request can be forwarded to another server for execution. 1 likes favorite reviews ',' Http://jbcdn2.b0.upaiyun.com/2017/03/4bae6998d00f180d42c7da716e3d0bb2.jpg ', ' full/ 117976068e2e847f1067d25ea3fa90a3b5a60f3f.jpg ', 1, 0, 0) "" "If __name__ = = ' __main__ ': s = Sqlsave () S.process_item ()
Python_ Process Operation Database