Recently released Hive 0.13 ACID semantic transaction mechanism used to ensure transactional atomicity, consistency and durability at the partition layer, and by opening Zoohttp: //www.aliyun.com/zixun/aggregation/19458.html "> Keeper or in-memory lock mechanism to ensure transaction isolation.Data flow intake, slow changes in dimension, data restatement of these new use cases in the new version made possible, of course, there are still some deficiencies in the new version of Hive, Hive new version of the specific band What changes? Author Alan Gates brought us a brilliant analysis.
What is ACID, what is the role?
ACID on behalf of the four properties of the database transaction, atomicity (any one database operation either complete or not), consistency (once the application performs an operation, the result of the operation for each subsequent operation is Visible), isolation (one user's actions do not cause unintended side effects to other users), persistence (once an operation is completed, these operations are also logged, even if the machine or system fails Ensure the integrity of these records). These features have always been considered an important part of the business function.
In the recently released Hive 0.13, the atomicity, consistency, and persistence of transactions are guaranteed at the partition level, and isolation is guaranteed by turning on ZooKeeper or a lock mechanism available in memory. By adding transactions in Hive 0.13, it is possible to provide full ACID semantics at the row level so that one application can add rows while another can read data from the same partition without interfering with each other.
Transaction mechanisms that take ACID semantics are added to Hive to handle the following use cases:
Streaming ingest of data. Many users use tools such as Apache Flume, Apache Storm, or Apache Kafka that write data to a Hadoop cluster and write data at hundreds of lines per second, and Hive can run every 15 minutes to 1 Hour to add a partition, however, adding too many partitions will often cause the table to mess. These tools can also write data to existing partitions, but this in turn can result in dirty reads as data is read (the data they read is modified after the query is executed) and can be generated in their directory Many small files, putting pressure on NameNode. With this new feature of data ingestion, the use case will allow the reader to get a consistent view of the data while avoiding generating too many files.
Slow changing dimensions. In a typical star schema data warehouse, the dimension table changes slowly over time. For example, when a retailer opens a new store, that store needs to be added to the "store list," but it may just be expanding its existing store and perhaps adding new services. These changes can result in inserting new records into the data warehouse or making changes to existing records, depending on the chosen strategy. Hive does not currently support these operations as long as INSERT ... VALUES, UPDATE, and DELETE operations are supported and slowly changing dimensions will be possible.
Data restatement. Sometimes the data collected is incorrect and requires some correction. It is possible that the first instance of the data has been approximated (90% of the server reports) the complete data to be provided later, and it is possible that some transactions may need to be restated due to subsequent transactions (For example, in a transaction, the customer may purchase membership and is therefore entitled to a discounted price, including for the previous transaction, as well as the possibility that the user will, at the end of their trading relationship, Request to delete their user data. As long as INSERT ... VALUES, UPDATE, and DELETE operations are supported, restatement of data will also be possible.
insufficient:
In Hive 0.13, operations such as INSERT ... VALUES, UPDATE, and DELETE are not yet supported. Operations such as BEGIN, COMMIT, and ROLLBACK are not yet supported and are planned for the next release.
In the first version of Hive 0.13 can only support the ORC file format. Whether a transaction can be used in any storage format determines how update or delete operations apply to the underlying record (in fact, there are explicit or implicit row IDs in the underlying record), but so far integration work can only be done in ORC format carry out.
The Streaming Interface (see below) can not yet be well integrated with existing INSERT INTO operations in Hive. If the table uses a streaming interface, any data added by INSERT INTO is lost. However INSERT OVERWRITE is still available and can be used for streaming ingestion, which deletes inserted data by streaming in the same way that other data is written to that partition.
In Hive 0.13, the transaction is turned off by default. Please refer to the following configuration section, Hive in some key items need to be manually configured.
Tables must be bucketed to make better use of these features. Tables that do not use transactions and ACIDs in the same system do not need to be buffered.
Only support for snapshot level isolation. When a given query is started, a consistent data snapshot will be generated. Does not support dirty read, submitted to read, repeatable read, and serialization. The introduction of BEGIN, will support snapshot isolation services, in order to ensure the durability of the transaction, not just to complete a query. Other isolation levels can be added according to user requirements.
Existing Zookeeper and Memory Lock Manager are incompatible with transactions. There is currently no intention to solve this problem, to learn how to store locks for transactions, see the following basic design.
Stream intake interface
For more information on using streaming data ingestion, see StreamingDataIngest.
Grammar changes
Several new commands were added to Hive's DDL to support ACIDs and transactions, and some of the existing DDLs were also modified.
For example: The newly added SHOW TRANSACTIONS command, for more information on the command, see ShowTransactions.
SHOW COMPACTIONS is also a newly added command, see ShowCompactions for details.
The original SHOW LOCKS command was modified to provide new lock information related to the transaction. If you are using ZooKeeper or Memory Lock commands, you will notice that this command does not change much in output, see ShowLocks for more information.
ALTER TABLE added a new option, used to compress the table or partition. The average user does not need to request compression because the system detects their needs and then automatically starts the compression. However, if a table compression is terminated unexpectedly or a user wants to manually compress the table, ALTER TABLE satisfies the user and provides manual start compression, see AlterTable / PartitionCompact for more information. ALTER TABLE queues the request, compresses and returns the request, and if the user wants to see the progress of compression, use the SHOW COMPACTIONS command.
Basic design
HDFS does not support changes to the file. When the writer writes to the file and the file is read by other users, it can not guarantee the consistency of reading. To provide this functionality on HDFS, we use standard methods used in other data warehouse tools to store the table or partitioned data in a set of base files and new records, updates, and deletes in a delta file. Create a new set of delta files for each transaction (or create a new set of delta files for each batch of transactions in a stream agent such as Flume or Storm), change the table or partition. When read, the reader merges the base and delta files, applying updates and deletes.
Sometimes these changes need to be merged into a base file, a set of threads must be added to the Hive metastore. They determine when they need to be compressed, then perform compression, and finally clean up (delete old files). There are two types of compression, minor and major. Secondary compression takes a set of existing delta files and overwrites one delta file for each bucket. The main compression is to write one or more delta files for each bucket, overwriting a new base file for each bucket. All compression done in the background, does not prevent data concurrent read and write. After a compression, the system will wait until all the old files have been read, and then delete the old files.
All files for a previous partition (or a table if there is no partitioned table) are placed in a single directory. Because of these changes, all partitions written using ACID ideas will have a directory of base files, as well as a delta fileset directory.
New lock manager DbLockManager has also been added to Hive. The lock manager stores all lock information in the metastore, and all transactions are also stored in the metastore. This means that transactions and locks guarantee persistence even in the event of a server failure. To prevent clients from crashing, leaving, or hanging, the lock holder and transaction initiator need to send a heartbeat to the metastore, The server does not receive the heartbeat signal from the client in a given time, and the lock or transaction will be aborted.
Configuration
Many new configuration key items are added to the system to support transactions.
hive.txn.max.open.batch Controls multiple transaction stream agents, such as Flume or Storm. Streaming agents write multiple terms in a single file (per Flume agent or per Storm bolt). Therefore, increasing this value reduces the number of files created by the flow agent, but increasing this value also increases the number of open transactions (Hive requires tracking), which can affect read performance.
Worker threads generate many MapReduce jobs for the compression operation, and they themselves are not compressed. Determine the table compression, the increase in the number of workers will reduce the table compression time. As more MapReduce jobs run in the background, the load on the background of Hadoop clusters is also on the rise.
Decreasing this value will reduce the time it takes to compress the table or partition. Of course, first check whether compression is necessary, which requires calling multiple NameNodes for each table or partition, reducing this value can reduce the load on the NameNode.
Table properties
If the owner of the table does not want the system to automatically determine when to compress, you can manually set the table property NO_AUTO_COMPACTION to block all automatic compression operations.