Rafy Framework-Bulk Import entities

Source: Internet
Author: User

In some scenarios, developers want to be able to import entity data into a database in large quantities. Although it is convenient to use entity repositories to save the list of entities, the internal implementation mechanism is one-to-one saving to the database, and when the number of entities is large, the efficiency is very low. So Rafy designed a bulk import plug-in program, which uses the bulk import mechanism in ADO and odp.net to import large amounts of data into the database at once.

How to use

Steps
    1. Because the Bulk import feature is an additional assembly, you need to use NuGet to refer to the latest version of the Rafy.Domain.ORM.BatchSubmit assembly before using this feature.

    2. If you are ready to import an ORACLE database, you also need to reference oracle.manageddataaccess (12.1.022 or later) assemblies.

    3. Modify the code that needs to hold a large number of entities, for example, the original code is as follows:

var books = new  Booklist (); for  (int  i = 0; i < 1000000; i++) {var book = new  book {chapterlist = {new Chapter (), new  Chapter (),}}; Books. ADD (book);} //is stored directly using the entity repository.  repo. Save (books); 

The last exercise is required to save the list of entities with the repository, and modify to create the importer to save the list of entities:

//Create a batch importer to save. repo. Createimporter (). Save (books);

Attention

    • As you can see from the code above, the bulk importer is oriented towards the entire aggregation. That is, when you bulk import a parent entity, all child entities under the parent entity are also imported in bulk.

    • Bulk import not only supports adding new entities, but also supports batch updates and bulk deletions. Use the same method as you use the warehouse.

    • For large volumes of data, using bulk import is faster than storing the entity directly using a warehouse, which is about two data-level.

    • The ability to import entities in bulk now supports only Oracle and SQL Server two databases.

    • When using an Oracle database, you also need to call the following code specifically to enable the bulk import of an aggregated entity after the database build is complete, or an exception will be thrown during the import (for reasons see the following Implementation principles section). The code is as follows:

Rafy.Domain.ORM.BatchSubmit.Oracle.OracleBatchImporter.EnableBatchSequence (    RF. Concrete<originaldatarepository> ()    );

Implementation principle

The following is a brief introduction to the principle of bulk import.

SQL Server

For bulk storage of SQL Server databases:

    • Bulk new data is implemented using System.Data.SqlClient.SqlBulkCopy.

    • The bulk update of data is implemented using System.Data.SqlClient.SqlDataAdapter.

    • To delete data in bulk, the SQL statement is directly stitched and the Id of the entity to be deleted is dropped into the in statement. For example:

DELETE from Books WHERE Id in (1,3,5,7 ...);
Oracle

For batch storage of Oracle databases:

    • New data, update data are used odp.net native bulk Import function.

      See also: Oracle.ManagedDataAccess.Client.OracleCommand.ArrayBindCount property.

    • The implementation of the deleted data is consistent with the implementation of SQL Server, all of which are splicing DELETE statements.

when a large number of entities are added, the entity Id is generated

In general, when a new entity is saved with a warehouse, the warehouse uses the mechanism of the database itself to generate IDs for the entities, using the IDENTITY column in SQL Server, and in ORACLE using the SEQUENCE for each table. However, when you bulk import a large number of new entities, for performance reasons, you need to uniformly generate IDs for all new entities that need to be saved at once.

In SQL Server, it is easy to use SQL statements to adjust the next value of the IDENTITY in the table, so the implementation is relatively straightforward. You only need to set the value of the IDENTITY next time + 100000, and use these values skipped in the middle as the entity Id.

In ORACLE, however, if you go to adjust the value of the SEQUENCE, it is a DDL statement that implicitly commits the transaction, which can cause data errors. So we finally decided: if you want to use the Bulk import feature in ORACLE, the SEQUENCE for the datasheet must be a stride with a larger number (such as ALTER SEQUENCE "seq_table_id" INCREMENT by 100000 NOCACHE). This way, when you bulk import, you no longer need to modify the SEQUENCE step, and directly use those values that are skipped in the middle as the Id of the entity. This is also convenient, but the negative effect is to use the warehouse to save a single entity, two times to save different entities generated Id will be 100000, no longer continuous.

PS: This article has been included in the Rafy user manual.

Rafy Framework-Bulk Import entities

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.