There is such a large table assuming hundreds of millions of records, pure data text more than GB, bulk read the words in the whole.
Sqlloader, it's a choice. Well, in parallel, the datasheet itself may have been partition. But if oneself batch unload data should how to do it. Multithreaded parallelism is a good idea, but how to do it in parallel. If you are thinking about the idea of how to divide the number of primary keys, you may not have a good result. Cause, the range of the primary key is not necessarily evenly distributed, which can cause some of your work threads to be busy and some to end up in a hurry. For deeper reasons, you might encounter different threads competing for the same piece of data file, which is not, perhaps, sure to happen. Is there a more direct way? Unfortunately, it is.
The idea is that Oracle holds data files that are available, including file numbering, number of blocks, and so on. All you have to do is specify the starting and ending rowid of the relevant data file. Generally speaking, the expansion of the table is automatic even, such as 20,000 records one and so on. This allows us to basically achieve evenly distributed tasks and avoid the IO contention of the same data file.
Give two SQL, one is the Sqoop Oracle plugin from Quest, and the other is from previous experience (you know what I'm doing).
SELECT data_object_id, file_id, Relative_fno, File_batch, MIN (start_block _id) start_block_id, MAX (end_block_id) end_block_id, SUM (blocks) blocks from (SELECT O. data_object_id, e.file_id, E.relative_fno, e.block_id start_block_id, E.block_
ID + e.blocks-1 end_block_id, E.blocks, Ceil (SUM (e.blocks) Over (PARTITION by o.data_object_id, e.file_id order by e.block_id ASC)/(SUM
(e.blocks) over (PARTITION by o.data_object_id, e.file_id)/: Numchunks) File_batch from Dba_extents E, dba_objects o, dba_tab_subpartitions tsp WHERE o.owner =: Own ER and o.object_name =: object_name and E.owner =: Owner and e.segment_name =: Object_nam E and O.owner = E.owner and O.object_name = E.segment_name and (O.subobject_name = E.partition_name OR (O.subobject_name is null and e.partition_name is null)) and O.owner = Tsp.table_owner (+) and O.object_name = Tsp.table_name (+) and O.subobject_name = Tsp.subpartition_name (+)) GROUP by
data_object_id, file_id, Relative_fno, File_batch
Order by data_object_id, file_id, Relative_fno, File_batch ;
The following SQL has been intimate to the beginning and end of the ROWID to you calculate out, but also what bicycles ah.
Select GRP, Dbms_rowid.rowid_create (1, data_object_id, LO_FNO, Lo_block, 0) Min_rid, dbm S_rowid.rowid_create (1, data_object_id, HI_FNO, Hi_block, 10000) Max_rid from (SELECT distinct GRP, First_valu E (RELATIVE_FNO) over (partition by GRP ORDER by Relative_fno block_id rows between unbounded ing and unbounded following) Lo_fno, First_value (block_id) over (partition by GRP order by RELATIVE_FNO,
block_id rows between unbounded preceding and unbounded following) Lo_block, Last_value (RELATIVE_FNO) Over (partition by GRP order by RELATIVE_FNO, block_id rows between unbounded preceding and unbounded foll
Owing) Hi_fno, Last_value (block_id+blocks-1) over (partition by GRP order by RELATIVE_FNO, block_id Rows between unbounded preceding and unbounded following) Hi_block, sum (blocks) over (partition by GRP) Sum_block s from (select RelatiVE_FNO, block_id, blocks, trunc (sum (blocks) over (order by RELATIVE_FNO, block_id)-0.01)/ (SUM (blocks) over ()/:numchunks))
GRP from dba_extents where segment_name = Upper (: object_name) and Owner =: Owner ORDER by BLOCK_ID) ), (select data_object_id from dba_objects where owner=:owner and object_name = Upper (: object_name)) Orde R by GRP;
Reference:
Http://agstamy.blogspot.sg/2011/11/spliting-very-large-table-to-pieces.html
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P 11_question_id:29986713177238
http://www.quest.com/data-connector-for-oracle-and-hadoop/
Https://github.com/QuestSoftwareTCD/OracleSQOOPconnector