How to efficiently split a field into multiple rows I originally split a field in the table into multiple rows, such as aaa and bbb into aaabbb. Now the test is as follows: [SQL] witht1as (select3c1, eee, fff, gggc2fromdualUNIONALLselect2c1, ccc, dddc2fromdualUNIONALLSELECT1c1, aaa, bbbc2F
How to efficiently split a field into multiple rows I meant to split a field in the table into multiple rows. For example, splitting aaa and bbb into aaa bbb is now tested as follows: [SQL] with t1 as (select 3 c1, eee, fff, ggg c2 from dual UNION ALL select 2 c1, ccc, ddd c2 from dual UNION ALL SELECT 1 c1, aaa, bbb c2 F
How to efficiently split a field into multiple rows
I used to split a field in a table into multiple rows, for example, 'aaa, bbb'
'Aaa'
'Bbb'
Now the test is as follows:
[SQL]
With t1
(
Select 3 c1, 'Eee, fff, ggg 'c2 from dual UNION ALL
Select 2 c1, 'ccc, ddd 'c2 from dual UNION ALL
SELECT 1 c1, 'aaa, bbb 'c2 FROM dual
)
Select c1, LEVEL, replace (regexp_substr (c2, '[^,] +', 1, level), ',', '') c2
From t1
Connect BY level <= length (c2)-length (replace (c2, ',') + 1
Order by c1, level
Www.2cto.com
The returned results are as follows:
[SQL]
C1 LEVEL C2
1 1 aaa
1 2 bbb
1 2 bbb
1 2 bbb
2 1 ccc
2 2 ddd
2 2 ddd
2 2 ddd
3 1 eee
3 2 fff
3 2 fff
3 2 fff
3 3 ggg
3 3 ggg
3 3 ggg
3 3 ggg
3 3 ggg
3 3 ggg
3 3 ggg
3 3 ggg
3 3 ggg
Www.2cto.com
-- ===================================================== ====================
It seems that a large amount of duplicate data is generated after connect by, so the correct data is obtained after distinct is added.
Reflection:
I constructed only three lines of test data. The longest split was only 3 sections of 'Eee, fff, ggg ', but 21 pieces of data were generated. If the test data increases or the segments to be split
As the number increases, the data produced by connect will be massive.
When using this method to actually process the data in the production database, the problem immediately appeared. There were only 17 pieces of data, and the longest split field was 8 segments. 7.38 million pieces of data were generated even though I used
After distinct, it is still very slow.
Solution: replace connect by with Join
[SQL]
With t1
(
Select 3 c1, 'Eee, fff, ggg 'c2 from dual UNION ALL
Select 2 c1, 'ccc, ddd 'c2 from dual UNION ALL
SELECT 1 c1, 'aaa, bbb 'c2 FROM dual
)
SELECT c1,
Substr (t. ca,
Instr (t. ca, ',', 1, d. lv) + 1,
Instr (t. ca, ',', 1, d. lv + 1 )-
(Instr (t. ca, ',', 1, d. lv) + 1) AS d
FROM (SELECT c1,
',' | C2 | ',' AS ca,
Length (c2 | ',')-nvl (length (REPLACE (c2, ','), 0) AS cnt
FROM t1) t,
(Select rownum lv from
(Select max (length (c2 | ',')-nvl (length (REPLACE (c2, ','), 0) mlc from t1)
Connect by level <= mlc
) D
WHERE d. lv <= t. cnt
Order by c1
Www.2cto.com
Conclusion:
When there is only one amount of table data, connect by is generally not a problem. However, if there are multiple pieces of data in the table, connect by will generate a large amount of duplicate data.
This type of problem can be solved using join.