Data cleansing Note: sorts strings to dates (A Date Field is processed in multiple formats ),
Original Works are from the blog of "Deep Blue blog". You are welcome to repost them. You must specify the source when you repost them. Otherwise, you have the right to pursue legal liability for copyright.
Deep Blue blog: http://blog.csdn.net/huangyanlong/article/details/46513855
Background]
When cleaning data, it is found that there are three types of data formats in a certain time field of the source system. It is suspected that this is caused by inconsistent source data formats accepted from three or more systems. This is because the source uses the varchar2 format for this time field, which is caused by the fact that the energy-end system has no specification when receiving data uploaded by different systems. Data under this field needs to be processed and cleaned by category.
[Solution]
We can use the case function to classify different types of data, for example:
Select case when condition 1 THEN processing method 1 WHEN condition 2 THEN processing method 2 ELSE processing method 3END naming from source table;
[Experiment]
Create an experiment table as follows:
Create table Experiment table (ID varchar2 (32) default sys_guid (), DATE_TIME varchar2 (50), MEMO varchar2 (32 ));
Insert experiment data to simulate three time formats:
Insert into experiment table (DATE_TIME, MEMO) values ('2017-08-11 2017: 23.0: 18.0 ', '1'); insert into experiment table (DATE_TIME, MEMO) values ('2017-05-27 2015: 12.0: 24.0 ', '1'); insert into experiment table (DATE_TIME, MEMO) values ('2017 11:00:12 PM ', '2'); insert into experiment table (DATE_TIME, MEMO) values ('2017 10:10:00 AM ', '2'); insert into experiment table (DATE_TIME, MEMO) values ('2017 02 08: 12: 23: 000 PM ', '3'); insert into experiment table (DATE_TIME, MEMO) values ('2017 01 31 09: 00: 00: 000 PM ', '3'); commit; select * from experiment table;
Create the target table as follows:
Create table target table (ID VARCHAR2 (32), RESULT_TIME DATE, LEVEL_NUMBER VARCHAR2 (32 ));
If the data is not processed, the following error is returned:
INSERT/* + append */INTO target table nologging selectid id, case when DATE_TIME LIKE '%-%' THEN TO_DATE (REPLACE (DATE_TIME ,'. 0 ', ''), 'yyyy-MM-DD HH24: MI: ss') WHEN DATE_TIME LIKE' %: % 'then TO_DATE (REPLACE (DATE_TIME, ': 000', ''), 'yyyy mm dd HH: MI: SS am', 'nls _ DATE_LANGUAGE = American') ELSE TO_DATE (DATE_TIME, 'yyyy mm dd HH: MI: ss am ', 'nls _ DATE_LANGUAGE = American') END RESULT_TIME, MEMO LEVEL_NUMBERFROM experiment table; COMMIT; SELECT * FROM target table;
Small knowledge, easy to remember.
Supplement: Date Processing in English format
select to_date('1-JULY-15 22:23:11','DD-MON-YY hh24:mi:ss') FROM DUAL;
**************************************** * ** Blue growth series ********************************* *******************
Original works, from the blog of "Deep Blue". You are welcome to reprint them. Please indicate the source (Http://blog.csdn.net/huangyanlong).
Blue growth note-chasing DBA (1): traveling on the road to Shandong
Blue growth notes-Chase DBA (2): Install! Install! Long-lost memories have aroused my new awareness of DBAs.
Blue growth note-chasing DBA (3): importing and exporting data on antiques becomes a problem
Blue growth note-chasing DBA (4): recalling the sorrow of teenagers, and exploring oracle Installation (10g and 11g in Linux)
Blue growth note-chasing DBA (5): Not talking about technology or business, annoying Application Systems
Blue growth note-chasing DBA (6): doing things and being human: Small technology, great human
Blue growth note-Chase DBA (7): Basic commands, foundation stone
Blue growth notes-chasing DBA (8): repicking SP reports and recalling oracle's STATSPACK Experiment
Blue growth note-chasing DBA (9): Chasing DBA, new planning, new departure
Blue growth note-chasing DBA (10): Flying knife defense, familiarity rather than expertise: Playing with middleware Websphere
Blue growth note-chasing DBA (11): It's easy to go home and wake up.
Blue growth notes-Chase DBA (12): seven days and seven gains of SQL
Blue growth note-chasing DBA (13): Coordinating hardware manufacturers, six stories: what you see as "servers, storage, switches ......"
Blue growth note-chasing DBA (14): An unforgettable "Cloud" end, started hadoop deployment
Blue growth note-chasing DBA (15): Who thinks FTP is "simple" and thinks it is a trigger
Blue growth note-chasing DBA (16): DBA also drank alcohol and was rejected
Blue growth note-Chase DBA (17): whether to share or consume, learn to grow in the Post-IOE Era
**************************************** **************************************** **********************************
**************************************** ******************************** *****************
Original works, from the blog of "Deep Blue". You are welcome to reprint them. Please indicate the source (Http://blog.csdn.net/huangyanlong).
Football and oracle series (1): 32-way zhoudianbing, overall view of group A Brazil smon process of oracle32 process Alliance
Football and oracle series (2)
Football and oracle series (3): oracle process rankings, the World Cup round is about to fight!
Football and oracle series (4): from Brazil to Germany, think of the different RAC topology comparison!
Football and oracle series (5): The directX library missing in the voda14 game is similar to the oracle rpm package!
Football and oracle series (6): Asian Cup with database creation-come on, Chinese Team
**************************************** **************************************** **********************************