Recently, a requirement has been raised for the organization to determine that the provided user name contains at least two Chinese characters but does not contain full-angle characters. After searching for half a day, I want to solve the problem through regular expressions. But after testing for a long time, I found that oracle's regular expression REGEXP_LIKE does not support "un matching n, where n is a four-digit hexadecimal Unicode Character ", for example,
Recently, a requirement has been raised for the organization to determine that the provided user name contains at least two Chinese characters but does not contain full-angle characters. After searching for half a day, I want to solve the problem through regular expressions. But after testing for a long time, I found that oracle's regular function REGEXP_LIKE does not support "\ un matching n, where n is a four-digit hexadecimal Unicode Character ", for example,
Recently, a requirement has been raised for the organization to determine that the provided user name contains at least two Chinese characters but does not contain full-angle characters. After searching for half a day, I want to solve the problem through regular expressions. But after testing for a long time, I found that oracle's regular function REGEXP_LIKE does not support "\ un matching n, where n is a four-digit hexadecimal Unicode Character ", for example, \ u00A9 matches the copyright symbol (?), Standard regular expressions are supported, so this method does not work and can only be implemented by other methods.
After searching for a long time on the Internet, there is no definite implementation method, which can be determined by Length () and lengthb (), but it is not perfect, therefore, we want to record our writing method to make it useful.
For full-width characters, the ascii value range is \ uFF00-\ uFFFF, all of which are FF segments. Therefore, you can convert them to ascii for determination, instr (asciistr (replace ('in <23 \', '\'), '\ FF',)> 0.
For Chinese characters, the range is too large and can only be achieved through functions. Therefore, the following function is written:
Create or replace function get_chinese (v_name in varchar2) return integer is
I int;
V_count integer;
V_code varchar2 (10 );
Begin
V_count: = 0;
/**
Author: backpack wandering
QQ: 380140243
Purpose: return the number of Chinese characters in a string.
Principle Description: Because the ASCII value of a Chinese character is between 4E00 and 9FA5, you can convert the string into ASCII and determine whether the five consecutive characters are in this range,
Yes, it is a Chinese character; otherwise, it is another character.
Return Value: Number of Chinese Characters
Return exception:-1
*/
For I in 1 .. lengthb (asciistr (v_name)-4 loop
-- If substr (asciistr (v_name), I, 1) = '\' then -- if yes \, judge whether it is a Chinese character
V_code: = substr (asciistr (v_name), I, 5 );
IF V_CODE BETWEEN '\ 4E00' AND '\ 9fa5' THEN
-- Asciid value range of Chinese Characters
V_count: = v_count + 1; -- there is a Chinese character
End if;
-- Dbms_output.put_line (v_code );
-- End if;
End loop;
Return v_count;
Exception
When others then
Return-1; -- Exception returns-1
End get_chinese;
Query results: