Use compose and unistr to create accent characters

Source: Internet
Author: User
Tags abs character set return
Create
Many languages, including English, use the immersion character (accented character). Because these characters are not part of the ASCII character set, it is difficult to write code that uses these characters without looking at the Unicode value or using the Unicode editor and turning it into a known character set.

Oracle9i introduces the COMPOSE function, which accepts a string of Unicode characters and rules its text. This means that it can accept a single letter and a combination of tags, such as ' a ' (Unicode character 0097) and Sink (Unicode character 0300), and then create a separate character composed of two tags. COMPOSE uses a special combination of tags, instead of using the corresponding syllable markers in ASCII, it uses a special combination of tags that are part of the Unicode standard. The result of the above example should be Unicode character 00E0 (the lowercase Latin ' a ' with a grave note).

The most common combinations of characters in ANSI are:

· u+0300: Shen Note (grave accent) (')

· u+0301: Heavy accent (acute accent) (')

· u+0302: circumflex Note (circumflex accent) (^)

· u+0303: Tilde (tilde) (~)

· u+0308: Vowel Tone change

It is difficult to enter Unicode characters 0097 and 0300 on the keyboard without special software or keyboard drivers. Therefore, one way to enter a Unicode sequence in plain ASCII text is to use the UNISTR function. This function takes an ASCII string and then creates a sequence of Unicode characters in the national character set (usually as a 16-bit Unicode or UTF-8 character set installation). It maps any non-ASCII characters using the hexadecimal placeholder sequence, and is mapped in a similar way to Java.

To enter a sequence of followed by a note combination character, you can use Unistr (' a\0300 ') instead of trying to enter characters directly into your code. This function works well under any character set and any database that has a national character set based on Unicode. You can place multiple combinations of characters in a function--you can mix both ASCII and Unicode placeholders in the UNISTR function. For example, you can use the UNISTR function as follows:

Select COMPOSE (Unistr (' Unless you are nai\0308ve, meet me at the cafe\0301 with
Your re\0301sume\0301. ') from dual;

When you combine the output of a unistr function with a COMPOSE, you can generate a Unicode character without looking for any values. For example:

Select ' It is true ' if compose (Unistr (' a\0300 ')) = Unistr (' \00e0 ');

The COMPOSE function returns a NVARCHAR2 string, and the returned NVARCHAR2 string is usually based on Unicode. If these characters are used locally, the database attempts to map Unicode characters to the local character set when there is an implicit to_char in the result. Not all characters can be mapped, and there are some combinations of characters that do not work in COMPOSE because the Unicode association does not define them at the level used by Oracle.

To quickly check how characters are queried in a particular environment, you can run a script similar to the following script to see how the output combination characters are mapped. You may need to determine the Nls_lang settings to ensure that these characters are returned correctly:

Create or replace type Hexrange_tbl as Table of VARCHAR2 (4);
/
Show errors;

Create or Replace function Hexrange (N1 varchar2,n2 varchar2)
return HEXRANGE_TBL pipelined
Is
Begin
For I in To_number (N1, ' 000X '). To_number (N2, ' 000X ') loop
Pipe row (To_char (i, ' fm000x '));
End Loop;
Return
End Hexrange;
/
Show errors;

Select Column_value Composer,
Compose (UNISTR) (' A\ ' | | Column_value)) A,
Compose (UNISTR) (' C\ ' | | Column_value)) C,
Compose (UNISTR) (' E\ ' | | Column_value)) E,
Compose (UNISTR) (' I\ ' | | Column_value)) I,
Compose (UNISTR) (' N\ ' | | Column_value)) N,
Compose (UNISTR) (' O\ ' | | Column_value)) O,
Compose (UNISTR) (' R\ ' | | Column_value)) R,
Compose (UNISTR) (' S\ ' | | Column_value)) S,
Compose (UNISTR) (' U\ ' | | Column_value)) U,
Compose (UNISTR) (' Y\ ' | | Column_value)) y
From table (Hexrange (' 0300 ', ' 0327 ')) x;

Let's take it easy, here's a little Pl/sql script that uses compose and unistr to create a special effect that many SMS users, hackers, and spammers use to make readable English text difficult to scan because it uses a random sequence of character accent versions. I use Dbms_random to randomly select a combination of characters that can be used by different characters, and then have the SQL combine and reverse-convert to generate the ansi/latin-1 output. This script uses the Ename field of the EMP table in the code.

Set serveroutput on;
Declare
--Combinations work under ANSI, at least
A_comb nvarchar2 (m): = Unistr (' \0300\0301\0302\0303\0308\ 030A ');
C_comb nvarchar2 (m): = Unistr (' \0327 ');
E_comb nvarchar2 (m): = Unistr (' \0300\0301\0302\0308 ');
I_comb nvarchar2 (m): = Unistr (' \0300\0301\0308 ');
N_comb nvarchar2 (m): = Unistr (' \0303 ');
O_comb nvarchar2 (m): = Unistr (' \0300\0301\0302\0303\0308 ');
U_comb nvarchar2 (m): = Unistr (' \0300\0301\0302\0308 ');
Y_comb nvarchar2 (m): = Unistr (' \0301\0308 ');
L_idx integer;
L_ename Nvarchar2 (50);
CH nchar;
L_junk VARCHAR2 (50);
Begin
Dbms_random.initialize (To_char (sysdate, ' sssss '));
For row in (select ename from emp) loop
L_ename: = Row.ename;
L_junk: = null;
For I in 1..length (L_ename) loop
CH: = substr (l_ename,i,1);
Case Lower (CH)
When ' a ' then
L_junk: = L_junk | | Compose (ch | | substr (A_COMB,
MoD (ABS (dbms_random.random), Length (A_comb)) + 1,1);
When ' C ' then
L_junk: = L_junk | | Compose (ch | | substr (C_COMB,
MoD (ABS (dbms_random.random), Length (C_comb)) + 1,1);
When ' E ' then
L_junk: = L_junk | | Compose (ch | | substr (E_COMB,
MoD (ABS (dbms_random.random), Length (E_comb)) + 1,1);
When ' I ' Then
L_junk: = L_junk | | Compose (ch | | substr (I_COMB,
MoD (ABS (dbms_random.random), Length (I_comb)) + 1,1);
When ' n ' Then
L_junk: = L_junk | | Compose (ch | | substr (N_COMB,
MoD (ABS (dbms_random.random), Length (N_comb)) + 1,1);
When ' O ' Then
L_junk: = L_junk | | Compose (ch | | substr (O_COMB,
MoD (ABS (dbms_random.random), Length (O_comb)) + 1,1);
When ' u ' then
L_junk: = L_junk | | Compose (ch | | substr (U_COMB,
MoD (ABS (dbms_random.random), Length (U_comb)) + 1,1);
When ' Y ' then
L_junk: = L_junk | | Compose (ch | | substr (Y_COMB,
MoD (ABS (dbms_random.random), Length (Y_comb)) + 1,1);
Else
L_junk: = L_junk | | Ch
End case;
End Loop;
Dbms_output.put_line (To_char (l_junk));
End Loop;
End
/
Show errors;


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.