This is the online word-breaker function, the first turn over to understand.
The basis of the word-writing program
Here is Carrie (thank carrie!:D Writing a basic word-splitting program, to a large extent, can meet the needs of a considerable number of applications:
--
--A basic Chinese word segment function
--Author:carrie
--
Create or Replace function carriecharseg (input text) returns text as $$
Declare
RetVal text;
I int;
j int;
Begin
i:= char_length (input);
j:= 0;
Retval:= ';
LOOP
retval:= RetVal | | SUBSTRING (input from J for 1) | | ' ';
J:= j+1;
EXIT when j=i+1;
End LOOP;
return retVal;
End
$ $language plpgsql;
The following is an overloaded Word segmentation program, distinguishing between words and Chinese characters, the word for separate participle:
--
--A basic Chinese word segment function
--Author:carrie
--
Create or Replace function carriecharseg (input text,int) returns text as $Q $
Declare
Query text:= ';
RetVal text:= ';
Thisval text:= ';
Lastval text:= ';
I integer:= 0;
J Integer:= 0;
Begin
query:= Lower (Regexp_replace (input, ' [[:p UNCT:]] ', ', ', ' G '));
--raise notice ' 123:% ', query;
i:= char_length (query);
LOOP
thisval:= substring (query from J for 1);
IF ((Thisval <> ') and (Thisval <> ')) THEN
IF ((Lastval >= ' a ') and (Lastval <= ' z '))
OR ((lastval >= ' A ') and (Lastval <= ' Z ')) and
((Thisval >= ' a ') and (Thisval <= ' z ')) OR
((thisval >= ' A ') and (Thisval <= ' Z ')) THEN
retval:= RetVal | | Thisval;
ELSE
retval:= RetVal | | ' ' || Thisval;
End IF;
End IF;
Lastval:= Thisval;
J:= j+1;
EXIT when J > i;
End LOOP;
Return trim (retVal);
End
$Q $language Plpgsql;
Usage feeling:
Recently used these two custom participle functions, simply to separate the Chinese characters, such as:
"I come from the Earth", through the above participle function becomes "I come from the Earth", will add spaces between each word, this way more suitable for the title of the search, because the content of the title itself is not much, so enter a word, it is also able to find out the relevant title information.
If the application scenario requires a very precise query, you can use this method to make a conversion before the call.