Uses regular expressions to extract a fixed-length string from the specified starting position in the source string.

Source: Internet
Author: User

[Code] use regular expressions to extract a fixed-length string (including Chinese characters) from the specified starting position in the source string. [version 4]
[Code] use regular expressions to extract a string of a certain length from the source string starting from the specified starting position [version 4]
[Code] use a regular expression to extract a string of a certain length from the source string starting from the specified start position. [fourth correction]
[Code] uses a regular expression to extract a string of a certain length in the source string starting from the string header.
[Code] uses a regular expression to extract a string of a certain length from the source string starting from the specified start position.

(BTW: The Chinese encoding is complex and unreasonable. The high position is 0xa1-0xfe (excluding 0xff because 0xff is 255, which plays an important role in the telnet Protocol), and the low position is 0x40-0xfe; GBK extended the high position to 0x81-0xfe for unicode ing.


Description of whether the last byte is truncated into incorrect Chinese characters:
For the last byte, if half of the Chinese characters are intercepted, it should be a high byte with an ASCII code greater than 0x81.
Because the Chinese high byte is greater than 0x81, the low byte is not limited.
A complete Chinese Character: [0x81-0xfe] [0x40-0xfe]
Therefore, regular expressions are used to extract Chinese and non-Chinese Characters in sequence.
For the last byte, if half of the Chinese character is intercepted, it will be a non-Chinese character and a high byte of the Chinese character.
To determine whether the byte is in [0x81-0xfe], you can see whether the truncation error is correct.

<? Php

//---------------------------------------------------------------
// File name: preg_substr.php
// Description: uses a regular expression to extract a certain degree of string from the source string starting from the specified start position.
//-----------------------------------------------------------

/// Function Description
/// Function name: preg_substr
/// Function version: Fourth Revision
/// Function: uses a regular expression to extract a certain degree of string from the source string starting from the specified start position.
/// Function parameters:
/// $ StrSource: Source string
/// $ IntStart: Start position. The default value is 0, indicating the start point.
/// $ IntLen: Specifies the truncation length. The default value is 32.

Function preg_substr ($ strSource, $ intStart = 0, $ intLen = 32)
{
Is_int ($ intLen )? 0: die ("len isn' t a integer ");
Is_int ($ intStart )? 0: die ("start isn' t a integer ");
If ($ intStart> = 0 & $ intLen> 0 & @ preg_match ('/^ (. {'. $ intStart. '})(. {0 ,'. $ intLen. '})/si', $ strSource )){
@ Preg_match ('/^ (. {'. $ intStart. '})(. {0 ,'. $ intLen. '})/si', $ strSource, $ regs );
@ Preg_match_all ('/([x81-xFE]. |.)/sim', $ regs [1], $ regs1, PREG_PATTERN_ORDER );
@ Preg_match ('/^ [x81-xFE] $/', $ regs1 [1] [count ($ regs1 [1])-1])? $ IntStart --: 0;

@ Preg_match ('/^ (. {'. $ intStart. '})(. {0 ,'. $ intLen. '})/si', $ strSource, $ regs );
@ Preg_match_all ('/([x81-xFE]. |.)/sim', $ regs [2], $ regs1, PREG_PATTERN_ORDER );
@ Preg_match ('/^ [x81-xFE] $/', $ regs1 [1] [count ($ regs1 [1])-1])? $ IntLen --: 0;

@ Preg_match ('/^ (. {'. $ intStart. '})(. {0 ,'. $ intLen. '})/si', $ strSource, $ regs );

$ StrResult = $ regs [2];
} Else {
$ StrResult = "";
}
Return $ strResult;
}

Function preg_substr2 ($ strSource, $ intStart = 0, $ intLen = 32)
{
Is_int ($ intLen )? 0: die ("len isn' t a integer ");
Is_int ($ intStart )? 0: die ("start isn' t a integer ");
If ($ intStart> = 0 & $ intLen> = 0)
{
$ StrResult = substr ($ strSource, 0, $ intStart );
@ Preg_match_all ('/([x81-xFE]. |.)/sim', $ strResult, $ regs, PREG_PATTERN_ORDER );
If (@ preg_match ('/^ [x81-xFE] $/', $ regs [1] [count ($ regs [1])-1], $ regs )){
$ IntStart --;
}

$ StrResult = substr ($ strSource, $ intStart, $ intLen );
@ Preg_match_all ('/([x81-xFE]. |.)/sim', $ strResult, $ regs, PREG_PATTERN_ORDER );
If (@ preg_match ('/^ [x81-xFE] $/', $ regs [1] [count ($ regs [1])-1], $ regs )){
$ StrResult = substr ($ strSource, $ intStart, -- $ intLen );
}
}
Return $ strResult;
}

$ StrHTML = <HTML
AB

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.