from:https://stackoverflow.com/questions/3512471/ What-is-a-non-capturing-group-what-does-a-question-mark-followed-by-a-colon
fter reading some tutorials I still don ' t get it.
Could someone explain how are used and what ?:
it's good for?
Let me try to explain this with an example.
Consider the following text:
https://stackoverflow.com/https://stackoverflow.com/questions/tagged/regex
Now, if I apply the regex below over it ...
(http|ftp)://([^/\r\n]+)(/[^\r\n]*)?
... I would get the following result:
Match "https://stackoverflow.com/" Group 1: "http" Group 2: "stackoverflow.com" group3: "/" Match "Https://stackoverflow.com/questions/tagged/regex" Span class= "Typ" >group 1: "http" group 2: "stackoverflow.com" group< Span class= "PLN" > 3: /questions/tagged/ Regex "
But I don ' t care about the protocol--I just want the host and path of the The URL. So, I change the regex to include the Non-capturing group (?:)
.
(?:http|ftp)://([^/\r\n]+)(/[^\r\n]*)?
Now, my result looks like this:
Match "https://stackoverflow.com/" group 1: "stackoverflow.com" group 2: "/" match "Https://stackoverflow.com/questions/tagged/regex" Span class= "PLN" > group 1: "stackoverflow.com" group 2: /questions/tagged /regex "
See? The first group has not been captured. The parser uses it to match the text, but ignores it later, in the final result.
EDIT:
As requested, let me try to explain groups too.
Well, groups serve many purposes. They can help you to extract exact information from a bigger match (which can also is named), they let you rematch a Previ OUs matched group, and can be used for substitutions. Let ' s try some examples, shall we?
Ok, imagine you has some kind of XML or HTML (be aware, a regex may isn't the best tool for the job, but it's nice as An example). You want to parse the tags, so you could does something like this (I had added spaces to make it easier to understand):
\<(?<TAG>.+?)\> [^<]*? \</\k<TAG>\>or \<(.+?)\> [^<]*? \</\1\>
The first regex has a named group (TAG) and while the second one uses a common group. Both regexes do the same thing:they with the value from the first group (the name of the tag) to match the closing tag. The difference is, the first one uses the name to match the value, and the second one uses the group index (which star TS at 1).
Let's try some substitutions now. Consider the following text:
Lorem ipsum dolor sit amet consectetuer feugiat fames malesuada pretium egestas.
Now, let's use the this dumb regex over it:
\b(\S)(\S)(\S)(\S*)\b
This regex matches words with at least 3 characters, and uses groups to separate the first three letters. The result is this:
Match "Lorem" Group 1: L Group 2: "O" Group 3: "R" Group 4: "Em"Match "Ipsum" Group 1: I Group 2: "P" Group 3: "S" Group 4: "Um" ... match "consectetuer" Span class= "Typ" >group 1: "C" group 2 : "O" group 3: "n" group 4: "Sectetuer" ...
So, if we apply the substitution string ...
$1_$3$2_$4
... over it, we is trying to use the first group, add a underscore, use the third group, then the second group, add Anot She underscore, and then the fourth group. The resulting string would is like the one below.
L_ro_em i_sp_um d_lo_or s_ti_ a_em_t c_no_sectetuer f_ue_giat f_ma_es m_la_esuada p_er_tium e_eg_stas.
You can use the named groups for substitutions too, using ${name}
.
To-play around with Regexes, I recommend http://regex101.com/, which offers a good amount of details about the regex work S It also offers a few regex engines to choose from.
JavaScript reg does not join groups