(?<=<b>)\w+(?=</b>)
- Regex options: Case insensitive
- Regex flavors: .NET, Java, PCRE, Perl, Python, Ruby 1.9
JavaScript and Ruby 1.8 support the lookahead (?=</b>)
, but not the lookbehind (?<=<b>)
- I prefer call lookbehind that where ahead cursor is something
- I prefer call lookahead that where behind cursor is something
Positive lookaround
Essentially, lookaround checks whether certain text can be
matched without actually matching it.lookbehind (?<=…) is the only regular expression construct that
will traverse the text right to left instead of from left to rightLookaround constructs are therefore called zero-length assertions.
Negative lookaround
(?!...) (?<!...)
, with an exclamation point
instead of an equals sign, is negative lookaround.negative lookaround
matches when the regex inside thelookaround
fails to match.
Different levels of lookbehind
lookahead is completely compatible, even
lookahead or lookbehind nested in lookahead.lookbehind is different, because regex is design traverse
from left to right, but lookbehind needs right to left
Perl
andPython
still require lookbehind to have a fixed lengthPCRE
andRuby 1.9
allow alternatives of different lengths inside lookbehind
notepad++ usePCRE7.2
regular expression engine?Java
takes lookbehind one step further, allows any finite-length
regular expression‹*›, ‹+›, and ‹{42,}›
inside lookbehind.NET Framework
is the only one in the world
that can actually apply a full regular expression from right to left.
Lookaround is atomic
(?=(\d+))\w+\1
- Regex options: None
- Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby
The group capture inside the lookaround is same as usual group,
numbered from outter to inner , left to right
Alternative to Lookbehind
<b>\K\w+(?=</b>)
- Regex options: Case insensitive
- Regex flavors: PCRE 7.2, Perl 5.10
Match with ‘\K’, string in front of it will not be pattern.
It matches like a block, no recursive, no loop, no backtrack.
For example:
when (?<=a)a
matches the string ‘aaaa’, three a
be matched,
the 2th/ 3th/ 4th a
. Lookbehind will track to left one matched then next.
But a\Ka
matches two a
, the 2th and the 4th.
when first/second a
captured, abandon first, then second matches.
Then begin next matching, third/fourth a
captured, abandon thrid.
Solution Without Lookbehind
In Ruby 1.8 or JavaScript there is no lookbehind can be use.
Solution:
- use a common expression to suit, group them, just pick the group you want.
- If replace operation needed, use group number to replace
which place you don’t want be changed\1
or\kxxx
.
simulate lookbehind
1 | var mainregexp = /\w+(?=<\/b>)/; |
- first find the target location with Lookahead, remove it.
- second if the forepart is end with what you lookbehind anticipated
<b>
, then lookbehind matched.