Today I learned how to write a regex that doesn’t match a particular string.. Or to be a bit more precise; How to write a regex with a zero-width negative lookahead assertion!
Problem: You want to match something not followed by something else. Lets say for example that you want to match ‘foo/’ not followed by ‘beer’. In other words, ‘foo/’ followed by anything else but ‘beer’ is what you’re looking for.
The regex syntax for doing this is (?!X). And its availble in Ruby1.8+, Perl5+, Java5+ and most likely in you favorite language too.
Here’s an example in Ruby:
%w['foo/bar', 'foo/foo', 'foo/beer'].each do |x|
if x =~ /foo\/(?!beer).*/ then
puts x + "matches"
end
end
Another example in java:
import java.util.regex.*;
public class JavaRegex {
public static void main(String[] args) {
String[] s = new String[] {"foo/bar", "foo/foo", "foo/beer" };
Pattern p = Pattern.compile("foo/(?!beer).*");
for(String string : s) {
Matcher m = p.matcher(string);
System.out.println(string + " " + m.matches());
}
}
}
For more info on the ‘lookaround’ or ‘zero-width assertions’ regexp construct see Lookahead and Lookbehind Zero-Width Assertions