The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

How to ommit a text line that matches a pattern using RegEx

Trying to adapt regular expressions as part of my programming arsenal, I certainly found that they can be a very valuable tool in a lot of situations, and in my case I find them most useful as a code search / replace mechanism….

One thing, I was not able to find out how to do is how to omit from the returned strings a line of text that follows a specific pattern.

Example

Let’s say I have the following code snippet:

// This is a junk class

class  Junk
{

}

Now I am doing a global search applied to the whole project that contains several similar classes and I try to display all the class names….

I was not able to find a regular exception that will skip the commented line that happens to contains the token class, instead all my searching experiments that were looking for an occurrence of class were returning both the comment line and the desired line where the class is actually declared….

Any ideas?
JohnyWest Send private email
Tuesday, July 29, 2008
 
 
^[^/]*class
will find the instances of class not in a C++ style comment
not the nonno
Tuesday, July 29, 2008
 
 
thanks!
JohnyWest Send private email
Tuesday, July 29, 2008
 
 
In general, regular expressions don't provide a way to match strings that DON'T match a given expression (though some implementations of regexs may provide that feature: e.g. the grep command provides the -v flag to return all lines that DON'T match the specified regex). Check the documentation for the regex tool that you are using, but prepare to be disappointed.
Jeffrey Dutky Send private email
Tuesday, July 29, 2008
 
 
I think this is the case...
I am using Visual Studio's Find in Files and although the expression ^[^/]*class as discussed before allows me to skip commented lines that contain the word class, I still have trouble looking for the syntax that will allow me to skip lines that contain a specific sub-string ...
To be more concrete:


I want to view all the classes that exist in my project except those that I use for testing, that by default they end with the string Test

Searching for string returns:

class Junk
class JunkxyzTest

I cannot find how can I write the regex to skip the second line...

Am I missing something here?

Thanks for the help....
JohnyWest Send private email
Wednesday, July 30, 2008
 
 
Yet another reason I prefer command line tools. In the shell, I can pipe together several simple things, rather than having to come up with one complex one.

To find lines in myFile that start with a sequence of non-/ chars then "class ", but do not end with "Test" I would do

grep '^[^/]*class ' myFile | grep -v 'Test$'

The two invocations of grep work together to do what you want in a way a single regEx cannot easily do.

Will

Wednesday, July 30, 2008
 
 
I had high hopes for
class ([a-zA-Z_$][a-zA-Z0-9_$]@)#~(Test)
but that still matches class JimBobTest

I think the ~(X) form can only be used when you already have matched some specific characters otherwise the X gets swallowed by the preceeding matcher.

It looks like you will have to delve into the non-delights of VS macros unless somebody else has a working regex.
not the nonno
Wednesday, July 30, 2008
 
 
I was trying very similar regexp but none of them skips the non desired substring... Apparently the ~(xyz) works only as part of larger substring and does not consider wild characters....

As the anonymous poster stated, using a pipe probably is the best way to go
JohnyWest Send private email
Wednesday, July 30, 2008
 
 
in AWK.. match on something you don't want & then 'next'; i.e. skip to the next line.

So to skip over #defines when parsing a C++ program I have

{
/^#/ { next }

/function/ { do_something }
}
Grant Black Send private email
Wednesday, July 30, 2008
 
 
Jeffrey Dutky:
"In general, regular expressions don't provide a way to match strings that DON'T match a given expression"

Actually, most regex engines provide zero-width negative lookaround assertions (negative lookahead and negative lookbehind), for exactly this kind of task.

There's a good tutorial here:

http://www.regular-expressions.info/lookaround.html
BenjiSmith Send private email
Thursday, July 31, 2008
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz