The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

A better way to replace these strings?

I've stepped into a somewhat interesting problem today, and I'm wondering if there's a better way that what I've managed.

Here's the toy version:

I have a string that looks like this

I want to swap the asdf's and qwer's around in place.

You can't just do a global replace on 'asdf' then on 'qwerty' because you'll end up stuck with all 'qwertys' and nowhere to go.

I ended up making use of sentinel characters to mark my replacements.  So, replace all 'asdf' with '[qwerty]', all non-marked 'qwerty' with '[asdf]'.  Then, all marked '[asdf]' with 'asdf', all marked '[qwerty]' with 'qwerty'.

What I did works fine, is short and sweet (nine lines of code) and is safe for my input, but I'm wondering just out of interest's sake, does anybody see a more elegant solution?

Tuesday, October 02, 2007
Whoops, that was me above.
Derek Illchuk Send private email
Tuesday, October 02, 2007
You could just write your own search routine that goes through the string character by character and replaces each sequence as it finds them.
Chris Nahr
Wednesday, October 03, 2007
"So, replace all 'asdf' with '[qwerty]', all non-marked 'qwerty' with '[asdf]'.  Then, all marked '[asdf]' with 'asdf', all marked '[qwerty]' with 'qwerty'."

You can save one replace:

asdf -> [qwerty]
qwerty -> asdf
[qwerty] -> qwerty

Of course you must make sure that [qwerty] can't be a part of your origignal string.

If it can and you have a split and a join function, you can also do this:

parts = split(str, "asdf")
for each part
    replace(qwerty -> asdf)
str = join(parts, "qwerty")
Wednesday, October 03, 2007
+1 Chris. If performance matters, 1 times through is better than 3.
Wednesday, October 03, 2007
I agree with chris and onanon. Go through it once. You can either go character by character or you can use a method like indexOf() on both string values and take the lowest index returned. Make your replacement and then start again from the end of the replaced string.
Wednesday, October 03, 2007
I agree with the others. Even without knowing for what you need this, and without knowing the language you use, we can say for sure that premature optimization will save you some nanoseconds and thus the day. Just make sure that your string type is low-level enough so that you can control the behaviour of the string functions, especially to avoid that

target = target + leftpart + "asdf"

will reallocate and copy the whole string over and over. And make it generic. Tomorrow you may need to replace 3 substrings, next week it may be 50. All these saved nanoseconds will cumulate quicker than you may think. Maybe you should even think about using assembler for it...

Seriously, you asked for alternatives and rolling your own function is another one. I just wonder that no implementations in $MY_PET_LANGUAGE were given so far -- untested, buggy and non-working, of course, with an additional post for correction and a third to correct the corrections. That's the usual type of answers for such questions.

Isn't it wonderful here? ;)
Wednesday, October 03, 2007
Geez, a NINE LINE Algorithm, and you want to IMPROVE it?

Get the job done, get a life, and get on with it.  Unless you're just bragging, in which case, that looks like a cool solution.
Wednesday, October 03, 2007
Just use regex expressions. Almost all languages support them and they are very efficient/fast.

The .NET version of the regex replace:

Wednesday, October 03, 2007
what about using a GUID as the temporary replace string, instead of the bracketed one?
We're quite sure that a guid will not be part of a string
Johncharles Send private email
Wednesday, October 03, 2007
> I want to swap the asdf's and qwer's around in place.

So are all of the suggested solutions wrong or were the specs wrong?
Wednesday, October 03, 2007
Your own search/replace routine is probably fastest and best, but Secure's suggestion turns into one line of Python (which is cool if you're into that):

'qwerty'.join([s.replace('qwerty', 'asdf') for s in x.split('asdf')])

Regular expressions are usually the way to go, but I'm not sure how to write one in this case. Can someone post a regex that swaps 'qwerty' and 'asdf' in one pass with no temporary string?
Wednesday, October 03, 2007
In Perl, you can do it in one pass with a regexp by using the simple substitution command

s/(asdf|qwerty)/$1 eq 'qwerty' ? 'asdf' : 'qwerty'/ge

(There's your $MY_PET_LANGUAGE solution, Secure -- although Perl isn't really my pet language, and I've tested that it works on the single item of sample data we have.)
Wednesday, October 03, 2007
From which of course that oh-so-expensive (not) string comparison can be trivially removed:

s/(asdf)|qwerty/$1 ? 'qwerty' : 'asdf'/eg
Wednesday, October 03, 2007

Wednesday, October 03, 2007

would you have it tested if I had not ranted about it? And see, you've posted a correction (optimization in this case). ;)
Wednesday, October 03, 2007
Secure, yes, of course I would have tested it anyway!    I didn't take in your post in detail till I looked back to remind myself who it was who'd mentioned pet-language implementations.

However, my failure to profile anything before attempting optimisation satisfies me that I'm upholding the most important traditions of programming discussions.  :)
Wednesday, October 03, 2007
"If performance matters,"

Not since 1995...
Invader Zim
Thursday, October 11, 2007

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz