A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.
In one of the recent stackoverflow podcasts, Joel mentioned CLucene not being good, but Lucene.NET being excellent. He mentioned something about threading issues, but didn't quite delve into it.
So what's wrong with CLucene?
Thursday, July 03, 2008
Might sound biased coming from me, since i created CLucene. However, CLucene is fine (never had any problems with multithreading, myself - depends how you implement, probably).
CLucene lags behind java lucene a bit since CLucene's developer base is much smaller. But there is a lot of new work going on that brings CLucene closer to java lucene.
Lucene.net - of course if you're developing in .net and don't have special performance requirements, then this may be a better option, since you won't have to deal with memory management, etc, in c++.
I use an older version of CLucene for my product (because, IIRC, the newer versions have a different license that prevents us from using it) and it simply doesn't work in a multi-threaded app. I eventually had to serialize everything so only one thread was in it at any given time.
I submitted some patches and suggestions to help try fix things but when you have critical sections of code being protected by boolean variables e.g.
static bool gIsLocked = false ;
if ( ! gIsLocked )
gIsLocked = true ;
// do critical processing here
gIsLocked = false ;
then, with apologies to Ben and his team, it suggests that the devs don't really understand the issues involved when writing multi-threaded code.
Friday, July 04, 2008
> I use an older version of CLucene for my product (because,
> IIRC, the newer versions have a different license that
> prevents us from using it)
It looks like they went from GPL to LGPL which would give you more freedom actually. With LGPL, you can use the library with a commercial application as long as you link to the library dynamically and use it as is without making any changes to its source code. If you make any changes to the library (hence create a derived work), then you must release those changes to everyone for free of charge. The rest of your application can remain under the license of your chosing. GPL does not allow this freedom (ie your application has to be GPLed as well).
> and it simply doesn't work in a multi-threaded app. I
> eventually had to serialize everything so only one
> thread was in it at any given time.
I think this is what Joel was talking about although, as I said, he didn't go into much detail.
Ben, care to elaborate on this issue?
Friday, July 04, 2008
In response to Taka:
Typical locking code in clucene looks like this:
if (docAnalyzer != NULL)
Where SCOPED_LOCK_MUTEX defines a mutex guard, which uses pthread or critical sections to lock the THIS_LOCK object.
Perhaps if you are using a VERY old version, there were problems. However, I'd be very dissapointed if people were advertising how bad CLucene is based on such old, old code!!
The licensing mentioned: CLucene has become MORE liberal with its licensing. It supports dual LGPL or Apache (you choose which one to use). As code_kungfu points out, LGPL is more liberal than GPL, and is appropriate for library code like CLucene.
I and many other large projects use CLucene with multithreading. There have been no reported multithreaded issues in the last few years that are not fixed in the latest code.
I hope this clears up any problems with CLucene.
This topic is archived. No further replies will be accepted.Other recent topics
Powered by FogBugz