.NET Questions (CLOSED)Questions and Answers on any aspect of .NET. Now closed. |
||
|
This discussion group is now closed.
Have a question about .NET development? Try stackoverflow.com, a worldwide community of great developers asking and answering questions 24 hours a day. The archives of .NET Questions contain years of Q&A. Even older .NET Questions are still online, too. |
I'm having performance issues when using a custom collection that inherits from KeyedCollection<T, V>. After I did some tests it seems the Contains() method is slow, so I decided to check the code for this method (thanx to VS 2008 framework debug). Internally it use the following method:
private int FindEntry(TKey key) { if( key == null) { ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key); } if (buckets != null) { int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF; for (int i = buckets[hashCode % buckets.Length]; i >= 0; i = entries[i].next) { if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i; } } return -1; } which as you can see is doing a linear search to find the item. I'm expecting thousand of items in my collection so I need a better collection that is faster. I guess using a hashtable would help, but I wana hear any experience you might have. Thanx
Not sure where you got that code from: looking at Contains(TKey) with Lutz Reflector gives the code at the end of this post.
If you're using Contains(TItem), that is inherited from Collection<T> so would be slow. Kind of surprising that Contains(TItem) isn't implemented as Contains(GetKeyForItem(TKey)). public bool Contains(TKey key) { if (key == null) { ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key); } if (this.dict != null) { return this.dict.ContainsKey(key); } if (key != null) { foreach (TItem local in base.Items) { if (this.comparer.Equals(this.GetKeyForItem(local), key)) { return true; } } } return false; }
Joe Tuesday, February 26, 2008
What is the type of your key? I would suspect that you are getting lots of collisions because your hashCode function isn't very good. I think the code in the contains method is only doing a linear search through a single bucket which is what a typical hashtable implementation would do: notice the int i = buckets[hashCode % buckets.Length] at the start of the for loop. Its hard to say though, without knowing what is in "entries". Everything in a bucket should have the same hashCode.
Joe you're absolutely right...the issue is because I was using TItem. When I replaced it with the key the performance improved dramatically. I didn't know that Contains(TItem) would inherit from Collection. It should be overriden in the case of Dictionary, as you say, to use GetKeyForItem(TKey).
If you use Contains(TItem) and the dictionary has a few thousand object it's unusable. I got the code using Visual Studio 2008...Microsoft has released the framework code which helps a lot. Thanx
Ted, you're right also that the implementation is correct...
The issues is explained above...my hash code works fine:)
Thumbs down to Microsoft for the quality of the KeyedCollection implementation.
The Contains(TItem) issue was reported here: https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=94359 In addition it is very slow to serialize, because it serializes its internal dictionary as well as its internal list - unnecessary because its easy to recreate on deserialization: https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=302290 Of course it would be relatively simple to write a custom implementation without these problems.
Joe Tuesday, February 26, 2008 |
|
Powered by FogBugz


