The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

(object) design patterns for realtime apps?

I am writing an multithreaded realtime application with a lot of user feedback/control. So I started to design it with object design patterns, which works pretty well. But in some situations I am not satisfied with the performance, which is okay, but not great.

Now I was searching around for a kind of "realtime best practices", maybe some of you could give me some pointers where to look and read more?

I appreciate any input

Husker
Husker Send private email
Sunday, February 05, 2006
 
 
Art Wilkins
Sunday, February 05, 2006
 
 
"Real time" used to be reserved to what is now known as "hard real time": Applications in which 10ms delay in running some computation or process makes your product irrelevant (think: aircraft control; heart pacers; any time sensitive control equipment). You need special O/S and tools for that.

Are you doing "hard real time" or just "online"? (called "soft real time" for some reason these days?)
Ori Berger Send private email
Sunday, February 05, 2006
 
 
@Art: Thanks for that links, yes, this looks promising, will read it.

@Ori: to be more detailed: it is "hard real time", even if it is not life threatening if it fails ;)

It is an audio plugin (VST framework) where you are not allowed to use up 100ms cpu time for 100ms of audio stream (even 10ms is not _super_). You have also a GUI thread and the interaction between these two (or more, it is a quite complex plugin) has a lot of parameters and display responses etc..

OS is WindowsXP and Mac OSX, btw.

It works well, but some parts of my code do not really satisfy me, therefore I wanted to research a bit. But it was difficult to find some points where to start from.

thanks for your input,
Husker
Husker Send private email
Sunday, February 05, 2006
 
 
It depends on what's wrong with the performance. Why is it slow? Out of CPU? Task switching? Locks? Algorithms? Code? IO?
son of parnas
Sunday, February 05, 2006
 
 
Well, actually my impression is that my architecture is not so well.

Here is the specific problem I am working on:

there is a quite huge amount of data chunks ("programs") that have a bulk data, a name and different attributes.

all data is organized in a tree (folders, files, chunks in the file, etc., minimum about 5000, can go up to 50000 or more).

Now parts of the tree can change at any time by the reception of a message ("exchange chunk 529 by this chunk" and such).

Now I have to maintain several lists that are 1-dimensional like "chunks that are loadable", "chunks that are overwritable", "chunks that are yellow", "chunks that are red" (so they fall into several categories).

In these lists a chunk can show up multiple times.

Everytime the "high priority" system gets such a data chunk, it stores it in a intermediate buffer and a low priority task finds it and updates the tree. Then, a lot of listeners get a message (it is updated) by calling a member function and the first one rebuilds the lists (the others take a copy) by walking the tree.

But this takes about 400-800ms (on a 1.6gHz P4) after each single change (even with only 5000 objects) and I want to have this faster. Therefore I am looking for strategies to make this more clever so the user doesn't feel all that things, because walking the tree and fill up vector<>'s is not so good.

It also takes 250ms to build up a dropdown menu holding all that entries.

Oh, and the other point is: I must also lookup things very fast, so I can not make it a linked list. Maintaining both structures would be a even bigger no-no, I guess.
Husker Send private email
Sunday, February 05, 2006
 
 
Husker, have you profiled it? Where is it taking the time? Probably in malloc, right? I bet the tree insert is not too bad, but you are allocating space for samples and it is a large space. Let me know if this is the case. If it is, I'll have more to say.
Art Wilkins
Sunday, February 05, 2006
 
 
Haven't profiled it yet, but will do and let you know.

Oh, and one of the things I am really after is: I am locking that list for every single "get entry" with a semaphore which is (what I think) much too time consuming but I have to interlock it. On the other hand I can not lock it the whole time, because then my other thread is blocked too long.

Therefore reading about locking strategies would be interesting either.
Husker Send private email
Sunday, February 05, 2006
 
 
>Husker, have you profiled it?

+10

You'll probably be quite enlightened after doing some code profiling. Sometimes algorithm changes can make dramatic improvements in performance.

I assume by "tree" you aren't referring to the MS Tree Control as that gets very slow with large numbers of items.

For locking you should only need to use critical sections which is faster than semaphore's.

Also use thread pools instead of creating and destroying threads.
Neville Franks Send private email
Sunday, February 05, 2006
 
 
The threads run during the whole lifetime of my code, not a threadpool implementation. The tree is a little selfwritten node, that uses this to hold childs:

map<string,node*> mItems;

and uses iterators to walk them along. It is more a tree in terms of directory tree than a tree as in avl-tree.

I will make an elaborated profiling day tomorrow and try to find something.
Husker Send private email
Sunday, February 05, 2006
 
 
What you are calling a tree is a STL hash map indexed by strings? And there are tens of thousands of nodes?
Art Wilkins
Sunday, February 05, 2006
 
 
How that f*ck can you post about performance without profiling?
son of parnas
Sunday, February 05, 2006
 
 
parnas, he is off doing the profiling now. Read the responses.
Art Wilkins
Monday, February 06, 2006
 
 
> parnas, he is off doing the profiling now. Read the responses.

That's grand. You post about performance without actually having done a lick of work to figure out what isn't performing.

That certainly deserves a group hug.
son of parnas
Monday, February 06, 2006
 
 
parn-ass, do you have any useful advice to contribute or are you just showing what a twit you are? It's fine if you want to, just carry on.
Art Wilkins
Monday, February 06, 2006
 
 
@son of parnas: if you would have read my original post carefully, you would have found that it is not about special performance of the code. Profiling can not be replaced by knowning your tools and as far it concerns the code itself, it is okay enough. The profiling supports this (you're welcome to read on).

@Art: the profiling results seems to support my first idea. The critical sections, even when they are the fastest synchronisation objects in this context, consume the most time, since I am locking too much. Removing the lockings reduces execution time down to (about) 20 to 40ms, but it (of course) crashes when I do not prevent my other thread to do its work.

About the map: yes, the key is a string, since I want to access them fast when resolving an object by a string path, which is a perfectly valid thing. Every iterating access ('walking the tree') uses map<>::iterator which is nearly as fast as a doubly linked list.

Btw.: the lists are vector<string>, because I need to access them randomly the most time, but I am guessing the number I need before and .reserve() it to decrease the need of massive malloc-ing (this improved also very much in the first place).

So, I am back to my original problem: find a better strategy to keep the access to the tree synchronised without overlocking it. I will read the papers you've hint me and try to figure out a better way (-> strategies for multithreaded realtime code patterns).

I'll keep you posted, if you like.

Thanks for hints and suggestions everyone.
Husker Send private email
Monday, February 06, 2006
 
 
> I am locking too much.

Profiling would have showed this and then you could have asked how do I reduce locking in my program.

Locking is most likely the result of a confused threading model. You may want to look into an approach like seda (http://www.eecs.harvard.edu/~mdw/proj/seda/).

This eliminates the need for most locking.

The normal java/c++ approach to locking is nearly impossible for humans to use in the large. Organizing an application like the erlang (http://www.erlang.org/) will create a safe high performaing app.
son of parnas
Monday, February 06, 2006
 
 
>Profiling would have showed this and then you could
>have asked how do I reduce locking in my program

My original question was this:
>Now I was searching around for a kind of "realtime best
>practices", maybe some of you could give me some pointers
>where to look and read more?

I just checked against the suggestion of Art and found my first assumption valid.

Anyway, thanks for that links, they are looking interesting.

And of course, in the next project everything gets better. :)

Cheers
Husker Send private email
Monday, February 06, 2006
 
 
> do you have any useful advice to contribute or are you
> just showing what a twit you are?

I generally try to do both.

The issue is performance issues in a RT is enormous subject. Before going in one of a thousand different directions it would be nice if the person put in at least some of their own effort so we could explore only a few of the many directions.

That's just the kind of twit I am.

Also, one thing to consider is dynamic memory is usually a hidden source of performance problems both because it can be slow depending on the allocator and because of locking.

If your app uses a lot of dynamic memory, which is fine in my book, then your threads will spend a lot of time in the memory library.

Consider using thread specific memory allocators. Consider using custom memory pools.
son of parnas
Monday, February 06, 2006
 
 
"Consider using thread specific memory allocators. Consider using custom memory pools."

Not a bad suggestion generally.

One of my 'favorite' things to see in people's hard real time threads is new/malloc/orsomehiddenversionofsame...unless they've written their own of course.
TritoneSubstitution Send private email
Monday, February 06, 2006
 
 
I was aware of the different aspects of C++ containers. Therefore the vector is reserve()'d so it does not reallocate. There is no memory allocation during that process I described except the first reservation (I knew that even without profiler).

After thinking and drawing a lot of bubbles and lines on paper I've come to the solution that I should spice up the interface, so specialised code to insert/replace/remove an element at the specific places instead of rebuilding everything should decrease the cpu time. Even if the vector<>::at and vector<>::insert costs a bit more it is better than doing a lots of vector<>::push_back()s.

It is a bit more code (which I generally dislike) but in this situation specialisation seems to be the way, since I can not redesign the whole architecture now.
Husker Send private email
Monday, February 06, 2006
 
 
>  I described except the first reservation (I knew that even without profiler).

Unless your container stores pointers you may be doing copies in out of the container, which hits the memory library.
son of parnas
Monday, February 06, 2006
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz