The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

10,000+ clients with Java?

I've decided to go ahead with the next generation of our server using Java.  We will have a need to maintain a connections with 10,000+ clients at one time.

I know I can do this in C and/or C++ (I've done it, lots).  And the Java people on my team assure me it will be fine, as well as other Java people I have spoken to.

The main reason for going to Java is the support for concurrency over C++, and developer expertise.

By the way, we'll be using the mina (http://mina.apache.org) as it follows patterns I'm familiar with from previous work.

Doable?  My initial prototyping is tell me yes, but I'd like to hear what others have experienced.

Thanks.
nimrod Send private email
Tuesday, March 27, 2007
 
 
I read that Java performance (with 1.5) is now about 1.1 of what it is with C++.  So only about 10% slower.  Not bad at all. Java had a bad performance reputation at the beginning but it has really closed the gap in the last few releases.

Go for it.
flameThis
Tuesday, March 27, 2007
 
 
Yes, I was happy with the performance.

For interests sake, I wrote a small program that did some of the processing we commonly do.  I won't post the details, as its very application specific, and benchmarks are open to interpretation, at any rate our results were:

Pure python: 9s
Java 1.4 (no jit): 8s
Java 1.4: 4s
Java 1.5: 1.1s
Java 1.6: 0.8s
C: 0.45s

Take this with a grain of salt, but these were my observations.  The Java speed is fine, and there is always JNI if we hit a problem area as well.

Thanks - hearing it from someone else, whoever that might be helps give me the warm fuzzy feeling.
nimrod Send private email
Tuesday, March 27, 2007
 
 
Just out of curiosity, what is your "client"? I mean, fat Win32 vs. Browser vs. other servers vs. mobile or... ?
Greg Send private email
Tuesday, March 27, 2007
 
 
There are 2 types of clients, a "fat" client written in Java that is the user interface, and then agents that are C and Python.
nimrod Send private email
Tuesday, March 27, 2007
 
 
If you need that many clients, your bigger problem will be the sheer number of sockets that you have to have open at any given time.  Some socket management strategies/tools keep a thread for each one... while that may work when you have <50 connections, it doesn't scale up very well.
KC Send private email
Tuesday, March 27, 2007
 
 
Look at the java.nio package - it defines an API (similar to Unix select() ) that allows you to multiplex I/O across multiple sockets in a single thread. For 10,000 clients, you'll want that.

Realistically, you probably want multiple processes anyways. Front end processes, connecting directly to clients, acting as proxies for backends.
Early30sNerd Send private email
Tuesday, March 27, 2007
 
 
Just curious...
I assume what you mean by "maintain a connection" means tcp/ip socket connection.
What's your target OS for your server?
How does your application handle client connections
which are not terminated gracefully(like modem/router/switch/hub was shutdown, LAN cable removed, etc)

We were in a similar situation(but relatively smaller scale) and made a mistake not using the nio library. We ended up tweaking the OS(Red Hat AS) in order to allow more threads to be created.
jhanx Send private email
Tuesday, March 27, 2007
 
 
NIO is part of what makes the project even feasable with Java.  I'd never consider writing a server using one thread per client (part of my background I guess).

But some of the comments now have me wondering what backend the Java selector uses.  I've done single thread servers that handle 10,000 clients using kqueue and epoll - I hope java uses the best underlying event system for the given OS.

I also came across a mina mailing post where someone had tested up to 54,000 concurrent TCP clients.  So I think my requirements are doable.
nimrod Send private email
Tuesday, March 27, 2007
 
 
I don't understand you guys with the single-threaded approach: if one client is being served, all the others would have to wait.

The best approach is to have a single thread which takes in the request and then dispatches it to a worker thread, selected from a thread pool.
Achilleas Margaritis
Wednesday, March 28, 2007
 
 
Single threaded works fine, it depends on what you are doing.  If all your actions are non-blocking, you can scale really well with a single thread (you set the amount of time you feel is non-blocking).  With libpq and async queries (PostgreSQL) you can even write database driven apps with a single thread.  It takes more work as you have to build of state machines and manage callbacks, but you'll never have to debug threads.

My current server is like this.  Its a mix of C and Python, handles 5000 or so clients reasonably well with low latency.

Of course, you only use one CPU here, but you can create a new select()[1] for each cpu you have.

[1] I use select in the generic term, I could be epoll, kqueue, etc.
nimrod Send private email
Wednesday, March 28, 2007
 
 
I wouldn't worry about how fast Java is, I would worry about creating a horizontally scaling architecture so you can easily handle the load. Don't try to shave every cycle to support all the clients in one process/thread or even machine.
son of parnas
Wednesday, March 28, 2007
 
 
It doesn't have to be an either-or decision. There are more options than just "single thread" vs. "different thread for each connection". These are just the most straight forward to implement. There is also "fixed size pool of threads" that provides benefits of the other two approaches but is a little harder to implement.
anony mouse
Wednesday, March 28, 2007
 
 
I agree with son of parnas: architecture is more important than language.  Any single threaded or single process approach is going to be unable to scale on an SMP machine.

Is there any mechanism on Java similar to I/O Completion Ports?  Windows, and I believe Solaris 10, have an elegant mechanism available for performing I/O using a limited/fixed thread pool.

As the previous poster suggested, a design that uses a fixed size thread pool might require a bit more thought, but but may perform ideally.  The argument goes something like this:  On a machine with N processor cores, only N threads are *really* running at a time anyway, so the ideal number of threads is usually something like N*K where K is some constant that probably depends on the application.
Meganonymous Rex Send private email
Wednesday, March 28, 2007
 
 
>> As the previous poster suggested, a design that uses a fixed size thread pool might require a bit more thought, but but may perform ideally.  The argument goes something like this:  On a machine with N processor cores, only N threads are *really* running at a time anyway, so the ideal number of threads is usually something like N*K where K is some constant that probably depends on the application.<<

+1

I was going to respond with the same basic thought. Well said.
Berfert Send private email
Wednesday, March 28, 2007
 
 
I haven't seen Erlang mentioned in a while...

How about Erlang?
*myName
Wednesday, March 28, 2007
 
 
Achilleas:

Single-threaded servers can perform very well so long as the programmer employs the non-blocking versions of the socket routines.

In this scenario, a single thread monitors a set of socket descriptors (representing open connections and possibly the listener socket so that new connections can be made) using the select() function.  When I/O completes on one of the sockets, the server iterates through the socket list to see which descriptors are ready for I/O, does the work and then loops back to the select.  I believe that this is typically referred to as "multiplexing."

You simply would not use blocking socket functions in a server like this, so the problem you mention does not exist.

One might say, "OK, but it doesn't scale to multiple processors."  I'm not a UNIX programmer, but I believe that the way that traditional servers worked in absensce of threads was to fork multiple processes off, each of which used the scheme described above.  Each of these children are scheduled independently by the kernel and the number of forks() could be tuned to maximize performance in the same way as described in my previous post. 

The one advantage that threads have is that they are in the same process address space.  In the traditional fork() world, if it is important for the child worker processes to communicate in some way, they do so via shared memory, pipes, etc,.

The goal in all of this is keep the processor saturated with work.  If you have too few threads/processes, then then CPU goes idle.  If you have too many, the overhead of swapping out stacks for thread/process context switches starts to hurt you.

Any model that uses one thread or one process per connection is probably not scalable to 10,000 users though, unless the connections are not persistent like HTTP.
Meganonymous Rex Send private email
Wednesday, March 28, 2007
 
 
On Windows, you modify a socket's blocking behavior by calliing ioctlsocket() and FION_BIO flag set to TRUE for "non blocking" and I believe on UNIX the function used is fcntl().

After a socket is set to nonblocking, calling a function like read() will not block. If it COULD have read data but none was available, on UNIX, EAGAIN is returned.  On Windows, it's WSA_E_WOULDBLOCK.

One important thing to note for those still listening is that on UNIX, socket calls _can_ fail with EINTR if signals are delivered (like SIGALARM) so you have to watch for this in a real production program.  I am told this behavior varies by platform so watch out.
Meganonymous Rex Send private email
Wednesday, March 28, 2007
 
 
Meganonymous Rex,

I was not referring to blocking sockets but to processing done on received data. If the server has to run a part of an application for each incoming request, then a single-threaded server is not a viable solution: next requests will wait for the processing of the current request, even with a non-blocking solution.
Achilleas Margaritis
Thursday, March 29, 2007
 
 
> I was not referring to blocking sockets but to processing done on received data.

In a typical application, it's only the I/O (network I/O, disk I/O, and database I/O) that's slow: so it's sufficient if the I/O is asynchronous.
Christopher Wells Send private email
Thursday, March 29, 2007
 
 
i worked on software where we had 1000+ transactions-per-second done with the single threaded approach.  threads suck.
Early30sNerd Send private email
Thursday, March 29, 2007
 
 
"In a typical application, it's only the I/O (network I/O, disk I/O, and database I/O) that's slow: so it's sufficient if the I/O is asynchronous"

It depends on the application: for a static content server (html pages, files etc), the single threaded approach is adequate. But if the code needs to run some calculation for each user which make take some time, then a single-threaded approach is not enough.
Achilleas Margaritis
Friday, March 30, 2007
 
 
Achilleas:  I see what you're saying.  Clearly, it's probably application dependent.  The question really is: where's the bottleneck?  What I was trying to say was: if the answer is the network, then multiplexing is probably just as good as anything.
Meganonymous Rex
Friday, March 30, 2007
 
 
>> threads suck

I keep hearing this.  I hear it from fairly smart people who have worked the industry for a long time.

I've come to the conclusion that "they suck" because many people are confused about the issues and/or are impatient and don't want to spend the extra time up-front to design multithreaded systems properly.

I'm working on debugging a thread related problem right now.  The architect who designed the system can't figure it out.  He says, "threads suck."

No they don't, he just didn't think it through before he started.  People just aren't rigorous.
Meganonymous Rex
Friday, March 30, 2007
 
 
> People just aren't rigorous.

Yes, that's why threads suck. Understand now?
son of parnas
Saturday, March 31, 2007
 
 
1) Java's libraries are designed for scalability. It might not outperform C/C++ solutions on a single CPU but it is said to perform very well across multiple CPUs.

2) It is my understanding that the world's most massive servers use Java on the backend scale the heck out of it so you can trust that you're not the first to go down this road and it's likely to work just fine.

3) There has been a lot of talk of multiplexing I/O, that's exactly what NIO does. So yes, Java has that.

4) Another thing to consider is that in my experience disk I/O does not scale well at all. At least under Windows, two tasks run faster sequentially than if they both run at the same time in different threads. This is mostly because the disk seeks all over the place instead of scheduling I/O along the direction the disk rotates. It is my understanding there isn't a good way to fix this under preemptive OSs (correct me if I'm wrong). In any case, this might be a point for going single-threaded if you're I/O bound. If you're mostly computationally bound I would suggest using multiple threads because the ability to service more clients simultaenously probably outweighs the overhead of multitasking.

Gili
Gili Tzabari Send private email
Sunday, April 01, 2007
 
 
> It is my understanding there isn't a good way to fix this under preemptive OSs (correct me if I'm wrong).

I'd expect this is up to the disk driver (software) and controller (hardware): I think they should be able to accept multiple concurrent requests, and satisfy them in a physically efficient sequence.
Christopher Wells Send private email
Monday, April 02, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz