A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.
I'm unclear on something-- If I have a server where a socket opened in overlapped mode is associated with an I/O completion port, and then I call WSARecvFrom to queue up a "deferred recvfrom", is it possible for WSARecvFrom to return immediately with success? If so, does this still post an I/O completion?
I am operating right now on the assumption that the answers to these questions are YES and YES.
I ask because I'm seeing some interesting behavior with my test app. I have a server that creates a socket and binds it to a UDP port, then spins up a few threads to wait for incoming messages. All the threads do is (assuming they get a successful completion) increment a counter using InterlockedIncrement and then post another receive with WSARecvFrom.
I wrote another app that slams the hell out of that server by spinning up four threads and having each send 1000 1Kb datagrams each in a tight loop, and then quitting.
I sort of expect to see dropped packets, but It seems like the server does better when it's getting hit more. When it first starts up, my first run of the test program might only get 13 packets through, but then the next run it gets 1000+.
> YES and YES.
Yes I think so.
> UDP port
If you don't have an outstanding read then to some finite extent received UDP packets will be buffered in the protocol stack, waiting for your read, and then copied into your read buffer when you read.
> I sort of expect to see dropped packets
Do you have both processes running on the same machine? This might be an application where you'd like to boost the thread priority of your receiving threads.
>> to some finite extent received UDP packets will be
>> buffered in the protocol stack, waiting for your read,
>> Do you have both processes running on the same machine?
Yes, although that WAS just for testing. Perhaps if I ran the packet slammer app from a different box on my LAN, the network would actually be the bottleneck and my app would see fewer misses. Basically, I had it counting the packets it received, and it was fewer than expected. I didn't make this abundantly clear in my last post.
>> This might be an application where you'd like to boost
>> the thread priority of your receiving threads.
That's interesting. I'm going to play with that. One thing that I've been reading about Windows Vs. Linux performance (maybe this is old information) is that Linux's prime advantage (again, at least in the past) was that that TCP/IP stack was in the kernel and didn't get paged out at all or as much as the stack on Windows.
You'd think raising the thread pri on the workers could improve things. However, I'm doubtful that Windows is as bad as some of those tests claimed. Regardless, I know I'd MUCH rather program async I/O for Windows that Linux given the choice - I/O Completion Ports are totally awesome.
Try it and see: send one packet, after the socket is opened, before you do your first read.
> Yes, although that WAS just for testing.
I'd say that's an imperfect test.
> TCP/IP stack was in the kernel
That's as may be, but if you're running two processes on one machine, and if your (user-mode) application's receiving process/threads aren't high priority, then your receiving threads may become CPU-starved and not read as often as you'd like them to. I'd give my threads "real time priority", bearing in mind that you can kill the machine (needing a hard reboot to recover), if you ever let a real time priority thread spin forever.
Some people say yes and some say no: see http://www.google.ca/search?hl=en&safe=off&q=SO_RCVBUF+udp for further details.
The answers are YES and YES. See http://support.microsoft.com/default.aspx?scid=kb;en-us;Q192800 for details.
I think your testing is probably broken. First run on a different machine. Second, personally, I'd run the thrashing app at lower levels of packet sending first and slowly increment it to see when the server stops receiving the datagrams. Thirdly I'd probably run a packet sniffer on the lan to make sure that the sender was actually sending X datagrams onto the wire at the point where your receiver starts to receive fewer than expected...
I've done a fair amount of work with high performance IOCP servers. Most of the stuff that I give away are TCP servers: http://www.lenholgate.com/archives/000637.html but there is an old (and rather nasty) UDP server here: http://www.lenholgate.com/archives/000088.html . I've recently done a lot of performance work on the IOCP framework that I use and my VOIP clients have been very happy with the UDP over IOCP performance (which is why I think your testing is broken. ;) It's sometimes difficult to be sure you're actually testing the right thing with your tests...)
Tuesday, July 11, 2006
Thanks, I will check out the links you've posted. Taking some time to reflect, the testing is almost certainly broken.
My guess is that since the sending appplication didn't truly have to go over a real network, it was able to send far more datagrams than would be possible otherwise. (I assume that the TCP/IP stack is smart enough to manage local delivery.)
I slowed down the sending app a bit and saw near perfect delivery, so my hypothesis sounds right.
I can't emphasize how much I appreciate the help. There seems to be a lot of misinformation on this topic. The vast majority of the sample code I've checked out is incorrect for one reason or another or makes assumptions that I don't believe are safe to make in the real world.
I almost wish there was a section on JOS containing peer-reviewed samples.
Thanks again, Len.
This topic is archived. No further replies will be accepted.Other recent topics
Powered by FogBugz