Multithreading + TCP/IP = potential WTF

Corvidae

Long time reader, first-time poster, etc.

I'm currently working on a custom backend server application (under Windows, using VS2005 / C#) that will have thousands of unique TCP/IP clients connecting to it (potentially tens of thousands per instance, but that will depend on hardware and optimization. If I can handle 5K connections per box, that will be a good start). I have all the concurrency issues taken care of, but I'm looking for a decidedly non-WTF way of handling that many connections at an architectural level.

The clients typically connect and send requests for data, at which point the server parses the request, pulls in data from elsewhere as needed, and returns it. The server needs to have the ability to "notify" clients of status changes, so I'm unable to just loop between blocking for a read and returning the response to the client. In addition, the connection will remain open until one side kills it; it's not just a simple "request, respond, disconnect" sort of procedure.

The "textbook" way to handle clients like this is to spawn off a thread as each client connects and handle all the I/O in there. Since I need to be able to send messages across the pipe while blocking for input, I think I'd need two threads per client in order to handle both a send and receive queue (barring something like a shared write queue across all clients, but that causes its own bottlenecks). This doesn't sound like the best way to handle this, however, as the context switching alone would cause a ton of overhead.

Given this, I have a couple of questions:

* Am I off base in my assumptions above re: requiring as many as two threads per unique client?

* If the threads will be blocking 99.99% of the time waiting for I/O, should I be concerned about the cost of context switching to them just to switch away again, or is the Windows (2003+) scheduler "smart" enough to realize there's nothing to do on them and just bypass them entirely?

* If I need multiple threads and context switching is going to be an issue, does anyone have any tips as to how larger applications like this typically handle thousands of unique connections?

Sorry for the wall of text, but I wanted to make sure I framed the questions properly. Any input on this would be greatly appreciated.

Lingerance

@Corvidae said:

Long time reader, first-time poster, etc.

This meme should die. We can see your post count.
@Corvidae said:

* Am I off base in my assumptions above re: requiring as many as two threads per unique client?

One thread for a few clients could work. But one thread per socket would be easiest. Doing two threads is WTFy.
@Corvidae said:

* If the threads will be blocking 99.99% of the time waiting for I/O, should I be concerned about the cost of context switching to them just to switch away again, or is the Windows (2003+) scheduler "smart" enough to realize there's nothing to do on them and just bypass them entirely?

With select() you don't need to block. As for the question as stated: if you use the native socket API the scheduler is informed correctly of the situation and behaves accordingly.
@Corvidae said:

If I need multiple threads and context switching is going to be an issue, does anyone have any tips as to how larger applications like this typically handle thousands of unique connections?

Look at the code for an open source IRC or HTTP server.

stratos

Perhaps this is simply me applying my hammer to every nail I see, but wouldn't it be a lot easier if you could adapt the client to actually do the whole "request, respond, disconnect" procedure. At least in my eyes it would make the whole deal a lot easier because you could just get a off-the-shelf http server and implement the server logic without having to worry about the communication. Add to that all the wonderful performance/caching stuff that is available for http and scaling probebly won't be a problem.

Corvidae

@Lingerance said:

@Corvidae said:
Long time reader, first-time poster, etc.
This meme should die. We can see your post count.

This was only intended to be a "I think I kind of know how things work around here, even though I just created an account" thing, not as an attempt to look trendy with a (lame) meme.

@Lingerance said:

One thread for a few clients could work. But one thread per socket would be easiest. Doing two threads is WTFy.

One thread per socket I understand. How can I get separate read and write queues with just one thread, though, considering that one direction can't block the other, short of having state flags and spinlocking the thread as it keeps checking to see if I have anything to read/write? That strikes me as a far worse solution.

@Lingerance said:

With select() you don't need to block. As for the question as stated: if you use the native socket API the scheduler is informed correctly of the situation and behaves accordingly.

I'm using C#, but I'll assume for now that it's being converted to raw Win32 socket commands and will work the same way. Thanks for the confirmation (I seem to remember the Win32 scheduler being crap in that regard, but that was a long time ago).

@Lingerance said:

Look at the code for an open source IRC or HTTP server.

The problem with doing that is licensing pollution; unless I went with an old BSD package or something, I could run into legal issues with that. Then again, POSIX is POSIX, so...

@stratos said:

Perhaps this is simply me applying my hammer to every nail I see, but wouldn't it be a lot easier if you could adapt the client to actually do the whole "request, respond, disconnect" procedure.

I need to be able to keep sessions persistent for security and some other architectural concerns. If I have the clients disconnect after each request, then the server can't update them with additional information unless we have the clients open ports for the server to connect to, which will cause a bunch of problems with firewalls, NAT, etc. Much simpler to just keep the connection alive, provided we can make it scale sufficiently.

Thanks for the input!

Lingerance

@Corvidae said:

@Lingerance said:
Look at the code for an open source IRC or HTTP server.

The problem with doing that is licensing pollution; unless I went with an old BSD package or something, I could run into legal issues with that. Then again, POSIX is POSIX, so...

My comment was "look at the code" meant to be read as "look at how someone else does it", not "copy their code".@Corvidae said:

I need to be able to keep sessions persistent for security and some other architectural concerns. If I have the clients disconnect after each request, then the server can't update them with additional information unless we have the clients open ports for the server to connect to, which will cause a bunch of problems with firewalls, NAT, etc. Much simpler to just keep the connection alive, provided we can make it scale sufficiently.

You can actually do this with HTTP. There's a technique called "comet" (IIRC) for this purpose.

If you don't want 2 threads per client then you can use 1 thread per client (for reading from the client) and then to push out notifications you can use a thread pool or worker threads (a set number like 5 or 10).

Context switching is mostly used in terms of processes, and switching between threads is pretty cheap. Also the .NET framework has had a lot of work (especially for the upcoming 4.0 release) into threading so it should be fast.

I wouldn't be concerned about having thousands of threads, but you can always prototype it. Create 2 programs, 1 which acts like a server to clients and 1 which acts like thousands of clients and do some performance testing on it.

@Corvidae said:

One thread per socket I understand. How can I get separate read and write queues with just one thread, though, considering that one direction can't block the other, short of having state flags and spinlocking the thread as it keeps checking to see if I have anything to read/write? That strikes me as a far worse solution.

the one thread will read only. For writing you will use a thread pool or worker threads inside your notification pushing module.

pitchingchris

why can't asynchronous socket methods be used (BeginReceive, BeginSend, etc) ? These use the threadpool internallly. You can use a secondary processing queue that receives inputs from multiple clients so that you release the threadpool thread to do the actual processing.

Weng

Is it possible to architect this so that the client ASKS for status updates? You'll probably get better client to server density that way than by holding connections even without considering the possible overhead of having the server update the client on its own.

Example transaction: (One day I will actually write a network protocol that uses syntax like this.)

<Client> Hi, can you plz do all this IO crap for me?
<Server> Yes. Your transaction ID is "Cheesecake"
<Disconnect>

... A few seconds later.
<Client> I'm Cheesecake. Are you done yet?
<Server> No
<Disconnect>

... A few seconds later.
<Client> I'm Cheesecake. Are you done yet?
<Server> No
<Disconnect>

... A few seconds later.
<Client> I'm Cheesecake. Are you done yet?
<Server> Yes. Here, have a big pile of XML.

Nandurius

@Corvidae said:

The "textbook" way to handle clients like this is to spawn off a thread as each client connects and handle all the I/O in there. Since I need to be able to send messages across the pipe while blocking for input, I think I'd need two threads per client in order to handle both a send and receive queue (barring something like a shared write queue across all clients, but that causes its own bottlenecks). This doesn't sound like the best way to handle this, however, as the context switching alone would cause a ton of overhead.

I don't think this is the textbook way of handling multiple connections. It's more of a text book way to explain why threading may be useful. As with most things in text books, it's a great concept but probably won't work in practice...

I don't think that having one thread per connection is a good idea. Let's say that you're accepting 5000 connections. You won't be able to process all of the simultaneously, since you'll run into CPU and I/O bottle necks. If your server has 4 CPUs with 8 cores each, you can process 32 requests simultaneously. That's it. Spawning more threads makes it easy on you as a programmer, but I don't think that it'll work well.

I'd start out with having one thread accept connections and create a session state object for that thread. This thread should only accept the connection, but not process any data; all it needs to do is plop the session object into a queue. Create a thread pool with enough worker threads to keep all your cores busy and have it process the work queue.

Each session object should have a handle for the tcp/ip connection, which you can poll for data, and a message queue of data that needs to be send back to the client. You could either have a homogenous thread pool where every thread receives (and processes) messages, then sends out any responses that may be queued, or have one pool that receives/processes and one that transmits responses. To make this more efficient, you can make use of the select() API to pick out those threads that have transmitted data instead of looping over the entire set.

Once you deal with thousands of connections you'll also quickly run into the limitations of the operating systems tcp/ip stack. You'll have to fine tune all sorts of buffer and connection related settings to be able to hold this many connections open simultaneously.

Nyquist

@Lingerance said:

@Corvidae said:
If I need multiple threads and context switching is going to be an issue, does anyone have any tips as to how larger applications like this typically handle thousands of unique connections?

Look at the code for an open source IRC or HTTP server.

Totally not my area but.... [url=http://haproxy.1wt.eu/]HAProxy[/url] might be worth a look as far as the simultaneous connections issues.

Ixpah

Async I/O is the way to go, as has already been mentioned. If you were coding directly to Win32 then I/O Completion Ports would be the solution.