When it comes to networking, thread-per-connection is often considered to be Baby's First Concurrency Model: It's obvious, and works for very, very small things, but never scales very far and is quite painful to work with. There are many other ways to do networking, and properly describing them all would go beyond the scope of this writeup, but I will mention that they generally are based on the concept of multiplexing: syscalls like
select()orpoll()are used to list a set of connections and figure out which ones are ready to read from or write to, without extra userspace overhead.Well, the [Minecraft] server is thread-per-connection, so it was figured that it would be a good idea to just pass the incoming socket directly to
readUTF()and friends, in blocking mode. What this effectively means is that the thread containing the socket will block and yield control as needed, until enough bytes have come in to satisfyreadUTF()and relenquish control back to the server. Imagine what happens if we were to givereadUTF()a relatively large size, say, 65535, and then very slowly send characters down the wire to the server. The server cannot really do much about it sincereadUTF()is a library function; it has to wait untilreadUTF()has finished reading the string. This means that a malicious client could completely tie up a thread by doing this "slow trickle" of data. If enough threads are tied up, then the server will be unable to handle new requests because of resource exhaustion.The moral of the story is, again: Don't use thread-per-connection. In Java, use NIO or MINA. There are reasonable ways to do networking in every language; let's start using some of them.
Admittedly, this seems like a bit of attention grab from the guy posting this since he's working on his own Minecraft server implementation, but it seems like Notch has work to do. This is bad. Of course, it would've been much better if this guy had disclosed the vulnerability responsibly (which I doubt he did do). Even better, if Minecraft was open source, I'm sure something like this in the code would've surfaced a long time ago and would have been fixed by now.