Eclipse from the bottom up

The view of the Eclipse world from the bottom of the stack

Monday, January 15, 2007

Responsive Remote Connections

We've had a couple of contributions for fetching contents directly from a URL:
These are great ideas but the problem is that the contributions don't treat accessing the contents from the remote system any differently than accessing contents from the local file system. Unfortunately, there are several issues related to fetching content over remote connections that make this problematic. Some problems were less severe, such as manual cancellation not being instantaneous, while others were more severe, such as Eclipse hanging and requiring a restart.

We encountered these issues several years ago with the CVS client and came up with solutions that allowed the remote connections to remain responsive. I thought it would be worthwhile to summarize the issues and what we came up with to solve the problems. I would be curious to know if others have encountered similar problems and, if so, how they solved them. Also, I'd be curious to know if the situation has improved in later versions of Java (our coding was done against 1.4 or earlier).

Problem 1: Connecting to a Socket may hang indefinitely

When initially connecting to a socket in Java, there is no way to specify a timeout. In most cases, this is not a problem as the connection will either succeed in a fairly short time or fail fairly quickly. However, we found that, under some circumstances, the connection attempt would hang indefinitely.

To work around this, we wrote a createSocket method that performed the open in a separate thread while the caller thread blocked until the socket was opened. However, the caller thread would only sleep for a second at a time and would also only loop for a predefined number of times (default of 60). That way, the connection would only block for a maximum of 60 seconds (or whatever value the user specified as the CVS timeout preference) and would respond to cancellation with a second. For those that are curious, the implementation of the create socket is in the CVS/Core Util class.

Problem 2: Users expect cancellation to happen immediately

Once you have a socket open, you can specify the read and write timeouts for the socket streams. Thus, you can ensure that the connection will be dropped if it has been blocked for too long. However, after waiting for a while, the user may decide to explicitly cancel the operation and would expect this to occur immediately. To make this work, we created a wrapper stream that would only block on the read for a second. If no content arrived in that time, the stream would poll to see if the user canceled. If a cancel did occur, the read was terminated. Otherwise, the read was reattempted until either content became available or the number of attempts exceeded the timeout (default 60).

The other benefit of this stream wrapper is that appropriate progress messages could be shown to the user (e.g. 350 or 2000 bytes read). The stream implementation is internal to the Team/Core plug-in in the PollingInputStream class.

Problem 3: Some streams do not have a timeout

The final problem we found was that some streams do not have a timeout. Specifically, when reading the output from a system process (for the EXT connection method), under some circumstances, the read could block indefinitely. To solve this, we used a similar approach to problem 1. All the reading was performed in a separate thread and any read content was passed between the reading thread and the caller thread using a shared buffer. The implementation for this is found in the TimeoutInputStream class in the Team/Core plug-in.

Thoughts?

As I mentioned earlier, I would be curious to know how others have addressed these issues. Is there facilities available from the JDK that make this easier or are there better approaches that should be considered? Let me know what you think.