I am using the Internet Direct TIdTCPClient component to communicate with a remote service to retrieve a message that is normally about 5k is size. During a typical operation I will send about 400 requests to the service, each one taking about 1 second to complete. Most of the time everything works perfectly. However, about one percent of the time the request takes 189 seconds and I receive no data at all. For ease of discussion, I'll call this a failure.
I am particularly interested in understanding exactly what is happening when the failure occurs so that I can take my evidence to the publisher of the service. First of all, the failure is not reproducible. If I re-send a failed request there is a very high probability (maybe 99 percent) that it will work.
I also capture the request that I send when a failure occurs, so I am able to confirm that the request is well formed.
I am assuming that during a failure that I am getting some data, just not all of it. Here is why. My IdTCPClient has a 30 second timeout (I've even set it to 5 seconds, but that didn't make a difference). When the failure occurs, it always fails after 189 seconds (plus about 500 milliseconds).
As a result, I am thinking that during a failure my component is receiving a trickle of data, which is why my client is not timing out. And, I am assuming that the disconnection is happening at the service since none of my timeout values are ever set to 189 seconds. On the other hand, reading IOHandler.AllData does not raise an exception (not even an EIdConnClosedGracefully exception). Am I interpreting this evidence correctly?
What I want to do is to confirm that I am getting some data, just not all of it, before the service terminates the connection. Furthermore, I want to know what that partial data looks like as I believe it can help identify the source of the failure.
Currently, my request is similar to the following:
//ExceptionName is a temporary global variable
//that I am using while trying to solve this issue
ExceptionName = 'no exception';
try
s := GetRequest(id);
IdTcpClient1.Host := Host;
IdTcpClient1.Port := StrToInt(Port);
IdTcpClient1.ReadTimeout := ReadTimeout;
try
IdTcpClient1.Connect;
except
on e: exception do
begin
ExceptionName := e.ClassName;
raise EConnectionFailure.Create('Connection refused: ' + e.Message)
end;
end;
IdTcpClient1.IOHandler.Writeln(s);
try
Result := IdTcpClient1.IOHandler.AllData;
except
on E: EIdConnClosedGracefully do
begin
ExceptionName := e.ClassName;
//eat this exception
end;
on e: Exception do
begin
ExceptionName := e.ClassName;
raise;
end;
end;
finally
if IdTcpClient1.Connected then
IdTcpClient1.Disconnect;
end;
Using IOHandler.AllData to read the data is very convenient, but I cannot retrieve any data following a failure (AllData returns an empty string). I've tested IOHandler.InputBufferIsEmpty after a failure, and it returns True.
I've also tried other methods to read the data, such as IOHandler.ReadStream (this produced the same result as reading AllData). I also used IOHandler.ReadBytes and IOHandler.ReadByte (in conjunction with IOHandler.CheckForDataOnSource). Nothing has worked.
Am I wrong about partial data transmission? If so, why would I see a consistent 189.nnnn seconds delay before the failure.
If partial data transmission is a possibility, what approach should I take to capture every byte of data received before the failure.
I am using Delphi 2009 for this project and Indy 10, but I don't think the version has anything to do with it. I don't think this is an Indy issue.
Edit: I have inspected the communication between my Indy client and the server using WireShark. When one of these failures occur, after sending my request the server sent two [ACK] packets followed by silence for just over 189 seconds. After that delay, the response included [FIN, PSH, ACK] but no application data.
When the communication worked normally, the two ACK packets returned by the server in response to my request was followed by an application data packet.
Edit: Have reported the issue to the publisher of the Web service, and am waiting for a reply.
Edit: Ok, the publisher of the Web service has responded. They acknowledged problems on their end and have addressed some of those. We are no longer getting timeouts. Most responses are received in about 2 seconds, with a few taking slightly longer. The publisher is working to fix the remaining issues.
Thank you everyone for your input.