1
votes

Short version: I have a self hosted WCF service, I can't get local applications to talk to it via net.tcp.

Detailed: Our environment has several self hosted services that all communicate with each other via net.tcp. The service in question (A) exposes a net.tcp based endpoint which is consumed by at least 3 other services on other boxes.

We recently established a new connection between another service (B) and (A). Both (B) and (A) are hosted on the same box. This connection refuses to work resulting in the error below.

The socket connection was aborted. This could be caused by an error processing your message or a receive timeout being exceeded by the remote host, or an underlying network resource issue. Local socket timeout was '00:59:59.9688006'.

Inner Exception: An existing connection was forcibly closed by the remote host

I have literally copy and pasted the binding config from (A) into (B) with no luck. I have created a test app that is able to replicate this issue. It works from any machine on the network EXCEPT the host machine for (A).

Just for fun, service (B) is able to communicate with service (C) which is also on the same box as (A), using the same binding type. I even put the binding from (C) into (A) and the same behavior is seen by the test app and (B).

I've checked and tested every solution I could find to net.tcp connection issues with no luck.

I checked with IT and our Security Officer, neither of them can think of anything from their perspective that could cause this.

This is the first local connection to (A), so far all other connections originate from other boxes.

Update: Server binding snippet

<netTcpBinding>
    <binding name="TMSNetBinding"
                 closeTimeout="01:00:00"
                 openTimeout="00:00:20"
                 receiveTimeout="01:00:00"
                 sendTimeout="01:00:00"
                 maxBufferPoolSize="600000000"
                 maxBufferSize="30000000"
                 maxReceivedMessageSize="30000000"
                 maxConnections="100"
                 portSharingEnabled="false"
                 listenBacklog="100"
                 transferMode="Buffered"
                 hostNameComparisonMode="StrongWildcard">
        <readerQuotas maxArrayLength="25000"
                        maxBytesPerRead="4096"
                        maxDepth="32"
                        maxNameTableCharCount="30000"
                        maxStringContentLength="300000" />
        <security mode="None">
            <transport clientCredentialType="Windows"
                       protectionLevel="EncryptAndSign" />
        </security>
    </binding>
</netTcpBinding>

<endpoint binding="netTcpBinding"
          bindingConfiguration="TMSNetBinding"
          contract="TMS.Internal.ITMSInternalOperations" />

In code the base service address is "net.tcp://localhost:888/TMS/"

1
How are you defining the connection string? using localhost? Or the machine name?Erik Funkenbusch
It's a resolvable name. address="net.tcp://TMS.dev.company.com:888/TMS"dsmithpl13
So have you verified that this address resolves to the same IP address on the given machine? Are you sure there is no host file entry that is causing problems? That there is no firewall rule that might be blocking the request from the same machine? There is literally nothing in WCF that could prevent this, so it must be something in the network stack.. firewall, dns, host file entry, route table, something...Erik Funkenbusch
Yes, yes and yes. I've even updated the client config with localhost, 127.0.0.1 and machine name. All have the same result.dsmithpl13
That's just it though. If your service is binding to a specific interface, then it may not be available on 127.0.0.1, so if all your testing has been forcing it to that, that could be the problem.Erik Funkenbusch

1 Answers

0
votes

As mentioned there was another server (C) that works under very similar circumstances. I spent a good bit of time comparing the two and the only difference I could find was that the self hosting was implemented differently between the two services. The failing service had an over engineered abstraction in place I wan't able to identify exactly what part of that mess was causing the issue. However ripping the whole pile out and replacing it with simple self hosting resolved the issue.