Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!ucsd!ucbvax! EXPO.LCS.MIT.EDU!rws From: r...@EXPO.LCS.MIT.EDU Newsgroups: comp.protocols.tcp-ip Subject: SO_KEEPALIVE considered harmful? Message-ID: <8905231205.AA00500@expire.lcs.mit.edu> Date: 23 May 89 12:04:55 GMT Sender: dae...@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 9 I have a random question that I hope this illustrious audience can answer definitively for me (or else point me to a definitive source). Is the BSD notion of SO_KEEPALIVE on a TCP connection considered kosher with respect to the TCP specification? If so, is its use to be encouraged? Specifically, it has been suggested that in the X Window System world, X libraries should automatically be setting SO_KEEPALIVE on connections to X servers. Is this a reasonable thing to do? [If this is a totally inappropriate forum for this question, I apologize.]
Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu! ucbvax!AHWAHNEE.STANFORD.EDU!dcrocker From: dcroc...@AHWAHNEE.STANFORD.EDU (Dave Crocker) Newsgroups: comp.protocols.tcp-ip Subject: Re: SO_KEEPALIVE considered harmful? Message-ID: <8905231641.AA25794@ucbvax.Berkeley.EDU> Date: 23 May 89 14:57:06 GMT Sender: dae...@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 18 The use of Keepalives is terrible, but sometimes necessary. The key word, here, is "sometimes". The "terrible" is due to the fact that they add traffic to the net. An important point to keep in mind, with TCP connections, is that they may span the globe, over thin wires. Extra traffic can have a very serious effect. Further, they scale poorly. The incremental traffic from one connection may not be onerous, but what about 1000 connections? Lastly, of course, there is the small fact that there may be a charge for those extra packets, such as may happen if one of the links along the path is over a public X.25 network. If the group proposing the use of Keepalives has already gone through the exercise of convincing themselves that critical functionality will be lost if they are not used, then I hope the next question was/is how to minimize their use. Dave
Path: utzoo!attcan!uunet!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax!NNSC.NSF.NET! craig From: cr...@NNSC.NSF.NET (Craig Partridge) Newsgroups: comp.protocols.tcp-ip Subject: re: SO_KEEPALIVE considered harmful? Message-ID: <8905231944.AA06042@ucbvax.Berkeley.EDU> Date: 23 May 89 16:41:15 GMT Sender: dae...@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 75 > I have a random question that I hope this illustrious audience can answer > definitively for me (or else point me to a definitive source). Is the BSD > notion of SO_KEEPALIVE on a TCP connection considered kosher with respect to > the TCP specification? If so, is its use to be encouraged? Specifically, > it has been suggested that in the X Window System world, X libraries > should automatically be setting SO_KEEPALIVE on connections to X servers. Is this > a reasonable thing to do? Oh what fun! Keepalive wars return.... Well, I'm a firm hater of keep-alives, although Mike Karels has persuaded me that in the current world they are a useful tool for catching clients that go off into hyperspace without telling you. I have lots of fellow travellers (actually, I'm probably a fellow traveller with Phil Karn, president of the "I hate keep-alives" party), witness the current host requirements text, which is appended. Craig Implementors MAY include "keep-alives" in their TCP | implementations, although this practice is not universally | accepted. If keep-alives are included, the application MUST | be able to turn them on or off for each TCP connection, and | they MUST default to off. | Keep-alive packets MUST NOT be sent when any data or | acknowledgement packets have been received for the | connection within a configurable interval; this interval | MUST default to no less than two hours. | An implementation SHOULD send a keep-alive segment with no | data; however, it MAY be configurable to send a keep-alive | segment containing one garbage octet, for compatibililty | with erroneous TCP implementations. | DISCUSSION: | A "keep-alive" mechanism would periodically probe the | other end of a connection when the connection was | otherwise idle, even when there was no data to be sent. | The TCP specification does not include a keep-alive | mechanism because it could: (1) cause perfectly good | connections to break during transient Internet | failures; (2) consume unnecessary bandwidth ("if no one | is using the connection, who cares if it is still | good?"); and (3) cost money for an Internet path that | charges for packets. | Some TCP implementations, however, have included a | keep-alive mechanism. To confirm that an idle | connection is still active, these implementations send | a probe segment designed to elicit a response from the | peer TCP. Such a segment generally contains SEG.SEQ = | SND.NXT-1. The segment may or may not contain one | garbage octet of data. Note that on a quiet | connection, SND.NXT = RCV.NXT and SEG.SEQ will be | outside the window. Therefore, the probe causes the | receiver to return an acknowledgment segment, | confirming that the connection is still live. If the | peer has dropped the connection due to a network | partition or a crash, it will respond with a reset | instead of an acknowledgement. | Unfortunately, some misbehaved TCP implementations fail | to respond to a segment with SEG.SEQ = SND.NXT-1 unless | the segment contains data. Alternatively, an | implementation could determine whether a peer responded | correctly to keep-alive packets with no garbage data | octet. | A TCP keep-alive mechanism should only be invoked in | network servers that might otherwise hang indefinitely | and consume resources unnecessarily if a client crashes | or aborts a connection during a network partition. |
Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu! ucbvax!AHWAHNEE.STANFORD.EDU!dcrocker From: dcroc...@AHWAHNEE.STANFORD.EDU (Dave Crocker) Newsgroups: comp.protocols.tcp-ip Subject: Re: SO_KEEPALIVE considered harmful? Message-ID: <8905250638.AA21706@ucbvax.Berkeley.EDU> Date: 23 May 89 21:03:13 GMT Sender: dae...@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 16 I tried to avoid saying that keepalives should be prohibited, except, perhaps, from an aesthetic point of view. Since aesthetics often are altered by reality, it is no great concession to acknowledge the occasional need for the mechanism. My point was that they are dangerous and therefore should be used VERY judiciously. Craig's note puts this point forward in more detail. It is worth adding that the excessive use of keepalives has removed a feature that used to be in TCP and has been recently re-documented by Bob Braden: TCP used to be remarkably robust against temporary outages. If you were willing to wait, so was TCP. Now, an outage of a very short time -- on some implementations, as short as 1-2 minutes -- will abort the connection. Dave
Path: utzoo!attcan!uunet!lll-winken!ames!think!barmar From: bar...@think.COM (Barry Margolin) Newsgroups: comp.protocols.tcp-ip Subject: Re: SO_KEEPALIVE considered harmful? Message-ID: <20761@news.Think.COM> Date: 25 May 89 16:32:31 GMT References: <8905250638.AA21706@ucbvax.Berkeley.EDU> Sender: n...@Think.COM Reply-To: bar...@kulla.think.com.UUCP (Barry Margolin) Organization: Thinking Machines Corporation, Cambridge, MA Lines: 49 In article <8905250638.AA21...@ucbvax.Berkeley.EDU> dcroc...@AHWAHNEE.STANFORD.EDU (Dave Crocker) writes: >It is worth adding that the excessive use of keepalives has removed a >feature that used to be in TCP and has been recently re-documented by >Bob Braden: TCP used to be remarkably robust against temporary >outages. If you were willing to wait, so was TCP. Now, an outage of >a very short time -- on some implementations, as short as 1-2 minutes -- >will abort the connection. I dispute this claim. TCP is only robust against temporary outages if you don't try to use the connection during that period. For instance, if I'm using telnet, the connection will stay alive during outages if I don't type anything to the client or the host doesn't try to send any output. If either end tries to use the connection, and the outage is longer than the TCP acknowledgement timeout, then the connection will die. If I happen to know that the network is having trouble I won't type anything, but how often is this the case? What it mostly means is that a temporary outage after I go home won't break my connections. TCP's robustness is still a good idea. It's nice to be able to swap Ethernet cables without causing all the network connections to die. But in my experience (which, I admit, isn't all that extensive), any connection that dies for more than a minute or two probably isn't going to come back. What I mostly care about, though, is that the other end definitely has reinitialized, e.g. it has crashed and been rebooted. If it's a telnet server that crashed I can do this by typing into the client, which will provoke a reset, and the client will abort. But if it's the telnet client or an X server that died, there's often no way to force the other end to try to send something so it will get a reset. I think the right solution is a compromise. What's needed is a way to send a segment with infinite (or near-infinite, e.g. hours or a day) retransmissions and slow retransmit rate (one to two minutes). This would allow idle connections to stay up across most network failures, but they will die within a minute or so of the other end rebooting. And, of course, it should be optional, so that applications that perform frequent output of their own need not compound their network use (although since keepalives need only be sent when there are no normal packets in the retransmit queue, any application whose output rate is higher than the keepalive rate will never invoke the keepalive mechanism). Barry Margolin Thinking Machines Corp. bar...@think.com {uunet,harvard}!think!barmar
Path: utzoo!utgpu!jarvis.csri.toronto.edu!mailrus!tut.cis.ohio-state.edu! rutgers!bellcore!jupiter!karn From: karn@jupiter (Phil R. Karn) Newsgroups: comp.protocols.tcp-ip Subject: Re: SO_KEEPALIVE considered harmful? Message-ID: <16423@bellcore.bellcore.com> Date: 25 May 89 22:40:49 GMT References: <8905250638.AA21706@ucbvax.Berkeley.EDU> <20761@news.Think.COM> Sender: n...@bellcore.bellcore.com Reply-To: k...@jupiter.bellcore.com (Phil R. Karn) Organization: Bell Communications Research, Inc Lines: 28 >>It is worth adding that the excessive use of keepalives has removed a >>feature that used to be in TCP and has been recently re-documented by >>Bob Braden: TCP used to be remarkably robust against temporary >>outages. [...] >I dispute this claim. TCP is only robust against temporary outages if >you don't try to use the connection during that period. TCP becomes quite robust against all outages (whether or not the connection is idle) once you make a very simple change: get rid of TCP level timeouts! I feel very strongly that TCP should *never* just give up on its own accord; that decision belongs to the application. And, in the event the application is an interactive one, the decision to abort should be left to the human user. If he's willing to wait, why shouldn't the system let him? (The only case when TCP should abort a connection on its own is when it has clear proof that the other end has crashed, i.e., by receiving a valid RST.) Users of my TCP/IP package on amateur packet radio occasionally report cases of FTP transfers that resume automatically after network outages lasting for *days* (e.g., those due to crashes of network nodes in remote locations that require manual resets). They are most happy to do without TCP give-up timers, as long as TCP backs off its retransmissions to avoid channel congestion. Phil
Path: utzoo!attcan!uunet!cs.utexas.edu!tut.cis.ohio-state.edu!ucbvax! AHWAHNEE.STANFORD.EDU!dcrocker From: dcroc...@AHWAHNEE.STANFORD.EDU (Dave Crocker) Newsgroups: comp.protocols.tcp-ip Subject: Re: SO_KEEPALIVE considered harmful? Message-ID: <8906012254.AA23748@ucbvax.Berkeley.EDU> Date: 26 May 89 13:28:24 GMT Sender: dae...@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 9 Phil, As a test-of-concept: I assume that you have no objection to a TCP implementation's being able to do keepalives, under the control of the application, where both the fact of keepalives AND their periodicity can be specified; and the effect of a timeout is a signal to the application, not an abort? Dave
Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu! ucbvax!THUMPER.BELLCORE.COM!karn From: k...@THUMPER.BELLCORE.COM (Phil R. Karn) Newsgroups: comp.protocols.tcp-ip Subject: Re: SO_KEEPALIVE considered harmful? Message-ID: <8905262347.AA11535@thumper.bellcore.com> Date: 26 May 89 23:47:25 GMT Sender: dae...@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 29 Dave, Yes, that might be acceptable to me. I'd go a little further, though, and say that a REMOTE USER (not just the application code) must always be able to turn off keepalives, even on binary-only systems. It does no good to say "the application must be able to disable keepalives" when I'm having problems with a remote server that I have no administrative control over. Much of my animosity toward keepalives came from trying to make a Sun workstation work properly over SLIP links and amateur packet radio. I finally replaced the TCP object modules provided by Sun with ones compiled from Van's latest TCP, which I had already edited to disable keepalives. Works like a charm. At the last InterOp, I sat next to Dave Borman in a panel session on TCP performance. Between us, we represented a "dynamic range" of about 6 orders of magnitude in TCP transfer rates (1200 bps amateur packet radio to 500 Mbps between Crays). This is an exceptional achievement for a single networking protocol, but it was possible only because TCP was designed from the beginning to scale well over a wide network performance range. But broken mechanisms like keepalives threaten this. We need a big red warning light that will flash whenever someone proposes to put an fixed time interval into a protocol spec, because you can't scale protocols that have arbitrary timers. Phil
Path: utzoo!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu! ucbvax!A.ISI.EDU!CERF From: C...@A.ISI.EDU Newsgroups: comp.protocols.tcp-ip Subject: Re: SO_KEEPALIVE considered harmful? Message-ID: <[A.ISI.EDU]28-May-89.14:23:38.CERF> Date: 28 May 89 18:23:00 GMT References: <2681@elxsi.UUCP> Sender: dae...@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 24 When TCP was first designed, and for all subsequent versions, it was thought inappropriate to impose any kind of semantics on the logical connections extablished by TCP. In particular, no sense of absolute timeout for the severing of a connection was desired. We thought that such notions of "impatience" or "time to give up" ought to be the choice of the upper level protocol using TCP as the basis merely for reliable delivery. A part of this view stemmed from the fact that the networks over which TCP had to function, for the DoD applications we had in mind, were potentially very unpredictable as to loss and delay. Mobile packet radio systems had to function under jamming and radio shadow effects, for instance. TCP never unilaterally severed connections but only reported failure to achieve positive acknowledgement after a time which could be controlled by the application or upper-level protocol. It was up to the application to decide whether to sever the connection and, even then, the choice to do so gracefully or abruptly was also left to the application. The use of a feature (X-level NOP) to test the liveness of a TCP connection is consonant with the model against which the TCP was designed. Vint Cerf
Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!rutgers!tut.cis.ohio-state.edu! ucbvax!OKEEFFE.BERKELEY.EDU!karels From: kar...@OKEEFFE.BERKELEY.EDU (Mike Karels) Newsgroups: comp.protocols.tcp-ip Subject: Re: SO_KEEPALIVE considered harmful? Message-ID: <8906082328.AA04514@okeeffe.Berkeley.EDU> Date: 8 Jun 89 23:28:37 GMT Sender: dae...@ucbvax.BERKELEY.EDU Organization: The Internet Lines: 39 Sorry, I can't let this go by without commenting on Phil's message and this discussion, even though the discussion has mostly died down. (I haven't been reading tcp-ip very often, but noticed this subject line going by.) Last time Phil and I talked about keepalives in person, I asked him whether he had problems with telnet/rlogin servers accumulating on his systems if they didn't use keepalives. We certainly accumulate junk, including xterm programs, waiting for input from a half-open connection. Phil told me that he doesn't have problems, because he runs a "wall" every night to force output to all users, and of course breaking connections that time out. In other words, Phil violently objects to servers requesting keepalives from TCP, but allows the system manager (himself) to force them above the application level. And before people jump up to point out the difference in time scales, the current BSD code sends no keepalive packets until a connection has been idle for 2 hr, and that interval is easily changeable. One proposal for the Host Requirements document was to wait for 12 hr. I think that's a bit high, but the difference is only a factor of 6. Compare the number of keepalive packets with the number of packets exchanged by an xterm and an X server over the course of a week if used 4 hours a day! Phil says: ... I'd go a little further, though, and say that a REMOTE USER (not just the application code) must always be able to turn off keepalives, even on binary-only systems. It does no good to say "the application must be able to disable keepalives" when I'm having problems with a remote server that I have no administrative control over. I'm sorry, Phil, but remote users have no more right to override system management policies than do local users (at least on *our* systems!). On some of the systems where I have guest accounts, local or remote users are logged off if they aren't active for two hours. I don't like that, either, but I don't claim that the managers of those systems have no right to enforce such a policy. Mike