6
votes

According to FTP protocol (rfc 959), when a ftp client connects to a ftp server, a control connection should be established between the ftp client and the ftp server. And when the ftp client sends a request of {LIST, RETR, STOR, etc}, a data connection between the client and the server should be established first, and ftp data will be transfered in the data connection.

so, my question is: why we need the second connection -- the data connection ? why does not all the request, reponse and ftp data be transfered in the control connection ?

3
Keep in mind that the control/data split in FTP dates back to around 1971.ninjalj
Ok, it looks like it was to be able to select an appropriate byte-size for the data connection (e.g: on 36-bit computers). See: RFC310 and RFC327. And RFC354 for a first version of "modern" FTP.ninjalj

3 Answers

7
votes

The decision to have separate control and data connections in FTP was taken at the Data and File Transfer Workshop at MIT on April 14-15, 1972.

RFC310 "Another Look At Data And File Transfer Protocols" was published on April 3 to prepare for the workshop. Some relevant information from that RFC:

  • The CPYNET protocol used on TENEX systems closed the control connection and opened a new one with possibly different byte-size. The selection of byte-size could be important for some computers, e.g: the 36-bit PDP-10.
  • Ad-hoc protocols on top of TELNET where the receiving process had to inspect every byte were considered slow. Using separate connections was suggested to avoid that overhead.
  • In the Data Transfer Protocol (equivalent to modern-day data connections in FTP), block mode was considered too costly just to provide control/data separation and EOF indication. Again, opening/closing a separate data connection was suggested as an alternative (which would also allow selection of an appropriate byte-size).
  • For FTP usefulness, efficiency was considered important, and again separate connections with perhaps different byte-size were suggested, noting the ambiguity that closing a connection could be either due to an EOF indication or an error.
  • For use in TIPs/IMPs (Terminal Interface Message Processors), some of which had no file system, and had devices listening on specific sockets, it was considered convenient to allow sending data to a specified socket.

RFC327: "Data and File Transfer Workshop Notes", published on April 27, briefly summarizes the discussions and decisions taken in the workshop. Speed and efficiency of file transfer were considered important, with byte-size and data format conversions being considered some of the most important factors affecting speed and efficiency. Finally, it was decided to use separate control and data connections. Other decisions were taken: the control connection would be a TELNET connection, the control connection would use ASCII human-readable commands and responses, and DTP (the Data Tranfer Protocol) would stop existing as a separate entity, and become the protocol used on the data connection of FTP.

Finally, RFC354: "The File Transfer Protocol", published on July 8, 1972, became the first incarnation of the FTP RFC to feature separate control and data connections. It used a SOCK command, instead of our familiar PORT and PASV commands.

Addendum

Inter-server file transfer (AKA FTP bounce/FXP) appeared on RFC542 "File Transfer Protocol for the ARPA Network", published on August 12, 1973, with the introduction of the PASV command.

Finally, RFC765 "file Tranfer Protocol", published on June 1980, was modified to use TCP instead of NCP, changing the SOCK command for the PORT command.

0
votes

When you have separate data connection you can, for instance, transfer a file between two FTP servers directly, and not through your client machine. People do not use this feature today, but it could have been useful in the past when data transfer was really slow.

0
votes

Because that's just not how FTP works.

Also, there are some benefits from this arrangement, including:

  • There's no need for complicated framing on the control connection.

  • Handling special cases, like cancelling a data connection, is simpler.

  • You can have multiple transfers running at a time without having to establish multiple control connections.

  • It enables a trick, known as FXP, that can allow you to make two FTP servers exchange data directly between each other.