FTP

What is FTP

 FTP stands for “File Transfer Protocol” and is also known as File Transfer Protocol. A communication protocol for transferring files over a network that has existed since the early days of the Internet.

It is now the most popular file transfer protocol on the Internet.

 Other file transfer protocols include NetBIOS and NFS. This realizes file transfer by virtually mounting the external file system using the OS file system.

FTP is a client/server type of file transfer.

 File transfer with FTP is command-based, but currently there are many dedicated FTP software that can be operated with a GUI in both Windows and Linux environments. Therefore, you can transfer files without being aware of commands.

 FTP communicates using two connections, one for control and one for data transfer. Port 21 (FTP) is used for control and port 20 (FTP-DATA) is used for data transfer.

 First, a control connection is established and user authentication is performed. After successful user authentication, a control connection is established. This is established by making a client-side request to the FTP server. The client connects to port 21 on the server using any free port.

 After establishing the control connection, establish the data connection. In this way, with FTP, by establishing a control connection separate from data transmission/reception, other commands such as a stop command can be issued even during data transfer.

FTP PORT mode (active mode)

The FTP protocol has PORT mode (active mode) and PASV mode. Here, we will explain the PORT mode.

PORT mode is the default transfer mode for most FTP software.

 In this mode, unlike the control connection, the data connection is established from port 20 on the FTP server side to any port on the client.

 In order to set up a connection from the FTP server side, it is necessary to know the IP address and port number of the client, and this is notified to the FTP server by the PORT command at the time of control connection.

 A word of caution here. The direction of the data connection is from the FTP server side to the client side. This means that a connection is established from the FTP server side regardless of file download or upload.

When uploading a file, doesn’t communication start from the client?

You may wonder, but uploading also starts a connection from the FTP server.

The direction of this data connection matters.

First, there is the issue of security.

 The port number of the client is arbitrary (1,024 or more), so in order to communicate with FTP, it is necessary to open the port of the data connection that is connected from the FTP server side in the FW. Anything over 1,024 ports is very risky.

It also becomes a problem if the client is behind a NAT router.

 NAT and IP Masquerade translate IP addresses and port numbers. Therefore, the value passed with the PORT command also needs to be rewritten.

 The reason is that the IP address and port number notified by the PORT command cannot communicate with the client. You need to communicate with IP addresses and port numbers that are translated by NAT.

 However, the argument of the PORT command is in decimal notation, and if the value changes, the length of the string will change. Converting the arguments in the PORT command requires complex processing.

 Therefore, some NAT routers do not support rewriting of the PORT command, which may cause problems with NAT. In other words, if there is a client under the NAT router, there is a problem that FTP communication may not be possible.

FTP PASV mode (passive mode)

In PASV mode, both control and data connections all originate from the client.

 In PORT mode, there were problems with FW and NAT. PASV mode is a mode designed to enable communication even if the client is under the FW or NAT router.

 In PORT mode, a data connection is established from the FTP server, but in PASV mode data connection, contrary to PORT mode, a connection is established from any port on the client side to the listening port of the FTP server.

 This mode uses the PASV command on the control connection from the client to the FTP server. The FTP server notifies the client including its own IP address and listening port number in the response.

 The client establishes a data connection to the FTP server for the IP address and port notified by the FTP server.

In PASV mode, both control and data connections are established from the client side.

 Although the IP address and listening port number of the FTP server are written in the response to the PASV command, this is not a problem because it is not subject to conversion by NAT or IP masquerade.

 At first glance, PASV mode seems to solve the NAT problem, but problems arise when the client is under the NAT router and the FTP server is under the NAT router.

After all, FTP communication via NAT (IP masquerading) poses two problems:

(1) The IP address and listening port number of the client must be notified by the PORT command from the client.

(2) Notification of the IP address and standby port number of the FTP server as a response to the PASV command

 If the router converts the IP address to be notified to the global IP address on the WAN side, the NAT problem in FTP communication can be solved.

Most recent routers support case (1), but some products do not yet support case (2).