Figure 1 – Architecture of FTP Backup

General Information

This page documents the FTP Backup mechanism implemented for each Mount Olive website. Figure 1 illustrates the architecture of the FTP backup process. The right most black box represents the IONOS server machine that runs the Mount Olive websites. Four websites are supported: 1) the Mount Olive church live website, 2) the Mount Olive school live website, 3) the Mount Olive church test website, and 4) the Mount Olive school test website. The live websites display current information for the church and school respectively. The test websites are used for testing and development work.

The IONOS machine is connected to the internet and uses an address of C.D.E.F, which letters stand for numerical values (e.g., 145.22.37.9). The FTP complex is connected to the internet through an Internet Service Provider (ISP) layer 2 device, generally either a Cable Modem or DSL Modem. This device is connected to the FTP complex router/firewall/DHCP server (henceforth, simply the router) and is assigned an internet address of G.H.I.J, again which letters stand for numerical values. This address is provided to the router by the ISP DHCP server.

The addresses used on the local network are from a Classless Inter Domain Routing block and are generally either of the form 10.x.y.z or 192.168.x.y. For the purposes of this document, it is assumed that local network addresses are of the form 192.168.x.y, where the letters x and y stand for numerical values in the range 1-255.

The router is also assigned a local address so it can communicate with machines on the FTP complex local area network. Generally, this address is statically assigned and here is assumed to be 192.168.1.1. The FTP server machine also has a local address, 192.168.A.B, which is used by other machines, including the router, to communicate with it.

FTP based backup of a Mount Olive website comprises the movement of website information from the IONOS/Mount Olive server machine to the FTP server machine on the FTP complex local network. This movement is managed on the IONOS side by a plugin installed in the website’s WordPress base. Currently, this plugin is BackWPup, which supports not only FTP backup, but also the Amazon S3 protocol for storing data blobs in a cloud. An FTP server runs on the FTP Server machine in the FTP Complex, which accepts data from the BackWPup plugin and stores it on server machine local disks.

FTP is a ancient protocol that was designed before firewalls became common. Consequently, it has features that require uncommon configuration of both the backup plugin that runs on the IONOS machine and the FTP server. Specifically, the firewall component of the router does not allow traffic coming from the internet through unless one of two conditions apply: 1) a local machine sends packets to a machine outside the local network using an internet address and specific port. The firewall notices this and allows replies from the process using the internet address and port to travel through it. There is also a way to “open” a port on the firewall. Traffic from an internet machine to this port is forwarded to a specific machine on the local network for processing. In the case where FTP traffic from the backup plugin is forwarded to the FTP server, the port used by the backup plugin must be open in order for data to transit the router/firewall and reach the FTP server.

FTP uses two ports during an FTP session. The first port is used for command traffic (identified here as the command channel) and the second for data traffic. In order for FTP packets to transit the firewall and reach the FTP server, both ports must be open on the firewall. The command port is generally fixed in value and used by the plugin to contact the ftp server to start a session. The standard value of the FTP command port is 21. However, it is common for hacker bots to probe this port looking for vulnerabilities in firewall and local network configuration. Consequently, it is good security practice to use a non-standard value for this port to discourage such penetration attempts. This has an effect on how the client (in this case the plugin) is configured (see below).

At least one data port of an FTP session is chosen dynamically during the ftp session. Two options are available. In FTP active mode, the server initiates the data connection to the client using port 20 as the source data port and a random value specified by the client over the command channel as the destination data port. In FTP passive mode, the FTP server choses the value of the client destination data port. The client then initiates the data connection using this port as the destination and a random data port it chooses as the source. The firewall is configured to keep open the ports in the range specified by the FTP server so that data connection requests by the client make it through the firewall.

Using active mode causes problems. In particular, in the case of communications between the FTP server and the IONOS server, the IONOS machine may itself be behind a firewall (not shown in Figure 1). The plugin chooses a random port that is unlikely to be open on the IONOS firewall. Therefore, when the server attempts to open the data connection (from its source port of 20 to the random destination port on the IONOS machine), the IONOS firewall (which sits in front of the IONOS machine, which, in this case, the plugin is acting as the client) doesn’t allow the connection through, since it has not opened the random port specified by the plugin.

Therefore, using passive mode is necessary for FTP transfers between the FTP server and the plugin. The server can limit the value of its data port to a small range that the FTP complex firewall keeps open for FTP transfers. It then chooses a value in this range and communicates it to the client, which it then uses as the destination port to open a connection to the FTP server.

FTP server operation

When passive mode is chosen, the ftp server sends to the client its address so the client can open a connection to it using one of the ports in appropriate range. (FTP allows the server IP address used for data connectinos to be different than the address used for the command channel. This supports transfer of data to a different machine used to control the FTP transfer.) However, the FTP server is assigned a local network CIDR address that is not routable over the internet. Consequently, it must send to the client the internet address of the router, which will then forward the data packets to the FTP server using its local address. Since the FTP server doesn’t know a priori what the internet address of the router is, this address must be specified in the FTP server configuration file.

The configuration file information used by the vsftpd FTP server to set up FTP transfers for Mount Olive machine backups is shown here. Only the configuration information relevant to this discussion is given:

connect_from_port_20=YES
listen_port=22021
listen_ipv6=NO
listen=YES
pasv_enable=YES
pasv_address=G.H.I.J
pasv_min_port=22019
pasv_max_port=22020

The first line tells the FTP server to use port 20 when transfering data between it and the FTP client. The second line specifies the use of port 22021 as the command port, instead of port 21. The third line indicates that no IPV6 traffic should be used. The fourth line tells the FTP server to listen on an IPV4 port. The fifth line enables passive mode for all FTP sessions. The sixth line specifies the internet address of the router (in a real configuration file the letters would be replaced by values in the range 0-255). The seventh line specifies the minimum port value to use as a data port. The eighth line specifies the maximum port value to use as a data port. In this example, only two server data port values are allowed.

Plugin (FTP client) configuration

The FTP client (in this case the plugin) also requires configuration in order to connect to the FTP server and transfer backup data to the FTP server machine disks. The plugin configuration data must include the firewall address (which stands as the proxy for the FTP server address), the user identity on the FTP server machine and password in order to login to the correct account under which the backup data is stored. In addition, it must specify the command port that the FTP server uses. It also must include the directory path of the folder used on the server to store the backup data files. Finally, it must configure the client (plugin) to use FTP passive mode when opening an FTP connection to the FTP server.

Two other categories of information not directly related to the FTP protocol, but nevertheless necessary to support the backup process are used. The first comprises information used to schedule the backups on the plugin. The second comprises information required to handle impermanent IP addresses assigned to the firewall. These are covered in the next sections.

Scheduling backups by the plugin

The backup plugin enables WordPress admins to define jobs that trigger backup activity. Each job is defined using a set of configuration panes, one of which is a schedule for when to run the job. Figure 2 shows a typical schedule pane.

Figure 2 – WordPress cron job schedule pane

The first set of items in this pane specifies the specific scheduler to use to trigger the backup activity. Three choices are presented: 1) manual scheduling (i.e., the admin must trigger the job manually), 2) using the WordPress cron facility to schedule the job (under this choice is the option to use EasyCron.com, a paid for scheduling service), and 3) using a URL link that an external scheduler can reference to start the job.

The WordPress cron scheduler is the most convenient to use, since it is built into the WordPress functionality that underlies the webserver itself. Configuration for the job’s use of the WordPress cron scheduler is shown in Figure 2 in the section titled “Schedule”. Four scheduling options are given: 1) monthly, 2) weekly, 3) daily, and 4) hourly. The monthly and weekly options are configured by specifying a day (a day of the month or day of the week) and the hour and minute at which to start the job. The daily option allows the admin to specify the hour and minute at which to start the job and the hourly option allows the admin to specify the minute at which to start the job.

While the WordPress cron scheduler is convenient to use, it has a significant drawback. WordPress only runs when an HTTP request arrives from a browser. When this happens, WordPress checks to see if any WordPress cron jobs are scheduled to run and if so runs them. However, if no HTTP request arrives when a job managed by the WordPress cron scheduler is supposed to run, then the WordPress cron scheduler itself doesn’t run and therefore the job doesn’t run. While the live sites generally get enough traffic to make this problem moot, the test websites do not. Consequently, another scheduling option must be used to backup these sites.

IONOS server configuration is managed thorough the IONOS Mount Olive account. IONOS provides a configuration application under this account that allows its customers to manage various aspects of its behavior using a web browser. After logging into the account, the admin is presented with a web page displaying several choices for configuring the account. One of these is the “Hosting” choice. Clicking on this option takes the admin to a page that specifies several categories of configuration, including “Webspace”, “SFTP & SSH”, “Databases”, among others. At the bottom of the page is a list of “More features” and the first service listed is “Cron Job Manager”. Clicking on this text displays a page that allows the admin to create new cron jobs and also to modify existing cron jobs. Note that these jobs are not managed by the WordPress cron scheduler. Rather they are managed by the underlying operating system cron facility. Therefore, these jobs run whether or not WordPress is triggered by an HTTP request.

Accessing the “Cron Job Manager” page displays all currently defined jobs as well as a button near the top right corner of the page that allows admins to create new cron jobs. Cron job configuration allows an admin to specify an HTTP URL that is referenced when the job runs as well as scheduling data. A typical specification of scheduling data is shown in Figure 3.

Figure 3 – IONOS cron job scheduling pane

The precision of scheduling of an IONOS cron job is significantly less that that provided by the WordPress cron scheduler. In particular, after defining the day on which the job should run, the only time options available are: 1) Morning, 2) Afternoon, 3) Evening, and 4) Night. These comprise 6-hour periods during which the cron scheduler may run the job.

To use the IONOS cron scheduler, the backup plugin scheduling option denoted as “with a link” is used (see Figure 2). A URL is supplied for this option that when referenced runs the backup job. The IONOS cron job specifies this URL as one of its configuration options. When the IONOS cron job runs, it references this URL (using an HTTP GET operation), which triggers the WordPress cron scheduler and which then runs the WordPress cron job.

Dynamic DNS

One potential problem that could arise is related to the allocation of an IP address to the FTP complex firewall by the ISP DHCP server. As mentioned above, configuration of passive mode on the FTP server requires specification of the firewall IP address using the pasv_address configuration option. However, some ISPs change the allocated address assigned to the firewall periodically. This complicates FTP server configuration, since providing a fixed IP address for the value of pasv_address is not possible.

One possible solution is to use Dynamic DNS (DDNS) to assign a DNS name to the firewall and provide that for the value of pasv_address. DDNS continually monitors the firewall IP address and when it changes, changes the mapping between a DNS name assigned to the FTP complex and the firewall IP address.

(Note: this problem will only occur if the FTP server is inside the Mount Olive IT network. Currently, this is not a problem. This section is simply a placeholder in case the problem arises. If so, a solution will be documented.)