Verifying a downloaded file's integrity requires a means of assuring that the local copy of the file is identical to the remote copy of the file. When downloading a file from a MOVEit Transfer host, MOVEit Automation automatically sends a "hash" of the downloaded file to MOVEit Transfer, to make sure the file it received is identical to the file on MOVEit Transfer. This "hash" is essentially a fingerprint of the file, and is constructed so that the likelihood of two different files having the same hash is very remote.
FTP servers do not support a standard method for providing such an assurance for clients who download files from them. However, many FTP server operators opt to use a method which provides a partial assurance of the downloaded file's integrity. This method involves making available a file on the FTP server which contains a list of hashes of the other available files. The client can then check the hashes listed in that file against the files it downloaded.
Any hash system can be used for this method, but the most frequently used is the MD5 hash. MOVEit Automation now supports using an MD5 file on a source FTP server to check downloaded files against. The option is available on FTP and SSH hosts, and is overridable on sources using FTP or SSH hosts. When properly configured, MOVEit Automation can check for an MD5 hash file (normally these files are named MD5SUM, but MOVEit Automation does support supplying a different name), and verify its downloaded files against the hashes contained in it. If a file does not match the hash listed for it, or if the file does not have a hash listed for it (only if MD5 checking is set to Required), MOVEit Automation will generate an error and discontinue processing of that file.
Note that because this method relies on downloading a list of file hashes, it cannot provide complete integrity verification, since there is no way for MOVEit Automation to make sure that the MD5 hash file was not altered in transit. It does, however, provide more verification than a normal FTP transfer, and under normal circumstances, will provide a defense against files that are somehow corrupted during transport.
Setting Up MD5 Hash Files on an FTP Server
The majority of FTP server operators that provide MD5 files use a program called "md5sum" to generate those files. This program takes a list of files and generates a list of MD5 hashes for those files. This output is then redirected to a file, normally called MD5SUM, which resides in the directory along with the files it contains hashes for. FTP operators wishing to provide MD5 hash files for a MOVEit Automation client should use this program. It is widely available on the internet, as well as being included in most UNIX distributions today. Use your favorite search engine to find a copy for your specific system, if you don't have it installed already.
To generate an MD5 hash file with md5sum, simply execute it with the list of files you wish to create hashes for, and redirect its output to a file called MD5SUM. For example, to create a hash file of all the files in the /ftproot/products directory on a server, issue the following command:
cd /ftproot/products
md5sum * > MD5SUM
Issuing a command like this for all important FTP content directories on a frequent basis will help provide added assurance that the files downloaded by clients such as MOVEit Automation are identical to the files on your FTP server.