How to incrementally mirror FTP site regularly

Daily Business Data, like NAV, is often made available via FTP servers which stores and makes available the data for limited time duration (like a week). We need to not only download the data for the day but also want to ensure that the historical data remains intact. Today I will show you how you can download and mirror the data from an ftp server without losing access to old data and how you can schedule it via cron to run periodically. Only changes are downloaded.

wget -m --retry-connrefused --password='password' ftp://login@address/ -o log

login -> Login
password -> Password
address -> IP Address or Hostname of the FTP server
log -> Log file name

Save the above in a file named mirror-data and make it executable:

chmod 755 mirror-data

Run crontab -e to edit the cron (scheduler) and add the following line after making necessary changes:

0 2 * * 1-5 /path/mirror-data

This runs the executable every week day at 2am.

Author: Angsuman Chakraborty

Software architect and entrepreneur with substantial experience in designing and developing enterprise software and services for Fortune 500 companies as well as startups. Enjoys solving complex software and bioinformatics problems. Currently heading a product development & outsourcing company specializing in Web 2.0 technologies. Specialties: Software Architecture, Software Design, Software Consultant, Java Developer, Web 2.0, Custom product development, Bioinformatics, Software Outsourcing, Management Consultant

Leave a Reply

Your email address will not be published. Required fields are marked *