Mirror a website using lftp
I'm helping an organization with some website work. But I'm not the only one working on the website, and there's no version control. I wanted an easy way to make sure all my files were up-to-date before I start to work on one ... a way to mirror the website, or at least specific directories, to my local disk.
Normally I use rsync -av over ssh to mirror directories, but this website is on a server that only offers ftp access. I've been using ncftp to copy files up one by one, but although ncftp's manual says it has a mirror mode and I found a few web references to that, I couldn't find anything telling me how to activate it.
Making matters worse, there are some large files that I don't need to mirror. The first time I tried to use get * in ncftp to get one directory, it spent 15 minutes trying to download a huge powerpoint file, then stalled and lost the connection. There are some big .doc and .docx files, too. And ncftp doesn't seem to have a way to exclude specific files.
Enter lftp. It has a mirror mode (with documentation, even!) which includes a -X to exclude files matching specified patterns.
lftp includes a -e to pass commands -- like "mirror" -- to it on the command line. But the documentation doesn't say whether you can use more than one command at a time. So it seemed safer to start up an lftp session and pass a series of commands to it.
And that works nicely. Just set up the list of directories you want to mirror, and you can write a nice shell function you can put in your. .zshrc or .bashrc:
sitemirror() { commands="" for dir in thisdir thatdir theotherdir do commands="$commands mirror --only-newer -vvv -X '*.ppt' -X '*.doc*' -X '*.pdf' htdocs/$dir $HOME/web/webmirror/$dir" done echo Commands to be run: echo $commands echo lftp <<EOF open -u 'user,password' ftp.example.com $commands bye EOF }
Super easy -- all I do is type sitemirror
and wait a little.
Now I don't have any excuse for not being up to date.
[ 12:39 Jun 21, 2014 More tech/web | permalink to this entry | ]