00:59:49 #startmeeting Fedora Classroom - intro to rsync 01:00:08 #topic Intro 01:00:44 The idea for this class came up when nirik was doing his session on preupgrade 01:00:57 * nirik waves 01:01:05 There was the suggestion there (which is a good one) to back up your stuff prior to doing the upgrade 01:01:06 hi 01:01:49 But there weren't any ways presented to do so. I think that rsync is an easy way to back stuff up. (and do lots of other things) 01:02:05 So that's the first topic. 01:02:11 like keep mirrors in sync (usually ;-) ) 01:02:15 #topic Using rsync for backup 01:02:25 onekopaka: that's another topic :) 01:02:53 the basic purpose in life of rsync, as you might guess by it's name, is to keep two things in sync. 01:03:17 I'm just going to discuss the client side of rsync in this class, setting up an rsync server is beyond the scope 01:03:28 (but look at man rsyncd.conf if you're interested) 01:03:59 so the easiest form of rsync is to back up via ssh 01:04:22 all you need is another machine to ssh to, and you can sync files between them 01:04:30 no rsync server is required. 01:04:49 both machines obviously need to have ssh installed, and rsync :) 01:05:18 and the form to do that would be rsync -av /some/dir/ user@machine:/some/dir 01:05:29 I used a trailing slash in the source spec of that rsync command 01:05:52 that's important, as it says to consider all of the files in the directory, rather than the directory itself. 01:06:12 did i lose anyone there? 01:06:21 question 01:06:26 sure 01:06:44 with the remote location, are single quotes needed around the destination? 01:06:57 i've always used user@machine:'/some/dir' 01:07:01 not unless it has some weird characters in it. 01:07:08 cool 01:07:22 i'm done 01:07:46 the -a flag is actually a combination flag for several 01:07:56 basically it says to "archive" the directory 01:08:04 preserve timestamps, permissions, owner, etc 01:08:35 from the rsync manpage: archive mode; equals -rlptgoD (no -H,-A,-X) 01:09:17 the -v is just for verbose 01:09:28 it outputs all the files as it does it. 01:10:28 if there are already files in the destination by the same name, then they are only copied if they are different. 01:10:52 and the beauty of rsync is that only the differences are copied, not the entire file again. 01:11:04 any questions/comments? 01:11:43 guess not. 01:11:53 am I correct to understand it is a unidirectional sync? 01:12:06 BounceCat: it is. 01:12:13 i.e. one side is copied to the other. 01:12:25 but you can make the source the remote machine if you want. 01:12:34 ok 01:12:47 just flip the locations 01:12:51 yeah 01:12:57 put the "user@...." stuff first 01:13:22 #topic other uses for rsync 01:14:03 There are a number of other uses for rsync as well. If you have multiple webservers and a master repository of content for them, you could use rsync in cron in order to keep them in sync. 01:14:38 In Fedora Infrastructure, one of the uses we have is to gather httpd logs from multiple machines and aggregrate them into a single location 01:15:14 for web servers, you'll need to remember to modify etags handling if you rsync. (default etag includes inum info, which is not valid across hosts.) 01:16:00 fenris02: etags have always been mysterious for me, maybe you can enlighten me how they're used after class? 01:16:06 http://developer.yahoo.net/blog/archives/2007/07/high_performanc_11.html 01:16:49 basically, any time i need to keep two things in sync, or copy things from one place to another, i generally use rsync. 01:17:09 You can even use rsync locally, I just did that today, when setting up a home directory for a user 01:17:22 both the source and destination can be local filesystem paths 01:17:47 the only restriction is that you can't use two remote paths (and I *have* had use for that, but maybe I'm unique :) ) 01:18:46 jds2001: well in that case, why not just ssh into the remote and rsync locally there? 01:18:59 01:19:07 onekopaka: it's been more like user@host:/path user@otherhost:/path 01:19:17 jds2001: oh. 01:19:22 and host and otherhost couldn't talk directly to each other. 01:19:30 jds2001: well. 01:19:33 jds2001: that sucks. 01:19:43 i think i tried that once for poo and giggles 01:20:16 anyhow. 01:20:20 ssh user@host1 "tar cfz -" | ssh user@host2 "cd /path/to/; tar xzf -" 01:20:29 fenris02: exactly :) 01:20:43 nasty. rsync to local, and then out again is far nicer if you have space 01:21:24 the uses for rsync are as limitless as your imagination :) 01:21:55 #topic Syncing with public rsync servers 01:22:20 There are servers that run a server that serves the rsync protocol without need for an account on them 01:22:38 that's the server side I'm not covering here (but can cover later if there's interest) 01:22:55 mainly, these are upstream servers for mirrors. 01:23:42 so the source side of those would be rsync:///some/dir 01:24:15 if you eliminate a source dir, you will get a list of 'modules' that the server offers 01:24:42 im sorry, if you eliminate a destination dir. 01:24:46 and a source dir :) 01:25:12 so if I do rsync rsync://mirrors.tummy.com 01:25:14 I get 01:25:24 [jds2001@rugrat convert2]$ rsync rsync://mirrors.tummy.com 01:25:24 fedora Fedora - RedHat community project 01:25:24 epel Fedora EPEL - RedHat community project 01:25:24 fedora-enchilada Fedora - RedHat community project 01:25:24 pub The full mirror 01:25:48 those are the modules defined in nirik's rsyncd.conf 01:26:01 * nirik nods 01:26:40 If I wanted to sync fedora-enchilada (which is the complete content of the master mirrors), I could do the following (and I'll explain all of these switches): 01:27:48 rsync -avH --delay-updates --delete --delete-after rsync://mirrors.tummy.com/fedora-enchilada/ /some/path 01:28:02 some of those are important when syncing mirrors. 01:28:09 -H means to preserve hardlinks 01:28:19 everyone know what hardlinks are? 01:28:30 I guess. 01:28:39 i take that as no :) 01:28:58 they are ways for multiple directory entries to refer to the same inode on a filesystem. 01:29:11 I'm not sure about the difference between hard and symbolic. 01:29:17 that's my issue. 01:29:31 oh, good question 01:29:50 so a symbolic link is simply a pointer to another location on the filesystem. 01:30:00 they are an inode of it's own. 01:30:13 oh, okay. 01:30:23 the big difference is that hardlinks are restricted to the same filesystem (since they are directory entries that point to the same inode) 01:30:37 and symbolic links can point anywhere you please. 01:30:46 (including places that don't exist) 01:31:18 the big use of hardlinks is identical files in multiple places 01:32:03 for example, multilib stuff. The i386 and x86_64 versions are in the x86_64 directory, but the i386 version is also in the i386 directory 01:32:11 on a fedora mirror, the space savings are huge 01:32:12 identical selinux labels, ownership and permissions. 01:33:12 [jstanley@monster fedora]$ du -sh . 01:33:12 302G . 01:33:19 [jstanley@monster fedora]$ du -shl . 01:33:20 398G . 01:33:44 wow 01:33:44 so hardlinks on my mirror are saving me 96GB of actual disk. 01:34:01 * onekopaka doesn't have 302GB of space anywhere. 01:34:46 onekopaka: and that's not everything :) 01:34:59 nirik: how big is everything about? 01:35:14 * jds2001 doesn't carry debug, ppc 01:35:16 not sure off hand. 01:35:30 no biggie 01:35:45 so the next thing says --delay-updates 01:36:15 what that does is delays updating any files until the rsync run is completed. This keeps your mirror consistent during the sync process 01:36:21 either all updated, or all not. 01:37:10 --delete and --delete-after say to delete any files that are on the local filesystem that aren't on the remote, but only after the sync has completed. 01:37:25 (the default is to do it beforehand to save space) 01:37:58 and then the source and destination. 01:38:12 I mentioned that I didn't carry everything on my mirror. 01:38:25 I do that through the use of excludes. 01:38:48 rsync --exclude-from=/home/jstanley/mirrorsync/fedora-excludes -avH --delay-updates --delete --delete-after rsync://mirror.hiwaay.net/fedora-linux-updates/ /mirror/fedora/updates 01:38:57 that's my real rsync line from my mirror 01:39:24 so --exclude-from= is a newline separated list of things that I wish to exclude 01:39:33 no -c ? extraneous? 01:40:13 im not sure what -c does 01:40:34 pj 01:40:35 oh 01:40:36 -c, --checksum skip based on checksum, not mod-time & size 01:40:48 that creates load on the rsync server 01:40:55 and time and size is suffcient. 01:41:03 some rsync servers actually forbid -c 01:41:26 (I ran into that when I found I had a bunch of corrupted files on my mirror) 01:41:37 I quickly changed upstream mirrors obviously :) 01:41:45 good to know. thanks. 01:42:43 so in my exclude file, I have things like 01:42:44 ppc/ 01:42:44 ppc64/ 01:43:02 so any directory called ppc will be excluded 01:43:03 * nirik tsks. No ppc machines? :) 01:43:30 nirik: i actually have one on the shelf :) 01:43:39 * jds2001 may have to get it down sometime :() 01:43:40 :) 01:43:45 dual 1.25GHz G4 :) 01:44:15 much better than mine. Feel free to send it my way. :) 01:44:45 * nirik notes to get back to the topic that many packages build on librsync to provide rsync functionality inside them. rdiff-backup, etc. 01:45:36 any questions? 01:47:17 well that's all that I had.... 01:47:52 next up is mether talking about bodhi and koji, sure to be interesting 01:47:58 but for now... 01:48:01 #endmeeting