What can I do with ZFS?



  • On my FreeBSD server, I have a 1TB drive with ZFS file system. Configured with the defaults during installation.

     # zfs list
    NAME                 USED  AVAIL  REFER  MOUNTPOINT
    zroot               4.33G   895G    96K  /zroot
    zroot/ROOT          1.61G   895G    96K  none
    zroot/ROOT/default  1.61G   895G  1.61G  /
    zroot/tmp            120K   895G   120K  /tmp
    zroot/usr           2.72G   895G    96K  /usr
    zroot/usr/home      1.96G   895G  1.96G  /usr/home
    zroot/usr/ports      781M   895G   781M  /usr/ports
    zroot/usr/src         96K   895G    96K  /usr/src
    zroot/var           1.72M   895G    96K  /var
    zroot/var/audit       96K   895G    96K  /var/audit
    zroot/var/crash       96K   895G    96K  /var/crash
    zroot/var/log       1.19M   895G  1.19M  /var/log
    zroot/var/mail       120K   895G   120K  /var/mail
    zroot/var/tmp        136K   895G   136K  /var/tmp
    
    

    Through a series of unlucky circumstances, I've recently happened upon a fresh 3TB drive. I'll be formatting it to ZFS and adding it to the system.

     # geom disk list
    Geom name: ada0
    Providers:
    1. Name: ada0
       Mediasize: 3000592982016 (2.7T)
       Sectorsize: 512
       Stripesize: 4096
       Stripeoffset: 0
       Mode: r0w0e0
       descr: WDC WD30EZRX-00D8PB0
       lunid: 50014ee2b6e88a0e
       ident: WD-WCC4N1XYTRYS
       fwsectors: 63
       fwheads: 16
    
    Geom name: ada1
    Providers:
    1. Name: ada1
       Mediasize: 1000204886016 (932G)
       Sectorsize: 512
       Mode: r1w1e3
       descr: Hitachi HDS721010CLA330
       lunid: 5000cca39cc5a08e
       ident: JP2940N10DBS6V
       fwsectors: 63
       fwheads: 16
    
    

    But how exactly should I use it?

    I've read a little bit about ZFS and have a vague idea about its capabilities and commands. Obviously, I want to use my new drive to ensure safety of my server data.

    So, is snapshoting + backup the way to go? Or maybe I want RAID? ZRAID?

    Should I add everything into one pool, or should I create a new pool for my new drive? If I add the drive into my existing pool, and set the "copies" option to 2, will that help me if one of the drives dies?

    Do I need to do something extra to access ZFS' data integrity features (anti-bitrot stuff)? Any backup tool I can use instead of setting up crons?

    Basically, if there are any ZFS aficionados around, please share some tips on how do I use this fancy new thing.



  • I know a fair amount about ZFS, so I'll try to answer all of the basic questions.

    Two of the most useful features for simple use at home are compression and snapshots.

    On Linux, compression is off by default, but I'm not sure about BSD. You can check it with zfs get compression <pool> (zfs get all <pool> will show a bunch of properties). I think the standard recommended setting is lz4.

    The amount of space taken up by snapshots depends on how much the data changes. Snapshots are very cheap if the data doesn't change much. I do daily snapshots, but I use that disk mostly for media, so changes aren't common. You can also set up scripts to delete old snapshots (such as keeping only one snapshot per week once they're over a month old).

    You can add the disk to the existing pool and get the total amount of space of both disks combined, but you won't have any redundancy. I can't remember offhand if you can add the new disk as a mirror of the existing disk, but if you do, you'll only have as much space as you have on the smaller disk. raidz, which is basically RAID 5, won't help with fewer than 3 disks anyway.

    Another major feature of ZFS is send/receive, which lets you copy snapshots from one pool to another. You can also do incremental send/receive, so you only copy the differences between two snapshots. To send/receive over a network, you just pipe the send over SSH (which runs the receive).



  • @Dragnslcr said:

    On Linux, compression is off by default, but I'm not sure about BSD. You can check it with zfs get compression <pool> (zfs get all <pool> will show a bunch of properties). I think the standard recommended setting is lz4.

    # zfs get compression zroot
    NAME   PROPERTY     VALUE     SOURCE
    zroot  compression  lz4       local
    
    

    Seems BSD got this covered.

    Is this a good idea, though? I never use compression on other FS-s because of speed. Basically, space is cheap, speed is at a premium.

    Are things different on ZFS?

    @Dragnslcr said:

    The amount of space taken up by snapshots depends on how much the data changes. Snapshots are very cheap if the data doesn't change much. I do daily snapshots, but I use that disk mostly for media, so changes aren't common. You can also set up scripts to delete old snapshots (such as keeping only one snapshot per week once they're over a month old).

    @Dragnslcr said:

    Another major feature of ZFS is send/receive, which lets you copy snapshots from one pool to another. You can also do incremental send/receive, so you only copy the differences between two snapshots. To send/receive over a network, you just pipe the send over SSH (which runs the receive).

    Yeah.

    Some of the workflows I've been reading about are

    • create snapshot
    • send/receive snapshot into a different pool on a different machine (pipe over network if needed)
    • delete snapshots locally
    • organize backup rotation on the other end

    All this sounds like a lot of error-prone scripting. Need to look into some utility for this.

    @Dragnslcr said:

    You can add the disk to the existing pool and get the total amount of space of both disks combined, but you won't have any redundancy. I can't remember offhand if you can add the new disk as a mirror of the existing disk, but if you do, you'll only have as much space as you have on the smaller disk. raidz, which is basically RAID 5, won't help with fewer than 3 disks anyway.

    I'm leaning towards just creating a new pool for the new disk. Then I can cross-backup stuff between disks as needed.

    That sounds like a pedestrian solution, but at least I'll know what's going on and that it'll work.



  • @cartman82 said:

    speed is at a premium

    CPU time is a lot cheaper than I/O time when you're talking about the differences caused by compression. Unless you're using xz -9e or something.



  • @ben_lubar said:

    CPU time is a lot cheaper than I/O time when you're talking about the differences caused by compression. Unless you're using xz -9e or something.

    Hmmm... good point.



  • Unless the machine is using almost 100% of the CPU all the time, compression is almost always worth it. ZFS is good at it, so you'll get a pretty good compression ratio with probably no noticeable performance hit.

    That snapshot workflow sounds pretty reasonable. I haven't bothered setting up scripts to delete old snapshots yet, but like I said, the ones I have barely take up any disk space.


  • Notification Spam Recipient

    @Dragnslcr said:

    simple use at home are compression and snapshots.

    👍

    @cartman82 said:

    Basically, space is cheap, speed is at a premium.
    Compressing it is usually also cheap(ish), especially if you're not doing much on your server.

    @cartman82 said:

    Need to look into some utility for this.
    I use FreeNAS, which takes care of almost all of this for you.

    As an added benefit, you should be able to import your existing ZFS pools during the install, and the install can happen to a USB flash drive as well, so if it doesn't work out for you, just unplug and you're back to whatever you had before.

    I'm not going to fanboy FreeNAS too much, but since it's the only BSD I've worked with so far:

    • Automatic snapshot generation and cleaning
    • Automatic ZFS sending to another FreeNAS box (once you generate SSH keys etc).
    • Most everything you would want to do is abstracted behind a GUI, so it's slightly more newb-friendly (👋)
    • Other fun toys you can do, in addition to all the NAS-y things like Samba and AFP shares.

    Back on topic though, it's comparable to LVM (I think) but for BSD.



  • Here's what I did so far.

    To describe the initial situation, I've had my /usr/home directory (this is the equivalent to /home on linux) as one dataset in my zroot pool. Inside, I had my /usr/home/cartman and /usr/home/git directories, which is the precious data I want to preserve. I also had a bare HDD attached to the system. See the listings in the OP.

    First, I initialized a new pool on my empty drive.

    # zpool create storage /dev/ada0
    

    My new disk is now formatted and available under /storage.

    I set the compression, as I was shown it is a good idea:

    # zfs set compression=lz4 storage
    

    Then I created a new dataset which I'll use for backups.

    # zfs create storage/backup
    
     # zfs list -r storage
    NAME                       USED  AVAIL  REFER  MOUNTPOINT
    storage                   1.96G  2.63T    96K  /storage
    storage/backup            1.96G  2.63T   176K  /storage/backup
    

    Dataset is basically the equivalent of partition, only much lighter and hierarchical. ZFS can only work its magic on the level of datasets, not directories or files. That's why you want to create plenty of these, to get the level of granularity you need.

    So the problem now was that my /usr/home/git directory wasn't a dataset, but an ordinary directory inside the zroot/usr/home dataset. And there doesn't seem to be an automated way to convert a nested directory into a nested dataset.

    So I did something like

    # mv /usr/home/git /tmp
    # zfs create zroot/usr/home/git
    # rsync -a /tmp/git/ /usr/home/git
    # rm -r /tmp/git
    

    ...Only with a lot more faffing around and mistakes.

    Anyway, after a similar transformation with my home folder, I ended up with this:

    # zfs list -r zroot
    NAME                     USED   AVAIL  REFER  MOUNTPOINT
    zroot                    4.34G   895G    96K  /zroot
    ...
    zroot/usr/home           1.96G   895G  4.73M  /usr/home
    zroot/usr/home/git       1.96G   895G  1.96G  /usr/home/git
    zroot/usr/home/cartman   172K    895G   172K  /usr/home/cartman
    

    So now I could create my first snapshot...

    # zfs snapshot zroot/usr/home/git@2016-03-16
    

    ... and send it to my backup directory on the other drive.

    # zfs create storage/backup/git
    # zfs send zroot/usr/home/git@2016-03-16 | zfs recv storage/backup/git/daily
    

    It seems I got my backup alright

    # zfs list -t snapshot
    NAME                                  USED  AVAIL  REFER  MOUNTPOINT
    storage/backup/git/daily@2016-03-16    72K      -  1.96G  -
    zroot/usr/home/git@2016-03-16            0      -  1.96G  -
    

    So theoretically, I put this in cron, and I'm good to go?

    Not sure. I'm still confused. Will this create a new snapshot every time in the destination? Should I send incremental snapshots and send that? Or just overwrite everything each time?

    Also, what if my BSD + ZFS go kaboom? I wouldn't mind having a copy of this stuff in some more portable format. I'll probably handle that when I add samba sharing.



  • Also, what if my BSD + ZFS go kaboom? I wouldn't mind having a copy of this stuff in some more portable format. I'll probably handle that when I add samba sharing.

    Since you already have rsync installed, try rsnapshot. I really liked it when I used it.


  • Notification Spam Recipient

    @cartman82 said:

    Should I send incremental snapshots and send that?

    That would definitely be quicker, depending on usage.
    For a profile measuring in the single-digit gigs, it probably won't matter too much.

    @cartman82 said:

    Also, what if my BSD + ZFS go kaboom?
    Yes, that's the point of redundancy with backups and raid. Eventually I'll have enough bits to build a second server to be a backup target for the current one.

    Things were really scary when I had only one disk (over USB) as the only location for a lot of personally critical data...

    Like @Captain said, there's tools to transmit and sync your tree to more canonical systems if needed. IIRC you could even technically mount ext2 disks in your system and rsync it to that too.

    It just depends on what you want your recovery to be: swap out a failed disk (raid), download your snapshots (zfs snapshots), recreate all of your datasets on a fresh zfs pool and rsync it back over (backup held in portable format only).



  • Here's more experimenting.

    I created another snapshot.

    # zfs snapshot zroot/usr/home/git@2016-03-16-2
    

    Then tried to pave over the old one

    # zfs send zroot/usr/home/git@2016-03-16-2 | zfs recv storage/backup/git/daily
    
    cannot receive new filesystem stream: destination 'storage/backup/git/daily' exists
    must specify -F to overwrite it
    warning: cannot send 'zroot/usr/home/git@2016-03-16-2': Broken pipe
    

    Ok...

    # zfs send zroot/usr/home/git@2016-03-16-2 | zfs recv -F storage/backup/git/daily
    
    cannot receive new filesystem stream: destination has snapshots (eg. storage/backup/git/daily@2016-03-16)
    must destroy them to overwrite it
    warning: cannot send 'zroot/usr/home/git@2016-03-16-2': Broken pipe
    

    So, this obviously isn't the way it was designed to work.

    # zfs send -i 2016-03-16 zroot/usr/home/git@2016-03-16-2 | zfs recv -F storage/backup/git/daily
    

    This worked like a charm and is obviously the way to go. -F switch ensures that backups will be paved over even if you make changes to the destination.

    Unfortunately, there doesn't seem to be a relative reference syntax like with git (eg. send changes since the previous snapshot), so cron will have to be a bit more clever than it's ideal.


  • Notification Spam Recipient

    @cartman82 said:

    cron will have to be a bit more clever than it's ideal.

    Yeah, IIRC the FreeNAS interface does this a bit more dynamically through python scripts.

    You're walking the path of others, is it for the experience?



  • @Tsaukpaetra said:

    You're walking the path of others, is it for the experience?

    I'd be happy to take an automated ZFS backup solution. Haven't looked much into it yet, but everything so far seems pretty ad-hoc and unpolished. @captain's thing is more for doing off-site backups, to a different server, not really this zfs snapshot wrangling.


    BTW,

     # zfs list -t snapshot
    NAME                                    USED  AVAIL  REFER  MOUNTPOINT
    storage/backup/git/daily@2016-03-16       8K      -  1.96G  -
    storage/backup/git/daily@2016-03-16-2     8K      -  1.96G  -
    storage/backup/git/daily@2016-03-16-3    72K      -  1.96G  -
    zroot/usr/home/git@2016-03-16              0      -  1.96G  -
    zroot/usr/home/git@2016-03-16-2            0      -  1.96G  -
    zroot/usr/home/git@2016-03-16-3            0      -  1.96G  -
    
    

    This will quickly become a problem without some culling. On both sides.

    Also, why are only snapshots on the backup drive taking space?


  • Notification Spam Recipient

    @cartman82 said:

    Haven't looked much into it yet, but everything so far seems pretty ad-hoc and unpolished.

    Like this?

    (Huh, apparently none of my iSCSI clients have been changing anything on their disks for a while...)
    (Also, I'm not replicating these snapshots anywhere, yet, but that column would indicate its' status if I were)

    @cartman82 said:

    zfs snapshot wrangling
    I think you can configure snapshot wrangling to target the local machine actually, so long as it's to a different pool.

    @cartman82 said:

    This will quickly become a problem without some culling. On both sides.
    All the better to have an established system manage it for you. ;)

    @cartman82 said:

    Also, why are only snapshots on the backup drive taking space?
    They're probably containing the metadata for the reference? Can't recall.



  • FreeNAS seems a bit heavy weight if I'm not setting up a NAS. But...

    Oh God, no. They're gonna do that whole OSS-y thing with initial letter branding (like KDE and GNOME), aren't they?



  • @cartman82 said:

    Oh God, no. They're gonna do that whole OSS-y thing with initial letter branding (like KDE and GNOME), aren't they?

    Are you fucking kidding me? What are you, 7 years old?

    Also note they are actually trying to develop a support based business around this thing. I bet zey zont have many clientz.


  • Notification Spam Recipient

    @cartman82 said:

    [IMG]

    Are you fucking kidding me? What are you, 7 years old?

    I was literally going to post this, Haha.

    I do hope their little "joke" doesn't continue too far...



  • You know what? I think I'll just setup my own cron, thank you. That way I won't find all my documents have "s" mysteriously replaced with "z" a year down the road.


  • Notification Spam Recipient

    @cartman82 said:

    That way I won't find all my documents have "s" mysteriously replaced with "z" a year down the road.

    😆

    @cartman82 said:

    I think I'll just setup my own cron, thank you.
    Yeah, if all you want is a simple job set, that's really all you need for it.
    All the management cruft is for those that need a lot more going on under the hood. ;)


  • BINNED

    @cartman82 said:

    Are you fucking kidding me? What are you, 7 years old?

    MORTAL KOMBAAAAT!

    ... shit, now I have to find that damned silly song on YT and play it some 50 times...



  • @cartman82 said:

    What are you, 7 years old?

    Probably more like zeven years old.


  • BINNED

    @cvi said:

    Probably more like zeven yearsz old.

    Fikzed that for you



  • @cartman82 said:

    ```text

    zfs send -i 2016-03-16 zroot/usr/home/git@2016-03-16-2 | zfs recv -F storage/backup/git/daily

    
    Here's an example line that I have for sending daily snapshots from one server to another:
    
    `/sbin/zfs send -R -v -I pool@2016-03-16 pool@2016-03-17 | ssh user@server "sudo /sbin/zfs receive -F -v -u pool2/backups/pool"`
    
    This is using the Linux version of ZFS, so you might have to tweak it a bit.
    
    @cartman82 <a href="/t/via-quote/55587/11">said</a>:<blockquote>Unfortunately, there doesn't seem to be a relative reference syntax like with git (eg. send changes since the previous snapshot), so cron will have to be a bit more clever than it's ideal.</blockquote>
    
    Here's a really good tutorial about ZFS (it's specifically about the Linux version, but all of the concepts are the same): https://pthree.org/2012/04/17/install-zfs-on-debian-gnulinux/


  • And in case you need it, here's a shell line that should get you the name of the most recent snapshot, which you can use with the -I option:

    /sbin/zfs get creation -t snapshot -d 1 -H -p -o value,name pool | sort -r | head -n 1 | awk '{ print $2 }' -



  • What are you doing about old snapshots?

    If you are doing this every day, seems like you'll end up with thousands of daily snapshots. Is that even a problem?


  • FoxDev

    @Dragnslcr said:

    /sbin/zfs get creation -t snapshot -d 1 -H -p -o value,name pool | sort -r | head -n 1 | awk '{ print $2 }' -

    if you happen to be allergic to awk.... this should work as well.

    /sbin/zfs get creation -t snapshot -d 1 -H -p -o value,name pool | sort -r | head -n 1 | cut -f2



  • Nothing at the moment, though that is something I need to implement.

    @cartman82 said:

    BTW,

     # zfs list -t snapshot
    NAME                                    USED  AVAIL  REFER  MOUNTPOINT
    storage/backup/git/daily@2016-03-16       8K      -  1.96G  -
    storage/backup/git/daily@2016-03-16-2     8K      -  1.96G  -
    storage/backup/git/daily@2016-03-16-3    72K      -  1.96G  -
    zroot/usr/home/git@2016-03-16              0      -  1.96G  -
    zroot/usr/home/git@2016-03-16-2            0      -  1.96G  -
    zroot/usr/home/git@2016-03-16-3            0      -  1.96G  -
    
    

    This will quickly become a problem without some culling. On both sides.

    Also, why are only snapshots on the backup drive taking space?

    The REFER column is the amount of data in the snapshot. The USED column is how much disk space the snapshot is actually consuming. There's usually a small amount of metadata for a snapshot, so you'll often see a snapshot taking up 8K. The only time a snapshot will use up more than that is if there were changes to the files.



  • Here's the cron I went with:

    0 8 * * * /root/scripts/zfs-backup.sh zroot/usr/home/git storage/backup/git
    

    ####/root/scripts/zfs-backup.sh:

    #!/usr/bin/env bash
    
    # eg: zroot/usr/home/user
    SOURCE="$1"
    
    # eg: storage/backup/user
    DESTINATION="$2"
    
    fatal() {
    	echo $@ >&2
    	exit 1
    }
    
    debug() {
    	[[ -z $ZFS_BACKUP_DEBUG ]] || echo "$@"
    }
    
    usage_and_exit() {
    	echo "Usage: $0 <source> <destination>"
    	exit 0
    }
    
    validate_ds() {
    	[[ -z $1 ]] && usage_and_exit
    	zfs list "$1" > /dev/null 2>&1 || fatal "Invalid dataset: $1"
    }
    
    get_latest_snapshot() {
    	local full_name=$(zfs list -r -H -S creation -o name -t snapshot "$1" | head -n 1)
    	local name="${full_name##*@}"
    	echo "$name"
    }
    
    main() {
    	validate_ds "$SOURCE"
    	validate_ds "$DESTINATION"
    
    	local today="$(date +%Y-%m-%d)"
    	local from_snap="$(get_latest_snapshot $DESTINATION)"
    	local latest_snap="$(get_latest_snapshot $SOURCE)"
    
    	if [[ $latest_snap != $today ]]; then
    		debug "No today's snapshot found, making one for $today"
    		zfs snapshot "${SOURCE}@${today}"
    	fi
    
    	if [[ $from_snap == $today ]]; then
    		fatal "Already synced for $today"
    	fi
    
    	debug "Syncing from $from_snap to $today"
    
    	zfs send -i "$from_snap" "${SOURCE}@${today}" | zfs recv -F "${DESTINATION}"
    }
    
    main
    
    

    If everything goes well, every day this should create a new snapshot and sync it up into the backup. It should also be able to handle skipping a day or encountering custom unexpected snapshots. The only thing missing is handling the initial situation, where there aren't any snapshots.



  • Nice find on the -S option. I hadn't noticed that one, and not needing sort and awk definitely makes things simpler.

    I'm not certain that -i will work correctly if you have a snapshot in between that you've created manually. You can use -I instead, which does send all snapshots in between.

    You might have to do the initial send manually. You can just take a snapshot (or use one that gets created by your script, if it gets that far) and then do a send without the -i option.


Log in to reply