ZFS: Deduplicating is not a myth!

  • Posted on February 9, 2013 at 20:44

Long time no see!

After having it put away for, I guess almost 2 years, I took a look at ZFS again.
In ‘the early days’ ZFS only had a Linux-implementation using FUSE. An implementation which I liked from a Nerdish point-of-view, but not so much as a serious replacement for XFS on my operational Linux machines.

Since SSD is commonly available now, and all my operational servers have at least 16 cores, it was time to reevaluate the possibilities of ZFS on Linux again.

I was not disappointed! My oh my…

 

The ZFSonLinux Gentoo 64bit Walk-through:
(Using VMware Fusion 5)

  • Create a Gentoo Linux 64bit VM, add 4GB RAM at least, and the disks:
  • 1 Boot disk, 80GB
  • 4 Data disks, each 2TB. Single file, Do not Preallocate diskspace! Important!
  • 1 Cache disk, 20GB. Preallocating is advised. Not necessary.
  • Install Gentoo
  • Install sys-fs/zfs sys-fs/zfs-kmod
  • add ‘modules_3_6=”zfs”‘ to /etc/conf.d/modules
  • insmod /lib64/modules/3.6.11-gentoo/addon/zfs/zfs/zfs.ko
  • zpool create deduptestvol raidz -f /dev/sdf /dev/sdg /dev/sdh /dev/sdi
  • zpool add -f deduptestvol cache /dev/sdj
  • zfs set atime=off deduptestvol
  • zfs set dedup=on deduptestvol

The ZFS-volume, named deduptestvol, should be up and running right now. Typically it is mounted automatically under /.
Let’s check:

 

Testing the dedup-capabilities

  • I made one volume, with the same size, also RAIDZ
  • Created one file of exactly 1000MegaBytes
    (# dd if=/dev/random of=/data bs=100M count=10)
  • Copied that file over for 32 times
  • Then I copied that whole directory to the volume with deduplicating switched on.
  • On my host machine, I took a look a the disk space consuming.

Some proofs

Some recursive MD5’s over both volumes:

 

Conclusion

For saving 32Gigabyte of data in traditional RAID5:

The ‘normal’ ZFS-Volume consumed 44G of virtual disk-space.
The Deduplicated ZFS-Volume consumed 3.6G of virtual disk-space.

“ZFS is the shit!”

2 Comments on ZFS: Deduplicating is not a myth!

  1. zfs says:

    This is really “Absolutely Fabulous”. Now let get a beer (or dozens because it will just be one in the long run.)

Leave a Reply