Quick Guide to Setting up NetApp Deduplication

I’m relatively new to NetApp deduplication (formerly A-SIS), so this article won’t be an advanced treatise on NetApp deduplication or its deep inner workings. Instead, this is intended to be a quick guide to setting up NetApp deduplication for others, like myself, who may be familiar with Data ONTAP but not necessarily deduplication.

Obviously, the first step will be to ensure that your NetApp storage system is licensed for deduplication. As of March 10, NetApp made the NearStore option, which was a prerequisite for deduplication, free. Yes, you read that right: free. Since NearStore is a prerequisite, you’ll need to be sure to license that first:

license add <Code for NearStore>
license add <Code for Deduplication>

Once deduplication is licensed, then you can enable it on a per-volume basis using the “sis on” command:

sis on /vol/<volname>

Note, however, that the volume cannot exceed a certain size, based on the storage system model, in order for deduplication to work. These volume size limits are laid out in TR3505. Note that the volume must never have been any bigger than the size limits described, so this means you can’t size it down to the limits set forth and then run deduplication.

Once it’s running, you can check the status with:

sis status /vol/<volname>

After it’s finished running, you can see your space savings like this:

df -s /vol/<volname>

After running deduplication on a small NFS volume that housed only three VMs, the “df -s” command showed a space savings of 64%. That’s pretty impressive!

Moving forward, deduplication will run automatically every night at midnight, as shown by this command:

sis config /vol/<volname>

That should be enough to get most everyone started. Feel free to post comments or corrections below.

Tags: , , , , ,

Scott,

Nearstore is free, but how much does the Deduplication license ballpark for, any idea, just a rough estimate is all?

Thanks,

Ron

Cool posting, Scott!

When I was working at the university, we beta’d A-SIS about a year ago. We were using VMFS datastores on FCP luns, so the savings were not apparent to the VirtualCenter Admin as far as free space goes that he might use. (It turns out that it is possible to use the extra volume free space to store extra snapshots or more headroom for SnapMirror, but you may get some real WOW from the VM admins to show it off live on NFS stores!)

Are you seeing the NFS volume free space increase after you run the A-SIS process, or is a management/vpxa service restart necessary? Also, was this filesystem ACTIVE during your A-SIS deduplication processing window? What was the CPU impact, if so? Just curious… (we were seeing up to 83% by separating Windows temp/swap to a VMFS datastore on a volume where snapshots were disabled - no sense trying to dedupe that stuff)

Wes

Ron,

The deduplication license should be free, as well, if I’m not mistaken.

Wes,

Good to hear from you! We are indeed seeing the space savings show up on the NFS datastore, although it does take a little while for the numbers to change. I imagine that a vpxa restart would cause the numbers to change more rapidly, but I’m in no hurry. The space savings on the block (LUN) side are more difficult to see, but as you’ve pointed out you can use that space to store more Snapshots or even to provision additional LUNs in the same FlexVol.

Hey…are you still interested in getting together on that Kerberized NetApp NFS stuff we discussed via e-mail a while back?

Correct. Both licenses are free.

Indeed, I am. Funny thing is, that stuff is VERY well implemented internally. I hear you’re out here on occasion, so look me up, or contact me via email to follow up.

wes

Is this functionality for OnTAP 7.3 only, or can we still get it with 7.2.3? We have a FAS3020 that I’m thinking this should work on, but I’m having a hard time coming up with the “free” license codes.

I thought I’d just contact support, plug the codes in and forge ahead, but that’s not working out the way I thought it would.

Sam,

This is available in Data ONTAP 7.2.3, although NetApp is making an upgrade to Data ONTAP 7.2.4 mandatory for users of deduplication due to a recently uncovered bug. You should be able to contact your NetApp sales rep and get the licenses for NearStore and deduplication without that much difficulty. (Note that I said, “should”.)

After dededuplicating existing data with ASIS the “old” stuff is usually still hidden in snapshots. These snapshots have to expire (if they were automatic) and then space savings will increase even more. I started a series with our experiences with VMware / ASIS on my personal blog (http://21stcenturystorage.cebis.net/).
BTW great work, Scott - your blog is a valuable source for many admins in our company :-)

Christoph,

Yeah, I’ve noticed posts from your blog showing up in my RSS feeds over the last few days–looks like you’ve got some good stuff so far, keep up the good work. I’m glad you and your team are finding my site helpful.

Take care!

Any clues if Ontap 7.3 will remove/change any of the asis/flexvol limitations?

The limits according to tr-3505 is just plain artificial and silly how low. They would force us into extra managment work just to be able to use ASIS. Being used to Datadomains, the Netapp-numbers seems like “buy bigger that you need” practise from the company.

Dejan,

Talk to your NetApp SE or sales rep for more information–I have information on this but I can’t discuss it with you because it’s under NDA. Your best bet is to take that matter up directly with your NetApp team.

As for the comparison with Data Domain, one key thing to remember is that with NetApp we are deduplicating primary data. It is my understanding that Data Domain deduplicates backup data only, so there are some important differences in what the two companies are trying to achieve with this functionality.

Scott,

This post was invaluable in letting me know that A-SIS licensing was now free and how to get it set up. I had recently upgraded our FAS3020’s to OnTAP 7.2.4 and our ESX servers to 3.5.0 so I was ready to rock. On Friday I created my first A-SIS volumes and started to Storage VMotion VMs into them. I’m using iSCSI and I’m turning off Space Reservation for the LUNs (and experimenting with (file Space Reservation for the volumes) to reap the storage savings as well as give the A-SIS metadata the room it needs to work. I have all VMware workload on these, so of course I’m seeing a great return rate (40+% so far). Thanks!

Aharden,

I’m glad you found the article helpful. Thanks for reading!

Hey Scott,
First off, I echo several others comments; this is a great resource for all of us NetApp devotees out there.

Can you provide or point me to any resources that detail the affect SIS has on *existing* SnapMirror operations? I’ve found plenty about SIS + SM = good, but most sites talk about turning on SM after SIS.

We are in the process of reorganizing several sets of data via SM that I can’t afford to stop and restart. However, because of savings we’ve seen (up to 71% - and that’s not even VMware!), we’d like to turn it on everywhere.

Theory says it should work just fine and begin transferring the deduped blocks with a big hit in the snap reserve until the exising snaps expire, but unfortunately, I don’t have the capacity to test that theory.

Jeff,

Thanks for the feedback.

I don’t know of any resources off the top of my head. I do agree that adding deduplication to an existing SnapMirror relationship will most certainly cause a spike in snap reserve usage, but I’m also wondering about the replication traffic. I guess transferring pointers should be easy and low-impact, but I don’t know that for certain.

I’ll see if I can get in touch with some friends at NetApp and dig something up for you. Or perhaps a NetApp reader would care to comment here…?