SAN

You are currently browsing articles tagged SAN.

The recent couple of articles I wrote about using NetApp deduplication—in particular, the article on using NetApp deduplication with block storage—have raised some questions that are probably worth addressing. Although NetApp deduplication works just fine with block-based storage, there are some considerations with regards to how the LUNs should be provisioned when deduplication will come into play.

Fortunately for me, someone at NetApp decided that it would be a good idea to document the five basic configurations of using NetApp deduplication with block storage. As seen in this comment to an earlier article, Larry Freeman points out this document on the NetApp Communities site (has anyone else noticed the similarity between VMware Communities and NetApp Communities?) that outlines the 5 basic configurations and where the freed blocks go in each configuration. Excellent—that saves me some work!

The most common configurations I’ve seen are configurations D (LUN not space reserved, space guarantee set to Volume) and E (LUN not space reserved, space guarantee set to None). Customers like to see the LUNs “shrink” after deduplication, and this is the only way to make that happen.

The only things we need now are for NetApp to a) remove the volume size limitations and b) get us deduplication at the aggregate level. Then we’d really be set!

Tags: , , ,

I came across this great article on the HP Fibre Channel VirtualConnect module tonight. Excellent work! Combine this with my TechTarget article on Ethernet networking with HP VirtualConnect and you have some good resources for using VirtualConnect in your environment.

Tags: , ,

Building on my earlier article on setting up NetApp deduplication, I wanted to follow up with some information on using NetApp deduplication with block storage (LUNs presented via Fibre Channel or iSCSI).

For the most part, using NetApp deduplication with block storage is a lot like I described earlier:

  • You (obviously) still need the NearStore and deduplication (A-SIS) licenses installed on the controller(s).
  • You will still turn deduplication on using the “sis on” command for the FlexVol containing the LUNs.
  • Limitations on the size of the FlexVol still apply.
  • You use the “sis status” command to check on the status of deduplication, and the “sis config” command to see the deduplication schedule.

OK, so what’s different? Well, it has to do with how LUNs are provisioned on a NetApp storage system. I’ve blogged before about managing LUN space requirements on a NetApp, and about using LUN clones vs. FlexClones. That second article, in particular, really goes into detail on how LUNs are implemented on top of NetApp’s file system, WAFL. Since LUNs are represented by WAFL as a single file, they are also normally “space reserved,” meaning that the maximum size of the LUN is allocated at the time of creation. If you create a 50GB LUN, then Data ONTAP creates a 50GB file right away. (For readers out there who are well-versed in NetApp storage, I know that’s a bit of a simplification, but bear with me.)

What does this have to do with deduplication? Great question. If the LUN is space reserved—meaning that the maximum size of the LUN is allocated up front and remains allocated to the LUN—then the file that represents the LUN won’t ever decrease in size to reflect deduplication savings, and deduplication therefore does you absolutely no good whatsoever. This is not to say that deduplication doesn’t work, just that it won’t help you at all.

Fortunately, there’s an easy fix for this. When creating the LUN, simply uncheck the box marked “Space Reserved” and allow Data ONTAP to allocate space to the LUN out of the containing FlexVol on an as-needed basis. Because the file that represents the LUN can grow in size, it can also shrink in size, and deduplication will cause the file that represents the LUN to decrease in size. This then allows you to provision additional LUNs from the same FlexVol to take advantage of the space savings resulting from deduplication.

I know that seems a bit confusing; I’ll probably post another article with some more in-depth discussions of the details. (Either that, or I’ll encourage my NetApp readers to chime in below in the comments.)

So, in summary, when using NetApp deduplication with block storage:

  • you’ll setup and configure deduplication on the FlexVol containing your LUN(s) just like described in my earlier article;
  • you’ll uncheck the “Space Reserved” checkbox when creating the LUNs to be deduplicated;
  • you won’t see the space savings from the host’s perspective and therefore can’t store more data in that LUN than the size of the LUN; but
  • you will be able to provision additional LUNs in that same FlexVol that can be presented back to host for additional storage.

I hope this helps clarify some of the questions or issues surrounding the use of NetApp deduplication with block storage. Feel free to add information, experiences with deduplication and block storage, or ask additional questions in the comments below.

UPDATE: There are some additional considerations about how to provision LUNs along with NetApp deduplication that warrant a more in-depth discussion. Look for a follow-up post within the next few days.

Tags: , , , ,

NetApp Blog Aggregator?

VMware’s done a great job of rolling together the VMware news and views with their VMware blog aggregator, Planet V12n. Does anyone know of something equivalent for NetApp? Where is “Planet NetApp”?

Tags: , , , ,

If you’ve worked with Network Appliance storage before, you’re probably already familiar with the idea of snap reserve (storage space set aside to accommodate for Snapshots) and fractional reserve (used with LUNs).  I’m going to hold the in-depth discussion of why you need snap reserve and fractional reserve for a different day, but I did want to pass on these commands that were shared with me by a colleague of mine.  These Data ONTAP commands, available with Data ONTAP 7.2 or later (some commands are available in Data ONTAP 7.1), will help you manage the space requirements for LUNs on a NetApp storage area network (SAN).

I’ll try to explain the commands along the way, but I would recommend you review the documentation available from the NOW site for more complete information.

vol options <volname> fractional_reserve 0

This command sets the fractional reserve to zero percent, down from the default of 100 percent.  Note that fractional reserve only applies to LUNs, not to NAS storage presented via CIFS or NFS.

snap autodelete <volname> trigger snap_reserve

This sets the trigger at which Data ONTAP will begin deleting Snapshots.  In this case, Snapshots will start getting deleted when the snap reserve for the volume gets nearly full.  The current size of the snap reserve can be viewed for a particular volume with the “snap reserve <volname>” command.

snap autodelete <volname> defer_delete none

This command instructs Data ONTAP not to exhibit any preference in the types of Snapshots that are deleted.  Options for this command include “user_created” (delete user-created Snapshot copies last) or “prefix” (Snapshot copies with a specified prefix string).

snap autodelete <volname> target_free_space 10

With this setting in place, Snapshots will be deleted until there is 10% free space in the volume.

snap autodelete <volname> on

Now that the Snapshot autodelete options have been configured, this command will actually turn the functionality on.

vol options <volname> try_first snap_delete

When a FlexVol runs into an issue with space, this option tells Data ONTAP to first try to delete Snapshots in order to free up space.  This command works in conjunction with the next command:

vol autosize <volname> on

This enables Data ONTAP to automatically grow the size of a FlexVol if the need arises.  This command works hand-in-hand with the previous command; Data ONTAP will first try to delete Snapshots to free up space, then grow the FlexVol according to the autosize configuration options.  Between these two options—Snapshot autodelete and volume autogrow—you can reduce the fractional reserve from the default of 100 and still make sure that you don’t run into problems taking Snapshots of your LUNs.

If you have a NOW login, you can get more information on Snapshot autodelete here; more information on volume autogrow is available here.  Be aware that SnapDrive may require different settings in order to accommodate its functionality, as it moves LUN management out of the storage system and onto the host.  Finally, the values presented here are only examples; be sure to use values that are appropriate for your environment.

Credit for compiling this list goes to my colleague Chauncey Willard.  Good work!

Tags: , , , ,

A Good Look at ESX Server I/O

This article showed up in my RSS feeds; it’s a fairly in-depth look at the ESX Server I/O stack, written by Nick Triantos.  Nick is with Network Appliance; I believe in their Global Services division.  I’ve bookmarked it for a more comprehensive review later, when I can really dedicate myself to the information he’s sharing here.

As an aside, Nick’s blog recently moved; he used to be here, but now can be found here instead.

Great information—thanks, Nick!

Tags: , , , ,

I guess I’m on a bit of a NetApp kick this week.  After discussing (or perhaps revisiting) the idea of recovering files inside VMs using NetApp Snapshots (first here late last year, then again here), I wanted to take a closer look at full VM recovery using NetApp Snapshots.

First of all, it should go without saying that you should never use any of the procedures I’m describing here without first testing them yourself.  While they worked fine for me, they may not work fine for you.  Don’t just assume they will!  Do the due diligence and test it in your environment first; you’ll be glad you did.

Second, before using NetApp Snapshots to recover VM data (file-level or full VM), be sure you are getting good, consistent Snapshots.  The Network Appliance Technical Reports Library has a number of excellent articles on this subject; I’ll defer you there for more information.

I’ll break this article into two sections, one for block-level storage (I’m using iSCSI, but the process should be almost identical for Fibre Channel) and one for NAS/NFS.  Please note that I’m not focusing so much on the specific steps that are required as I am on general concepts and any gotchas that may arise during the process.

Full VM Recovery using Block Storage

To recover a full VM using block-level storage, a number of steps have to be taken:

  1. Create a LUN clone (or a FlexClone) of the original LUN based on a Snapshot. 
  2. Enable resignaturing on the ESX host(s) that will need to see the cloned LUN.
  3. Mount the cloned LUN(s) on the ESX host(s) and copy the appropriate VM files from the clone to the production LUN.

For the first two steps, I’ll refer you back to one of my first articles on VMware data recovery with Snapshots, which has more information on the necessary commands and settings.

For the third step, you’ll need to login to the Service Console (typically via SSH) and copy the desired VM(s)—and all their files—from the cloned datastore to the production datastore, overwriting whatever is in the destination (you typically wouldn’t need to recover a full VM unless the production VM was hosed, right?).  Once the file(s) have been copied back over to the production datastore, dismount the cloned datastore and destroy it.

You should now be able to boot up your VM at the state it was in at the time of the Snapshot used to recover it.  Unless the Snapshot was a cold Snapshot (taken while the VM was powered off), the VM will perform a file system check (chkdsk or fsck) when it boots up.

Full VM Recovery using NFS

The procedure for recovering full VMs when using NFS is even easier:

  1. Using an NFS client, mount the NFS export and navigate to the hidden “.snapshot” directory.
  2. In the “.snapshot” directory, find the Snapshot from which you wish to recover the VM.
  3. Copy that VM’s files (the entire folder) out of the “.snapshot” directory into the production filesystem, replacing the current contents (again, this assumes that what’s in the production filesystem is no good, else why would you be recovering a full VM?).
  4. Unmount the NFS export from your NFS client.

The recovered VM should now boot and be back to the point in time at which the Snapshot was taken.  Again, unless the Snapshot was a cold Snapshot, the VM will likely perform a file system check upon boot.  This is normal and not unexpected.

I suppose you could even do this second procedure from a CIFS client, assuming that CIFS and NFS were both configured on the storage system and an appropriate CIFS share existed.  (Please note that I’ve never tried this, so I can’t tell you what the results might be.)  In that case, use the “~snapshot” directory instead of “.snapshot”.

And that’s it—there you have two ways of recovering entire VMs using Network Appliance Snapshots.  As always, feel free to hit me up in the comments with any questions, thoughts, corrections, or rants (just keep the rants on-topic, please!).  Thanks for reading!

Tags: , , , , , , ,

This session is titled “Best Practices for Architecting VMware Consolidated Backup Enabled Solutions,” and it’s being led by a couple of different presenters, one of whom is Dan Anderson, who also led the VCB session on Partner Day and leads the VCB labs (and led the VCB labs last year at VMworld 2006).  So far, it looks like this session won’t be a repeat of Monday’s Partner Day session, which is good.

The session started out with a review of VCB architecture, VCB components, and the interplay between the various portions of the VCB solution—the VCB proxy server, the pre-freeze and post-thaw scripts, file-level backups versus full VM backups, etc.  This stuff I already knew and had already been covered in detail in previous sessions.

The discussion then moved into a mention of the various items that affect the design of a VCB solution.  This would include things like the SAN architecture (Fibre Channel vs. iSCSI), software components (VCB version, VC version, ESX version, third-party backup software version), recovery mechanism, backup types, etc.  Dan mentioned again the command-line interface for VMware Converter that was mentioned on Monday; I really need to dig up that information so that I can explore that possibility in greater detail.

It was stated again today that there are bad candidates for VMware snapshots—high I/O, high transaction VMs are bad candidates for snapshots, and therefore are bad candidates for use with VCB.  (Recall that VCB uses snapshots to unlock the base VMDK for use by the VCB proxy, so a VM that is a bad candidate for snapshots is therefore a bad candidate for VCB-enabled backup solutions.)

I ended up leaving the session early because it turned out that a great deal more of the information that John and Dan were presenting was identical to the information that was presented on Monday at the partner session.  This is not a reflection on the presenters, or the session materials; I’d just already seen most, if not all, of the materials before.

Tags: , , , ,

A Collection of VMworld 2007 Links

Here are some VMworld links that I’ve been collecting over the last day or so.  As I have time, I’ll try to expand upon them and add my own thoughts and views, but wanted to mention them here briefly, at least:

I’ll try to add some more information and my thoughts on some of these issues as soon as possible.  In the meantime, I’m off to my final session here at VMworld 2007 on Day 2!

Tags: , , , , ,

Strange VCB Error

While in the process of verifying the operation of VMware Consolidated Backup (version 1.0.3) today using the command-line vcbMounter.exe utility, I kept receiving an error from vcbMounter and the full VM backups would fail.  Nothing seemed obvious at first, so I added the “-L 6” parameter to the command line, which was something like this:

vcbmounter -h vcserver.example.com -u username -p password
-r e:\mnt -a ipaddr:10.1.1.107 -t fullvm -L 6

Nothing terribly complicated there, just a simple full VM backup of the VM whose IP address is 10.1.1.107.  (For those of you that aren’t familiar with the vcbMounter.exe command-line syntax, it looks worse than it actually is.  Trust me.)  Upon running this command with the increased logging, I kept getting these errors:

[2007-08-07 12:13:47.418 'App' 2144 warning] Could not
obtain inquiry page 128 for device on path 0, target 4, lun 0
 
[2007-08-07 12:13:47.418 'App' 2144 warning] Sending SCSI inquiry
failed: Unknown error. (No proper error code was returned.)
 
[2007-08-07 12:13:47.418 'App' 2144 warning] Could not
obtain inquiry page 128 for device on path 0, target 5, lun 0
 
[2007-08-07 12:13:47.418 'App' 2144 warning] Sending SCSI inquiry
failed: Unknown error. (No proper error code was returned.)

The odd thing was, target ID 4 and target ID 5 were local SCSI targets, not anything SAN-related.  In fact, they were the system (C:) and data (D:) drives that had been created when the server was built and Windows Server 2003 was installed.

Google turned up nothing obvious, so I decided to try running the command directly against an ESX server.  The modified command now looked like this:

vcbmounter -h esxserver.example.com -u root -p password
-r e:\mnt -a ipaddr:10.1.1.107 -t fullvm -L 6

The operation still failed, but now I had a critical piece of missing information:

[2007-08-07 12:29:43.798 'BlockList' 2052 error] Your VirtualCenter or the ESX server hosting the virtual machine you are dealing with needs to be upgraded to work with this version of VCB. (VCB attempted to invoke the method “acquireLeaseExt” on a remote object of type “vim.host.DiskManager”, but this method is unknown to this object type.)

Aha!  A quick review of the environment showed that the ESX host this particular VM was hosted on was indeed running version 3.0.1.  With a quick VMotion to a nearby host running ESX Server 3.0.2 and a repeat of the command (changed to target the new host, obviously), and the backup operation worked.  I moved the guest back to the original host again, and the operation failed again.  This pattern held true regardless of whether the vcbMounter.exe command targeted the VirtualCenter server (which was running version 2.0.2) or the ESX Server. Anytime the VM was hosted on the ESX server running 3.0.1, the command failed.

<aside>Now why didn’t the error message just say that the first time, instead of complaining about local SCSI disks?</aside>

A quick review of the VCB 1.0.3 release notes turns up this fairly brief blurb:

VMware Consolidated Backup (VCB) 1.0.3 is compatible with VirtualCenter 2.0.2 and ESX Server 3.0.2 (or newer) only. This release is not supported if used with older version of ESX Server and VirtualCenter.

OK, fair enough—I should have more closely read the release notes before getting too deep into the testing.  But what this means is that customers won’t be able to start using VCB 1.0.3 until all their hosts have been upgraded to ESX Server 3.0.2 and VirtualCenter has been upgraded to 2.0.2.  I don’t know at this time if VCB 1.0.2 will work against both the newer and older versions; if not, that will put organizations in a situations where they may end up with two different sets of VCB proxy servers: one set to support the hosts running ESX Server 3.0.1, and another set to support hosts running ESX Server 3.0.2.  And that doesn’t even take VirtualCenter into consideration!

Anyone out there testing VCB 1.0.2 against the newer releases of ESX and VC?  This will help tell us if we can leverage the existing VCB infrastructure until after all the hosts have been upgraded, and then upgrade VCB, or if a parallel VCB infrastructure will have to be established to support the newer of ESX and VC.

Tags: , , , ,

« Older entries § Newer entries »