FibreChannel

You are currently browsing articles tagged FibreChannel.

This is a short post, but one that I hope will stir some discussion.

Earlier this evening, I read Maish’s blog post titled My Wish for dvFabric—a dvSwitch for Storage. In that blog post Maish describes a self-configuring storage fabric that would simplify how we provision storage in a vSphere environment. Here’s how Maish describes the dvFabric:

So how do I envision this dvFabric? Essentially the same as a dvSwitch. A logical entity to which I attach my network cards (for iSCSI/NFS) or my HBA’s (for FcoE or FC). I define which uplink goes to which storage, what the multi-pathing policy is for this uplink, how many ports should be used, what is the failover policy for which NIC, which NFS volumes to mount, which LUNS to add – I gather you see what I am getting at.

It’s a pretty interesting idea, and one with a great deal of merit. So here’s the “Thinking Out Loud” part: is Target Driven Zoning (or Peer Zoning) the answer to a large part of Maish’s dvFabric?

If you don’t know what Target Driven Zoning (TDZ) or Peer Zoning are, I recommend you go have a look at Erik Smith’s introductory blog post on Target Driven Zoning. Based on Erik’s description of TDZ, it certainly seems like it could be used to help on the block side of the house with Maish’s dvFabric idea.

So what do you think—am I way off here?

Tags: , , ,

Welcome to Technology Short Take #19, the first Technology Short Take for 2012. Here’s this year’s first collection of links, articles, and thoughts regarding virtualization, storage, networking, and other data center technology-related topics. I hope you find something useful!

Networking

  • While configuration limits aren’t the most exciting reading, they are important from time to time. Here’s some configuration limits for the UCS 6100 and 6200 series.
  • Understanding the differences—both positive and negative—between the various approaches to solving a particular challenge is a key skill. That’s why I like this article on HP Flex-10 versus NIOC for VDI. The author (Dwayne) weighs the pros and cons of both approaches in helping to shape network traffic for VDI deployments using 10Gb Ethernet.
  • It would appear that my recent VXLAN and OTV connectivity posts (incorrect VXLAN post here, corrected VXLAN post here, and OTV/VXLAN post here) sparked a discussion about whether we really need to concern ourselves with traffic trombones. On one side we have Brad Hedlund speculating that the network should be treated like a large virtual I/O fabric; on the other side we have Greg Ferro countering that we do need to be concerned about the topology of the network. I can see both sides of the argument, but at this stage of the game, I’m inclined to agree more with Greg. In the future (it’s unclear how far in the future) I think that Brad’s points will be more valid, but not right now.
  • This post by Ivan Pepelnjak on VXLAN, IP multicast, OpenFlow, and control planes highlights some of the current limitations with VXLAN and thus reinforces why I think that Brad’s arguments are a bit ahead of their time.
  • A few folks had some write-ups on Embrane Heleos: Greg Ferro, Jason Edelman, Brad Hedlund, Brad Casemore, and Ivan Pepelnjak. My question (and this is spurred in part by some comments by Brad Casemore): is this another Cisco spin-in move?

Servers/Operating Systems/Applications

Storage

Virtualization

And that it’s for this time around; as always, I hope you’ve found something useful here. Courteous comments are always welcome; feel free to speak up below.

Tags: , , , , , , , , , , ,

This is BRKSAN-3707, Advanced SAN Services, presented by Mike Dunn.

According to TIP’s Storage Research, the top five storage initiatives are deduplication, technology refresh, tiered storage build-out, archiving, and consolidation.

The three main sections of the presentation are SAN consolidation with virtualization, tiered storage and backup design, and Fibre Channel over Ethernet (FCoE).

The presentation starts with SAN consolidation using virtualization. This is really a discussion of virtual SANs (VSANs), which allow you to consolidate SANs onto the same hardware but still providing logical separation of fabrics. In order to move between VSANs, you need to use Inter-VSAN Routing (IVR).

When is IVR needed? When an initiator in one VSAN needs to talk to a target in another VSAN. IVR maintains isolation while allowing for resource sharing.

A common use for IVR is to provide common SAN services, like a shared tape library. IVR would allow media servers in individual VSANs to talk to a shared tape library in a “common” VSAN.

Setting up IVR involves creating an IVR topology. This means you need to manually define the VSANs that will be used for IVR on each switch (all switches that perform IVR will need identical configuration). After defining the IVR topology, you activate it. Then you create your IVR zones and IVR zoneset, just like creating regular zones and zoneset.

IVR works by creating a virtual domain in each VSAN that represents the other VSANs in the topology. Likewise, it creates virtual devices in each VSAN that represent the devices in the other VSANs. This means that logically the initiator thinks the target is in the same VSAN.

Keep in mind that IVR doesn’t perform FC ID translation, so domain IDs have to be unique across all VSANs.

IVR does have a Network Address Translation (NAT) mode (IVR NAT). With IVR NAT, the virtual switch is given a randomly available domain ID; this means that you don’t need unique domain IDs across all VSANs. IVR NAT is the preferred method of IVR moving forward.

Some operating systems or devices need persistent FC IDs, so IVR NAT allows for static definitions of domain IDs and FC IDs.

Another use case for IVR is SAN extension. You can use IVR to isolate the “remote site” VSAN from the production VSAN, limiting edge VSAN events to only that VSAN. The recommended configuration uses a transit VSAN that connects the two data centers. This keeps the VSANs in each data center isolated to only that data center. (Think of it like a /30 network between two routers.)

A question was asked about using FCIP and whether IVR is needed in this sort of situation. In this case, IVR would not be necessary.

Mike next launches into a quick review of SAN designs. In a core-edge design, there are core switches where storage is attached and edge switches where hosts are attached. This sort of design generally tops out at about 1,700 devices.

For larger environments, you can use an edge-core-edge design. Storage devices have their own edge switches, as do servers, and the edge-to-edge traffic passes through the core. This sort of design tops out at about 4,200 devices.

That discussion was a lead-in to a discussion of NPV/NPIV. This is a topic I’ve covered previously, so I didn’t take notes on this section.

Mike did share some good information on the maximum number of logins per port (42 in switch mode, 114 in NPV mode—watch this value if you are using nested NPIV, which is the term for NPIV hosts connecting to an NPV mode switch) and logins per switch (MDS 9124/9124e/9134/9148).

After discussing NPV/NPIV, Mike moves on to discuss a feature called FlexAttach. FlexAttach resolves the issue of needing to modify zoning and zonesets when an HBA or server needs to be replaced. Basically, any host connecting to an F-port configured as a FlexAttach port will assume the virtual WWPN assigned to that F-port. This eliminates the need to reconfigure zones or zonesets if you replace the server or HBA connected to that F-port. If you’re familiar with the behavior of HP VirtualConnect, this appears to be very similar in behavior. FlexAttach is supported on the MDS platform, but is not supported on the Nexus platform.

(Side question: Is FlexAttach leveraged in UCS for vHBA configurations?)

That wraps up the first section; Mike now moves into a discussion of tiered storage and backup design. In this section he will discuss Data Mobility Manager (DMM), SANTap, and Storage Media Encryption (SME).

To perform data migrations, there are different approaches:

  • Server/software-based
  • Array-based
  • Appliance-based

Each of these approaches has advantages and disadvantages. Cisco’s solution is DMM, which is a SAN-based migration solution. DMM does both online and offline data migration, uses FC redirects to allow transparent insertion/removal, and is very fast (4.2TB/hr).

FC Redirect is a target-based mechanism for transparently redirecting traffic to a target. With regard to DMM specifically, when using DMM for data migration, FC redirect is used to redirect traffic to the DMM process itself. DMM then sends the I/Os to both the original (source) and destination locations on the SAN. In this regard, it sounds like DMM is performing a form of write-splitting.

In synchronous mode, when handling I/Os to a “migrated” area of the LUN, writes are mirrored. If I/Os are to a “in process” area, writes are queued temporarily until the region has been migrated. For I/Os to “unmigrated” areas are simply sent directly to the source LUN.

In a dual-fabric configuration, each fabric requires its own DMM. Each DMM can handle multiple VSANs.

DMM can also run in an asynchronous mode. In this mode, DMM uses Modified Region Logs (MRLs) to track changes to the source LUN. Any “dirty region” in the MRL is copied across to the target. There is no write penalty as there is with synchronous mode described earlier.

A question was raised about what happens when the data migration is complete. At that point, you’ll halt the I/O on the server, complete the job in DMM, and then rezone the fabric to point your host(s) to the new storage target.

You can use the DMM asynchronous mode to migrate data between data centers as well. To prevent having to span a VSAN to the remote site (generally not recommended), you can add another VSAN (a replication VSAN) and a third MSM (a module in the FC switch that runs DMM) to handle the inter-site traffic.

The 120 day evaluation licenses within NX-OS will enable DMM with full functionality for 120 days.

The presentation next shifts to Storage Media Encryption (SME). SME encrypts media for SAN-attached tapes, VTLs, and disk arrays. It uses AES-256 encryption and it is FIPS 140-2 certified. The solution can use a Cisco key management solution or RSA Key Manager. SME is a licensed feature and is only supported on certain platforms (requires certain modules, i.e., the MSM-18/4 or the SSN16).

SME uses FC redirects to transparently insert itself into the data stream to perform encryption.

Cisco’s key management solution, Key Management Center, is part of Cisco Fabric Manager. It handles archiving, replicating, recovering, and purging media keys.

Encrypting disks using SME will be available in NX-OS 5.2(1).

The next topic up is SANTap, which as many readers already know is leveraged by EMC RecoverPoint for heterogeneous storage replication. SANTap is a licensed feature but does not use FC redirects. Instead, SANTap uses a host VSAN and a target VSAN. In the host VSAN, SANTap creates a DVT (Data Virtual Target), which uses the WWPN of the real target port. In the target VSAN, SANTap creates a VI (Virtual Initiator), which uses the WWPN of the real host port. SANTAp issues I/Os (or splits I/Os) from the host to the DVT and passes a copy to an additional fabric-based appliance (i.e., a RecoverPoint appliance).

Mike did not have any information on SANTap support for the SCSI commands used by VMware for the VAAI/VAAIv2 offloads in vSphere 4 and vSphere 5. (Bummer!)

The last section of the presentation was on Fibre Channel over Ethernet (FCoE). The information contained in this section was review and stuff that I’ve already covered elsewhere.

Tags: , , , , ,

In this post, I want to pull together all the steps necessary to take a Converged Network Adapter (CNA)-equipped server and connect it, using FCoE, to a Fibre Channel-attached storage array. There isn’t a whole lot of “net new” information in this post, but rather I’m summarizing previous posts, organizing the information, and showing how these steps relate to each other. I hope that this helps someone understand the “big picture” of how FCoE and Fibre Channel relate to each other and how they interoperate (which, quite frankly, is one of the key factors for the adoption of FCoE).

The steps involved come from an environment with the following components:

  • A Dell PowerEdge R610 server running VMware ESXi and containing an Emulex CNA
  • A Cisco Nexus 5010 switch running NX-OS 4.2(1)N1(1).
  • A Cisco MDS 9134 Fibre Channel switch running NX-OS 5.0(1a).
  • An older EMC CX3-based array with Fibre Channel ports in the storage processors.

We’ll start at the edge (the host) and work our way to the storage. All these steps assume that you’ve already taken care of the physical cabling.

  1. Depending upon how old the software is on your hosts, you might need to install updated drivers for your CNA, as I described here. If you’re using newer versions of software, the drivers will most likely work just fine out of the box.
  2. The closest piece to the edge is the FCoE configuration on the Nexus 5010 switch. Here’s how to setup FCoE on a Nexus 5000. Be sure that you map VLANs correctly to VSANs; for every VSAN that needs to be accessible from the FCoE-attached hosts, you’ll need a matching VLAN. Further, as pointed out here, the VLAN and VLAN trunking configuration is critical to making FCoE work properly anyway.
  3. The next step is connecting the Nexus 5010 to the MDS 9134 Fibre Channel switch. Read this to see how to configure NPV on a Nexus 5000 if you are going to use NPV mode (and read this for more information on NPV and NPIV). Using NPV or not, you’ll also need to setup connections between the Nexus and the MDS; here’s how to setup SAN port channels between a Cisco Nexus and a Cisco MDS.
  4. Once the Nexus and the MDS are connected, then you’ll need to perform the necessary zoning so that the hosts can see the storage. Before starting the zoning, you might find it helpful to set up device aliases. After your device aliases are defined, you can create the zones and zonesets. This post has information on how to create zones via CLI; this post has information on how to manage zones via CLI.

At this point—if everything is working correctly—then you are done and you should be ready to present storage to the end hosts.

I hope this helps put the steps involved (all of which I’ve already written about) in the right order and in the right relationship to each other. If there are any questions, clarifications, or suggestions, please feel free to speak up in the comments.

Tags: , , , , , , ,

Last year, I posted a couple of articles on configuring and managing Cisco MDS Fibre Channel zones via the CLI:

New User’s Guide to Configuring Cisco MDS Zones via CLI
New User’s Guide to Managing Cisco MDS Zones via CLI

In those posts, I discussed the use of the fcalias command to create aliases for World Wide Port Names (WWPNs) on the fabric. A couple of people suggested via Twitter and blog comments that I should use device aliases instead of the the fcalias command. As a follow up to those posts, here is some information on using device aliases on a Cisco MDS Fibre Channel switch.

To create a device alias, you’ll use the device-alias database command in global configuration mode. Once you are in database configuration mode, you can create device aliases using the device-alias command, like this:

mds(config)# device-alias database
mds(config-device-alias-db)# device-alias name <Friendly name> pwwn <Fibre Channel WWPN>
mds(config-device-alias-db)# exit
mds(config)# end

There is an additional step required after defining the device aliases. You must also commit the changes to the device alias database, like this:

mds(config)# device-alias commit

This commits the changes to the device alias database and makes the device aliases active in the switch.

Once a device alias is created, it applies to that WWPN regardless of VSAN. This means that you only have to define a single device alias for any given WWPN, whereas with the fcalias command a different alias needed to be defined for each VSAN. All other things being equal (and they’re not equal, as you’ll see in a moment), that alone is worth switching to device aliases in my opinion.

Using device aliases also provides a couple other key benefits:

  • Device aliases are automatically distributed to other Cisco-attached switches. For example, I defined the device aliases on a Cisco MDS 9134 that was attached to the Fibre Channel expansion port of a Cisco Nexus 5010 switch. The Nexus switch automatically picked up the device aliases. As best I can tell, this is controlled by the device-alias distribute global configuration command (or its reverse, the no device-alias distribute, which would disable device alias distribution).
  • Once a device alias is defined for a WWPN, anytime the WWPN is displayed the device alias is also displayed. So in the output of various commands like show flogi database, show fcns database, or show zone you will see not only the WWPN, but also that WWPN’s associated device alias.
  • You can use the device alias as the destination with the fcping command.

All in all, I see a lot of value in using device aliases over simple Fibre Channel aliases. I’ll grant that some of this value is more readily apparent only in homogenous Cisco storage networks, but even in single-switch networks I personally would use device aliases.

To those who suggested I look at device aliases, I thank you! You’ve made my job easier.

As always, I welcome your feedback! Feel free to speak up in the comments with corrections, clarifications, or suggestions.

Tags: , , ,

As part of an ongoing effort to expand the functionality of the vSpecialist lab in the EMC RTP facility, we recently added a pair of Cisco MDS 9134 Fibre Channel switches. These Fibre Channel switches are connected to a pair of Cisco Nexus 5010 switches, which handle Unified Fabric connections from a collection of CNA-equipped servers. To connect the Nexus switches to the MDS switches, we used SAN port channels to bond multiple Fibre Channel interfaces together for both redundancy and increased aggregate throughput. Here is how to configure SAN port channels to connect a Cisco Nexus switch to a Cisco MDS switch.

If you are interested, more in-depth information can be found here on Cisco’s web site.

Although I’ve broken out the configuration for the MDS and the Nexus into separate sections, the commands are very similar. In my situation, the MDS 9134 was running NX-OS 5.0(1a) and the Nexus 5010 was running NX-OS 4.2(1)N1(1).

Configuring the Cisco MDS 9134

To configure the MDS 9134 with a SAN port channel, use the following commands.

First, create the SAN port channel with the interface port-channel command, like this:

mds(config)# interface port-channel 1

You can replace the “1″ at the end of that command with any number from 1 to 256; it’s just the numeric identifier for that SAN port channel. The SAN port channel number does not have to match on both ends.

Once you’ve created the SAN port channel, then add individual interfaces with the channel-group command:

mds(config)# interface fc1/16
mds(config-if)# channel-group 1

The “1″ specified in the channel-group command has to match the number specified in the earlier interface port-channel command. This might seem obvious, but I wanted to point it out nevertheless.

Repeat this process for each interface you want to add to the SAN port channel. In my example, I used two interfaces.

When you add an interface to the SAN port channel, NX-OS reminds you to perform a matching configuration on the switch at the other end, then use the no shutdown command to make the interfaces (and the SAN port channel) active. Let’s look first at the commands for configuring the Nexus, then we’ll examine what it looks like when we bring the SAN port channel online.

Configuring the Cisco Nexus 5010

The commands here are very similar to the MDS 9134. First, you need to create the SAN port channel using the interface san-port-channel command (note the slight difference in commands between the MDS and the Nexus here):

nexus(config)# interface san-port-channel 1

As with the MDS, the number at the end simply serves as a unique identifier for the SAN port channel and can range from 1 to 256.

Then add interfaces to the SAN port channel using the channel-group command:

nexus(config)# interface fc2/1
nexus(config-if)# channel-group 1
nexus(config-if)# interface fc2/2
nexus(config-if)# channel-group 1

As I’ve shown above, simply repeat the process for each interface you want to add to the SAN port channel. As on the MDS, NX-OS reminds you to perform a matching configuration on the opposite end of the link and then issue the no shutdown command.

Bringing Up the SAN Port Channel

Once a matching configuration is performed on both ends, then you can use the no shutdown command (which you can abbreviate to simply no shut) to activate the interfaces and the SAN port channel. After activating the interfaces, a show interface port-channel (on the MDS) or a show interface san-port-channel (on the Nexus) will show you the status of the SAN port channel. Only the first few lines of output are shown below (this output is taken from the Nexus):

nexus# sh int san-port-channel 1
san-port-channel 1 is trunking (Not all VSANs UP on the trunk)
    Hardware is Fibre Channel
    Port WWN is 24:01:00:05:9b:7b:0c:80
    Admin port mode is auto, trunk mode is on
    snmp link state traps are enabled
    Port mode is TE
    Port vsan is 1
    Speed is 4 Gbps
    Trunk vsans (admin allowed and active)  (1)
    Trunk vsans (up)                        ()
    Trunk vsans (isolated)                  ()
    Trunk vsans (initializing)              (1)

A couple of useful pieces of information are available here:

  • First, you can see that the SAN port channel is not fully up; it’s still initializing. This is shown by the “Not all VSANs UP on the trunk” message, as well as by the “Trunk vsans (initializing)” line.
  • Second, you can see the only a single member is up. Note the speed of the SAN port channel is listed as 4 Gbps.
  • Third, note that this is a trunking port, meaning that it could carry multiple VSANs. This is noted by the “Port mode is TE” line as well as the first line of the output (“san-port-channel 1 is trunking”).

As it turns out, I’d cabled the connections wrong; after I fixed the connections and gave the SAN port channel a small amount of time to initialize, the output was different (this output is taken from the MDS):

nexus# sh int port-channel 1
port-channel 1 is trunking
    Hardware is Fibre Channel
    Port WWN is 24:01:00:05:73:a7:72:00
    Admin port mode is auto, trunk mode is on
    snmp link state traps are enabled
    Port mode is TE
    Port vsan is 1
    Speed is 8 Gbps
    Trunk vsans (admin allowed and active)  (1)
    Trunk vsans (up)                        (1)
    Trunk vsans (isolated)                  ()
    Trunk vsans (initializing)              ()

Now you can see that both members of the SAN port channel are active (“Speed is 8 Gbps”) and that all VSANs are trunking across the SAN port channel.

At this point, you are now ready to proceed with creating VSANs, zones, and zonesets. Refer to these articles for more information on MDS zone creation and management via CLI:

New User’s Guide to Configuring Cisco MDS Zones via CLI
New User’s Guide to Managing Cisco MDS Zones via CLI

As always, questions, clarifications, or corrections are welcome—just add them below in the comments. Thanks!

Tags: , , , ,

I ran into an issue in the lab today with some VMware ESX 4.0 hosts and some older CLARiiON CX3 arrays. I’d been working to fix up the lab so that it more properly reflects a “best practices” configuration with dual SAN fabrics, dual-homed HBAs and CNAs, network connections spread across multiple physical switches, etc.—you know, all the wonderful things that we recommend to our customers.

As a result of cross-connecting both the HBAs and the CLARiiON’s storage processors to both SAN fabrics, I ended up with more paths to the LUNs than I had previously. This was, of course, fully expected. Upon browsing the properties for the datastore in the vSphere Client, however, I still saw only four paths—two target ports on each storage processor—when I expected to see more. Upon closer inspection, I determined that I wasn’t seeing the ports on the array that I had recently connected to the second fabric.

My first thought was that the SAN zoning incorrect. I went back and double-checked the SAN zoning to ensure that all the initiators were, in fact, zoned to see all the targets. OK, so the zoning is correct. Why didn’t the extra paths show up?

I double-checked the physical layer; everything was fine there.

That left only the array itself. Logging into Navisphere, I saw in the Connectivity Status window that all the initiators were logged in and registered. (It turns out there is something else in the Connectivity Status window that I should have noticed, but didn’t. Read on to find out what I missed.) Hmmm…so that’s not it. I manually edited the initiators in the Connectivity Status window so that all the paths were linked to the same host, thinking perhaps that would resolve the problem, but it didn’t help. So, thinking that perhaps de-registering and re-registering the initiators might help, I enabled engineering mode in Navisphere so I could do just that. After enabling engineering mode but before I de-registered the initiators, I poked around to see if anything else stuck out at me.

(If you’re not familiar with engineering mode on a CLARiiON, just look at the results from a Google search like this. It should give you all the information you need.)

While browsing through Navisphere with engineering mode enabled, I noticed something I hadn’t noticed before: the VMware ESX host I was troubleshooting was showing up in two different storage groups. This is an error, as a host is only allowed to be in a single storage group at a time. In this case, some of the initiators were showing up in the desired storage group, but two of the initiators were showing up in the ~management storage group. Ah ha! Those initiators were my missing paths. But how to fix it?

It turns out that by looking at the properties of the storage group and then looking at the Hosts tab, there is now (with engineering mode enabled) an Advanced button that allows you to select the specific paths for each host that should be enabled for that storage group. When I opened the Advanced Properties dialog box for the storage group, there’s a separate tab for each host in the storage group that lists all the connection paths that should be included. And, sure enough, my two missing paths were there, unchecked! When I checked them and then went back to review the paths from the VMware ESX host, all six expected paths were now present and accounted for.

Now, I’m told that removing the host from the storage group and then re-adding it to the storage group would accomplish the same effect. That, however, is a disruptive process; this method is non-disruptive (as far as I can tell). I’m also told that the Reconnect button—found in the Host Connectivity status window, accessible by right-clicking a specific host and selecting Connectivity Status—will accomplish the same result as well. I can’t speak for either of these two options, but I do know that entering engineering mode and enabling all the paths works and works without disruption.

Oh, and remember how I mentioned that there was something I overlooked in the Connectivity Status window? I learned after the fact—after I’d already fixed the problem—that initiators that are blue in the Connectivity Status window are initiators that are not in a storage group. I don’t think knowing that up front would have helped all that much, but it’s still handy to know.

So there you have it: if you enable new paths from a host to a storage array and the paths don’t show up, use engineering mode to ensure that all the paths are enabled for the host in the storage group.

I encourage you to speak up in the comments if you have additional information or other tips/tricks pertaining to this issue. Thanks for reading!

UPDATE: A reader has posted in the comments that the Reconnect option will reestablish all paths to the host without disruption. Thanks, Tim!

Tags: , , ,

Some time ago I posted a “how to” article on how to configure FCoE on a Nexus 5000 switch. At that time, I did not put the Nexus 5000 into NPV mode but rather connected it to a Cisco MDS Fibre Channel switch without using NPV. In this entry, I’d like to follow up on that article and show you how to configure NPV on a Nexus 5000.

If you aren’t familiar with NPV (N_Port Virtualization) and how it’s different than NPIV (N_Port ID Virtualization), check out this article titled “Understanding NPIV and NPV”; It should help clear things up. Because NPV makes the Nexus 5000 look like an NPIV-enabled host, one potential use for NPV, as in this case, is when connecting the Nexus 5000 to a non-Cisco Fibre Channel switch. Using NPV helps ease interoperability concerns. In this instance, I’m connecting the Nexus 5000 to a Brocade Fibre Channel switch (actually an EMC Connectrix).

Note that I tested these instructions on a Nexus 5010 using both NX-OS 4.1(3)N2(1) as well as NX-OS 4.2(1)N1(1).

The first step is to enable NPV on the Nexus 5000. As far as I can tell, in order to enable NPV you must first enable FCoE using the feature command:

switch(config)# feature fcoe

This loads various Fibre Channel modules and makes possible other features, including NPV and NPIV. Enabling NPV erases the switch configuration and reboots the switch, so be sure you are connected via a console connection before enabling NPV with the feature command:

switch(config)# feature npv

Immediately after enabling NPV, the Nexus 5000 will reboot (you’re warned and given the option to proceed or cancel). The warning indicates that the switch’s configuration will be removed, but the minimally-configured switches I used in my testing retained their configuration. Granted, I hadn’t performed any Fibre Channel or FCoE configurations yet, so perhaps that’s the configuration to which the warning was referring.

Once NPV is enabled on the switch, you can then configure Fibre Channel uplink ports as NP ports (proxy N_ports); these are also referred to as external interfaces. To configure a Fibre Channel port as an NP port, use these commands:

switch# config t
switch(config)# interface fc slot/port
switch(config-if)# switchport mode np
switch(config-if)# no shut
switch(config-if)# exit
switch(config)# exit

You should then be ready to physically connect to the upstream Fibre Channel switch, which—if you recall correctly from my earlier NPV/NPIV post—needs to be NPIV-enabled. In this particular case, I was uplinking the Nexus 5010 to an EMC-rebranded Brocade switch (a Connectrix DS-300B running Fabric OS 6.1.0). To show whether the port on the Connectrix was enabled for NPIV, I used the portcfgshow command:

rtp-fc-sw-01:admin> portcfgshow port number

Look for the line that says “NPIV Capability”; the value should be reported as “ON”. If the value is not “ON”, you’ll need to use the portcfgnpivport command to enable NPIV on the specified port, like this:

rtp-fc-sw-01:admin> portcfgnpivport port number 1

The “1″ at the end of that command enables NPIV; a “0″ would disable NPIV for that port.

Once NPIV is enabled on the upstream Fibre Channel switch, when you physically connect the configured external interface then the Fibre Channel link should come up. I used the show int fc slot/number command on the Nexus to verify that the port was up; on the Connectrix, I used the portshow port command. In addition, I was also able to see the Nexus switch logged into the Fibre Channel fabric on the Connectrix using the nsshow command.

Once you have Fibre Channel connectivity via the external interfaces, then configuring FCoE to hosts connected to the Nexus follows the same set of instructions laid out in my earlier FCoE how-to article.

From that point on, it’s only a matter of configuring zones (see here for help with zones on a Cisco MDS) and presenting storage. Those are different posts for a different day…

Tags: , , , , , ,

I’ve had these FCoE-related articles sitting around in my Yojimbo database for a while, but I’m only now getting around to doing something with them. There’s some great information in these posts, but be sure to check the comments to the posts as well—there’s some equally good information to be found there as well.

FCoE Multi-hop: Why wait?
Re-examining FCoE and iSCSI Pros and Cons
FCoE vs. iSCSI: The Cagefight!
Gartner on FCoE. Whoa There, Sparky
8Gb Fibre Channel or 10Gb Ethernet w/ FCoE?

Tags: , , ,

I was browsing through an EMC technical document titled “EMC CLARiiON Integration with VMware ESX Server” (download it here) a little while ago and I came across a phrase in the document that caught my attention:

“VMware ESX/ESXi support both Fibre Channel and iSCSI storage. However, VMware and EMC do not support connecting VMware ESX/ESXi servers to CLARiiON Fibre Channel and iSCSI devices on the same array simultaneously.”

What? No Fibre Channel and iSCSI from the same array to a VMware ESX/ESXi host simultaneously? That piqued my curiosity, so I contacted a few people within EMC to question the veracity of that statement. It turns out that the answer is more complicated than it might seem at first glance.

For those of you who aren’t interested in the deep technical details, here’s the short explanation behind this behavior:

  • VMware fully supports the use of both Fibre Channel and iSCSI from the same array to the same VMware ESX/ESXi host simultaneously.
  • VMware does not support presenting the same LUN via both protocols concurrently to the same host. (I qualified this directly with VMware.)
  • For a Celerra, you can use both Fibre Channel (via the CLARiiON side of the array) and iSCSI (via the Celerra side of the array) simultaneously. This is a fully supported configuration.
  • A CLARiiON array can easily present the same LUN via both Fibre Channel and iSCSI, but then VMware wouldn’t support it (see earlier bullet).
  • With a CLARiiON array, it is possible to present some LUNs via Fibre Channel and some LUNs via iSCSI to the same VMware ESX/ESXi host (i.e., LUN A via Fibre Channel and LUN B via iSCSI), but EMC will only support it if you file an RPQ. Without an RPQ, it’s an unsupported configuration. An RPQ, by the way, is a request to qualify a certain configuration for support.

I’m confident that some other array vendors out there will be very quick to jump on this post and harp on this limitation until the cows come home. I would just ask this question: is it really as big of a limitation as it seems? I’ll come back to that question in a moment.

With the short explanation in mind, here are the more in-depth details. If you like the longer, more technical explanation, then read on!

From EMC’s side, the root of the restriction about using both Fibre Channel and iSCSI devices on the same array simultaneously stems from the interaction of host registration and storage groups.

Host registration is a requirement in the CLARiiON world. In order to present storage to a host from a CLARiiON array, you must first register the host’s initiators with the array in Navisphere. Once the host has been registered, then you can proceed with presenting storage to that host. In theory the CLARiiON could operate without registering hosts and initiators, but EMC chose to require registration. EMC made this choice in order to help simplify host management.

Requiring host registration is a bit different than some of other storage arrays on the market. It’s not better or worse—just different. (Remember, pros and cons come from every technology decision.)

If you’re like me, you’re probably wondering at this point how requiring host registration simplifies anything. Instead of having to manage multiple paths, multiple initiators, and individual hosts every time you want to present storage to a host, you only need to register the host—and all of its initiators—and then you can refer to that same object (the host) over and over again as needed. Yes, host registration does mean a bit more work up front, but the idea is that it will save some work down the road. I guess you can think of host registration kind of like defining aliases in your Fibre Channel zoning configuration: it’s a bit more work up front, but it simplifies things later down the road. If you didn’t create device aliases in your Fibre Channel switch, you’d end up having to re-enter Fibre Channel WWPNs multiple times. You create the aliases so that it’s easier later. The same applies to host registration. Again, it’s a matter of choices.

One might also say that registration is security measure, albeit a weak measure. Rather than allow just any Fibre Channel-attached or iSCSI-attached host to see storage, the array requires that it know about the host (via host registration) in order to present storage to the host. This provides an additional layer of security to ensure that only authorized hosts are presented storage from the array.

Now you have a fairly decent idea of why host registration is necessary. So how does host registration occur? Host registration can occur either manually or automatically. Starting with version 4.0, both VMware ESX and VMware ESXi will automatically register with a CLARiiON array running any recent version of FLARE (ESX 3i version 3.5 also supports this form of push registration). FLARE release 28 and earlier will show these hosts as “Manually registered, unmanaged”; starting with FLARE 29, these hosts are listed as “Manually registered, managed”. In either case, the registration occurs automatically. If the host is Fibre Channel-attached, then the Fibre Channel initiators will be included in the automatic registration. The same goes for iSCSI initiators. Normally, this is a good thing because it saves the administrator the extra steps of registering the host with the storage array. (Also, because VMware ESX/ESXi hosts register automatically, there is no need to install the Navisphere Agent.)

In this case, though, the automatic registration causes a problem. Why? This goes back to the second item I said I needed to discuss: storage groups. Specifically, storage groups have two characteristics that come into play here:

  1. First, any given host—not just VMware ESX/ESXi hosts, but all types of hosts—can only be connected to a single storage group at any given time.
  2. Second, while the CLARiiON can present Fibre Channel LUNs and iSCSI LUNs simultaneously (including presenting the same LUN via both protocols simultaneously), there is no way within a single storage group to specify which LUNs should be accessed via Fibre Channel and which LUNs should be accessed via iSCSI. This is necessary because VMware won’t support accessing the same LUN via both protocols at the same time (see earlier VMware support statement).

Do you see how all the pieces come together? The only way to control which LUNs should be presented via which protocol is to use multiple storage groups—but a host can only be in a single storage group at a time. With only a single host object for any given VMware ESX/ESXi host, that host can only see either Fibre Channel LUNs (by being in a storage group containing Fibre Channel LUNs) or iSCSI LUNs (by being in a storage group containing iSCSI LUNs), but not both. Hence, the statement in the CLARiiON document I referenced in the very beginning of this blog post that outlines using either Fibre Channel or iSCSI but not both. This behavior is required to enforce the single-protocol LUN access required by VMware.

As with all things, there is a workaround. Because it is a workaround, that’s why the RPQ is necessary to get full support.

To work around this problem, you’ll need to ignore the automatic host registration (or disable the automatic host registration) and instead create two manually registered “pseudo-hosts”: one with the Fibre Channel initiators and one with the iSCSI initiators. These “pseudo-hosts” will need fake IP addresses (if they both use the same IP address, Navisphere will treat them as the same host, thus defeating the purpose of the workaround). Put the Fibre Channel initiators into the Fibre Channel storage group(s), and put the iSCSI initiators into the iSCSI storage group(s). Each “pseudo-host” will be able to see LUNs presented to that storage group and therefore would see both Fibre Channel and iSCSI LUNs at the same time. And, as required by VMware, any given LUN would be accessed only via Fibre Channel or iSCSI but not both. Remember that you need to file an RPQ in order to get support on this configuration.

For VMware ESX/ESXi 4.0 hosts (and ESX 3i version 3.5 hosts), you can disable automatic registration using the Disk.EnableNaviReg advanced configuration option. Setting this value to 0 disables the automatic registration with Navisphere. (Here are screenshots for VMware ESX 3i and VMware ESX/ESXi 4.) If you disable the automatic registration, then you only need to manually register the Fibre Channel and iSCSI initiators as separate “pseudo-hosts” and you’re ready to go.

Let me reiterate again that if you are presenting iSCSI LUNs via the Celerra and not the CLARiiON, none of this applies. Presenting Fibre Channel LUNs via the CLARiiON and iSCSI LUNs via the Celerra to the same VMware ESX/ESXi host is fine. This workaround that I’ve described only applies when you want to present some LUNs via Fibre Channel and some LUNs via iSCSI from a CLARiiON to a single VMware ESX/ESXi host.

Earlier you’ll recall that I asked this question: is this really a limitation? There are a couple of viewpoints:

  • One viewpoint states there is no need for both Fibre Channel and iSCSI connectivity to the same array. Since you already have Fibre Channel connectivity to the array, what’s the point in using iSCSI? Conversely, if you already have iSCSI connectivity to an array, why invest in establishing Fibre Channel connectivity? Since you can’t use it for failover (that would violate the VMware support position), running another block protocol against the same array and same sets of disks doesn’t add a great deal of value.
  • A second viewpoint argues that the ability to provide a differentiation of service based on the different performance characteristics of Fibre Channel and iSCSI (and NFS, but we’re focusing on block protocols for this discussion) is valuable, and thus the need to be able to easily present LUNs via either protocol from the same array to the same host is a worthwhile function. There are a number of potential use cases here—test/development environments, Tier 2 applications, varying SLAs, etc. This is especially true if you are using different disk pools (fast Fibre Channel drives or EFDs vs. slower SATA drives) on the same array.

I can see both sides of the coin. Personally, I tend to side more with the second viewpoint and would prefer to see the CLARiiON have the ability to easily present Fibre Channel and iSCSI to the same host, especially when multiple disk pools are involved. I think that CLARiiON engineering is now evaluating this possibility; as more information emerges, I’ll be sure to keep you posted.

Courteous and professional comments, clarifications, or corrections are always welcome!

Tags: , , , , ,

« Older entries