FibreChannel

You are currently browsing articles tagged FibreChannel.

Two technologies that seem to have come to the fore recently are NPIV (N_Port ID Virtualization) and NPV (N_Port Virtualization). Judging just by the names, you might think that these two technologies are the same thing. While they are related in some aspects and can be used in a complementary way, they are quite different. What I’d like to do in this post is help explain these two technologies, how they are different, and how they can be used. I hope to follow up in future posts with some hands-on examples of configuring these technologies on various types of equipment.

First, though, I need to cover some basics. This is unnecessary for those of you that are Fibre Channel experts, but for the rest of the world it might be useful:

  • N_Port: An N_Port is an end node port on the Fibre Channel fabric. This could be an HBA (Host Bus Adapter) in a server or a target port on a storage array.
  • F_Port: An F_Port is a port on a Fibre Channel switch that is connected to an N_Port. So, the port into which a server’s HBA or a storage array’s target port is connected is an F_Port.
  • E_Port: An E_Port is a port on a Fibre Channel switch that is connected to another Fibre Channel switch. The connection between two E_Ports forms an Inter-Switch Link (ISL).

There are other types of ports as well—NL_Port, FL_Port, G_Port, TE_Port—but for the purposes of this discussion these three will get us started. With these definitions in mind, I’ll start by discussing N_Port ID Virtualization (NPIV).

N_Port ID Virtualization (NPIV)

Normally, an N_Port would have a single N_Port_ID associated with it; this N_Port_ID is a 24-bit address assigned by the Fibre Channel switch during the FLOGI process. The N_Port_ID is not the same as the World Wide Port Name (WWPN), although there is typically a one-to-one relationship between WWPN and N_Port_ID. Thus, for any given physical N_Port, there would be exactly one WWPN and one N_Port_ID associated with it.

What NPIV does is allow a single physical N_Port to have multiple WWPNs, and therefore multiple N_Port_IDs, associated with it. After the normal FLOGI process, an NPIV-enabled physical N_Port can subsequently issue additional commands to register more WWPNs and receive more N_Port_IDs (one for each WWPN). The Fibre Channel switch must also support NPIV, as the F_Port on the other end of the link would “see” multiple WWPNs and multiple N_Port_IDs coming from the host and must know how to handle this behavior.

Once all the applicable WWPNs have been registered, each of these WWPNs can be used for SAN zoning or LUN presentation. There is no distinction between the physical WWPN and the virtual WWPNs; they all behave in exactly the same fashion and you can use them in exactly the same ways.

So why might this functionality be useful? Consider a virtualized environment, where you would like to be able to present a LUN via Fibre Channel to a specific virtual machine only:

  • Without NPIV, it’s not possible because the N_Port on the physical host would have only a single WWPN (and N_Port_ID). Any LUNs would have to be zoned and presented to this single WWPN. Because all VMs would be sharing the same WWPN on the one single physical N_Port, any LUNs zoned to this WWPN would be visible to all VMs on that host because all VMs are using the same physical N_Port, same WWPN, and same N_Port_ID.
  • With NPIV, the physical N_Port can register additional WWPNs (and N_Port_IDs). Each VM can have its own WWPN. When you build SAN zones and present LUNs using the VM-specific WWPN, then the LUNs will only be visible to that VM and not to any other VMs.

Virtualization is not the only use case for NPIV, although it is certainly one of the easiest to understand.

<aside>As an aside, it’s interesting to me that VMotion works and is supported with NPIV as long as the RDMs and all associated VMDKs are in the same datastore. Looking at how the physical N_Port has the additional WWPNs and N_Port_IDs associated with it, you’d think that VMotion wouldn’t work. I wonder: does the HBA on the destination ESX/ESXi host have to “re-register” the WWPNs and N_Port_IDs on that physical N_Port as part of the VMotion process?</aside>

Now that I’ve discussed NPIV, I’d like to turn the discussion to N_Port Virtualization (NPV).

N_Port Virtualization

While NPIV is primarily a host-based solution, NPV is primarily a switch-based technology. It is designed to reduce switch management and overhead in larger SAN deployments. Consider that every Fibre Channel switch in a fabric needs a different domain ID, and that the total number of domain IDs in a fabric is limited. In some cases, this limit can be fairly low depending upon the devices attached to the fabric. The problem, though, is that you often need to add Fibre Channel switches in order to scale the size of your fabric. There is therefore an inherent conflict between trying to reduce the overall number of switches in order to keep the domain ID count low while also needing to add switches in order to have a sufficiently high port count. NPV is intended to help address this problem.

NPV introduces a new type of Fibre Channel port, the NP_Port. The NP_Port connects to an F_Port and acts as a proxy for other N_Ports on the NPV-enabled switch. Essentially, the NP_Port “looks” like an NPIV-enabled host to the F_Port on the other end. An NPV-enabled switch will register additional WWPNs (and receive additional N_Port_IDs) via NPIV on behalf of the N_Ports connected to it. The physical N_Ports don’t have any knowledge this is occurring and don’t need any support for it; it’s all handled by the NPV-enabled switch.

Obviously, this means that the upstream Fibre Channel switch must support NPIV, since the NP_Port “looks” and “acts” like an NPIV-enabled host to the upstream F_Port. Additionally, because the NPV-enabled switch now looks like an end host, it no longer needs a domain ID to participate in the Fibre Channel fabric. Using NPV, you can add switches and ports to your fabric without adding domain IDs.

So why is this functionality useful? There is the immediate benefit of being able to scale your Fibre Channel fabric without having to add domain IDs, yes, but in what sorts of environments might this be particularly useful? Consider a blade server environment, like an HP c7000 chassis, where there are Fibre Channel switches in the back of the chassis. By using NPV on these switches, you can add them to your fabric without having to assign a domain ID to each and every one of them.

Here’s another example. Consider an environment where you are mixing different types of Fibre Channel switches and are concerned about interoperability. As long as there is NPIV support, you can enable NPV on one set of switches. The NPV-enabled switches will then act like NPIV-enabled hosts, and you won’t have to worry about connecting E_Ports and creating ISLs between different brands of Fibre Channel switches.

I hope you’ve found this explanation of NPIV and NPV helpful and accurate. In the future, I hope to follow up with some additional posts—including diagrams—that show how these can be used in action. Until then, feel free to post any questions, thoughts, or corrections in the comments below. Your feedback is always welcome!

Disclosure: Some industry contacts at Cisco Systems provided me with information regarding NPV and its operation and behavior, but this post is neither sponsored nor endorsed by anyone.

Tags: , , , , ,

Fibre Channel over Ethernet (FCoE) is receiving a great deal of attention in the media these days. Fortunately, setting up FCoE on a Nexus 5000 series switch from Cisco isn’t too terribly complicated, so don’t be too concerned about deploying FCoE in your datacenter (assuming it makes sense for your organization). Configuring FCoE basically consists of three major steps:

  1. Enable FCoE on the switch.
  2. Map a VSAN for FCoE traffic onto a VLAN.
  3. Create virtual Fibre Channel interfaces to carry the FCoE traffic.

The first step is incredibly easy. To enable FCoE on the switch, just use this command:

switch(config)# feature fcoe

The next part of the FCoE configuration is mapping a VSAN to a VLAN. What VSAN should you use? Well, if you are connecting to an existing Fibre Channel fabric, perhaps on a Cisco MDS switch, you’ll need to make sure that the VSANs between the Nexus and the MDS are appropriately matched. Otherwise, traffic on one VSAN on the Nexus won’t be able to reach devices on another VSAN on the MDS. If there’s enough demand, I’ll post a quick piece on this step as well.

Note that this FCoE VSAN-to-VLAN mapping is a required step; if you don’t do this, the FCoE side of the interfaces won’t come up (as you’ll see later in this post). Assuming the VSAN is already defined, perform these steps to map the VSAN to a VLAN:

switch(config)# vlan XXX
switch(config-vlan)# fcoe vsan YYY
switch(config-vlan)# exit

Obviously, you’ll want to substitute XXX and YYY for the correct VLAN and VSAN numbers, respectively.

After you’ve enabled FCoE and mapped FCoE VSANs onto VLANs, then you are ready to create virtual Fibre Channel (vfc) interfaces. Each physical Nexus port that will carry FCoE traffic must have a corresponding vfc interface. Generally, you will want to create the vfc interface with the same number as the physical interface, although as far as I know you are not required to do so. It just makes management of the interfaces easier. The commands to create a vfc interface look like this:

switch(config)# interface vfc ZZ
switch(config-if)# bind interface ethernet 1/ZZ
switch(config-if)# no shutdown
switch(config-if)# exit

At this point the vfc interface is created, but it won’t work yet; you’ll need to place it into an VSAN that is mapped to an FCoE enabled VLAN. If you don’t, the show interface vfc <number> command will report this (emphasis mine):

vfc13 is down (VSAN not mapped to an FCoE enabled VLAN)

As I mentioned earlier, if you haven’t mapped the FCoE VSAN onto a VLAN, you won’t be able to fix this problem. If you have mapped the FCoE VSAN onto a VLAN, then you only need to assign the vfc interface to the appropriate VSAN with these commands:

switch(config)# vsan database
switch(config-vsan-db)# vsan <number> interface vfc <number>
switch(config-vsan-db)# exit

At this point, the vfc interface will report up, and you should be able to see the host’s connection information with the show flogi database command.

From this point—assuming that your storage is attached to a traditional Fibre Channel fabric, which is likely to be the case in the near future—you only need to create zones with the WWNs of the FCoE-attached hosts in order to grant them access to the storage. Refer to my posts on creating zones and managing zones on a Cisco MDS for more information on this task.

In my own experience, once FCoE was properly configured on the Nexus 5000 switch, then creating zones and zonesets on the Cisco MDS Fibre Channel switch and creating and masking LUNs on the Fibre Channel-attached storage is very straightforward. This, as has been stated on several previous occasions, is one of the strengths of FCoE: it’s compatibility with existing Fibre Channel installations is outstanding.

Feel free to submit any questions or clarifications in the comments below.

Tags: , , , , ,

Toward the end of August 2009, I posted an article on how to configure Cisco MDS zones via the command-line interface (CLI). This article is a follow-up to that article; in this post, I’ll review some commands that are helpful in managing those zones.

As with the first post, this post probably won’t be very helpful to users who are well-versed with the Cisco MDS family of Fibre Channel switches. Hence, why I’ve tagged it as a “new user’s” post. Similarly, I’m not going into the need for zones, as that is covered amply elsewhere.

First, I find it extremely handy to be able to rename Fibre Channel aliases using the fcalias rename command like this:

switch(config)# fcalias rename <old alias> <new alias> vsan XXX

You can also rename zones:

switch(config)# zone rename <old zone name> <new zone name> vsan XXX

And you can rename zonesets:

switch(config)# zoneset rename <old zoneset name> <new zoneset name> vsan XXX

In my earlier article I talked about the zoneset clone command, but you can also clone aliases and individual zones. I’m not yet convinced of the value of being able to clone an individual alias, and if you are using single initiator/single target zoning I’m not 100% sure how helpful it will be to clone a specific zone. Still, the functionality is there if you need it.

Adding a new alias, zone, or zoneset is similar to modifying an existing alias, zone, or zoneset. For example, to add a new alias to an existing zone, you would use these commands:

switch(config)# zone name existing-zone-name-here vsan XXX
switch(config-zone)# member fcalias new-alias-to-add
switch(config-zone)# exit

Likewise, adding a new zone to an existing zoneset is similar to defining a new zoneset:

switch(config)# zoneset name existing-zoneset-name vsan XXX
switch(config-zoneset)# member new-zone-to-add
switch(config-zoneset)# member another-new-zone
switch(config-zoneset)# exit

Managing zones via the CLI can be a bit daunting; as the number of aliases and zones increases, it becomes more difficult to work with all of them and find only the ones in which you are interested at the moment. Here, using the include keyword can be rather handy. Consider this command:

switch# show zone | include server-name
zone name server-name-storage vsan XXX
  fcalias name server-name vsan XXX
zone name server-name-storage2 vsan XXX
  fcalias name server-name vsan XXX

I’ve marked the matching text in bold, so that you can see that the include keywords acts like a bit like grep. This makes it much easier to filter out only the zones you want or need to see, instead of having to wade through all the currently defined zones. This is not an MDS-specific trick; it’s also applicable in IOS and NX-OS as well. And it works not only with zones, but also with zonesets, FC aliases, etc.

Cisco MDS experts, feel free to post additional suggestions on managing zones via the CLI in the comments below so that all readers can benefit. Thanks for reading!

Tags: , , , ,

Earlier I posted some notes on meetings I’d had with Virsto and Xangati. In this post I’d like to discuss some additional meetings I’ve had with Virtual Instruments and Tranxition.

Virtual Instruments

Virtual Instruments makes a solution that is intended to help troubleshoot and optimize storage environments. I had the opportunity to grab some coffee with them this morning and hear about what they’re doing and how they’re doing it. As a company carved out of Finisar and taken private, their goal is to help drive higher levels of virtualization by providing more visibility into the storage fabric.

Clearly, this message will really only resonate with larger customers, and that is their target market: multiple hundreds of terabytes into the single petabyte range. At this scale, providing visibility into the thousands of virtual machines across hundreds of ESX/ESXi hosts attached to hundreds of Fibre Channel ports is almost impossible. Virtual Instruments tackles this with a multi-prong approach:

  • First, they use a SAN tap to plug into the Fibre Channel fabric and mirror traffic information to a collection device for analysis. If you’re a networking person, you can think of this as using a SPAN port to mirror traffic. This is done on the storage side to reduce the scale due to fan in-fan out ratios.
  • Second, they gather SNMP information from the Fibre Channel switches. This enables visibility at the switch level.
  • Third and finally, Virtual Instruments collects information from VMware vCenter Server. This information provides the final piece necessary to correlate per-host and per-VM traffic to the information being gathered by the fabric taps and the switch monitoring.

What this allows Virtual Instruments to do is to feed information back to vCenter Server to enable I/O-based recommendations for VM movement. It also enables visibility into path utilization so that multipathing information can be configured for optimal performance. Finally, more detailed storage information is exposed that enables organizations to more effectively place VM storage on Tier 1, Tier 2, or Tier 3 according to its storage needs. In some cases, in fact, money saved on buying additional Tier 1 storage can more than pay for an implementation of Virtual Instruments.

Overall, this is very interesting soltuion, albeit limited in scope to larger environments. If this describes your organization, though, it may definitely be worth a closer look.

Tranxition

Tranxition makes software to do “personality virtualization.” Apparently they’ve been around since 1998 and are just now becoming more visible, creating a partner program, and starting to expand coverage. Their key product is Adaptive Persona, which some have said can be called “Softricity for user personality data”. The product seems to work a lot like ThinApp in that it creates a virtual file system and virtual Registry that captures all user personality data. This user personality data, which can reside either inside or outside the traditional user profile file system structure, is then continuously streamed back to a central server. When a user logs off, whatever data has not been synchronized to the server is then copied up to the server, and the local system is scrubbed of user personality data. Then, when that same user logs on to a different system, Tranxition streams down only those portions of the user personality that are needed at that moment. All other data is fetched “on demand”. This helps speed up the logon process by decoupling the size of the profile from the time required to log on.

Overall, I was fairly impressed with the product. They seem to have done a reasonably good job of taking the principles behind application virtualization and applied them to user personality management. If anyone has any additional feedback on Tranxition (vendors, please disclose yourselves!), I’d love to hear it in the comments.

Tags: , , , ,

I’m a bit new to the Cisco MDS family of Fibre Channel switches, so I’m sure that this information is “old hat” to the storage pros out there who’ve done it a million times. Hence, I’m labeling this one as a “new user” article. The topic of this post is how to use the command-line interface (CLI) to configure zones on a Cisco MDS 9000 series Fibre Channel switch.

I won’t go into great detail on the purpose of zones and that sort of thing; I’m sure it’s been covered in excruciating detail elsewhere. (Knowledgeable readers with any links to that sort of information are encouraged to share those links in the comments.) Instead, I’ll just focus on the mechanics of how it’s done.

First, create some aliases for your own use instead of having to remember the Fibre Channel World Wide Port Names (WWPNs). This will make life a lot easier, in my opinion. You create aliases using the fcalias command, like this (where applicable in this command and all other commands in this post, replace XXX with the appropriate VSAN number):

switch(config)# fcalias name stor-array-processor-a vsan XXX
switch(config-fcalias)# member pwwn AA:BB:CC:DD:EE:FF:00:11
switch(config-fclias)# exit
switch(config)#

Obviously, you’ll replace the fake WWPN I used in the command above with the correct WWPN for that device. Repeat this process for all the storage processor ports, server HBAs, etc. From this point forward, you can use the alias in place of the WWPN when creating zones. See, isn’t that easier?

Next, create zones. Each zone should have a single initiator and (ideally) a single target, although multiple targets is usually acceptable. To create a zone, use the zone and member commands like this:

switch(config)# zone name first-new-zone vsan XXX
switch(config-zone)# member fcalias stor-array-processor-a
switch(config-zone)# member fcalias server-hba
switch(config-zone)# exit
switch(config)#

Since each zone contains only a single initiator, you’ll need to repeat this process for each initiator.

Once you have all the zones created, next create a zoneset. You can create a new zoneset just using the zoneset command, or you can clone an existing zoneset with the zoneset clone command. In this case, I’ll clone an existing zoneset:

switch(config)# zoneset clone existing-zoneset new-zoneset vsan XXX

From here, you have a copy of the existing zoneset, which already had all the previously defined zones as members. Add the new zones you’ve defined to the zoneset like this:

switch(config)# zoneset new-zoneset vsan XXX
switch(config-zoneset)# member first-new-zone
switch(config-zoneset)# member second-new-zone
switch(config-zoneset)# exit

Finally, activate the zoneset:

switch(config)# zoneset activate name new-zoneset vsan XXX

Then save the configuration with copy runn start and you should be good to go! All you need to do now is configure and present storage from the storage array to the initiators. But that’s another topic for another post…

UPDATE: I’ve posted a follow-up to this article on managing zones via the CLI.

Tags: , , ,

By Aaron Delp
Twitter: aarondelp
FriendFeed (Delicious, Twitter, & all my blogs in one spot): aarondelp

This week I’ve had the privilege of attending a Cisco Nexus 5000/7000 class. I have learned a tremendous amount about FCoE this week and after some conversations with Scott about the topic, I wanted to tackle it one more time from a different point of view. I have included a list of some of Scott’s FCoE articles at the bottom for those interested in a more in-depth analysis.

Disclaimer: I am by no means an FCoE expert! My working knowledge of FCoE is about four days old at this point. If I am incorrect in some of my situations below, please let me know (keep it nice and professional, people!) and I will be happy to make adjustments.

If you are an existing VMWare customer today with FC as your storage transport layer, should you be thinking about FCoE? How would you get started? What investments can you make in the near future to prepare you for the next generation?

These are questions I am starting to get from my customers in planning/white board sessions around VMware vSphere and the next generation of virtualization. The upgrade to vSphere is starting to prompt planning and discussions around the storage infrastructure.

Before I tackle the design aspect, let me start out with some hardware and definitions.

Cisco Nexus 5000 series switch: The Nexus 5K is a Layer 2 switch that is capable of FCoE and can provide both Ethernet and FC ports (with an expansion module). In addition to Ethernet switching, the switch also operates as an FC fabric switch providing full fabric services, or it can be set for N_Port Virtualization (NPV) mode. The Nexus 5K can’t be put in FC switch mode and NPV mode at the same time. You must pick one or the other.

N_Port Virtualization (NPV) mode: NPV allows the Nexus 5K to act as a FC “pass thru” or proxy. NPV is great for environments where the existing fabric is not Cisco and merging the fabrics could be ugly. There is a downside to this. In NPV mode, no targets (storage arrays) can be hung off the Nexus 5K. This is true for both FC and FCoE targets.

Converged Network Adapter (CNA): A CNA is single PCI card that contains both FC and Ethernet logic, negating the need for separate cards, separate switches, etc.

Now that the definitions and terminology is out of the way, I see four possible paths if you have FC in your environment today.

1. FCoE with a Nexus 5000 in a non-Cisco MDS environment (merging)

In this scenario, the easiest way to get the Nexus on the existing non-Cisco FC fabric is to put the switch in NPV mode. You could put the switch in interop mode (and all the existing FC switches), but it is a nightmare to get them all talking and you often lose vendor specific features in interop mode. Plus, to configure interop mode, the entire fabric has to be brought down. (You do have redundant fabrics, right?)

With the Nexus in NPV mode, what will it do? Not much. You can’t hang storage off of it. You aren’t taking advantage of Cisco VSANs or any other features that Cisco can provide. You are merely a pass thru. The zoning is handled by your existing switches; your storage is off the existing switches, etc.

Why would you do this? By doing this, you could put CNAs in new servers (leaving the existing servers alone) to talk to the Nexus. This will simplify the server side infrastructure because you will have fewer cables, cards, switch ports, etc. Does the cost of the CNA and new infrastructure offset the cost of just continuing the old environment? That is for you to decide.

2. FCoE with a Nexus 5000 in a non-Cisco MDS environment (non-merging)

Who says you have to put the Nexus into the existing FC fabric? We have many customers that purchase “data centers in a box”. By that I mean a few servers, FC and/or Ethernet switches, storage, and VMware all in one solution. This “box” sits in the data center and the network is merged with the legacy network, but we stand up a Cisco SAN next to the existing non-Cisco SAN and just not let them talk to each other. In this instance, we would use CNAs in the servers, Nexus as the switch, and you pick a storage vendor. This will work just like option 3.

3. FCoE with a Nexus 5000 in a Cisco MDS environment

Now we’re talking. Install the Nexus in FC switch mode, merge it with the MDS fabric, put CNAs in all the servers and install the storage off the Nexus as either FC or FCoE. You’re off to the races!

You potentially gain the same server side savings by replacing FC and Ethernet in new servers with CNAs. You are able to use all of the Cisco sexy features of FCoE. Nice solution if the cost is justified in your environment.

4. Keep the existing environment and use NFS to new servers

What did I just say? Why would I even consider that option?

OK, this last one is a little tongue-in-cheek for customers that are already using FC. The NFS vs. traditional storage for VMWare is a bit of a religious debate. I know you aren’t going to sway me and I know I’m not going to sway you.

I admit I’m thinking NetApp here in a VMWare environment; I’m a big fan so this is a biased opinion. NetApp is my background but other vendors play in this space as well. I bet Chad will be happy to leave a comment to help tell us why (and I hope he does!).

Think of it this way. You’re already moving from FC cards to CNAs. Why not buy regular 10Gb Ethernet cards instead? Why not just use the Nexus 5K as a line-rate, non-blocking 10Gb Ethernet switch? This configuration is very simple compared to FCoE at the Nexus level and management of the NetApp is very easy! Besides, you could always turn up FCoE on the Nexus (and the NetApp) at a future date.

In closing, I really like FCoE but as you can see it isn’t a perfect fit today for all environments. I really see this taking off in 1-2 years and I can’t wait. Until then, use caution and ask all the right questions!

If you are interested in some more in-depth discussion, here are links to some of Scott’s articles on FCoE:

Continuing the FCoE Discussion
Why No Multi-Hop FCoE?
There Might Be an FCoE End to End Solution

Tags: , , , , , , , ,

Last week’s partner boot camp for the Cisco Unified Computing System (UCS) was very helpful. It has really helped me gain a better understanding of the solution, how it works, and its advantages and disadvantages. I’d like to share some random bits of information I gathered during the class here in the hopes that it will serve as a useful add-on to the formal training. I’m sorry the thoughts aren’t better organized.

  • Although the UCS 6100 fabric interconnects are based on Nexus 5000 technologies, they are not the same. It would be best for you not to compare the two, or you’ll find yourself getting confused (I did, at least) because there are some things the Nexus 5000 will do that the fabric interconnects won’t do. Granted, some of these differences are the result of design decisions around the UCS, but they are differences nonetheless.
  • You’ll see the terms “northbound” and “southbound” used extensively throughout UCS documentation. Northbound traffic is traffic headed out of the UCS (out of the UCS 6100 fabric interconnects) to external Ethernet and Fibre Channel networks. Southbound traffic is traffic headed into the UCS (out of the UCS 6100 fabric interconnects to the I/O modules in the chassis). You may also see references to “east-to-west” traffic; this is traffic moving laterally from chassis to chassis within a UCS.
  • For a couple of different reasons (reasons I will expand upon in future posts), there is no northbound FCoE or FC connectivity out of the UCS 6100 fabric interconnects. This means that you cannot hook your storage directly into the UCS 6100 fabric interconnects. This, in turn, means that purchasing a UCS alone is not a complete solution—customers need supporting infrastructure in order to install a UCS. That supporting infrastructure would include a Fibre Channel fabric and 10Gbps Ethernet ports.
  • Continuing the previous thought, this means that—with regard to UCS, at least—my previous assertion that there is no such thing as an end-to-end FCoE solution is true. (Read my correction article and you’ll see that I qualified the presence of end-to-end FCoE solutions as solutions that did not include UCS.)
  • The I/O Modules (IOMs) in the back of each chassis are fabric extenders, not switches. This is analogous to the Nexus 5000-Nexus 2000 relationship. (Again, be careful about the comparisons, though.) You’ll see the IOMs occasionally referred to as fabric extenders, or FEXs. As a result, there is no switching functionality in each chassis—all switching takes place within the UCS 6100 fabric interconnects. Some of the implications of this architecture include:
    1. All east-to-west traffic must travel through the fabric interconnects, even for east-to-west traffic between two blades in the same chassis.
    2. When you use the Cisco “Palo” adapter and start creating multiple virtual NICs and/or virtual HBAs, the requirement for all east-to-west traffic applies to each individual vNIC. This means that east-to-west traffic between individual vNIC instances on the same blade must also travel through the fabric interconnects.
    3. This means that in ESX/ESXi environments using hypervisor bypass (VMDirectPath) with Cisco’s “Palo” adapter, inter-VM traffic between VMs on the same host must travel through the fabric interconnects. (This is not true if you are using a software switch, including the Nexus 1000V, but rather only when using hypervisor bypass.)
  • Each IOM can connect to a single fabric interconnect only. You cannot uplink a single IOM to both fabric interconnects. For full redundancy, then, you must have both fabric interconnects and both IOMs in each and every chassis.
  • Each 10Gbps port on a blade connects to a single IOM. To use both ports on a mezzanine adapter, you must have both IOMs in the chassis; to have both IOMs in the chassis, you must have both fabric interconnects. This makes the initial cost much higher (because you have to buy everything), but incremental cost much lower.
  • If you want to use FCoE, you must purchase the Cisco “Menlo” adapter. This will provide both a virtual NIC (vNIC) and a virtual HBA (vHBA) for each IOM populated in the chassis (i.e., populate the chassis with a single IOM and you get one vNIC and one vHBA, use two IOMs and get two vNICs and two vHBAs).
  • If you use the Cisco “Oplin” adapter, you’ll get 10Gbps Ethernet only. There is no FCoE support; you would have to use a software-based FCoE stack.
  • The Cisco “Palo” adapter offers the ability to use SR-IOV to present multiple, discrete instances of vNICs and vHBAs. The number of instances is based on the number of uplinks from the IOMs to the fabric interconnects. The formula for calculating this number is 15 * (IOM uplinks) - 2. So, for two uplinks, you could create a total of 28 vNICs or vHBAs (any combination of the two, not 28 each).
  • Blades within a UCS are designed to be completely stateless; the full identity of the system can be assigned dynamically using a service profile. However, to take full advantage of this statelessness, organizations will also have to use boot-from-SAN. This further echoes the need for organizations to dramatically re-architect in order to really exploit the value of UCS.
  • There are Linux kernels embedded everywhere: in the blades firmware, in the firmware of the IOMs, in the chassis, and in the fabric interconnects. On the blades, this embedded Linux version is referred to as pnuOS. (At the moment, I can’t recall what it stands for. Sorry.)
  • In order to reconfigure a blade, the UCS Manager boots into pnuOS, reconfigures the blade, and then boots “normally.” While this is kind of cool, it also makes the reconfiguration of a blade take a lot longer than I expected. Frankly, I was a bit disappointed at the time it took to associate or de-associate a service profile to a blade.
  • To monitor the status of a service profile association or de-association, you’ll use the FSM (Finite State Machine) tab within UCS Manager.
  • You’ll need a separate block of IP addresses, presumably on a separate VLAN, for each blade. These addresses are the management addresses for the blades. Cisco folks won’t like this analogy, but consider these the equivalent of Enclosure Bay IP Addressing (EBIPA) in the HP c7000 environment.
  • The UCS Manager software is written in Java. Need I say anything further?
  • UCS Manager uses the idea of a “service profile” to control the entire identity of the server. However, admins must be careful when creating and associating service profiles. A service profile that has two vNICs assigned would require a blade in a chassis with two IOMs connected to two fabric interconnects, and that service profile would fail to associate to a blade in a chassis with only a single IOM. Similarly, a service profile that defines both vNICs and vHBAs (assuming the presence of the “Menlo” or “Palo” adapters) would fail to associate to a blade with an “Oplin” adapter because the “Oplin” adapter doesn’t provide vHBA functionality. The onus is upon the administrator to ensure that the service profile is properly configured for the hardware. Once again, I was disappointed that the system was not more resilient in this regard.
  • Each service profile can be associated to exactly one blade, and each blade may be associated to exactly one service profile. To apply the same type of configuration to multiple blades, you would have to use a service profile template to create multiple, identical service profiles. However, a change to one of those service profiles will not affect any of the other service profiles cloned from the same template.
  • UCS Manager does offer role-based access control (RBAC), which means that different groups within the organization can be assigned different roles: the network group can manage networking, the storage group can manage the SAN aspects, and the server admins can manage the servers. This effectively addresses the concerns of some opponents that UCS places the network team in control.
  • While UCS supports some operating systems on the bare metal, it really was designed with virtualization in mind. ESX 4.0.0 (supposedly) installs out of the box, although I have yet to actually try that myself. The “Palo” adapter is built for VMDirectPath; in fact, Cisco makes a big deal about hypervisor bypass (that’s a topic I’ll address in a future post). With that in mind, some of the drawbacks—such as how long it takes to associate or de-associate a blade—become a little less pertinent.

I guess that about does it for now. I’ll update this post with more information as I recall/remember it over the next few days. I also encourage other readers who have attended similar UCS events to share any additional points in the comments below.

Tags: , , , , , , , ,

This post is a follow-up from my post yesterday titled “No Such Thing as an End-to-End FCoE Solution”.

After publishing that post, I managed to get in touch with some very smart people who were willing to spend some time with me and educate me on the various intricacies involved here. In order to help you, my readers, understand the various pieces and parts, I’ll need to first provide some definitions.

Fibre Channel Forwarder (FCF): In its simplest form, this is another form for an FCoE switch. A Nexus 5000 would be an example of an FCF.

Multi-hop FCoE: There are a couple of different definitions here. One definition would be having multiple FCFs connected together (i.e., a Nexus 5000 connected to another Nexus 5000). A second definition would be having multiple Layer 2 hops between an FCoE initiator or target and an FCF. Note that the switches handling those hops must be IEEE DCB capable.

FCoE Initialization Protocol: FIP, as its more commonly known, was included in the FC-BB-5 FCoE standard that was finalized in early June.

OK, now that I have some definitions in place, I can discuss how it might be possible (eventually) to build an end-to-end FCoE solution.

Disclaimer: I don’t claim to be an FCoE expert, I’m just trying to understand it better myself and help others understand it better. If I’m misrepresenting something, let me know—courteously and professionally—in the comments, or drop me an e-mail.

The question “Can I build an end-to-end FCoE solution?” has multiple answers:

  • If you have only a single FCF and everything is plugged into that FCF, then you can build a pure FCoE solution today. The Nexus 5000 can function as the FCF, and both the CNAs and targets that are available will work. Obviously, this is not a very scalable solution.
  • If you have multiple FCFs, or if you have multiple Layer 2 hops between initiators or targets and the FCFs, then you might or might not be able to build an end-to-end FCoE solution. In this scenario, FIP-enabled initiators and targets would be able to find and communicate with each other, but non-FIP-enabled initiators and targets would not (unless they were plugged into the same FCF). At this point I am unclear about connectivity between pre-FIP initiators or targets on the same IEEE DCB-capable Layer 2 switch (not an FCF); I suspect they would not be able to communicate.

All of these statements are applicable until you bring UCS into the mix. For UCS, my earlier statement stands: with UCS, you cannot have an end-to-end FCoE solution today. That will change at some point in the future, but no one has shared any information with me regarding just how far in the future that might be.

If you have a pure Nexus 5000 environment, with FCoE-capable storage and servers with CNAs, you’d probably be able to make it work. With FIP support in that environment, you’d definitely be able to make it work. When you add UCS, though, it becomes very different. I hope to be able to discuss that in greater detail in the near future.

So, my earlier statement wasn’t entirely true; it is possible to build an end-to-end FCoE solution. Today, that solution would be very limited in size; once FIP support is baked into the initiators, the targets, and the FCFs, then the solution size will be able to scale.

As always, comments and clarifications are welcome!

Tags: , , , , ,

Update: See this follow-up post for more information.

I mentioned yesterday on Twitter that I’d had something of a revelation with regard to Fibre Channel over Ethernet (FCoE). This is probably nothing new to the experienced storage intelligentsia, but I’m just a simple guy so this was a big deal. After a spirited discussion in the Cisco UCS class about how to best leverage “FCoE-capable” storage, I have come to this realization: there is no such thing as an end-to-end FCoE solution.

If you’re impatient and want the short story, here it is: Even if you have an FCoE-capable storage array and you have FCoE converged network adapters (CNAs), you still can’t build an end-to-end FCoE solution. Why? Because you must put a standard Fibre Channel switch into the mix in order to provide fabric services like zoning, etc., because equipment like the UCS 6100 fabric interconnects and the Nexus 5000 don’t provide those services.

Here’s the longer version. We were having a discussion in the Cisco UCS training class revisiting the northbound FCoE connectivity issue that I discussed here. It turns out that the UCS 6100 fabric interconnect runs in NPV (or end-host) mode, so you can’t hook up any sort of storage target, FC or FCoE, directly to the UCS 6100 fabric interconnect. Even if you were to enable the UCS 6100 fabric interconnect to run in switch mode—something that’s not possible today—you still can’t hook a storage target, FC or FCoE, to the fabric interconnect because the fabric interconnect doesn’t provide any fabric services. Further, even if you were to leave the UCS 6100 fabric interconnect in NPV mode and add a Nexus 5000 switch to the mix, you can’t hook the the UCS 6100 and the Nexus 5000 together because FCoE isn’t multi-hop capable (yet). If I understand correctly, the FC-BB-5 standard includes FIP, which will address this limitation. However, according to the information I’m getting here—and I’m fully open to more information from others who are “in the know”—even that won’t fully address the problem because neither the UCS 6100 nor the Nexus 5000 will offer fabric services. So, you will still need a traditional Fibre Channel switch, like a Cisco MDS 9000 series, to provide fabric services.

The end result is that, today, it’s impossible to build an end-to-end FCoE solution. You will still need a traditional Fibre Channel switch somewhere in the mix, either to connect the FCoE equipment together (for example, to link a UCS 6100 fabric interconnect to a Nexus 5000) and/or to provide fabric services.

<aside>Now, there seems to be some confusion within Cisco, as the UCS resources to which I’ve been speaking are confirming my conclusions, but others (consider this tweet by Brad Hedlund) are saying it’s not true. I don’t know who’s correct—I can only go on what I’m being given.</aside>

As a result, it seems completely futile and useless for storage vendors to offer FCoE support on their storage arrays until these issues are addressed. In my mind, this further cements FCoE as an “edge-only” solution. Adding fabric services to the Nexus 5000 and/or UCS 6100 fabric interconnects would address this problem, and perhaps that’s something that is now enabled and made possible via the FC-BB-5 standard and FIP. If so, I have yet to hear a timeline in which these limitations will be addressed.

Either way, if you’re thinking of deploying FCoE today, be sure to keep this in mind or you could find yourself in for a surprise.

Courteous comments and clarifications are welcome!

Tags: , , , ,

I’m about halfway through the first day of Unified Computing System (UCS) training in San Jose, CA, and I’ve learned of what I think is a fairly significant limitation. The issue centers around what Cisco refers to as “northbound” traffic and how Fibre Channel over Ethernet (FCoE) is handled with northbound traffic.

Recall that a central part of UCS is the UCS 6100 series fabric interconnect. The 6100 series fabric interconnect has connectivity in two directions:

  • Southbound connectivity is connectivity aimed back at the fabric extenders in the blade chassis themselves.
  • Northbound connectivity is connectivity headed outside the UCS to other systems and networks.

All southbound traffic is 10Gbps Ethernet with FCoE. Northbound traffic can be 10Gbps Ethernet or Fibre Channel, but not FCoE. Based on the information I’ve been given (and if I’m incorrect please let me know in the comments), you cannot directly connect an FCoE-enabled storage array to a UCS. Even if your storage array has native FCoE interfaces, you can’t plug them into the UCS 6100 series fabric interconnects because that’s considered northbound traffic and you can’t use FCoE with northbound traffic.

I have a feeling customers who have purchased storage arrays with FCoE interfaces with the intention of hooking the arrays up directly to a UCS are going to be a bit upset when this information becomes more widely known.

If I’m working from incorrect or incomplete information, please feel free to speak up in the comments.

Tags: , , , , ,

« Older entries