nehalem

You are currently browsing articles tagged nehalem.

By Aaron Delp
Twitter: aarondelp
FriendFeed (Delicious, Twitter, & all my blogs in one spot): aarondelp

I wanted to relay some information regarding choosing memory speeds and types for the new Intel Xeon 5500 (Nehalem family) processors. As stated in my previous article on the Nehalem CPUs, there are some decisions that need to be made when choosing the memory and processor combinations. Let’s start off with what the memory architecture looks like.

  • The current Xeon 5500 family is a two-socket configuration.
  • Memory will run at 1333 MHz, 1066 MHz, and 800 MHz.
  • Memory is currently produced in single, dual, and quad rank configurations. Dual rank is faster than single rank, quad rank is currently limited to 1066 MHz speed.
  • Each CPU socket has 3 memory channels for a total of 6 channels per server.
  • Each channel can accept up to 3 DIMMS. This is why the servers currently are made with either 12 sockets (2 DIMMS per channel x 3 channels per processor x 2 processor sockets) or 18 sockets (3 DIMMS per channel x 3 Channels per processor x 2 processor sockets).
  • Some servers come in a 16 DIMM arrangement. Please see this IBM Paper for more information.
  • The maximum memory speed is limited by processor. For example, the X5570 has a max memory speed of 1333 MHz, the E5540 has a max memory speed of 1066 MHz, etc.
  • As more memory is added to a channel, the memory will slow down.
  • Better performance is achieved when the memory is “balanced” (the total amount of memory across channels is the same).

Take a look at the Hp Quick Specs for the BL460 G6 server in the Memory section. I found this to be a great source.

So, what does all of that mean? It means that for best performance you should install the memory using the following guidelines:

  • Ideally, install DIMMs in sets of 6, 1 per channel (populate both sockets with CPUs!). Use DIMMs that are dual rank and have the fastest speed you can purchase that the processor supports.
  • Populate the first slot in all channels first, then populate the 2nd slots in all channels, etc. Don’t put all three DIMMs in one channel and leave other channels empty.
  • Balance the amount of memory in each channel whenever possible (3 x 4GB on two channels and 1 x 4GB 1 X 8GB on the last channel).
  • If at all possible, try to keep the system away from the 800MHz memory speed.

Here is link to an awesome IBM white paper explaining everything.

Here’s an example 12 DIMM slot Nehalem configuration:

Speed

Max Mem Speed

Bank 1 in Channel Populated

Bank 2 in Channel Populated

X5570 (2.93 GHZ)

1333 MHz

1333 MHz

1066 MHz *

X5560 (2.80 GHZ)

1333 MHz

1333 MHz

1066 MHz *

X5550 (2.66 GHZ)

1333 MHz

1333 MHz

1066 MHz *

E5540 (2.53 GHZ)

1066 MHz

1066 MHz

1066 MHz

E5530 (2.40 GHZ)

1066 MHz

1066 MHz

1066 MHz

E5520 (2.26 GHZ)

1066 MHz

1066 MHz

1066 MHz

E5506 (2.13 GHZ)

800 MHz

800 MHz

800 MHz

E5504 (2.00 GHZ)

800 MHz

800 MHz

800 MHz

E5502 (1.66 GHZ)

800 MHz

800 MHz

800 MHz

 

Here’s an example 18 DIMM slot Nehalem configuration:

Speed

Max Mem Speed

Bank 1 in Channel Populated

Bank 2 in Channel Populated

Bank 3 in Channel Populated

X5570 (2.93 GHZ)

1333 MHz

1333 MHz

1066 MHz *

800 MHz

X5560 (2.80 GHZ)

1333 MHz

1333 MHz

1066 MHz *

800 MHz

X5550 (2.66 GHZ)

1333 MHz

1333 MHz

1066 MHz *

800 MHz

E5540 (2.53 GHZ)

1066 MHz

1066 MHz

1066 MHz

800 MHz

E5530 (2.40 GHZ)

1066 MHz

1066 MHz

1066 MHz

800 MHz

E5520 (2.26 GHZ)

1066 MHz

1066 MHz

1066 MHz

800 MHz

E5506 (2.13 GHZ)

800 MHz

800 MHz

800 MHz

800 MHz

E5504 (2.00 GHZ)

800 MHz

800 MHz

800 MHz

800 MHz

E5502 (1.66 GHZ)

800 MHz

800 MHz

800 MHz

800 MHz

* According to the HP Quick Spec for the BL460 G6, they are able to keep the speed at 1333 MHz with 2 DIMMS. A BIOS update is required to achieve this. This is HP specific.

Common Questions:

Q: What kind of performance decrease will I see by lowering the clock speed of my memory? For example using 6x2GB DIMMs (running at 1333 MHz) vs 12 x 1 GB DIMMs (running at 1066 MHz) to save a little money.

A: According to the IBM white paper listed above, we have two main areas of performance to worry about, latency and throughput. The latency difference between 1333 MHz and 800 MHz is about 10%. Memory throughput is another story though. The different between 1333 MHz and 1066 MHz is about 9%. The difference from 1066 MHz to 800 MHz is 28%!

Q: What kind of performance increase will I see in a “balanced” (same amount of memory per channel) system?

A: Again, according to the IBM paper, you will see a performance increase if the system is balanced. An exact number isn’t given.

Q: Which is fastest? Single, dual, or quad rank DIMMS?

A: According to the IBM White Paper, dual rank outperforms single rank by 7% in Specjbb2005. Quad rank DIMMs decrease the clock speed to 1066 MHz so they are not faster at this time.

Q: What if I only populate one processor?

A: You want to populate both sockets if performance is a concern. Adding the second processor not only makes the second set of DIMM sockets available, it also doubles the memory bandwidth.

Tags: ,

By Aaron Delp
Twitter: aarondelp

Hello everyone! It’s Aaron again. I’m sorry for falling off the radar for a bit. A new generation of Intel processors is upon us and I felt the need to come out of seclusion to share some recent findings regarding the new architecture. Today’s article will explore the new processor offerings. I will be following this up with one (or more depending on the length) about the memory architecture and interconnects.

There is one simple reason why I wrote this article. You can no longer pick a processor based on clock speed. The Nehalem processors have “levels” now and each level provides additional features and functionality lacking in the lower levels. You will need to be careful when choosing a processor if you are looking for certain features. Here is a quick table listing the models and the features:

Speed

Watts

Max Mem Speed

Turbo Mode and Hyper-Threading

X5570 (2.93GHz)

95W

1333 MHz

Yes

X5560 (2.80GHz)

95W

1333 MHz

Yes

X5550 (2.66GHz)

95W

1333 MHz

Yes

E5540 (2.53GHz)

80W

1066 MHz

Yes

E5530 (2.40GHz)

80W

1066 MHz

Yes

E5520 (2.26GHz)

80W

1066 MHz

Yes

E5506 (2.13GHz)

80W

800 MHz

No

E5504 (2.00GHz)

80W

800 MHz

No

E5502 (1.66GHz)

80W

800 MHz

No

I find the Max Memory Speed particularly interesting. As you will see in the next article, memory speed can get pretty complex very quickly. The more memory that is installed in the system, the lower the clock speed on the memory. The days of installing in matched pairs and forgetting about it are gone.

What is Turbo Mode and Hyper-Threading you ask? Hyper-Threading as far as I can tell (please leave feedback if this incorrect!) is the same old Hyper-Threading we knew and loved from past chipsets. Turbo mode is interesting though. Think of it as “Burst Mode” for processors. If your OS supports it, the CPU will increase the clock speed as long as you are within the thermal/power thresholds for the chip. The ability to go into Turbo mode depends on the number of active cores. If you are using most of the cores, the chip will be less likely to go into Turbo mode.

UPDATE: Keith from Intel has provided a great explanation of Turbo mode from a hardware perspective in the comments section. I wanted to include it here as a direct quote. Thanks Keith!

Turbo mode is mostly independent of OS support. On CPUs that support Turbo, it is implemented as the P0 p-state in the CPU. It looks & smells like a CPU that is simply running in the highest-frequency P-state. The PCU (power control unit) in Nehalem handles the rest.

Tags: , ,