Solaris MPxIO: LUN distribution

Lately my work tasks are leading me to learn a lot about Solaris storage configuration. I have a “test” environment with 13 HP ProLiant servers, Solaris x86 and HP EVA8400 storage.

EVA8400 is advertised as an active-active (AA) storage – a LUN can be accessed via both controllers. But if you look at the small letters you will see that it is actually asymmetric AA storage (AAA), which is a cheaper class of AA storages. While all LUNs can still be accessed via both controllers, LUNs still have owning controller. If I/O request comes to a non-owning controller it will be transfered to the owner and only then performed. This of course has impact on the performance. To avoid this EVA monitors the requests, and if more than 60% of requests are coming through non-owning controller ownership is changed.

Preferred method of access for AAA storages is using asymmetric logical unit access (ALUA) – method that enables target (storage system) to set different access characteristics to different paths based on the owning controller (this is known as target port group support (TPGS)). For example, if Controller_A owns LUN_1, all paths for LUN_1 to Controller_A will be marked as Active/Optimized, while all paths to Controller_B will be marked as Active/NonOptimized.

Knowing all this, it is only obvious that ideal setup for AAA storages is when one half of LUNs is owned by one controller and the other half by second controller. This enables us to balance the load on both controllers and get the maximum performance out of the storage.

Now, on the client side some systems have Veritas suite installed, but there are a couple of servers that are using Solaris native MPxIO as a multipathing solution. By specification MPxIO is ALUA aware plus it comes with a nice feature to balance the traffic across all Active paths. EVA8400 controllers have 4 x 4Gbps fibre channel ports each, while my servers have 2 x 8Gbps HBAs. Spreading the traffic across all 4 target ports would help me get needed performance.

Everything sounds perfect in theory, but I’m having problems to make this work in practice. For some reason MPxIO does all the I/O over Controller A and, since with MPxIO path priority can not be set manually, after a while all LUNs are moved to this controller leaving second controller idle. Since TPGS specification is vendor specific there seems to be some incompatibility between HP and Sun implementation. Or maybe there are some hidden undocumented options that have to be set – so far I had no luck in finding these. :-/

During the tests and in order to see how the LUNs were spread I wrote a script that counts Active paths per target port. Maybe someone else will find it useful. Download can be found here. And here is how it looks in action:

root@xdb1-ora:~# get_san_paths.py
Initiator ports found:    2
Target ports found:       8
LUNs found:              87
Path \ LUNs:
  50001fe150229f78: 77 LUNs
  50001fe150229f79: 77 LUNs
  50001fe150229f7c: 10 LUNs
  50001fe150229f7d: 10 LUNs
  50001fe150229f7a: 77 LUNs
  50001fe150229f7b: 77 LUNs
  50001fe150229f7e: 10 LUNs
  50001fe150229f7f: 10 LUNs