linux – MegaRAID: performance difference between volumes

I have a MegaRAID SAS 9361-8i with 2x 240GB SATA 6Gbps SSD in RAID1, 4x 10TB SAS 12Gbps HDD in RAID6 and 4x 480GB SATA 6 Gbps SSD in RAID5:

-----------------------------------------------------------------------------
DG Arr Row EID:Slot DID Type  State BT       Size PDC  PI SED DS3  FSpace TR 
-----------------------------------------------------------------------------
 0 -   -   -        -   RAID1 Optl  N  223.062 GB dflt N  N   dflt N      N  
 0 0   -   -        -   RAID1 Optl  N  223.062 GB dflt N  N   dflt N      N  
 0 0   0   8:2      13  DRIVE Onln  N  223.062 GB dflt N  N   dflt -      N  
 0 0   1   8:5      16  DRIVE Onln  N  223.062 GB dflt N  N   dflt -      N  
 1 -   -   -        -   RAID6 Optl  N   18.190 TB enbl N  N   dflt N      N  
 1 0   -   -        -   RAID6 Optl  N   18.190 TB enbl N  N   dflt N      N  
 1 0   0   8:0      9   DRIVE Onln  N    9.094 TB enbl N  N   dflt -      N  
 1 0   1   8:1      11  DRIVE Onln  N    9.094 TB enbl N  N   dflt -      N  
 1 0   2   8:3      10  DRIVE Onln  N    9.094 TB enbl N  N   dflt -      N  
 1 0   3   8:4      12  DRIVE Onln  N    9.094 TB enbl N  N   dflt -      N  
 2 -   -   -        -   RAID5 Optl  N    1.307 TB dflt N  N   dflt N      N  
 2 0   -   -        -   RAID5 Optl  N    1.307 TB dflt N  N   dflt N      N  
 2 0   0   8:6      14  DRIVE Onln  N  446.625 GB dflt N  N   dflt -      N  
 2 0   1   8:7      17  DRIVE Onln  N  446.625 GB dflt N  N   dflt -      N  
 2 0   2   8:9      15  DRIVE Onln  N  446.625 GB dflt N  N   dflt -      N  
 2 0   3   8:10     18  DRIVE Onln  N  446.625 GB dflt N  N   dflt -      N  
-----------------------------------------------------------------------------

---------------------------------------------------------------
DG/VD TYPE  State Access Consist Cache Cac sCC       Size Name 
---------------------------------------------------------------
0/0   RAID1 Optl  RW     Yes     NRWBD -   ON  223.062 GB VD0  
1/1   RAID6 Optl  RW     Yes     RWBD  -   ON   18.190 TB VD1  
2/2   RAID5 Optl  RW     Yes     NRWBD -   ON    1.307 TB VD2  
---------------------------------------------------------------

---------------------------------------------------------------------------------------
EID:Slt DID State DG       Size Intf Med SED PI SeSz Model                     Sp Type 
---------------------------------------------------------------------------------------
8:0       9 Onln   1   9.094 TB SAS  HDD N   N  512B HUH721010AL5200           U  -    
8:1      11 Onln   1   9.094 TB SAS  HDD N   N  512B HUH721010AL5200           U  -    
8:2      13 Onln   0 223.062 GB SATA SSD N   N  512B Micron_5100_MTFDDAK240TCC U  -    
8:3      10 Onln   1   9.094 TB SAS  HDD N   N  512B HUH721010AL5200           U  -    
8:4      12 Onln   1   9.094 TB SAS  HDD N   N  512B HUH721010AL5200           U  -    
8:5      16 Onln   0 223.062 GB SATA SSD N   N  512B Micron_5100_MTFDDAK240TCC U  -    
8:6      14 Onln   2 446.625 GB SATA SSD N   N  512B Micron_5100_MTFDDAK480TCC U  -    
8:7      17 Onln   2 446.625 GB SATA SSD N   N  512B Micron_5100_MTFDDAK480TCC U  -    
8:9      15 Onln   2 446.625 GB SATA SSD N   N  512B Micron_5100_MTFDDAK480TCC U  -    
8:10     18 Onln   2 446.625 GB SATA SSD N   N  512B Micron_5100_MTFDDAK480TCC U  -    
---------------------------------------------------------------------------------------

Testing write speed on these VDs:

# lvcreate -ntest1 -L32G vg /dev/sda
# lvcreate -ntest2 -L32G vg /dev/sdb
# lvcreate -ntest3 -L32G vg /dev/sdc
# for i in 1 2 3; do sleep 10; dd if=/dev/zero of=/dev/vg/test$i bs=128M count=256 oflag=direct; done
34359738368 bytes (34 GB, 32 GiB) copied, 120.433 s, 285 MB/s  (test1/VD 0)
34359738368 bytes (34 GB, 32 GiB) copied, 141.989 s, 242 MB/s  (test2/VD 1)
34359738368 bytes (34 GB, 32 GiB) copied, 26.4339 s, 1.3 GB/s  (test3/VD 2)

# for i in 1 2 3; do sleep 10; dd if=/dev/vg/test$i of=/dev/zero bs=128M count=256 iflag=direct; done
34359738368 bytes (34 GB, 32 GiB) copied, 35.7277 s, 962 MB/s  (test1/VD 0)
34359738368 bytes (34 GB, 32 GiB) copied, 147.361 s, 233 MB/s  (test2/VD 1)
34359738368 bytes (34 GB, 32 GiB) copied, 16.7518 s, 2.1 GB/s  (test3/VD 2)

Running dd in parallel:

# sleep 10; for i in 1 2 3; do dd if=/dev/zero of=/dev/vg/test$i bs=128M count=256 oflag=direct & done
34359738368 bytes (34 GB, 32 GiB) copied, 28.1198 s, 1.2 GB/s  (test3/VD 2)
34359738368 bytes (34 GB, 32 GiB) copied, 115.826 s, 297 MB/s  (test1/VD 0)
34359738368 bytes (34 GB, 32 GiB) copied, 143.737 s, 239 MB/s  (test2/VD 1)

# sleep 10; for i in 1 2 3; do dd if=/dev/vg/test$i of=/dev/zero bs=128M count=256 iflag=direct & done
34359738368 bytes (34 GB, 32 GiB) copied, 16.8986 s, 2.0 GB/s  (test3/VD 2)
34359738368 bytes (34 GB, 32 GiB) copied, 35.7328 s, 962 MB/s  (test1/VD 0)
34359738368 bytes (34 GB, 32 GiB) copied, 153.147 s, 224 MB/s  (test2/VD 1)

The values for VD 0 and VD 1 are abysmal, and what is remarkable, VD 2 had similar values to the other ones until I deleted and recreated it, which I can’t do with the others as they contain data.

The only limit I can readily explain is the read speed of VD 2, which is roughly three times the SATA link speed — that makes sense for a RAID5 with four disks. The read speed of VD0 is a bit below twice the SATA link speed, that could be either a limitation of the media, or non-optimal interleaving of requests in a RAID1, but either would still be acceptable.

The other numbers make no sense to me. The controller is obviously able to handle data faster, and that parallel performance is not significantly different from looking at volumes in isolation also suggests that it is not choosing a bottlenecked data path.

My interpretation of the situation is that creating the volumes from the BIOS instead of from StorCLI somehow gave them a sub-optimal configuration. Comparing the output from storcli /c0/v0 show all and storcli /c0/v2 show all shows no unexplained differences, so my fear is that the error is somewhere deeper in the stack.

  • Is there a known configuration gotcha or bug that would explain this behaviour?
  • Is there a tool to analyze configurations for bottlenecks, or, failing that,
  • Can I somehow export internal configuration values to allow me to compare them between volumes?