Increasing the queue depth?
Hi, I have moved this blog to the new site frankdenneman.nl.
Please visit the new home of this article: http://frankdenneman.nl/2009/03/increasing-the-queue-depth/.
Apologies for the inconvenience!
Categories: VMware
active path, Disk.SchedNumReqOutstanding, EMC, ESX, ESX host, Execution Throttle, HBA, HP, I/O, IO performance, IO request, IO throttling command, LUN, Lun Balancing, QFULL, Qlogic, Qlogic Execution Throttle, Queue, Queue Depth, Queueing, SAN, SCSI, Storage, Storage Array, Target Port Queue Depth, vi3_san_design_deploy.pdf, VMware
Hi Frank
could you explain this formuala a bit more?
The execution throttle can be set to 136,5=> T=2048 (1 * Q * 15) = 136,5
where does the 136,5 come from? i understand this would be the Qlogic execution throttle value but i don’t get what the value means
thanks
Hi Paul,
I changed the text, i forgot to add the queue depth to that sentence. So subsitute execution throttle for queue depth.
The target port can accept 2048 outstanding IO’s in its queue before shutting down communications, the esx host in the example has 15 luns presented over that storage controller. IO is issued over one active path per LUN.
Now you can calculate a queue depth that will give you the best performance without flooding the queue.
If you set the queuedepth to 136 the ESX host is
able to issue 2040 IO’s (2048/15).
But (hopefully) you do not have one ESX host, so you will need to devide the target port queue with the amount of hosts….
Does the rest of the post make any sense?
Hi Frank yes it makes perfect sense now was just what I was looking for cheers!
Paul.
On my blog someone suggested to keep the “DSNRO” lower than the QD. This way it won’t be possible for just one VM to fill up the entire queue for a host.
Great article again by the way, keep them coming!
Advice please:
I am an ESX admin, not a storage admin.
The institution where I work is using 1 LUN per VM (sometimes more than 1 e.g lun per vmdk).
Their environment currently consists of 3 hosts running 86 vm’s connected to 100 LUNs. Most LUNs have 2 paths. The SAN is active/passive.
According to your formula this would equal 2048/100/3*2=13.65.
What settings should I be using for QD and DSNRO? And shold I be telling the storage admins to stop assigning 1 LUN per VM? What are the implications of continuing down this path?
One VM per LUN is a really nice situation.
The VM has its own path and does not need to compete with other VM’s sharing the LUN.
If more than one VM issues IO to the LUN, the VMkernel controlls the IO with some sort of “fairness” scheme, that means if two VM’s issue IO’s to the LUN, the vmkernel decides which VM can write the IO. This has to do with sector proximity and the amount of consecutive sequential IO’s allowed from one VM. You just gave me an idea of a new article
Did you every calculated the storage utilization rate with you vm-to LUN ratio?
How many free space do you have on every lun?
In ESX you will always communicate over one active path.
DSNRO setting does not apply in you situation, because you are hosting one VM per LUN.
The QD applies.
Please contact your SAN vendor to ask the target port queue depth before setting the QD.
How many storage controllers does your SAN have?
“When the queue depth is reached, (QFULL) the storage controller issues an IO throttling command to the host…”
How do you see that at the host level ?
Logs wise, what should I grep for ?
Thx,
Good question!
To my knowledge the qfull condition is handled by the qlogic driver itself. The QFULL status is not returned to the OS. But some storage devices return BUSY rather than QFULL, BUSY errors are logged in the /var/log/vmkernel.
But not every BUSY error is a Qfull error!
A KB article exists will SCSI sense codes: KB here
If you have suspicious error codes in the logs, you can check them at vmproffesional.com SCSI Error Decoder
Hi Frank,
Great post!
your formula implies a 100% virtual situation.
What about those physical guests connected to the same storage controllers?
When determining your QD settings you have to take them into account also.
-Arnim
Thx for the compliment,
Yes, the formula implies a 100% virtual situation, just to keep it “simple”.
When other systems connect to the storage arrays, you will need to take them into account as well.
Microsoft published two excellent documents, both are must reads for windows admins
Disk Subsystem Performance Analysis for Windows
and
Performance Tuning Guidelines for Windows Server 2003
In the Disk subsystem doc, the numberofrequest setting is explained:
NumberOfRequests
Both SCSIport and Storport miniport drivers can use a registry parameter to designate how much concurrency is allowed on a device by device basis.
The default is 16, which is much too small for a storage subsystem of any decent size unless quite a number of physical disks are being presented to the operating system by the controller.
Up until today I worked on virtual infrastructure which contained only windows systems, so I do not know the default linux and solaris queuedepths yet.
But I’m working on a new virtual infra for a new client of mine, which will host windows, linux and solaris vm’s. So I will post the settings soon.
As you write : you use the same value for QueueDepth, DSNRO and ExecutionThrottle. And you that the max ExcutionThrottle is 64.
So you never put QueueDepth greater than 64?
please tell me how to configure the adapter queue depth value (AQLEN) in esx4.
Hi Aru,
Changing the queuedepth in ESX 4 is quite similiar to changing the queue depth in ESX 3.5. For the example I used the popular qlogic qla2300_707 driver
1. Open a connection to the service console.
2. Issue the command: vicfg-module -s ql2xmaxqdepth=64 qla2300_707
3. Reboot the system.
With ql2xmaxqdepth=64 setting the queue depth to 64
The vSphere SAN guide list also how to change the QD of an emulex , it’s listed on page 70/71. http://www.vmware.com/pdf/vsphere4/r40/vsp_40_san_cfg.pdf
Good luck!