Now that you have your ScaleIO environment up and running after following the posts above, of course you want to see what kind of performance you will get out of it. There are many ways to go about doing this, and I’ll show you one method I’ve been using and also some of the results of said tests.
ScaleIO is handled a bit differently in a VMware environment, where it’s using an iSCSI connection instead of the native ScaleIO protocols. Because of the extra layers that are used (underlying storage (HDD/SSD/PCIe)->VMFS->VMDK->ScaleIO->iSCSI->VMFS->VMDK->Guest FS) you will probably see different performance numbers in a VMware environment compared to a physical one.
However, the method I’ve been using can be used for both physical and virtual ScaleIO environments, so read on.
To gain the best performance out of your ScaleIO environment, a few settings on the ScaleIO VMs need to be set first.
First of all, jumbo frames. Enable them using the following command:
ifconfig eth0 mtu 9000
It’s also recommended to increase the txqueuelen value from 1000 to 10000, like this:
ifconfig eth0 txqueuelen 10000
These are reset during reboot though, so please add them into your network configuration files if you want to continue using them after a reboot.
Here’s a protip! If you want to run the same command on all your ScaleIO nodes, there’s a tool called admincli.py on the MDM nodes that you can use, like this:
/opt/scaleio/ecs/mdm/diag/admincli.py --command "ifconfig eth0 mtu 9000"
When I created my ScaleIO VMs I created VMDKs on top of a VMFS volume on top of underlying storage. These VMDKs should always be created using Eager Zeroed Thick. I also used a paravirtualized SCSI adapter here, it’s not included in the official user guide but seems to have increased the performance a bit more. You can also play with increasing the vCPU count from 2 to 4 for a bit more performance but of course that eats up more CPU. Don’t touch the RAM though, you probably won’t ever need more than the 1GB that’s allocated.
If you are using SSDs or PCIe Flash as underlying storage, the recommendation when installing using the scripted way is to use the profile for SSDs. To do that, you run the following command during installation:
./install.py --all --vm --license=YOURLICENSEHERE --profile ssd
However, if you’ve already installed ScaleIO on top of your SSDs and would like to add the correct SSD configuration to your already existing environment, add the following into your /opt/scaleio/ecs/sds/cfg/conf.txt file on each SDS node:
tgt_net__recv_buffer=4096 tgt_net__send_buffer=4096 tgt_cache__size_mult=3 tgt_thread__ini_io=500 tgt_thread__tgt_io_main=500 tgt_umt_num=1200 tgt_umt_os_thrd=6 tgt_net__worker_thread=6 tgt_asyncio_max_req_per_file=400
Then restart your SDS by issuing the following command on each node:
The test bed I have consists of 4 ScaleIO VMs, each using an XtremSF Flash PCIe card as backend storage. I’ve created one volume of 2TB, given my 4 ESXi-hosts access to it, formatted it with VMFS5 and created 4 Ubuntu VMs with one drive each located on top of that ScaleIO volume. That second VMDK is also created using Eager Zeroed thick. It looks something like this:
In each Ubuntu VM, I’ve installed the load generating tool “fio”, which gives easy access to set things like block size, percent of read/write, if it should be random or not, etc. I’ve attached an example fio configuration file here:
[4k_random_read_90] # overwrite if true will create file if it doesn't exist # if file exists and is large enough nothing happens # here it is set to false because file should exist #rw= # read Sequential reads # write Sequential writes # randwrite Random writes # randread Random reads # rw Sequential mixed reads and writes # randrw Random mixed reads and writes rw=randrw # ioengine= # sync Basic read(2) or write(2) io. lseek(2) is # used to position the io location. # psync Basic pread(2) or pwrite(2) io. # vsync Basic readv(2) or writev(2) IO. # libaio Linux native asynchronous io. # posixaio glibc posix asynchronous io. # solarisaio Solaris native asynchronous io. # windowsaio Windows native asynchronous io. ioengine=libaio # direct If value is true, use non-buffered io. This is usually # O_DIRECT. Note that ZFS on Solaris doesn't support direct io. direct=1 # bs The block size used for the io units. Defaults to 4k. bs=4k # nrfiles= Number of files to use for this job. Defaults to 1. #filename - Set the device special file you need filename=/dev/sdb size=200g iodepth=64 numjobs=4 rwmixread=90
Past the content above into a file, name it “fio_4k_random_read_90” so you’ll know it’s for 4KB blocks, random read/write with a R/W ratio of 90/10. Then run it like this:
When running the fio workload from one Ubuntu VM, you will see some performance numbers immediately, and probably really good ones at that. What’s really cool though is when you run more than one fio workload, you’ll most probable see even more performance coming out of those HDDs/SSDs/PCIe cards that you have. So start up your engines!
When measuring performance, it’s easy to get lost in all the numbers flying by when using fio, so I suggest using the included ScaleIO dashboard. You can find the dashboard on the ScaleIO VM itself, it’s located under /opt/scaleio/ecs/mdm/bin/dashboard.jar. Just copy that to your own workstation and run it from there. When started, point it to your ScaleIO cluster IP, password not needed:
When connected, you’ll see something similar to this:
Yup, that’s one hundred thousand IOPS being handled by 4 ScaleIO VMs! Pretty crazy considering many other storage solutions would love to have numbers like this, and here we are with just 4 virtual machines and a few flash cards. I can definitely see a future for this product, what do you think?
Please comment below with your setup and your results!