Splunk on VMware and EMC ScaleIO – A quick index performance test

The last few weeks I’ve been getting acquainted with Splunk, a powerful tool for searching, analysing and visualising logs and events that happen in your infrastructure, live application performance and any type of machine generated data. I read the performance blog post that Splunk had previously done on physical bare-metal hardware and Amazon ECS instances, and wanted to se what I could get in a virtual environment on top of EMC’s scale-out block storage ScaleIO (which I’ve written several posts on here).

Generally speaking, virtualising Splunk has been frowned upon as Splunk consumes a lot of resources, more and more as you add more data ingestion and more searches. Physical bare-metal servers have been the de facto standard for Splunk servers for years, but I still wanted to see what we could do with virtual instances of it. Here’s the setup:

4 Splunk 6.0 servers, configured in a VMware environment with 12 vCPUs and 12 GB RAM as is recommended in the Splunk Enterprise installation guide. 
Each Splunk server has a ScaleIO volume attached to it for the entire /opt/splunk directory, containing the Splunk installation and all log and index files.
These ScaleIO volumes are running on top of EMC’s XtremSF PCIe Flash cards.

For the tests I used a standard tool for performance testing of Splunk, namely Splunkit. This tool can be used for generating a large log file, which can then be tested and indexed by Splunk itself.

To configure Splunkit like I did, edit the file called “pyro.properties” like this:

### SPLUNKIT PROPERTIES ###
# SPLUNK_HOME, the absolute path to the Splunk installation on this machine,
# e.g: on Linux: /home/user/splunk, usually ending in "/splunk"
# e.g. on Windows: C:\Program Files\Splunk

SPLUNK_HOME = /opt/splunk

# Host or IP of the server machine (this machine), as seen by the SplunkIt search user
# Server test process will bind to this address
# User's server_host (defined in splunkit-user/pyro.properties) must match this for proper test operation
# If left blank, will default to this machine's hostname

server_host = 127.0.0.1

# Admin-level login credentials of the Splunk instance
username = admin
password = yourpasswordhere

static_filesize_gb = 150

Then, create the log file by running the following command in the splunkit-server directory:

python bin/gendata.py

When the data has been generated, start the index test by running this command in the same directory:

python bin/indextest.py

Now login to your Splunk instance, and go to the Splunk-on-Splunk tab, and you should see something like this:

wpid-Screen-Shot-2014-01-24-at-13.16.17.png

That graph will show you the current estimated indexing rate, which is always interesting (this one shows close to 30000KB/sec). But if you want to compare your indexing performance to other benchmarks, you can click the “View results” link to get to another search, and enter the following search term:

index=_internal host=“localhost.localdomain” source=“*metrics.log” eps=“*” group=per_index_thruput series=splunkit_idxtest

This will give you a view of your current “eps”, events per second, which you can then compare to other benchmarks like the ones I mentioned in the beginning of this post.

So what eps values did I get out of my virtualised Splunk Enterprise environment? Pretty good ones I must say. And note that this is on a ScaleIO shared scale-out block storage, not individual independent local drives in each server. Also, it’s one volume per server, not a striped volume across multiple virtual drives. So no LVMs or anything like that, and regular ext4 filesystems without any tuning. Your basic server setup so to say :)

System Splunk Version Virtual Hardware Average EPS
Splunk-Index1 6.0 12 vCPUs, 12 GB RAM 86931 eps
Splunk-Index2 6.0 12 vCPUs, 12 GB RAM 90242 eps
Splunk-Index3 6.0 12 vCPUs, 12 GB RAM 87199 eps
Splunk-Index4 6.0 12 vCPUs, 12 GB RAM 92792 eps

So as you can see, we’re surpassing the performance numbers of the tests mentioned before, which is great! However, it will be even more interesting when we continue to do massive log input and then add searches on top, to see if we can maintain performance or not. And according to the performance number we get from the ScaleIO environment (see below), we’re nowhere near saturated on disk right now, which hopefully means that we can squeeze out the searches without a heavy impact on the indexing performance.

Screen Shot 2014-01-24 at 13.10.03

Stay tuned for the next installment of these posts :)

About these ads

About Jonas Rosland

Solutions architect at Office of the CTO at EMC
This entry was posted in Big Data, Converged Infrastructure, EMC, Experiment, ScaleIO, Splunk, VMware, vSphere5 and tagged , . Bookmark the permalink.

7 Responses to Splunk on VMware and EMC ScaleIO – A quick index performance test

  1. Rob Steele says:

    Great test Jonas, and even better results ! I’m about to deploy splunk and scaleio myself, this has been a big help.

  2. virtualprouk says:

    Great test my friend, and going by the results in the other blog post you are beating dedicated hardware with RAID 0 and RAID 1+0 by a clear 10,000 events per second. Were there other results around average search as per the Splunk performance post? Interested to know was it 4 VM’s across 4 physical hosts or was there some consolidation in there (4 VMs on 2 hosts, 4 VMs on 1 host).

    • Jonas Rosland says:

      It was one Splunk VM per physical host, so you could divide your Splunk environment up like this and still run other applications next to them.

  3. matthiasby says:

    Great work! Interesting Insights

  4. matthiasby says:

    well done – nice test

  5. Ian A. Thompson says:

    Reblogged this on Ian Thompson's Technology Blog and commented:
    Great way to do testing on Splunk. Glad to see EPS numbers on ScaleIO in the 80-90K region. Splunk doesn’t like to talk EPS, mostly because horizontal scaling will let you achieve EPS^n factor. Unfortunately, so many users are educated by appliance vendors that push hardware based on EPS alone. Ends up being a bigger issue than it really is. Numbers like this provide great ammunition when talking to people who don’t yet fully understand how Splunk works.

  6. Pingback: Cody Hosterman | Symmetrix VMAX and Splunk introduction

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s