Characterizing Workloads in Netapp

by Ankam Ravi Kumar · December 8, 2015

Characterizing Workloads in Netapp

One of the most important things when analyzing Data ONTAP performance is being able to characterizing the workload that you’re seeing. This can be done a couple of ways, but i will outline one here. Characterizing Workloads in Netapp

Note that this is not a primer on analyzing storage bottlenecks – for that, you will have to attend insight this year! – but rather a way to see how much data a system is serving out and what kind of data it is.

When it comes to characterizing workloads on a system, everyone has different approach and that’s totally fine. I tend to start wit a few basic questions.

Where is the data being served from?
how much is being served?
what protocols are doing the work?
what kind of work are the protocols doing?

You can do this with the stats command if you know what counters to look for. If you find a statistic and you don’t know what is it means, we have a good help system at hand. To wit and to use stats explain counters:

#rsh ravi "priv set -q diag; stats explain counters readahead:readahead:incore"

Counters for object name: readahead
Name: incore
Description: Total Number of blocks that were already resident
Properties: delta
Unit: none

Okay, So that’s not the greatest explanation, but it’s a start! More commonly-used counters are going to have better descriptions.
When it comes to analyzing your workload, a good place to start are the read_io_type counters in the

# rsh ravi "priv set -q diag; stats show wafl:wafl:read_io_type"
 wafl:wafl:read_io_type.cache:0%
 wafl:wafl:read_io_type.ext_cache:0%
 wafl:wafl:read_io_type.disk:0%
 wafl:wafl:read_io_type.bamboo_ssd:0%
 wafl:wafl:read_io_type.hya_hdd:0%
 wafl:wafl:read_io_type.hya_cache:0%

Here is an explanation of these counters:

cache are reads served from the Filer’s RAM
ext_cache are reads served from the Filer’s Flash Cache
disk are reads served from any model of spinning disk
bamboo_ssd are reads served from any model of solid-state disk
hya_hdd are reads served from the spinning disks that are part of a Flash Pool
hya_cache are reads served from the solid-state disks that are part of a Flash Pool

Note that these counters are present regardless of your system’s configuration. For example, the Flash Cache counters will be visible even if you don’t have Flash Cache installed. Characterizing Workloads in Netapp

Now that you know where your reads are being served from, you might be curious as to how big the reads are, and if they’re random or sequential. Those statistics are available, but it takes a bit more grepping and cutting to get them. Here’s a little from my volume, sefiles. Note that I need to prefix this capture with priv set:

# rsh ravi "priv set -q diag; stats start -I perfstat_nfs"
# rsh ravi "priv set -q diag; stats stop -I perfstat_nfs"

Okay, so there’s a lot of output there. It’s data for the entire system. That said, it’s a great way to learn — you can see everything the system sees. And if you dump it to a text file, it will save you using rsh over and over again! Let’s use rsh again and drill down to the volume stats, which is the most useful way to investigate a particular volume:

# rsh ravi "priv set -q diag; stats show volume:sefiles"
 volume:sefiles:instance_name:sefiles
 volume:sefiles:instance_uuid:a244800b-5506-11e1-91cc-123478563412
 volume:sefiles:parent_aggr:aggr1
 volume:sefiles:avg_latency:425.77us
 volume:sefiles:total_ops:93/s
 volume:sefiles:read_data:6040224b/s
 volume:sefiles:read_latency:433.44us
 volume:sefiles:read_ops:92/s
 volume:sefiles:write_data:0b/s
 volume:sefiles:write_latency:0us
 volume:sefiles:write_ops:0/s
 volume:sefiles:other_latency:12.17us
 volume:sefiles:other_ops:1/s
 volume:sefiles:nfs_read_data:6040224b/s
 volume:sefiles:nfs_read_latency:433.44us
 volume:sefiles:nfs_read_ops:92/s
 volume:sefiles:nfs_write_data:0b/s
 volume:sefiles:nfs_write_latency:0us
 volume:sefiles:nfs_write_ops:0/s
 volume:sefiles:nfs_other_latency:15.09us
 volume:sefiles:nfs_other_ops:0/s

I’ve removed some of the internal counters for clarity. Now, let’s take a look at a couple of things: the volume name is listed as selfies in the instance_name variable. During the time I was capturing statistics, the average amount of operations per second was 93 (“total_ops“). Read latency (“read_latency“) was 433us, which is 0.433ms. Notice that we did no writes (“write_ops”) and basically no metadata either (“other_ops”).

If you’re interested in seeing what the system is doing in general, I tend to go protocol-by-protocol. If you grep for nfsv3, you can start to wade through things:

# rsh ravi "priv set -q diag; stats show nfsv3:nfs:nfsv3_read_latency_hist"
 nfsv3:nfs:nfsv3_read_latency_hist.0 - <1ms:472
 nfsv3:nfs:nfsv3_read_latency_hist.1 - <2ms:7
 nfsv3:nfs:nfsv3_read_latency_hist.2 - <4ms:10
 nfsv3:nfs:nfsv3_read_latency_hist.4 - <6ms:8
 nfsv3:nfs:nfsv3_read_latency_hist.6 - <8ms:7
 nfsv3:nfs:nfsv3_read_latency_hist.8 - <10ms:0
 nfsv3:nfs:nfsv3_read_latency_hist.10 - <12ms:1

During our statistics capture, approximately 472 requests were served at 1ms or better. 7 requests were served between 2ms and 1ms. 10 requests were served between 4ms and 2ms, and so on. If you keep looking, you’ll see write latency as well:

# rsh ravi "priv set -q diag; stats show nfsv3:nfs:nfsv3_write_latency_hist"
 nfsv3:nfs:nfsv3_write_latency_hist.0 - <1ms:1761
 nfsv3:nfs:nfsv3_write_latency_hist.1 - <2ms:0
 nfsv3:nfs:nfsv3_write_latency_hist.2 - <4ms:2
 nfsv3:nfs:nfsv3_write_latency_hist.4 - <6ms:1
 nfsv3:nfs:nfsv3_write_latency_hist.6 - <8ms:0
 nfsv3:nfs:nfsv3_write_latency_hist.8 - <10ms:0
 nfsv3:nfs:nfsv3_write_latency_hist.10 - <12ms:0

So that’s 1761 writes at 1ms or better. Your writes should always show excellent latencies as writes are acknowledged once they’re committed to NVRAM — the actual writing of the data to disk will occur at the next consistency point.

Keep going and you’ll see latency figures that are concatenated for all types of requests — reads, writes, other (i.e. metadata):

# rsh ravi "priv set -q diag; stats show nfsv3:nfs:nfsv3_latency_hist"
 nfsv3:nfs:nfsv3_latency_hist.0 - <1ms:2124
 nfsv3:nfs:nfsv3_latency_hist.1 - <2ms:5
 nfsv3:nfs:nfsv3_latency_hist.2 - <4ms:16
 nfsv3:nfs:nfsv3_latency_hist.4 - <6ms:18
 nfsv3:nfs:nfsv3_latency_hist.6 - <8ms:13
 nfsv3:nfs:nfsv3_latency_hist.8 - <10ms:6
 nfsv3:nfs:nfsv3_latency_hist.10 - <12ms:1

Once you’ve established latencies, you may be interested to see the I/O sizes of the actual requests. Behold:

# rsh ravi "priv set -q diag; stats show nfsv3:nfs:nfsv3_read_size_histo"
 nfsv3:nfs:nfsv3_read_size_histo.0-511:0
 nfsv3:nfs:nfsv3_read_size_histo.512-1023:0
 nfsv3:nfs:nfsv3_read_size_histo.1K-2047:0
 nfsv3:nfs:nfsv3_read_size_histo.2K-4095:0
 nfsv3:nfs:nfsv3_read_size_histo.4K-8191:328
 nfsv3:nfs:nfsv3_read_size_histo.8K-16383:71

These statistics are represented like the latency were: 0 requests were served that were between 0-511 bytes in size, 0 between 512 & 1023 bytes, and so on; then, 328 requests were served that were between 4KB & 8KB in size and 71 requests between 8KB & 16KB.

When it comes to protocol operations, we break things down by the unique operation functions — including metadata operations. I’ll show the count numbers, but it’ll also show percentages of requests and individual latency for each: Characterizing Workloads in Netapp

# rsh ravi "priv set -q diag; stats show nfsv3:nfs:nfsv3_op_count"
 nfsv3:nfs:nfsv3_op_count.getattr:4
 nfsv3:nfs:nfsv3_op_count.setattr:0
 nfsv3:nfs:nfsv3_op_count.lookup:0
 nfsv3:nfs:nfsv3_op_count.access:1
 nfsv3:nfs:nfsv3_op_count.readlink:0
 nfsv3:nfs:nfsv3_op_count.read:424
 nfsv3:nfs:nfsv3_op_count.write:1767
 nfsv3:nfs:nfsv3_op_count.create:0
 nfsv3:nfs:nfsv3_op_count.mkdir:0
 nfsv3:nfs:nfsv3_op_count.symlink:0
 nfsv3:nfs:nfsv3_op_count.mknod:0
 nfsv3:nfs:nfsv3_op_count.remove:0
 nfsv3:nfs:nfsv3_op_count.rmdir:0
 nfsv3:nfs:nfsv3_op_count.rename:0
 nfsv3:nfs:nfsv3_op_count.link:0
 nfsv3:nfs:nfsv3_op_count.readdir:0
 nfsv3:nfs:nfsv3_op_count.readdirplus:0

In this case, most of my requests were reads & writes; there was very little metadata operation (just 4 attribute lookups). If you want to know what these operations represent, you can read some of my previous blog entries or check out the RFC that defines your protocol.

One last thing that I forgot to add is trying to figure out if your workload is random or sequential. The easiest way that I’ve found to do that is to look in the readahead stats, as the readahead engine is a particularly useful way of determining how your reads are operating. We don’t need to focus on writes as much because any write much larger than 16KB is coalesced in memory and written sequentially — even if it started as random at the client. (I’ll talk about RAVE, the new readahead engine in DOT 8.1+, in later posts.)

I’ve filtered out some of the irrelevant stuff. So let’s gather what we can use and see:

# rsh ravi "priv set -q diag; stats show readahead:readahead:total_read_reqs"
 readahead:readahead:total_read_reqs.4K:629
 readahead:readahead:total_read_reqs.8K:77
 readahead:readahead:total_read_reqs.12K:0
 readahead:readahead:total_read_reqs.16K:4
 readahead:readahead:total_read_reqs.20K:1
 readahead:readahead:total_read_reqs.24K:7
 readahead:readahead:total_read_reqs.28K:8
 readahead:readahead:total_read_reqs.32K:0
 readahead:readahead:total_read_reqs.40K:0
 readahead:readahead:total_read_reqs.48K:0
 readahead:readahead:total_read_reqs.56K:0
 readahead:readahead:total_read_reqs.64K:77662

These statistics work much the same as our previous ones. 629 read requests were serviced that were 4KB blocks, 77 were serviced that were 8KB blocks, etc. The numbers continue right up to 1024KB. (Keep this in mind whenever someone tells you that WAFL & ONTAP can only read or write 4KB at a time!) Now if you want to see sequential:

# rsh ravi "priv set -q diag; stats show readahead:readahead:seq_read_reqs"
 readahead:readahead:seq_read_reqs.4K:36%
 readahead:readahead:seq_read_reqs.8K:46%
 readahead:readahead:seq_read_reqs.12K:0%
 readahead:readahead:seq_read_reqs.16K:0%
 readahead:readahead:seq_read_reqs.20K:0%
 readahead:readahead:seq_read_reqs.24K:0%
 readahead:readahead:seq_read_reqs.28K:0%
 readahead:readahead:seq_read_reqs.32K:0%
 readahead:readahead:seq_read_reqs.40K:0%
 readahead:readahead:seq_read_reqs.48K:0%
 readahead:readahead:seq_read_reqs.56K:0%
 readahead:readahead:seq_read_reqs.64K:96%

We can see that, of the 4KB reads that we serviced, 36% were sequential. Mathematically this means that ~64% would be random, which we can confirm here:

# rsh ravi "priv set -q diag; stats show readahead:readahead:rand_read_reqs"
 readahead:readahead:rand_read_reqs.4K:63%
 readahead:readahead:rand_read_reqs.8K:53%
 readahead:readahead:rand_read_reqs.12K:0%
 readahead:readahead:rand_read_reqs.16K:100%
 readahead:readahead:rand_read_reqs.20K:100%
 readahead:readahead:rand_read_reqs.24K:100%
 readahead:readahead:rand_read_reqs.28K:100%
 readahead:readahead:rand_read_reqs.32K:0%
 readahead:readahead:rand_read_reqs.40K:0%
 readahead:readahead:rand_read_reqs.48K:0%
 readahead:readahead:rand_read_reqs.56K:0%
 readahead:readahead:rand_read_reqs.64K:3%

Here, of our 4KB requests, 63% of them are random. Of our 8KB requests, 53% were random. Notice that the random percentages and sequential percentages for each block histogram will add up to ~100% depending on rounding. If you dig deeper into the stats, readahead can provide you with some useful information about how your data came to be read:

# rsh ravi "priv set -q diag; stats readahead:readahead"
 readahead:readahead:requested:2217395
 readahead:readahead:read:82524
 readahead:readahead:incore:25518
 readahead:readahead:speculative:79658
 readahead:readahead:read_once:82200

You can interpret those numbers thus. Again, I’ve removed the irrelevant stuff:

requested is the sum of all blocks requested by the client
read are the blocks actually read from disk (where the readahead engine had a “miss”)
incore are the blocks read from cache (where the readahead engine had a “hit”)
speculative are blocks that were readahead because we figured the client might need them later in the I/O stream
read_once are blocks that were not read ahead and was not cached because it didn’t match the cache policy (e.g. flexscale.enable.lopri_blocks being off)

With this information, you should have a pretty good idea about how to characterize what your Filer is doing for workloads. Although I’ve focused on NFSv3 here, the same techniques can be applied to the other protocols. Characterizing Workloads in Netapp Characterizing Workloads in Netapp Characterizing Workloads in Netapp Characterizing Workloads in Netapp Characterizing Workloads in Netapp Characterizing Workloads in Netapp Characterizing Workloads in Netapp Characterizing Workloads in Netapp Characterizing Workloads in Netapp Characterizing Workloads in Netapp Characterizing Workloads in Netapp Characterizing Workloads in Netapp

Your feedback is more valuable to us ….

Thanks for your wonderful Support and Encouragement