Optimizing VMware ESXi iSCSI Performance with iSCSI Buffers
Optimizing VMware ESXi iSCSI performance involves a multi-faceted approach, touching on network configuration, ESXi settings, and even your storage array's capabilities. One of the common ways to improve iSCSI performance is network configuration (crucial for iSCSI) and ESXi host configuration. In this blog post, I’ll focus on improving iSCSI performance via ESXi host advanced configuration settings. To improve ESXi iSCSI performance via advanced settings, you're primarily looking at parameters that control how the ESXi host interacts with the iSCSI storage at a deeper level. These settings should always be modified with caution, preferably after consulting VMware (Broadcom) documentation or your storage vendor's recommendations, as incorrect changes can lead to instability or worse performance. Recommended steps for adjusting ESXi advanced settings: Understand your workload: Identify if your workload is sequential or random, small block or large block, read-heavy or write-heavy. This influences which settings might be most beneficial. Identify bottlenecks: Use esxtop, vCenter performance charts, and your storage array's monitoring tools to pinpoint where the bottleneck lies (host CPU, network, storage array controllers, disks). Consult documentation: Always refer to VMware's (Broadcom) official KBs and your storage vendor's best practices guides. Change one setting at a time: Make only one change, then thoroughly test and monitor the impact. This allows you to isolate the effect of each change. Make incremental adjustments: Don't make drastic changes. Increase/decrease values incrementally. Test in a lab: If possible, test performance changes in a lab environment before implementing them in production. Be prepared to revert: Make a note of default values before making changes so you can easily revert if issues arise. There are several ESXi advanced settings (VMkernel parameters) that can influence iSCSI performance, for example, iSCSI session cloning, DSNRO, iSCSI adapter device queue depth, MaxIoSize, and others. I’ll focus on the relatively new configuration setting available from vSphere 7.0 U3d onwards, which allows adjusting iSCSI socket buffer sizes. iSCSI Socket Buffer Sizes There are two advanced parameters for adjusting iSCSI socket buffer sizes: SocketSndBufLenKB and SocketRcvBufLenKB. Both those parameters control the size of the TCP send and receive buffers for iSCSI connections and are configurable via ESXi host advanced settings (go to Host > Configure > System > Advanced Settings > Search for ISCSI.SocketSndBufLenKB and ISCSI.SocketRcvBufLenKB). The receive buffer size affects read performance, while the send buffer size affects write performance. For high-bandwidth (10Gbps+) or high-latency networks, increasing these buffers can significantly improve TCP throughput by allowing more data to be "in flight" over the network. This is related to the bandwidth delay product (BDP); see details below. What Value Should I Use? These settings are tunable from vSphere 7.0 U3d onwards; the default values are set to 600KB for SocketSndBufLenKB and 256KB for SocketRcvBufLenKB and can be adjusted up to 6MB for both parameters. My recommendation is to calculate the BDP in your environment, adjust the iSCSI socket buffer sizes, test them, and monitor the results with esxtop (see a how-to below). Note that larger buffers consume more memory on the ESXi host. While generally not a major concern unless extremely large values are used, it's something to be aware of. Bandwidth Delay Product (BDP) Now, let’s take a closer look at the bandwidth delay product (BDP). BDP is a fundamental concept in networking that represents the maximum amount of data that can be in transit (on the "wire") at any given time over a network path. It's essentially the "volume" of the network pipe between two points. Why Is BDP Important for TCP/iSCSI? Transmission Control Protocol (TCP), which iSCSI relies on, uses a "windowing" mechanism to control the flow of data. The TCP send and receive buffers (also known as TCP windows) dictate how much data can be sent before an acknowledgment (ACK) is received. If your TCP buffers are smaller than the BDP, the TCP window will close before the network pipe is full. This means the sender has to stop and wait for ACKs, even if the network link has more capacity. This leads to underutilization of bandwidth and reduced throughput. If your TCP buffers are equal to or larger than the BDP, the sender can keep sending data continuously, filling the network pipe. This ensures maximum throughput and efficiency. When Is BDP Configuration Most Relevant? BDP configuration is important for: High-bandwidth networks: 10/25/40/50/100Gbps iSCSI networks High-latency networks: Stretched clusters, long-distance iSCSI, cloud environments, or environments with multiple network hops between ESXi and storage For typical 1Gbps iSCSI networks with low latency, the default buffer sizes are usually sufficient, as the BDP will likely be smaller than the defaults. However, as network speeds increase, accurately sizing your TCP buffers becomes more critical for maximizing performance. How to Calculate BDP BDP = Bandwidth (BW) × Round Trip Time (RTT) Where: Bandwidth (BW): The data rate of the network link, typically measured in bits per second (bps) or bytes per second (Bps). In ESXi contexts, this refers to the speed of your iSCSI NICs (e.g., 1Gbps, 10Gbps). Round Trip Time (RTT): The time it takes for a packet to travel from the sender to the receiver and back again, measured in seconds (or milliseconds, which then needs conversion to seconds for the formula). This accounts for network latency. Monitoring with esxtop Modifying ESXi advanced settings can yield significant performance benefits, but it requires a deep understanding of your environment and careful, methodical execution and monitoring. I highly recommend watching esxtop metrics for storage performance to monitor the results and see the outcomes of the above changes. How to Use esxtop The most common way is to SSH into your ESXi host, but you can also access the ESXi command line directly from the ESXi host console. Once you are there, type esxtop and press Enter. You'll see the CPU view by default. To get to the disk-related views, press one of the following keys: d (Disk Adapter View/HBA View): Shows performance metrics for your storage adapters (HBAs, software iSCSI adapters, etc.). This is useful for identifying bottlenecks at the host bus adapter level. u (Disk Device View/LUN View): Displays metrics for individual storage devices (LUNs or datastores). This is often the most useful view for identifying shared storage issues. v (Disk VM View/Virtual Machine Disk View): Shows disk performance metrics per virtual machine. This helps you identify which VMs are consuming the most I/O or experiencing high latency. Once you're in a disk view (d, u, or v), you can monitor these key storage metrics: Latency metrics (the most important): DAVG/cmd (device average latency): This tells you how long the storage array itself is taking to process commands. High DAVG often indicates a bottleneck on the storage array (e.g., slow disks, busy controllers, insufficient IOPS). KAVG/cmd (kernel average latency): This represents the time commands spend within the ESXi VMkernel's storage stack. High KAVG often points to queuing issues on the ESXi host. Look at QUED along with KAVG. If KAVG is high and QUED is consistently high, it suggests the ESXi host is queuing too many commands because the path to the storage (or the storage itself) can't keep up. This could be due to a low configured queue depth (Disk.SchedNumReqOutstanding, iscsivmk_LunQDepth) or a saturated network path. GAVG/cmd (guest average latency): This is the end-to-end latency seen by the virtual machine's guest operating system. It's the sum of DAVG + KAVG. This is what the VM and its applications are experiencing. If GAVG is high, you then use DAVG and KAVG to pinpoint where the problem lies. Thresholds: While specific thresholds vary by workload and expectation (e.g., database VMs need lower latency than file servers), general guidelines are: ~10-20ms sustained: Starting to see performance impact. >20-30ms sustained: Significant performance issues are likely. >50ms sustained: Severe performance degradation. I/O metrics: CMDS/s (commands per second): The total number of SCSI commands (reads, writes, and others like reservations). This is often used interchangeably with IOPS. READS/s/WRITES/s: The number of read/write I/O operations per second. MBREAD/s/MBWRTN/s: The throughput in megabytes per second. This tells you how much data is being transferred. Queuing metrics: QUED (queued commands): The number of commands waiting in the queue on the ESXi host. A persistently high QUED value indicates a bottleneck further down the storage path (either the network, the iSCSI adapter, or the storage array itself). This is a strong indicator that your queue depth settings might be too low, or your storage can't handle the incoming load. ACTV (active commands): The number of commands currently being processed by the storage device. QLEN (queue length): The configured queue depth for the device/adapter. Conclusion Modifying iSCSI socket buffer sizes is another method to tune the ESXi iSCSI connection for better performance. Together with other ESXi tunables, it can bring better performance to your storage backend. If the iSCSI connection is already tuned for maximum performance, another option is to implement a more modern protocol such as NVMe over TCP, which Pure Storage fully supports with our arrays.361Views2likes5CommentsHyper-V: The Municipal Fleet Pickup: Familiar, Capable, and Still Worth Considering
Hyper-V remains a practical, cost-efficient option for Windows-centric environments, offering strong features and seamless Azure integration. This blog explores where it shines, where it struggles, and how Pure ensures enterprise-grade data protection no matter which virtualization road you take.85Views1like0CommentsWho’s Driving Virtualization? Kicking Off the Road Trip
Over the years, VMware vSphere has been the gold-standard — the reliable luxury sedan of the datacenter. It’s delivered a smooth ride with powerful features, a robust ecosystem, and enough polish to keep your operations humming. Many of us have built entire practices, architectures, and skill sets around that platform. But with the Broadcom acquisition, the road has changed. New licensing structures, evolving product bundles, and operational shifts have created uncertainty — the equivalent of finding out that your well-loved sedan suddenly takes only premium-priced fuel and requires a new maintenance shop. So what are your options? Stick with VMware? That’s still a perfectly valid choice, especially if you double down on modernizing how you run it. Enhancing vSphere with Pure’s FlashArray, FlashBlade, and Fusion gives you ways to simplify, automate, and reduce costs, while maintaining that familiar driving experience. Look for an alternative? That’s where things get interesting. Because changing hypervisors isn’t like changing lanes on the highway — it’s more like switching to a whole different vehicle, with a new dashboard, new handling, new maintenance, and a different driving style. Framing the Conversation: The Virtualization Vehicle Metaphor In our session, we used a driving metaphor to illustrate these choices: 🚛 Hyper-V — The Municipal Fleet Pickup Reliable, widely available, low-cost. If you know Windows, you know Hyper-V, and the licenses may already be in your toolbox. 🚙 Nutanix AHV — The Retro-Modern Concept Car Streamlined, integrated, designed for simplicity. An HCI approach that reimagines what virtualization can look like. 🚐 Azure Local — The Electric Sprinter Van Hybrid-ready, with a familiar dashboard if you live in the Microsoft ecosystem. Built for flexible, modern routes. 🚗 AWS Outposts — The Off-Road Luxury SUV The same AWS powertrain, but adapted to handle rugged hybrid terrain on-premises. 🏎️ KVM — The EV Sports Car That’s Actually a Customized Japanese Compact (or maybe a well-used Ranger) Flexible, open-source, highly modifiable — but definitely for drivers who are ready to get their hands dirty and do their own tuning. Each of these “vehicles” comes with a different mix of: ✅ migration effort ✅ operational changes ✅ skill requirements ✅ and data protection needs Key Takeaways from Accelerate Here’s what stood out during our live session and the conversations that followed: ✅ There is no drop-in replacement for VMware. Each platform brings its own challenges and benefits. ✅ Migration is not just technical — it’s cultural, operational, and often requires reskilling your team. ✅ Modernizing with vSphere is still a strong path — with storage, automation, and security improvements, you can get more from what you already own. ✅ Pure is built to be your co-pilot — no matter which hypervisor you choose, we’re there to help you protect, manage, and move data seamlessly. One theme that resonated was that the driver matters as much as the car. Your organization’s skills, processes, and risk tolerance all shape which road makes sense. You can’t pick a new hypervisor in a vacuum — you have to look at what you can maintain, what you can train for, and what you can support. Where We’re Going with This Series We had a ton of material packed into Accelerate — far more than fits in a single session recap. So here on Pure Community, I’ll be breaking down each of these hypervisors in detail, one at a time. Here’s what you can expect: 🚛 Hyper-V We’ll dig into its Windows ecosystem strengths, where it works well, and what trade-offs come with a move from VMware. 🚙 Nutanix AHV Here we will take a look at how a platform that once was a integrated HCI. Can offer simplicity when it meets enterprise-grade capabilities on Pure Storage — and where it may leave gaps. 🚐 Azure Local Let's explore the strength is a hybrid-ready strategy. The native integrations, and what to watch out for when moving workloads from traditional hypervisors. 🚗 AWS Outposts Together we’ll break down why Outposts is not just a “VMware replacement,” but really an AWS extension, with its own Day 2 and migration realities. 🏎️ KVM We’ll explore the open-source options, why so many see it as a cost-saver, and the skills you’ll need to manage it at scale. So, Who’s Driving? My biggest takeaway from Accelerate is this: Your hypervisor journey is less about the technology, and more about the people and processes behind it. Every route — modernize VMware, switch to a new platform, or blend hybrid options — has trade-offs. But with the right planning, the right skills, and the right partners, you can make the journey smoother. Pure is committed to being your co-pilot, whichever path you choose. Whether you’re rolling out Fusion, looking to modernize with FlashArray, or exploring migration options, our ecosystem and integrations are designed to keep your data resilient, performant, and simple to manage. Join the Discussion I’d love to hear from you: 🚗 Are you staying on VMware? 🚗 Modernizing your vSphere environment? 🚗 Kicking the tires on an alternative hypervisor? What worries you? What excites you? Drop a comment below — to keep the conversation going. Let’s keep mapping this road trip — together.50Views3likes0Comments