This article is intended for administrators wishing to better understand SvSAN best practices and other factors that affect both performance and failover.
Resolution/Information
Storage
Storage Mediums
If using SvSAN predictive storage caching (included in SvSAN Advanced Edition), a tiered (or 'hybrid') configuration of memory, solid-state disk and hard disk is recommended, as this provides both capacity and performance, but for lower cost than if only memory and SSD were to be used. Memory and SSD provides predictive storage caching; HDD provides storage capacity. A relatively small amount of SSD in the SvSAN storage medium mix can offer significant performance benefits while keeping costs to a minimum. The cost versus benefit of the different storage mediums is detailed further in the white paper: Closing the Performance Gap – SvSAN’s Caching Features.
A wide range of hardware configuration is suitable for use with SvSAN. StorMagic can provide example bills of materials on various hardware vendors.
If mirroring to hosts with different storage mediums (which is supported) – for example, mirroring a single HDD to an NVMe – write performance will be limited to the speed of the slower storage medium.
Predictive storage caching
For SSD read/write caching, SvSAN supports 2.5” form factor or PCIe cards; memory provides fast read caching. The use of resilient cache stores of RAID 1 or more is best practice, again to provide internal resiliency if a node is down for maintenance; however, a single SSD per node is supported. The cache storage can be protected by hardware RAID, or software RAID 1 provided by SvSAN. As the cache is servicing both read and write I/Os, the use of mixed use SSDs is recommended, or at least an appreciation of the IOPS capabilities of the SSDs in use. Enterprise-grade SSDs or better are recommended due to their higher endurance rating. Therefore, consumer-grade SSDs are not recommended for caching.
The SvSAN SSD caching layer is atomic. This is not a discardable caching layer, which is why we strongly recommend a minimum of RAID 1.
Write caching behavior – all writes are completed to both sides of the mirror, and are then serviced directly off the cache tier. In the background this data is coalesced, where possible, and flushed to HDD sequentially. It is beneficial to maintain some headroom in your cache in order to ensure that writes can be serviced in an optimal manner.
Read caching behavior – SvSAN tracks frequently read blocks of data and then dynamically pulls those up into free capacity within the SSD and memory cache tiers, if present. As the workload changes this will dynamically change over time. However, with a fairly static workload the hit rate will improve over time, thus minimizing impact from things like nightly/weekly backups. Memory caching works best for workloads where the number of small random reads is small.
https://support.stormagic.com/hc/en-gb/articles/5887719016989-SvHCI-SvSAN-Caching
UNMAP support for SSD-backed targets
SvSAN supports UNMAP for targets that have SSD as the underlying storage. UNMAP is the name for TRIM in the SCSI command set. UNMAP provides increased SSD write performance and prolonged life of SSDs by reducing write amplification. An UNMAP command from the operating system informs the SSD which sectors contain data that has been deleted, so that those sectors can be reused without having to write that data to other parts of the SSD first. It is best practice to enable UNMAP, on SSD based targets, when prerequisites are met.
UNMAP cannot be used on targets with SSD caching enabled or those encrypted using SvSAN data encryption. See Target UNMAP for a full list of prerequisites and further information about this feature.
Pool storage assignment: direct disk mapping versus virtual hard drives
The use of direct mapped disks with SvSAN is recommended: raw device mapping (in vSphere); pass-through LUN (in Hyper-V). This prevents file system layering of virtual hard disks (such as VMDKs on top of an existing VMFS in a VMware vSphere environment).
Virtual hard disks are set eager zeroed at VSA deployment, to provide best performance, but will take longer to deploy than lazy zeroed.
If the underlying VMFS fills, it may alarm, while the represented and mirrored VMDK that SvSAN is sharing may have plenty of free space. In this situation, it may be necessary to disable the alarm on the underlying datastore.
RAID card
StorMagic recommends the use of internal hardware RAID within a host to ensure internal node protection when a node is down for maintenance. SvSAN supports any RAID card that is supported by the hypervisor. SvSAN supports all RAID types. If using parity RAID (for example, RAID 5 or RAID 6), it is important to employ a battery-backed cache/NVRAM module to enable good write performance, by using write-back rather than write-through mode. Note that SvSAN performance is dependent on the speed of the underlying hardware.
Witness for mirrored targets
To provide full data integrity in case of failure of a mirrored target, it is recommended to use a witness. A witness holds mirror state and acts as arbiter when the two sides of the mirror are uncertain as to which has the correct state. A witness is a third party machine with the StorMagic Witness Service deployed to it. The Witness Service can be deployed to either a Windows or Linux OS on either a physical machine, or a virtual machine (off the HA cluster). StorMagic also offers a Witness appliance VM.
When you use a witness, your mirrored targets should use the Majority mirror isolation policy.
A witness must not run on a machine that hosts a VSA in the SvSAN cluster that the witness protects.
For stretch cluster configurations, ideally the witness should be located at a third site to provide quorum in a site-failure scenario.
Snapshots (VMware/Hyper-V)
StorMagic does not recommend taking snapshots of the VSA VM with VMware/Hyper-V snapshots or with backup tools such as VEEAM. Doing so stuns (freezes) the VSA in order for the VMware APIs to complete the snapshot.
This has an impact on the guest VMs running on the SvSAN storage, stunning their I/O and impacting production.
SvSAN VSAs backup their configuration to each other automatically when mirroring so they don’t need to be managed by external backup/snapshot tools.
https://support.stormagic.com/hc/en-gb/articles/5729811555869-Should-I-backup-my-SvSAN-VSAs
Networking
This section describes two typical networking configurations. We recommend the first, using three SvSAN network interfaces and directly connected NICs. The second uses two interfaces and switched networking.
Where possible, one NIC per switch for mirroring and iSCSI is recommended, as this allows the VSA to handle balancing, failover and bandwidth. Using other methods prevents us from maximizing bandwidth as technologies such as LACP and Channeling will not increase the bandwidth available, since they are limited to handling single streams over one cable per 'bundle', and do not balance by frames.
In a 10 Gb network environment with two 10 Gb links used in this manner, or more, it can be advantageous to enable Round Robin and change the IOPS for this to 1 at the hypervisor level (see https://kb.vmware.com/s/article/2069356). This can boost performance by leveraging both VSAs.
SvSAN supports any network topology, including directly connected ('back to back' links), and the following types of switched connections: LACP, EtherChannel, VLAN/subnetting, etc.
Three SvSAN network interfaces
The example configuration used directly connected network interface cards. As such, two different storage IP subnets are utilized to prevent routing issues. Switched connections are also fully supported. The diagram and table below show example network address allocations.
HOST 1 | HOST 2 | ||
---|---|---|---|
Purpose | IP address | Purpose | IP address |
Host 1 management connection (Hostname: host1.example.com) |
10.1.100.11/24 |
Host 2 management connection (Hostname: host2.example.com) |
10.1.100.12/24 |
Host 1 storage connection 1 | 192.168.1.1/24 | Host 2 storage connection 1 | 192.168.1.2/24 |
Host 1 storage connection 2 | 192.168.2.1/24 | Host 2 storage connection 2 | 192.168.2.2/24 |
VSA 1 management connection (Hostname: VSAhost1.example.com) |
10.1.100.13/24 |
VSA 2 management connection (Hostname: VSAhost2.example.com) |
10.1.100.14/24 |
VSA 1 iSCSI and mirror connection 1 | 192.168.1.11/24 | VSA 2 iSCSI and mirror connection 1 | 192.168.1.12/24 |
VSA 1 iSCSI and mirror connection 2 | 192.168.2.11/24 | VSA 2 iSCSI and mirror connection 2 | 192.168.2.12/24 |
Default gateway | 10.1.100.254/24 | Default gateway | 10.1.100.254/24 |
DNS name server (primary) | 10.1.100.2/24 | DNS name server (primary) | 10.1.100.2/24 |
DNS name server (secondary) | 10.1.100.3/24 | DNS name server (secondary) | 10.1.100.3/24 |
The use of back-to-back cables frees up switch ports, enables load balancing, and potentially enables 10 Gb speed without the need for a 10 Gb switch. Note the back-to-back links are on two vSwitches and two different IP subnets. This is important to prevent issues with the host routing to the NIC it cannot reach due to the directly connected cabling.
Two SvSAN network interfaces
This example configuration teams NICs together using switched networking. The diagram and table below show example network address allocations.
HOST 1 | HOST 2 | ||
---|---|---|---|
Purpose | IP address | Purpose | IP address |
Host 1 management connection (Hostname: host1.example.com) |
10.1.100.11/24 |
Host 2 management connection (Hostname: host2.example.com) |
10.1.100.12/24 |
Host 1 storage connection 1 | 192.168.1.1/24 | Host 2 storage connection 1 | 192.168.1.2/24 |
VSA 1 management connection (Hostname: VSAhost1.example.com) |
10.1.100.13/24 |
VSA 2 management connection (Hostname: VSAhost2.example.com) |
10.1.100.14/24 |
VSA 1 iSCSI and mirror connection 1 |
192.168.1.11/24 |
VSA 2 iSCSI and mirror connection 1 |
192.168.1.12/24 |
Default gateway | 10.1.100.254/24 | Default gateway | 10.1.100.254/24 |
DNS name server (primary) | 10.1.100.2/24 | DNS name server (primary) | 10.1.100.2/24 |
DNS name server (secondary) | 10.1.100.3/24 | DNS name server (secondary) | 10.1.100.3/24 |
Two switches and physical NICs to minimize the networking configuration. To obtain 10 Gb speeds or faster, at least a 10 Gb switch would be required.
Multipathing
SvSAN presents the mirrored storage as an active-active multipath disk device over iSCSI. The key benefit here is that both nodes can be utilized for read I/Os. The multipathing is handled by the hypervisor. All multipath policies are supported. VSAs integrate with VMware ESXi, and, by default, set a fixed local preferred active I/O path policy, so that each VM accesses its local host’s disk for reads rather than hitting the physical networking. If the VM is migrated again, the I/O switches to the local storage of the other host. Alternatively, if in a heavy read I/O environment, Round Robin could be used with fast interconnect networking to read from both nodes.
See also
VMware doc: Path Selection Plug-Ins and Policies
Jumbo frames
SvSAN supports jumbo frames (up to 9000 MTU). It is recommended to employ them, with 10 Gb (or faster) networking.
VMware vSphere is Linux based, and a 9000 MTU must be set on vSwitch, VMkernel port, SvSAN logical NIC, and every physical interconnect (physical switch) between.
For Microsoft Hyper-V, settings include a header, and the size is usually set to 9014. But for SvSAN the size should be set to 9000, not 9014.
iSCSI port bindings
Port binding is designed to lock paths to certain hardware from specific endpoints, and in this virtualized scenario could cause issues. Due to the way SvSAN creates a 'virtual' SAN, across two physical pieces of hardware, using port binding would prevent some iSCSI failover scenarios, and is therefore not recommended.
VSA resources
Number of vCPUs
By default, SvSAN deploys VSAs with a single vCPU. Additional vCPUs will be required if using predictive storage caching ('caching') or data encryption.
SvSAN Usage | Number of vCPUs required | Notes |
---|---|---|
No caching or data encryption | 1 | This is the default number of vCPUs set on deployment |
Caching, no data encryption |
2 |
If using SSD read/write and memory read caching in a high performance environment, the addition of one vCPU to the VSA will spread the work across the two cores. Because SvSAN data encryption is not going to be used, it is not necessary to use more than two vCPUs because SvSAN works best by having one core for handling caching processes, and the other core dedicated to processing I/O without interruption; having more than two would impact SvSAN‘s I/O processing, and therefore performance. |
Data encryption, no caching | More than 1 | When using data encryption, for high performance consider employing additional vCPUs. SvSAN creates one thread per core offering maximum throughput of around 45M/s per core. Thus an extra four cores provides throughput of around 180M/s. You can hot add extra CPU cores and the VSA will dynamically add additional encryption threads accordingly. This will increase performance of data encryption up until the point where the CPU is no longer the limiting factor (the network or storage speed becomes the limiting factor). This point can be determined by performance testing. |
Caching and data encryption |
More than 2 |
Two vCPUs for caching, plus extra for data encryption (see note for data encryption above). |
https://support.stormagic.com/hc/en-gb/articles/5978263848861-SvHCI-SvSAN-Encryption
vCPU clock speed
Reservations set on deployment are as follows:
- If CPU < 2 GHz, reservation = CPU core speed. For example, a VSA deployed on a 1.8 GHz CPU core gets a reservation of 1.8 GHz
- If CPU ≥ 2 GHz, reservation = 2 GHz (note this is a maximum value set on deployment, but can be manually increased post-deployment)
Remaining CPU resources are then available to guest VMs.
It is best practice that the CPU reservation for the VSA match the base core clock speed of a single core of the host hardware. For example, for a host CPU clock speed of 3 GHz, you should increase (post-deployment) the reservation of the VSA to 3 GHz to match.
Increasing CPU clock speed
In the past, the bottleneck in virtualized environments was often the storage, due to the seek times of spinning disks and the random I/O nature of having many applications running on the same storage, or the speed of the networking. Today, this can be alleviated by using high performance storage (such as NVMe) together with fast networking (10 Gb or greater). To ensure that the SvSAN appliance does not become the bottleneck, a minimum CPU speed of 2 GHz is recommended. Faster processing means less time needed by the VSA to encapsulate the SCSI protocol data units or I/Os submitted by the guest VMs into TCP packets to transfer over the network.
Below are measurements of read and write performance at two different clock speeds set on the StorMagic VSA: 2.7 GHz and 2.0 GHz. The SvSAN VSA was running on an Intel Gold 6150 and the storage type was fast NVMe. All I/O was 4k 100% random, with a queue depth of 128.
In the graphics below, the % CPU utilization figures are for the guest VM, not the VSA.
For a single worker, accessing a simple target, the graphs show that a significant performance increase was obtained using the 2.7 GHz CPU.
4k random read performance with a 2.0 GHz limit on VSA CPU
4k random write performance with a 2.0 GHz limit on VSA CPU
4k random read performance with a 2.7 GHz limit on VSA CPU
4k random write performance with a 2.7 GHz limit on VSA CPU
With a second worker (a worker on each host), SvSAN's active-active mirror allows both nodes to be leveraged for reads, and 4k random reads of approximately 160,000 I/Os per second were achieved from the pair of nodes with 2.7 GHz CPUs:
4k random read performance with a 2.7 GHz limit on VSA CPU - 1x worker per node and data locality
https://support.stormagic.com/hc/en-gb/articles/5986612738717-SvSAN-Performance
VSA memory allocation
The default memory of 1 GB is adequate when using SvSAN without caching.
Note: With VSAs on Windows Hyper-V with >2 Targets it is recommended to increase the VSA memory to 2GB even when caching is not in use
If caching is used, VSA memory should be increased according to the required sizes of memory cache and SSD cache you require.
This section describes VSA memory requirements when utilizing SvSAN Predictive storage caching. Both memory and SSD read/write caching are available as a purchasable feature.
The following table shows the memory required by the VSA for a wanted memory read cache size and SSD read/write cache size. Some examples:
- Memory caching only. For 8 GB of usable memory for memory caching, the VSA requires 12 GB of memory.
- SSD caching only. For a 1 TB SSD cache, the VSA requires 4 GB of memory.
- SSD and memory caching. For 8 GB of usable memory for memory caching, and a 1 TB SSD cache, the VSA requires 14 GB of memory.
If neither SSD caching nor memory caching is used, the VSA needs 1 GB, which is the default amount of memory assigned on VSA deployment.
For memory caching, the VSA will use all the memory available to it.
StorMagic offers remote professional services to architect, deploy and analyze production workloads for virtualized environments, or to advise on cache sizing in detail.
VSA memory required for caching (all figures in GB)
SSD cache size, maximum (GB) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Memory cache size (GB) | 0 | 250 | 500 | 1000 | 1500 | 2000 | 2500 | 3000 | 3500 | 4000 |
0 |
1 |
3 |
3 | 4 | 6 | 7 | 8 | 9 | 10 | 12 |
1 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 12 | 13 |
2 |
5 |
5 |
6 | 7 | 8 | 9 | 11 | 12 | 13 | 14 |
3 |
6 |
6 |
7 | 8 | 9 | 11 | 12 | 13 | 14 | 15 |
4 | 7 | 7 | 8 | 9 | 11 | 12 | 13 | 14 | 15 | 17 |
5 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 17 | 18 |
6 | 10 | 10 | 11 | 12 | 13 | 14 | 15 | 17 | 18 | 19 |
7 | 11 | 11 | 12 | 13 | 14 | 15 | 17 | 18 | 19 | 20 |
8 | 12 | 12 | 13 | 14 | 15 | 17 | 18 | 19 | 20 | 22 |
9 | 13 | 13 | 14 | 15 | 17 | 18 | 19 | 20 | 22 | 23 |
10 | 14 | 15 | 15 | 17 | 18 | 19 | 21 | 22 | 23 | 24 |
11 | 15 | 17 | 17 | 18 | 19 | 21 | 22 | 23 | 24 | 25 |
12 | 17 | 18 | 18 | 19 | 21 | 22 | 23 | 24 | 25 | 26 |
13 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 |
14 | 20 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 |
15 | 21 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 |
16 | 22 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 |
17 | 23 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 32 |
18 | 24 | 25 | 25 | 26 | 27 | 28 | 29 | 30 | 32 | - |
19 | 25 | 26 | 26 | 27 | 28 | 29 | 31 | 32 | - | - |
20 | 26 | 27 | 27 | 28 | 29 | 31 | 32 | - | - | - |
21 | 27 | 28 | 28 | 29 | 31 | 32 | - | - | - | - |
22 | 28 | 29 | 30 | 31 | 32 | - | - | - | - | - |
23 | 30 | 30 | 31 | 32 | - | - | - | - | - | - |
24 | 31 | 31 | 32 | - | - | - | - | - | - | - |
25 | 32 | 32 | - | - | - | - | - | - | - | - |
¹ Memory caching disabled
² SSD caching disabled
Other performance considerations
Hypervisor performance versus power consumption
There is a trade-off between hypervisor performance and power consumption. Consider whether your settings are suitable for your needs.
ESXi
In ESXi, you can control performance in the Edit Power Policy Settings window:
Or, using vSphere PowerCLI:
PS C:\ Set-PowerCLIConfiguration -InvalidCertificateAction Ignore -WarningAction SilentlyContinue
PS C:\ connect-viserver 10.10.0.102
PS C:\ $view = (Get-VMHost 10.10.0.102 | Get-View)
PS C:\ (Get-View $view.ConfigManager.PowerSystem).ConfigurePowerPolicy(x)
where x is the key of the policy you want to set:
1 = static (high performance)
2 = dynamic (balanced)
3 = low (low power)
4 = custom
Hyper-V
In Hyper-V, you can control this within Control Panel: under Power Options. PowerShell can also be used.
Or, using Microsoft Powershell:
# Find the GUID of the "Power saver" plan
powercfg -list
# Power scheme guids and alias names:
# Name Value
# ----- -----
# 381b4222-f694-41f0-9685-ff5bb260df2e SCHEME_BALANCED # -- Balanced
# 8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c SCHEME_MIN # -- High performance
# a1841308-3541-4fab-bc81-f71556f20b4a SCHEME_MAX # -- Power saver
# Set "Power saver" as the active power plan (assuming its GUID is "{GUID_of_Power_Saver}")
powercfg -setactive {GUID_of_Power_Saver}
Server BIOS
In the hardware management interface, iLO, iDRAC, IMM, XCC, CIMC, etc, you can control this within your hardware BIOS settings as per hardware vendor's recommendations.
Network performance
SvSAN presents synchronously-mirrored multipath block-based disk devices over iSCSI. SvSAN has two network traffic types:
- Management – VSA management
- iSCSI – used by iSCSI initiators to access the iSCSI storage targets
You can define which traffic type is allowed over each of your network interfaces.
In addition, you can define which interfaces are allowed to be used for mirror traffic, including specifying mirror preferred, failover or excluded interfaces.
The above lends itself to many different network topologies dependent on whether storage is being shared just to the same hosts, other external hosts, other hypervisors, or guest VMs within a hypervisor.
For more information see Network interfaces.
Storage Path Failover Timing
StorMagic SvSAN presents synchronously mirrored multipath block based disk devices over iSCSI.
Storage paths are presented to the initiators rather than using a Virtual IP (VIP) address.
VMware ESXi
SvSAN mirrors are active active, such that each node can be read from independently, with a concept of quorum leadership under the hood.
The failure scenarios are detailed out below:
StorMagic has seen scenarios, should a VSA be powered off, relating to responsive path failure.
e.g. pinging a VM on SvSAN shared storage, power off the leader of the VSA and a packet drop may be observed.
This relates to the combination of internal SvSAN election and path failover ESXi side and the TCP stack in use.
As such StorMagic recommend disabling Delayed ACK:
https://knowledge.broadcom.com/external/article/313543/esxesxi-hosts-might-experience-read-or-w.html
Microsoft Hyper-V
StorMagic recommends modifying the below registry key values to 5:
iSCSI Adapter
Modifies the iSCSI software adapter on each host. This value determines how long requests will be held in the device queue and retried if the connection to the target is lost.
Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class{4D36E97B-E325-11CE-BFC1-08002BE10318}\0000\Parameters\
Value name: LinkDownTime
Value data: 5
0000 is the instance number, which could be different in your environment.
TCP tuning
This is TCP tuning (see https://technet.microsoft.com/en-gb/library/cc938210.aspx).
Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\
Value name: TcpMaxDataRetransmissions
Value data: 5
See Also
Comments
0 comments
Article is closed for comments.