Source Address: https://msandbu.wordpress.com/2014/10/31/netscaler-and-real-performance-tuning/
The author is obviously not a native English speaker, so some places I look at is also more laborious, but very grateful to the original author.
========= translated content started =========
Yesterday I had a meeting with Citrix User group in Norway on NetScaler and performance tuning, and I couldn't say much about performance tuning for 45 minutes, but I think I got a big're far off.
Here is the meeting schedule:
TCP Overview, multipath TCP, path Maximum transmission unit
SSL Overview and Tuning
Auto-Negotiate and duplex
Netscaler VPX
Jumbo Frames and LACP
Last but not the least important mobile data stream
In addition to a closer combination of mobile data flow (Mobilestream) and NetScaler back-end feature, the vast majority of this article is about the core feature of NetScaler optimization. So I will also write a blog about mobile data flow.
The first is the TCP attribute. By default, the TCP attribute was not modified until 1999. So NetScaler by default is to follow the compatibility principle rather than high performance, but for granted, there are many different factors involved here. For example, what kind of architecture do you use, packet loss scenarios, bandwidth, network jitter, firewalls, and so on.
But the main thing is that the default TCP attribute is not included in the following:
* Activate window scaling (window scaling is useful for sending more packages, because resizing a window means we can more easily send more data)
* Activate selective acknowledgement (meaning that we don't have to retransmit all the data after the packet has been dropped, for example, if the packet you want to pass is 12345, but the recipient does not receive 3, it will only retransmit 3, not 345)
* Activate Nagle algorithm (collect more data first until the data size reaches the MTU limit before sending the data)
For example, the ICA architecture protocol is less formal, and it uses a smaller package (with many headers) which means that this does not apply to regular TCP properties.
Nstcp_xa_xd_profile (all of the feature I mentioned above are activated in this strategy) but of course you also have mobile users jumping between different WLANs or due to an antenna problem causing drops. In the default TCP attribute, TCP Reno is used, and when packet loss is detected, TCP Reno will reduce the congestion window by half, which is of no benefit to mobile users.
So Citrix deploys a series of TCP congestion processing feature called westwood+, which will attempt to determine the current bandwidth with the device and then reduce the congestion window based on bandwidth. This means that mobile users can get higher transfer speeds (after congestion) faster.
Now the 10.5 version has an option to activate Mtcp (multiple TCP), then what does this mean, this is for your device that supports both antennas (one is mobile data traffic one is WiFi, you can use both antennas at the same time), We can prepare two kinds of TCP connections for the same device at the same time. This is a policy setting that is easy to fix.
The problem is that you need to have a clear application written to change the mtcp.
So enter system–> profiles–> TCP Profiles (you can use existing or create a new one)
Tick window Scaling
There are also mtcp (if you need to) sack and Nagle.
Of course Nagle also has a disadvantage is that it will wait until the data reaches the maximum MTU value before it is wired to send (translator: So mobile users do not use this feature? ), so that mobile users will have a packet loss, in theory, there will be a lot of data to be re-transmitted. So for SQL, don't use nagle:)
The cool is that the strategy for each vSERVER and service use, depending on the type of service people choose what strategy.
Another thing is the SSL adjustment, there are still several tips. The first is the quantum size, which is 8KB, which means that NetScaler transmits data to the SSL chip for encryption in a unit of 8KB. We can also adjust this value to 16KB, which means that more data is encrypted together.
Therefore, 16KB quantum size is recommended for downloading a large number of files. The General Web page (translator: Different from the download page) has a lot of small data on top, this situation is still recommended 8KB.
In addition to talk about auto-negotiation and duplex, this thing everyone is looking forward to it, but ...
On some specific devices, there are still problems, so you often have to manually set the rate on the NetScaler and on the switch/router/firewall. What's the duplex?
There are a lot of tips to adjust on the VPX, but on mpx ... (Translator: The author means that in the MPX on the special, hehe)
For example, on VPX, it supports multi-packet engines (multiple packet engine), which means you can have a specific engine that runs all of the non-operational policies, handles cryptographic transactions, and so on. For a regular VPX, the default is two virtual CPUs (one for management and the other for packet processing), so if you have a VPX3000 (two virtual CPUs and 2GB memory may not be enough), if you are using XenServer or VMware, You can add more CPU and memory to it to get more packet processing engines. (Note: Hyper-V does not support this feature, it is the upper limit is two virtual CPU and 2GB memory and two virtual network card, also cannot add a third network card)
Of course, if you're using Hyper-V and VPX, make sure you're using the latest driver and make sure your VMQ (Virtual machine queing)
VMQ means that a virtual machine on the physical network card has a dedicated queue for the VPX use, if the dedicated queue to use the default queue, the default queue is usually and other virtual machines (translator: estimate is the meaning of other ordinary virtual machine, such as fictitious Win7 what) mixed together. Many Broadcom NIC drivers do not support VMQ.
Also to say is LACP (Nic Teaming,port channel,802.3ad), can enable you to aggregate and you redundant backup of your multiple physical network cards (note that this needs to be set on the switch, and only MPX and SDX support)
There are also 10.5 new feature that support jumbo frames (Jumbo frames) (translator: Mom!) What is jumbo frame), which allows you to set the MTU to 9000, so that the head becomes relatively smaller, the frame of the data space is larger, saving the number of ACK.
This also only supports MPX and SDX, because VPX relies on your hypervisor which is provided (the translator: ESXi is a hypervisor, for example)
This can be set on each interface. But note that this requires your switch, server support jumbo frames, but note that when entering the WAN, this feature may not work properly, because it may end up on the carrier's router (most carriers support the default MTU size)
But note that NetScaler also has Path MTU feature, which allows NetScaler to advance to detect the smallest MTU in the path it is going to take. This feature uses ICMP to determine the lowest MTU of the next hop. The problem is that because it uses ICMP, the next hop might be a firewall, so it might not work properly. This feature is primarily intended to avoid IP fragmentation in the network.
Just say this, and still debug more NetScaler features. :)
========= Translated content End =========
The original author of English is also a variety of blunt, there are many places direct translation, but very grateful to the author.
"Translation" NetScaler Real performance tuning