How apropos that on Valentine’s Day Weekend, I have the privilege of penning a letter to the area of IT I love most: Performance. How many fields could call upon you at any given moment to draw from disciplines as varied as statistical modeling, storage tuning, network traffic analysis, CPU microarchitecture, queuing theory, kernel tracing, or assembly debugging? Needless to say, there is never a dull moment in the life of a Performance Engineer.
In our industry, performance is critical to survival in a way that may appear extreme to those in other tech-driven industries. Managing the performance of Belvedere Trading’s infrastructure is my primary responsibility, and as Tom DeMarco opened the first chapter of his book Controlling Software Projects: Management, Measurement, and Estimation (1982), “You can’t control what you can’t measure.” This blog entry outlines how we “control” the performance of our trading infrastructure.
The Foundation: Time
Very few prop firms host their entire trading ecosystem on a single host. Generally, there will be a ticker plant, a set of strategy servers, and a group of order execution gateways. Tracking common business metrics like tick-to-trade requires time-stamping events at microsecond or better granularity across the entire environment. This type of event correlation mandates accurate time synchronization. Aside from highly customized commercial variants, NTP cannot reliably maintain the microsecond-level accuracy needed. Therefore, we employ PTPv2, along with PTPv2 HW-compatible NICs, to maintain end-to-end time sync. Boundary clocks enabled on intervening network devices ensure that any queuing delay is accounted for along the PTP packet’s path. PTP Grandmaster appliances from companies like Microsemi (our chosen provider) and Spectracom provide the timing source (NTP, PTP, PPS, and/or IRIG) for an entire network (See Figure 1). Pay special attention to the internal oscillator provided with your appliance, as the heat sensitivity and holdover capability of each type (e.g., TCXO, OCXO, or Rubidium) vary widely.
With accurate time sync in place, network packet capture and analysis can offer valuable insight into the performance characteristics of a trading infrastructure. Commercial providers like Corvil (our chosen provider), NetScout, and SevOne offer complete solutions capable of decoding common market data and order execution protocols. Several companies roll their own solutions using capture cards from Endace or Solarflare along with open source packages like those offered at NTop.org. No matter which avenue you choose, the reporting capability must be carefully considered. Empirical studies show that system and network latency rarely exhibit a normal (Gaussian) distribution – it is often bimodal and positively skewed (See Figure 2); therefore, dashboards which highlight little more than an arithmetic mean and standard deviation remove key actionable information. So ensure that your solution offers percentile reporting – e.g., median, 75%ile, 95%ile, 99%ile, 99.9%ile, and max. Last, but not least, make sure to track standard error conditions such as TCP Zero Window and Retransmits, which inflict particularly adverse effects on latency.
Common ways of inserting a network capture and analysis device into the network include switch port mirroring, optical taps, and multiport Layer 1 switches. While port mirroring is cheapest and most readily available, it works by actively copying packets to a monitoring port – this alters original packet timing, removes visibility of bad packets, and drops packets in the case of accidental oversubscription. Optical taps solve these issues by only passively monitoring traffic, adding very little latency in the process (See <strong>Figure 3</strong>). However, they degrade the signal as it passes through the device, and require the use of optical fiber for any area of the network that one wishes to tap. Multiport Layer 1 switches, such as those from ExaBlaze and Metamako, provide similarly low latency tap capability, but for both copper and fiber connections, <em>and</em> they regenerate the signal as it travels through the device. Additionally, tap configuration is dynamic, programmable, and remotely configurable. Weigh the pros and cons of each alternative to determine which way is adequate for your network capture needs.
Lastly, the servers hosting your trading software should provide low overhead application telemetry. Operating systems such as Linux provide relatively low overhead timing calls like gettimeofday to facilitate this; though most firms use the rdtscp x86 instruction for even finer granularity and lower overhead (all modern Intel chips provide consistent and invariant TSCs for rdtscp accuracy across sockets). However, recording application metrics in an intrusive manner would render meaningless all the effort put into obtaining metrics in unobtrusive ways; therefore, make sure that you employ low latency coding principles when writing the logging portion of your trading applications. Central storage of application logs (along with system and network logs) via tools like Splunk provides indispensable visualization and reporting capabilities, as well as anomaly detection.
This blog entry provides only a surface-level overview of how Belvedere Trading currently manages the performance of its trading infrastructure. We continue to search for ways to improve our performance management capabilities; for example, we are in the process of beefing up our R&D Lab for this very purpose. Wait . . . you do have an R&D Lab, don’t you? ;-)