2017 Summer Intern Projects

The 2017 summer intern team worked on various projects to help improve testing at Belvedere. We worked both individually and as a team on our projects. Each of the projects involved understanding the business value, defining an approach, overcoming obstacles, and celebrating successes! Throughout the summer we worked with the team leads and product owners to accomplish our goals. Here is an overview of some of the main projects we tackled.


Mass Quoting

TRex is an exchange simulator, implemented by a previous intern class, that allows internal testing of trading software. Our project was to extend TRex to include the placement of mass quotes on the Chicago Mercantile Exchange (CME) simulator. Doing so involved three main components: implementing mass quote FIX messages, designing a system for Mass Quote Protections, and extending the existing interface so the frontend can communicate with the C++ backend.

The majority of the work went into implementing the FIX messages defined by CME for sending and acknowledging mass quotes and cancellations. This required an extension of the existing decoder classes to accommodate new message types and additional logic inside the simulator’s order book. We focused on reusing as much existing functionality as possible to prevent duplication of business logic. Quotes are treated as orders once inside the order book, which allowed us to avoid altering the order book data structure itself. We also refactored the existing implementation so the integration of the frontend interface was as painless as possible.

Thanks to the design foresight of previous developers, a large portion of the work was relatively easy to accomplish! Implementing FIX message decoders required a single method definition and some struct forwarding. Adding frontend communication only involved hooking up interfaces to order book functions.

The most time-consuming roadblock was working with an unfamiliar codebase. Incorporating new changes without altering existing functionality proved difficult, especially considering the complexity of the logic used by the CME and the ambiguity with which parts are stated. Given that CME does not post their documentation with the intent of someone else simulating the exchange, it can be difficult to nail down precisely the semantics of some of the messages. We overcame this challenge through CME support and help from Belvedere full-time employees, each of whom had a more thorough knowledge of the exchange.

Another aspect of this project involved re-designing the TRex exchange simulator's web interface to display mass quotes. The previous version of the web application, which was created by last summer's interns, only had the ability to show orders, not mass quotes.

The web interface looks like a simple webpage, with an Excel-like table that displays an order on each row and updates in real-time. However, we soon realized there was a lot more going on behind the scenes. The system has three main parts: a back-end application, written in C++; a web server, written in Python; and a webpage, written using the React and Redux frameworks. Information flows between these different parts through a variety of different protocols. For example, a new order is sent out from the back-end as a "slice" message, interpreted by the server as a Python object, serialized inside the server into a JSON string, and sent to the webpage as a Server-Sent Event. There, the order is de-serialized into a Javascript object and sent through the Redux store. Finally, the order is displayed in the table for the users to view.

Our work on the front end involved creating a parallel process for mass quotes. The hardest part was understanding the steps in the process well enough to replicate them. All the code was written in dynamically-typed languages, so it was sometimes hard to understand what sort of objects a function would expect to receive. In addition, the codebase relied on a number of Javascript frameworks, each of which introduced its own set of rules and conventions. Several weeks of learning were required before we began to produce substantive work, despite the guidance provided by the exceedingly patient three full-time gurus assigned to help our team.


BTProtocol Generic Exchange

The goal of the BTProtocol Generic Exchange project was to allow users and algorithms to use TRex to simulate trading with any asset from any exchange. To accomplish this, we began by familiarizing ourselves with the TRex and gateway codebase. Then, we divided the project such that one of us implemented generic exchange support for the gateway, while the other implemented the support in TRex. In doing so, we were able to work in parallel as there were two distinct components. Also, we were able to write unit tests for each of our respective parts, as well as additional test cases.

We began by implementing the heartbeat, login, and logout logic in both TRex and the gateway. By doing so, we were able to connect the gateway to TRex over TCP using a console app. A console app is a simple interactive script that allows a user to perform basic functions, such as place, modify, and cancel orders. This was our first major success as we were able to send heartbeats back and forth every thirty seconds and stay connected until the gateway side logged out gracefully. We then implemented the main functionality of placing, modifying, and canceling orders in both TRex and the gateway. As this went well, we were quickly able to connect and test the new protocol functionality over the console app and the TRex front end. We also wrote and implemented regression tests allowing us to find and fix bugs in our protocol as well as the existing TRex code. In addition to connecting the gateway to TRex and being able to place, modify, and cancel orders, we found that as a team, we worked well together! While working in parallel, we could discuss implementation methods, which made the project much more fun (while improving code quality).

Along the way, we hit a couple of major challenges. At first, we were working under the assumption that we would be able to fetch the assets from every exchange all at once while the exchange was starting up. However, doing this would require fetching nearly one million assets from the database, which was far too time intensive. As a result, one of our team leads implemented a system in which the gateway would provide an execution to emulate so we would only need to fetch assets from that specific exchange. Another major impediment was that the QA testing took longer than anticipated. This was largely due to uncovering multiple existing bugs in TRex. Fortunately, we were able to identify the bugs and quickly fix them, turning this roadblock into a success.

As for next steps, we hope to get the brand new BTProtocol into the hands of real users and see how they interact with the system. This could allow us to add new features to the exchange simulator, as well as change the way we test new code.


OneTick Market Data Replay

The OneTick Market Data Replay project allows for historical market data to be used when testing algorithms. This allows all users- from developers to traders- to test algorithms using historical market data from specific time periods by using existing infrastructure in our systems

The difficulty of this project mainly revolved around design; primarily, the code had to be both reusable and (of course) ~functional~. Achieving this meant hours upon hours spent reading nebulous OneTick documentation and, when code inevitably segfaulted, posting questions to the OneTick support website. It was here that we met the best support person of all time. All time. After writing wrapper classes for OneTick, the time came to implementing existing interfaces to allow data to play into current systems. Now, it's just an issue of continuously refactoring the code until it's beautiful and (of course) ~functional~.

There are many aspects of the OneTick project that went well. We were able to successfully use the OneTick API to run queries and process data for multiple assets based off specific time ranges. We also integrated the code with the current Market Data controller interface. Finally, we went to the OneTick support looking for answers and ended up finding a friend on the support team.

While at a high level the objective of the project seems straightforward, this journey has not been without its trials and tribulations. We spent a lot of time learning to navigate the massive, already existing codebase as well as the OneTick library. There have also been occasions where some people may have accidently deleted their entire local repository. Maybe. Possible next steps for this project will be all about adding more parametrization to the queries so that we can have more control from here on and (of course) refactoring. Never stop refactoring.


Feed Comparator

The feed comparator is a debugging tool used to ensure that the data being displayed is actually correct after market data has been updated. This ensures backwards compatibility and keeps traders from trading on bad data. Comparing this data by hand is difficult because there are thousands and thousands of updates. This tool was motivated by previous bugs found during market data feed refactors.

The goal was to write a python script that takes in two files of market data and returns a new file containing all the differences between the two. We wanted to make the error messages as specific as possible in order to be able to identify bugs in market data code quickly. We checked for missed updates, differences in ask and bid books, and skipped assets. The script relied heavily on python dictionaries that tracked specific information for each asset such as missed and matched updates.

In general, our original strategy of storing updates in a dictionary of lists keyed by asset id number was very successful. This allowed us to delete an update from storage if it appeared on both feeds and keep track of unmatched updates. It’s important to note that while updates for a specific asset must come in order, the assets themselves may come in different orders, which is why it’s necessary to have a mechanism to store unmatched updates. The biggest roadblock was the amount of information we had to store. We tried to track missed and matched updates, as well as catch updates that were unmatched before they hit the comparator. This took many iterations of updating dictionaries and other data structures. The next step for the feed comparator is to integrate with our regression testing suite to incorporate market feeds into testing.


Code Refactoring

One of the major takeaways we learned from our projects was the value of refactoring code. As soon as we familiarized ourselves with our projects, we realized the struggles of working with a such a large codebase. Even though the existing code was well written, as more features were added, the code became increasingly more difficult to understand and to extend. This is where refactoring came into play.

At its core, code refactoring is simply making code easier to understand and more versatile. However, it’s much more complicated than it seems. When we refactor code, the goal is for someone else to be able to see exactly what the code is doing without any comments; in essence, the code should be self-documenting. We accomplished this by properly modularizing classes and functions, planning a good hierarchy for classes and interfaces, and naming variables descriptively. All of us were responsible for maintaining the code base and ensuring that we were producing quality code.

Sometimes, more action is needed than just refactoring; in fact at a certain point, code can spiral so far out of control that it is better to just rewrite it. This was the case for a code generator we used to generate a very lengthy C++ file that abstracted the CME FIX protocol. It was poorly written and when it was extended to support a few more exchanges, it became an absolute disaster. Although it was fully functional, it was very difficult to read or extend. Because of this, we spent two weeks rewriting the parser from scratch and building it in such a way that adding new exchanges would not be difficult. Situations like these really highlight the importance of refactoring code in the long run.


Conclusion

Looking back at our summer in Chicago, it is clear that we had an amazing opportunity. We accomplished most of our goals, but above all, we know how our projects will be used by the employees at Belvedere. This was the most rewarding feeling! The projects were challenging and with the support of our team, we reached our goals and can leave the internship proud of our work.