Thought this was cool: How I made $500k with machine learning and HFT (high frequency trading) | Jesse Spaulding
This post will detail what I did to make approx. 500k from high frequency trading from 2009 to 2010. Since I was trading completely independently and am no longer running my program I’m happy to tell all. My trading was mostly in Russel 2000 and DAX futures contracts.
The key to my success, I believe, was not in a sophisticated financial equation but rather in the overall algorithm design which tied together many simple components and used machine learning to optimize for maximum profitability. You won’t need to know any sophisticated terminology here because when I setup my program it was all based on intuition. (Andrew Ng’s amazing machine learning course was not yet available – btw if you click that link you’ll be taken to my current project: CourseTalk, a review site for MOOCs)
First, I just want to demonstrate that my success was not simply the result of luck. My program made 1000-4000 trades per day (half long, half short) and never got into positions of more than a few contracts at a time. This meant the random luck from any one particular trade averaged out pretty fast. The result was I never lost more than $2000 in one day and never had a losing month:
And here’s a chart to give you a sense of the daily variation. Note this excludes the last 7 months because – as the figures stopped going up – I lost my motivation to enter them.
My trading background
Prior to setting up my automated trading program I’d had 2 years experience as a “manual” day trader. This was back in 2001 – it was the early days of electronic trading and there were opportunities for “scalpers” to make good money. I can only describe what I was doing as akin to playing a video game / gambling with a supposed edge. Being successful meant being fast, being disciplined, and having a good intuitive pattern recognition abilities. I was able to make around $250k, pay off my student loans and have money left over. Win!
Over the next five years I would launch two startups, picking up some programming skills along the way. It wouldn’t be until late 2008 that I would get back into trading. With money running low from the sale of my first startup, trading offered hopes of some quick cash while I figured out my next move.
A trading API
In 2008 I was “manually” day trading futures using software called T4. I’d been wanting some customized order entry hotkeys, so after discovering T4 had an API, I took on the challenge of learning C# (the programming language required to use the API) and went ahead and built myself some hotkeys.
After getting my feet wet with the API I soon had bigger aspirations: I wanted to teach the computer to trade for me. The API provided both a stream of market data and an easy way to send orders to the exchange – all I had to do was create the logic in the middle.
Below is a screenshot of a T4 trading window. What was cool is that when I got my program working I was able to watch the computer trade on this exact same interface. Watching real orders popping in and out (by themselves with my real money) was both thrilling and scary.
The design of my algorithm
From the outset my goal was to setup a system such that I could be reasonably confident I’d make money before ever making any live trades. To accomplish this I needed to build a trading simulation framework that would – as accurately as possible – simulate live trading.
While trading in live mode required processing market updates streamed through the API, simulation mode required reading market updates from a data file. To collect this data I setup the first version of my program to simply connect to the API and record market updates with timestamps. I ended up using 4 weeks worth of recent market data to train and test my system on.
With a basic framework in place I still had the task of figuring out how to make a profitable trading system. As it turns out my algorithm would break down into two distinct components, which I’ll explore in turn:
- Predicting price movements; and
- Making profitable trades
Predicting price movements
Perhaps an obvious component of any trading system is being able to predict where prices will move. And mine was no exception. I defined the current price as the average of the inside bid and inside offer and I set the goal of predicting where the price would be in the next 10 seconds. My algorithm would need to come up with this prediction moment-by-moment throughout the trading day.
Creating & optimizing indicators
I created a handful of indicators that proved to have a meaningful ability to predict short term price movements. Each indicator produced a number that was either positive or negative. An indicator was useful if more often than not a positive number corresponded with the market going up and a negative number corresponded with the market going down.
My system allowed me to quickly determine how much predictive ability any indicator had so I was able to experiment with a lot of different indicators to see what worked. Many of the indicators had variables in the formulas that produced them and I was able to find the optimal vales for those variables by doing side by side comparisons of results achieved with varying values.
The indicators that were most useful were all relatively simple and were based on recent events in the market I was trading as well as the markets of correlated securities.
Making exact price move predictions
Having indicators that simply predicted an up or down price movement wasn’t enough. I needed to know exactly how much price movement was predicted by each possible value of each indicator. I needed a formula that would convert an indicator value to a price prediction.
To accomplish this I tracked predicted price moves in 50 buckets that depended on the range that the indicator value fell in. This produced unique predictions for each bucket that I was then able to graph in Excel. As you can see the expected price change increases as the indicator value increases.
Based on a graph such as this I was able to make a formula to fit the curve. In the beginning I did this “curve fitting” manually but I soon wrote up some code to automate this process.
Note that not all the indicator curves had the same shape. Also note the buckets were logarithmically distributed so as to spread the data points out evenly. Finally note that negative indicator values (and their corresponding downward price predictions) were flipped and combined with the positive values. (My algorithm treated up and down exactly the same.)
Combining indicators for a single prediction
An important thing to consider was that each indicator was not entirely independent. I couldn’t simply just add up all the predictions that each indicator made individually. The key was to figure out the additional predictive value that each indicator had beyond what was already predicted. This wasn’t to hard to implement but it did mean that if I was “curve fitting” multiple indicators at the same time I had to be careful; changing one would effect the predictions of another.
In order to “curve fit” all of the indicators at the same time I setup the optimizer to step only 30% of the way towards the new prediction curves with each pass. With this 30% jump I found that the prediction curves would stabilize within a few passes.
With each indicator now giving us it’s additional price prediction I could simply add them up to produce a single prediction of where the market would be in 10 seconds.
Why predicting prices is not enough
You might think that with this edge on the market I was golden. But you need to keep in mind that the market is made up of bids and offers – it’s not just one market price. Success in high frequency trading comes down to getting good prices and it’s not that easy.
The following factors make creating a profitable system difficult:
- With each trade I had to pay commissions to both my broker and the exchange.
- The spread (difference between highest bid and lowest offer) meant that if I were to simply buy and sell randomly I’d be losing a ton of money.
- Most of the market volume was other bots that would only execute a trade with me if they thought they had some statistical edge.
- Seeing an offer did not guarantee that I could buy it. By the time my buy order got to the exchange it was very possible that that offer would have been cancelled.
- As a small market player there was no way I could compete on speed alone.
Building a full trading simulation
So I had a framework that allowed me to backtest and optimize indicators. But I had to go beyond this – I needed a framework that would allow me to backtest and optimize a full trading system; one where I was sending orders and getting in positions. In this case I’d be optimizing for total P&L and to some extent average P&L per trade.
This would be trickier and in some ways impossible to model exactly but I did as best as I could. Here are some of the issues I had to deal with:
- When an order was sent to the market in simulation I had to model the lag time. The fact that my system saw an offer did not mean that it could buy it straight away. The system would send the order, wait approximately 20 milliseconds and then only if the offer was still there was it considered as an executed trade. This was inexact because the real lag time was inconsistent and unreported.
- When I placed bids or offers I had to look at the trade execution stream (provided by the API) and use those to gauge when my order would have gotten executed against. To do this right I had to track the position of my order in the queue. (It’s a first-in first-out system.) Again, I couldn’t do this perfectly but I made a best approximation.
To refine my order execution simulation what I did was take my log files from live trading through the API and compare them to log files produced by simulated trading from the exact same time period. I was able to get my simulation to the point that it was pretty accurate and for the parts that were impossible to model exactly I made sure to at least produce outcomes that were statistically similar (in the metrics I thought were important).
Making profitable trades
With an order simulation model in place I could now send orders in simulation mode and see a simulated P&L. But how would my system know when and where to buy and sell?
The price move predictions were a starting point but not the whole story. What I did was create a scoring system for each of 5 price levels on the bid and offer. These included one level above the inside bid (for a buy order) and one level below the inside offer (for a sell order).
If the score at any given price level was above a certain threshold that would mean my system should have an active bid/offer there – below the threshold then any active orders should be cancelled. Based on this it was not uncommon that my system would flash a bid in the market then immediately cancel it. (Although I tried to minimize this as it’s annoying as heck to anyone looking at the screen with human eyes – including me.)
The price level scores were calculated based on the following factors:
- The price move prediction (that we discussed earlier).
- The price level in question. (Inner levels meant greater price move predictions were required.)
- The number of contracts in front of my order in the queue. (Less was better.)
- The number of contracts behind my order in the queue. (More was better.)
Essentially these factors served to identify “safe” places to bid/offer. The price move prediction alone was not adequate because it did not account for the fact that when placing a bid I was not automatically filled – I only got filled if someone sold to me there. The reality was that the mere fact of someone selling to me at a certain price changed the statistical odds of the trade.
The variables used in this step were all subject to optimization. This was done in the exact same way as I optimized variables in the price move indicators except in this case I was optimizing for bottom line P&L.
What my program ignored
When trading as humans we often have powerful emotions and biases that can lead to less than optimal decisions. Clearly I did not want to codify these biases. Here are some factors my system ignored:
- The price that a position was entered – In a trading office it’s pretty common to hear conversation about the price at which someone is long or short as if that should effect their future decision making. While this has some validity as part of a risk reduction strategy it really has no bearing on the future course of events in the market. Therefore my program completely ignored this information. It’s the same concept as ignoring sunk costs.
- Going short vs. exiting a long position – Typically a trader would have different criteria that determines where to sell a long position versus where to go short. However from my algorithms perspective there was no reason to make a distinction. If my algorithm expected a downward move selling was a good idea regardless of if it was currently long, short, or flat.
- A “doubling up” strategy – This is a common strategy where traders will buy more stock in the event that there original trade goes against them. This results in your average purchase price being lower and it means when (or if) the stock turns around you’ll be set to make your money back in no time. In my opinion this is really a horrible strategy unless you’re Warren Buffet. You’re tricked into thinking you are doing well because most of your trades will be winners. The problem is when you lose you lose big. The other effect is it makes it hard to judge if you actually have an edge on the market or are just getting lucky. Being able to monitor and confirm that my program did in fact have an edge was an important goal.
Since my algorithm made decisions the same way regardless of where it entered a trade or if it was currently long or short it did occasionally sit in (and take) some large losing trades (in addition to some large winning trades). But, you shouldn’t think there wasn’t any risk management.
To manage risk I enforced a maximum position size of 2 contracts at a time, occasionally bumped up on high volume days. I also had a maximum daily loss limit to safeguard against any unexpected market conditions or a bug in my software. These limits were enforced in my code but also in the backend through my broker. As it happened I never encountered any significant problems.
Running the algorithm
From the moment I started working on my program it took me about 6 months before i got it to the point of profitability and begun running it live. Although to be fair a significant amount of time was learning a new programming language. As I worked to improve the program I saw increased profits for each of the next four months.
Each week I would retrain my system based on the previous 4 weeks worth of data. I found this struck the right balance between capturing recent market behavioral trends and insuring my algorithm had enough data to establish meaningful patterns. As the training began taking more and more time I split it out so that it could be performed by 8 virtual machines using amazon EC2. The results were then coalesced on my local machine.
The high point of my trading was October 2009 when I made almost 100k. After this I continued to spend the next four months trying to improve my program despite decreased profit each month. Unfortunately by this point I guess I’d implemented all my best ideas because nothing I tried seemed to help much.
With the frustration of not being able to make improvements and not having a sense of growth I began thinking about a new direction. I emailed 6 different high frequency trading firms to see if they’d be interested in purchasing my software and hiring me to work for them. Nobody replied. I had some new startup ideas I wanted to work on so I never followed up.
UPDATE – I posted this on Hacker News and it has gotten a lot of attention. I just want to say that I do not advocate anyone trying to do something like this themselves now. You would need a team of really smart people with a range of experiences to have any hope of competing. Even when I was doing this I believe it was very rare for individuals to achieve success (though I had heard of others.)
There is a comment at the top of the page that mentions ”manipulated statistics” and refers to me as a “retail investor” that quants would “gleefully pick off”. This is a rather unfortunate comment that’s simply not based in reality. Setting that aside there’s some interesting comments: http://news.ycombinator.com/item?id=4748624
from Hacker News 200: http://jspauld.com/post/35126549635/how-i-made-500k-with-machine-learning-and-hft?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+hacker-news-feed-200+%28Hacker+News+200%29