[I wrote this in 2015 as we worked with a hedge fund to leverage Lambda and Kinesis to calculate real-time vwap for a set of one thousand ticker symbols.
Since then, this space has exploded with opportunity!] As an exercise to jump start what we're doing with market data, we've set aside a number of considerations, and we're talking through the code. We should just be coding, but a little planning isn't bad either. Nissim Karpenstein contributed to this post. Nissim is Bronze Drum's expert on risk and pricing and financial products. Nissim has worked as Technology Director at Hudson Bay Capital prior to partnering with Bronze Drum. Nissim has worked as a financial markets applications engineer in New York City finance for the past 20 years.
Since publishing this post, and successfully transcompiling portions of QuantLib from C++ to ASM.js we've worked with a number of firms on derivatives pricing in the Cloud, as well as real-time applications in social media, and finance. What strikes me today in 2017 is that it's still early, yet in 2017 I can already see firms in financial services separating from the pack, largely by adopting cloud scale analytics. I you've had a lot of advice around the status quo, and you'd like to speak with people who can't seem to help but think differently, contact us--we'd like to share our point-of-view. Since I wrote this article in 2015, a great deal has changed:
- Lambda functions now support multiple languages, including Python, Java, C#
- In 2016 the Amazon Kinesis team introduced Kinesis Analytics. Kinesis Analytics greatly simplifies the work of analyzing real-time data, and takes it to a place Nissim and I were working towards in 2015--real-time pattern analysis and anomaly detected. One of our customers streams thirteen million feeds into Kinesis (not all market data), and spends in excess of $8K/day on Kinesis, yet generates far more in revenue.
- In 2016 AWS announced the F1 instance type that enables customers to launch and program Xilinx FPGA, and for ISV to offer FPGA AMIs in the AWS Marketplace.
[Here's the original post]
Let's assume there are two types of functions we can write for AWS Lambda:
- Functions that take input and return output.
- Functions that operate on data in a data store and modify it or enhance it.
- If we are going to calculate bid/ask spread, we just need a simulated market data feed. Each record in that feed has bid/ask and the spread is just the difference.
- If you want to enhance a tick history with VWAP, we could probably keep a running total of volume and volume weighted price and then calculate the ratio as VWAP = sum(volume weighted price) / volume.
- If we use this data structure for equity market data we could enhance that structure by keeping the running totals: agg_volume = previous tick agg_volume + volume.
agg_vol_wtd_price = previous tick agg_vol_wtd_price + (volume * last); and VWAP = agg_vol_wtd_price / agg_volume.
- To do this we'll need to get the previous tick which has already run through the function. If there's no previous tick we can initialize agg_volume = volume and agg_vol_wtd_pric = (volume * last).
AWS Lambda Functions for Analyzing Market Data
- How do we pull the previous tick from our database?
- How do we know that it's been run through the function already?
- How do we know that we are processing the first tick of the day?