I took note of this article because I wondered at which point financial exchanges would migrate to the public cloud. Now that might be an interesting twitter debate.
Regardless of where exchanges end up, the article worth reading as it highlights some of the limits of fiber networks between Chicago and New York. As I update this post in 2017, a Chinese firm has attempted to purchase the Chicago.
The article also brings up another aspect of data streams worth considering: today the vast majority of analysis within firms is done very slowly by humans. While capital markets often present as cutting edge and high speed, beyond the exchange everything slows down. The vast majority of data analysis remains batch analysis. And because many firms run applications in the data center, the volume of data remains small.
It's true hedge funds may be analyzing much of the same capital markets data. And it's true alt data sources can provide insights. But this is "Seven Minute Abs" to an extent. Our point of view is that to gain a competitive advantage, firms need to re-architect for continuous processing of data at scale.
We need to give the computers more work to do. By designing a solution that continuously ingests and processes data we can task the computers with finding more of the kinds of opportunities we seek. Put the computers to work, and when they find things that are interesting we can examine the data and form a strategy.
Today it's not difficult to get started with continuous data ingestion and analysis. We've on-boarded customers to AWS and they've been up and running with 1000 node clusters within a few weeks, and that included the continuous delivery pipeline for their applications, and the ability to run jobs when spot prices for compute could be had at a good discount.
If you'd like to hear more about this capability, I've blocked out time each week to speak with new customers. You can connect with me via the "chat" icon at the bottom of each page, or schedule time on my calendar here.
The other aspect of this story I found interesting is how microwave is lower latency than fiber. The speed signals travel depends on the media, and so microwaves travel faster through the air than through fiber. Also consider the routes for fiber are often not optimal. Also worth noting is that just a few milliseconds is considered plenty of time for arbitrage between the Chicago Exchange and the NYSE.
We wonder to what extent AWS Lambda functions at the edge of the network will be able to use algorithms generated by deep learning at the edge of the network. We also find technology like Athena changes the way firms think about data processing. Athena enables you to store all your data in S3, which is 11 9s durable and easy to replicate across regions. The price keeps dropping, but last time I checked it was around $0.02 per GB/month. Consider that the entire history of the Standard and Poor's Index is around 10MB uncompressed. For tick-data the NYSE TAQ data set is about 12GB per day. For many firms it's not even the cost of the storage, it's the limits of their SQL database clusters. Once firm we worked with spend hours each day loading, then unloading a large SQL cluster simply because the cluster could not manage all the data. In the case of Athena, as soon as you write the data to S3, you can query it.
Once you remove the enormous cost and size constraints of traditional data processing, the needs of your business determine what and how to process the data. In our experience many firms realize all of a sudden that with Cloud Scale technology like Athena they need to think bigger about the data sets they examine.