Stock Exchange and Testing Real-Time Critical Services
Next: Testing in Physical World
TL;DR Chaos engineering cannot be applied in all cases. For mission critical, complex systems that cannot be tested during live operation, testing can be performed when it is not operational. For systems that are always online, partial experimentations are the only option. TechRank can be used to pinpoint areas most in need of verification.
Testing Resilience
Not all complex and sensitive systems can be tested in production. For example, chaos engineering principles cannot directly be used in finance markets and stock exchanges. It’s not possible to falsify, drop or arbitrarily delay some bids. This would cause some participant to lose great deals of money.
However, if complex systems are not constantly tested, its fragility grows with time until a complete collapse in unpredictable way becomes not only possible but likely. Recovery from such collapses can take long time or may create permanent changes for overall economy and human lives.
Modern stock exchanges and financial markets are mostly driven through automated computer programs. Increasingly machine learning and other artificial intelligence models are used to control and make decisions by myriad of independent operators. How could such a complex, interconnected, always changing system consisting of many separate entities be tested?
Building a complete copy where tests are executed would be very expensive. Automated trading is also a latency race where even the length of each cable has an effect to which bid from which robot wins. That’s why actors co-locate their computers next to the stock exchange machines. A test environment would have to be an exact copy for any meaningful results. Add that the system changes all the time with new physical installations and new software updates. Not doable.
A better approach is that during night time, when the stock exchange is closed, the real system is tested with different failure scenarios. The stock exchange computers and all interlinked machines participate into a simulation where the day’s events are driven again in but bid prices are altered and additional events are added to the mix. Additional events could be news about major fraud at big banks, great swings in key raw material availability and price, trade wars declared, wrongful atomic alerts, an unlikely performance artist being selected as a president of a major country etc. What effect - if any - such events or sequence of such events might have? Will the markets whistle all the way to the bottom, nothing happens or something quite unexpected?
These training sessions are a way of smoking out vulnerabilities in the overall systems. When systems lose its stability, the events predating and after can be studied and ways to mitigate or recover tested in a rerun. At minimum this reveals what these scenarios are and when markets should be temporarily suspended. The participating AI based robot traders can use the results as further training data sets. As end result, we are constantly teaching a complex and changing system how to mitigate or prevent crashes.
Several analogies come to mind.
Human societies have also always been complex interconnected structures. Perhaps gossips, fairytales, magic trick, theatre and different small swindles are ways how we test ourselves in production. They teach to take everything with a grain of salt and most people learn to avoid bigger frauds. The earlier ‘Fraud Corps’ concept is based on this insight.
There is a perhaps weaker analogy between the previously proposed way of running simulations on stock market computers and human brains. If you read a good book, watch a sci-fi or fantasy movie or encounter a near escape from an accident, the likelihood that you’ll relive similar experience the next night are much higher than any other night. This is a way for the brain to run nightly simulations to prepare the mind for various unlikely scenarios so they can be overcome. It prepares the cabling in the brain for the eventuality that the monsters below your bed one day turn out to be real.
A different hypothesis for dreams comes from an observation on machine learning. There the problem is that with small datasets the trained models can become too good in predicting results that match exactly the past seen. This makes them unable to function in new situations. This failure is called overfitting. There are a number of mechanisms used to prevent this such as adding more input from training images by generating new but slightly warped ones (cut, stretched, mirroring,) or adding forgetfulness to the trained model by dropping out small number of connections between neural nodes on each training iteration (called dropout) etc.
The hypothesis is that without wonky and often hallucinatory dreams the brain would learn too well from daily activities and fail to generalise for later life.
https://www.cell.com/patterns/fulltext/S2666-3899(21)00064-7
Unrelated side note: Of course, sleep can have multiple functions at the same time.
https://www.scientificamerican.com/article/deep-sleep-gives-your-brain-a-deep-clean1/
Studies have found that that specialised immune cells are more active in the brain during sleep and that during sleep, cerebrospinal fluid (fluid in brain and spinal cord) - washes in and out our brains clearing them of metabolic residue that when let to accumulate damages the cells. It works for rest of the body too, during so called deep sleep the blood pressure drops and brain goes into hibernation releasing extra blood supply, oxygen and nutrients that body can use for healing and growth in other parts.
As a summary: as amount of automation and use of machine learning models increases in financial markets and more players are added with more interconnections and dependencies, the resulting system become more and more complex, and finally it becomes necessary for stock exchanges and other forms of dynamic trading platforms need to start seeing dreams during off-hours to remain operational during daytime.
Hallucinations and bad input prevent overfitting.
Scenario Selection
What type of scenarios should then be used?
It would make sense to try to identify the areas of the system where small changes have the biggest impact to the overall system. In ecosystems these are called keystone species. Are there similar keystone industries, raw materials, key suppliers, key technologies or companies whose fall would have devastating impact to the overall system? How would one identify them?
The TechRank methods presented in previous chapter are ways of discovering what are the highly connected technologies and raw materials. The large unexpected swings in supply or demand of these have the most effect on overall markets. Hence, they should be in the focus of testing. Simulations results will show strategic points where to reduce the fragility of the system one should try to move to alternate materials or composites or by developing recycling technologies that allow to keep reusing materials already in circulation.
Nations may also think about repurchase of companies in these areas should they have fallen at the hands of adverse nations.
As second option, rather than try to do complex analyses of extremely interconnected ecosystem, a much simpler mechanism is to take some events from days messages and inflate them out of proportion during simulation. If the simulation results in unexpected large swings or changes, we have found a weak spot. Or we can remove completely some set of events based on different criteria just for testing if these have any material impact.
One can also use indirect signals. it is reasonable to estimate that many of the keystone companies and industries will be relatively often mentioned in news or other stock information messages indicating that they are well connected. So, pick some companies that are well presented in news and inflate news of them to see results.
Partition Method
What if stock exchanges or similar systems need to work 24*7 – i.e., never shut down. In that case scenario testing needs to be done in smaller subsets. Part of the systems are “off-line” for a while during slow hours when load is smaller. Then simulations are run at the offline part to see how that entity behaves. This is similar to for example marine mammals’ sleep. They can switch off half of their brain for a few minutes to have dreams while the other part still is wake and takes the animal regularly to the surface for breathing.
We’ve now covered a few different options in testing services in future. There are still services and structures that cannot be messed in any manner from the outside. We’ll cover that next post.