Thursday, February 4, 2016

Machine learning US investment options

As we remarked in the previous post, assessing the financial promise of a given company or other market instrument is a task for a professional. An amateur may point out one or two obvious things, but will have to guess in pretty much all the major questions. So the natural thought would be to seek out professionals.

There are a few services available with regards to investment:


  1. Portfolio management. There are some calculations based on the past prices and a list of assumptions, which lead to a prescription how much money one should commit to a given type of investment. The central idea is that of diversification: all things cannot fail at once. Or in more mathematical terms, average of N independent random variables will have 1/sqrt(N) smaller standard deviation than each one individually. This slightly improves the performance as compared to s&p 500, and there are plenty of websites that offer this kind of portfolios with formulas hard-coded in: betterment.com, Acorn. Note that such services do not improve the expectation value of your returns, they trade the expectation value for "protection from risk", in other words, apparent reduction in standard deviation of your returns. They save you the nerves during the market crashes. Nobody knows what happens with the actual probability, if that can even be defined (see below). Nice thing about those websites is that they seem to do the rebalancing for you, which also gives a slight improvement over buy&hold of fixed number of shares.
  2. Mutual funds. Those ones will ask you 1$ commission or 0.3%, and attempt to do something that they often do not completely disclose, but possibly more complicated than just portfolio allocation and rebalancing. You can choose the area of the market - there are plenty of mutual funds available for each one. Then the "professionals" who manage the mutual funds decide what to buy and when. Hopefully they can not only improve over blind allocation of portfolio managers, but also time their purchases (or play with the derivatives). However, all of that zoo does not bring them easy victory - as one sees in the websites that show mutual fund's performance, many of them jump like crazy on the scale of 1 year, and it's hard to say conclusively whether their strategy still works. Also, some of them have large barriers to entry (100K$, still not as large as hedge funds)
Most of the intuition in the comparison comes from the simple random process. Such random process would have fixed mean M size of each step, and a much bigger random noise determined by standard deviation S. Let step happen over 1 day. Then after 252 days (roughly 1 year of trading days) the mean/deviation of total returns will be amplified as sqrt(252)*M/S. This number is called Sharpe ratio (roughly). The general logic is that we want to choose a portfolio with the biggest M, but almost all the time the apparent Sharpe ratio of such portfolio will be too small. The mnemonic is that 1/Sharpe^2 is the number of years of data that we need to be confident in our choice. For individual instruments as well as the most of the simple strategies, Sharpe<1, which implies that we need >1 year of uniform data to confirm that our algorithm is working at all. But the problem is, the data on the stock market is strongly non-uniform, especially for B&H strategies. The events every few month strongly influence all the market and change the sentiment about the companies and other instruments. Of course, there are short term changes as well, but the hope is that they repeat often enough so that our training can take them into account. But the month/year scale events tend to be unique and unpredictable. So, in short, our conclusion is that past data alone does not provide sufficient evidence for reliability of any B&H strategy. If that argument was not enough, there's also selection bias that is carefully accounted in Quantopian.com environment, but not in any of the cheesy portfolio allocation formulas.

Still, it's a good sanity check to find out what's your strategy's Sharpe ratio. It's very random, though. Somewhat more stable number is achieved via training-test splitting, optimizing of paramenters on the training and endless crossvalidation on test. We expect that "speculation" strategies can be reliably assessed in this way as they are the ones about timing the purchases, which should be a universal technique largely independent on long-term trends of the market. But B&H strategies are not expected to provide particularly insightful numbers even after the laborious crossvalidation. We will cover a few results for a best performer/worst performer strategies in the next post.

In conclusion, I'd like to note that the above Sharpe ratio analysis relies on the assumption of simple random process, which is in one way definitely not true for the stock market instruments. If the mean M and the dispersion S were as they are measured from the data, typical stock would travel of order S*sqrt(N) from it's origin over N days for N<252/Sharpe^2. But a real stock hardly ever travels that far. If one estimates the anticorrelation coefficient sum(Xi -M) (Xj-M) = sum -0.5(Xi -M)^2 +  0.5(sum Xj- NM)^2 = NS^2, it's almost maximally possible...

No comments:

Post a Comment