I am attempting to use machine learning to find i) good companies AND ii) the right time to buy the shares. So rather than analyse lots of different companies’ statements, trying to identify characteristics in the text, I focused on one company. A good company is not always a good investment: Solid States share price had trundled along at circa 60p a share from 1997 for 10 years, before then crashing down to 12p (March 2009) and then bouncing up 879p (July 2015). Such a wild price range seems more than a little ironic for a company called Solid State.
Solid State plc share price (log scale. Source Stockopedia)
But the point is you can lose money in a good company if you buy at the wrong time. Conversely identifying the turning point from the March 2009 low would have given you 73x return in just over 6 years. Remember that’s for a listed company that makes things: battery products, electronic components and industrial computers.
Therefore it seemed worth trying to analyse the text to look for a “turning point”. Or perhaps look for what changed, was there a “catalyst” that changed the company from a low performing state to a high performing state? Financial numbers only tell you about historic performance – whereas the text might hint that after many years of mediocre performance something really rather fundamental had changed, which would see a huge future return.
I webscraped all the Solid State half year and full year results plus trading statements released by the company since 2000, which represented 52 announcements in all. I then labelled the announcements according to the one month performance versus the share price of the closing price the day before the announcement came out:
- Down 10% a month later, I labelled “negative”.
- Up 10% a month later, I labelled “positive”.
- If trading within a band 10% either higher or lower a month later, I labelled “neutral”.
I did this labeling manually, which took a lot of time. If I wanted to label 5,000 or even 50,000 statements in this way, I would need to find some way to automate this. But I figured it was worth doing a small sample by hand, to see if I could find interesting results first.
But it didn’t work. First I tried counting the most frequently occurring words in each bucket of “positive”, “neutral” and “negative” statements – no signal. Then I tried searching for relative frequencies of specific sentiment words like “difficult” or “confident”. Again not much of a signal, certainly no obvious catalyst. The problem was the statements which caused a negative share price reaction were written in a relatively neutral tone.
I could try more sophisticated machine learning techniques, but actually I think my original approach might be flawed. In the same way that looking at just numbers and ignoring the text is throwing away useful information, just looking at text and removing all the numbers is equally flawed.
So I went back and read the statements from the key time period of 2008 / 2009 with my own eyes. Unlike the computer some obvious positives jumped out at me. In full year statement in September 2008 the company was reporting a small increase in gross profit margins to 29.4% despite difficult conditions and revenues falling by 13%. Moreover the outlook statement noted a strong start to the year versus a very quiet comparable period 12 months earlier. The high gross margins meant that despite the decline in revenues, all three operating divisions were still trading profitably and the company was able to pay a 2p dividend (remember a few months later the shares would hit a 12p low, implying a historic yield of 17%).
This suggests to me a signal to investigate further. Try to identify companies reporting good numbers (eg. gross margins improving to close to 30% and paying a dividend) despite some obvious negatives mentioned in the text (eg difficult trading and falling revenues). That is positive numbers, negative sentiment in the text.
Originally I was thinking of using text analysis to highlight contradictions, and viewed these suspiciously. For instance the company where the Chief Executive claims in the text that they are a “world class” widget basher, but the numbers show that the company barely breaks even. Either widget bashing is not a high return industry, or the company is not world class. Or HBOS management claiming that they had a “strong balance sheet” in 2008. HBOS management perhaps didn’t know what a strong balance sheet was (they were being supported by the Bank of England Special Liquidity Scheme at the time). Otherwise a more cynical interpretation is that management thought that it was OK to write misleading financial statements. Whatever the answer, the contradiction is a warning flag to steer clear.
Even without machine learning, I have learnt the hard way tried to avoid companies reporting upbeat sentiment that report negative numbers. This type of upbeat loss making company tends to be a “blue sky” opportunity, selling a great story about graphene, fuel cells, Georgian natural gas – probably there’s a bitcoin related company doing exactly the same thing right now.
But previously I had not thought about the mirror image contradiction: downbeat textual sentiment but resilient numbers. Intuitively this makes sense though, if you think about a company that is going through temporarily bad times, but which is fundamentally sound and will benefit hugely when conditions improve. It’s almost like applying machine learning to value investing.
Embracing the unpopular
Most machine learning algorithms I have seen are attempting to exploit speed – ie they read the statement quicker than a human being and place a buy or a sell order before the human being.
I’m not looking to build something that relies on speed. I’m looking to exploit other human weaknesses, such as the fear of going against the herd. Being conventionally right doesn’t make you money, but being conventionally wrong is OK too, because you don’t get fired. Being unconventional and right makes you lots of money – but it’s hard because going against the herd, and discovering the herd is right, you get trampled to death. Most of my best investments (Adnams brewery in the 1990s recession, Berlin property in early 2007, Bank of Georgia 2010) have felt rather uncomfortable. A machine learning algorithm doesn’t feel uncomfortable, it doesn’t know what is shunned or unpopular.
So if my machine learning algorithm identified Solid State as
- good value
- suffering difficult trading conditions
it might have bought on the results day. If it had done many people would have said the computer was “wrong”, because the share price immediately fell. In fact the shares halved in value from 26p on the day of the announcement to 13p half a year later. That doesn’t seem very clever. Maybe if I worked at an asset manager they would have unplugged the computer and I would have got fired. Howard Marks, who manages over $100bn of assets at Oaktree, likes to say that in finance: being wrong and being too early is indistinguishable. That seems a problem that a computer wouldn’t have the answer for.
However, that problem also seems like an opportunity. As noted, that particular results statement was the turning point. When the share price hit the 12p low there was no announcement. 6 years later (July 2015) the share price reached a high of 879p.