Small Data is More Important than Big Data

Published on:18 July, 2019

In the context of investing, the current fad for big data and machine learning seems overblown. Successful investing entails understanding basic probabilities well and whilst machines can be used to crunch these probabilities, to date the old fashioned method of reading annual reports to calculate these probabilities has worked pretty well.

“Small data is big data in disguise. The reason we can often make good predictions from a small number of observations…is that our priors [prior experiences] are so rich. Whether we know it or not, we appear to carry around in our heads, surprisingly accurate priors about movies grosses,…poem lengths, political terms of office…and human life spans. We don’t need to gather them explicitly; we absorb them from the world.”– Brian Christian & Tom Griffiths in ‘Algorithms to Live By: The Computer Science of Human Decisions’ (2016) [square brackets are our insertions]

The Big Data craze on the London Underground
Last week whilst travelling on the London Undergound I was whiling away a train journey reading the adverts pasted on the walls of the subway train. When I saw that most of the adverts for professional education courses were centered around Big Data (eg. “Learn Python in 7 days” or “Masters in Big Data” or “Advanced Courses in Statistics”, etc), I couldn’t help remembering how 20 years ago the same adverts on the London Underground trains were for courses in Java, web design or graphic design.

At that time as a freshly minted graduate, I had gone to my then manager, Steve Norton, and requested him to give me time off to learn Java. To my chagrin, Steve had sat me down and given me a lecture on why I needed to focus on understanding how to analyse businesses and corporate strategy rather than joining the web bandwagon. 20 years on as I see today’s graduates climb the Big Data bandwagon I realise how right Steve was.

Steve’s insight is apt today as it was 20 years ago – small data i.e. an understanding of on how companies and industries operate is for investors far more powerful than any amount of big data crunching. Let me illustrate with an example.

The world reveals itself to us through small data
Let us begin with a simple game. Assume that we have two buckets: A & B. Each bucket has balls of two colours: red and black. Each of us gets one chance to pick one ball from one bucket (we can choose which bucket to pick from). Whoever picks the first black ball wins the game.

Bucket A has 100 balls of which 90 are red and the rest are black. Bucket B has 1000 balls of which 990 are red and the rest are black.

If the goal is to pick a black ball then obviously you should pick from bucket A since the probability of drawing black is 10% in bucket A versus 1% in bucket B. In other words, you are 10x more likely to be win the game if you focus on Bucket A than on Bucket B.

Sounds simple enough doesn’t it. Now, let’s bring this game to the real world. Bucket A is the paints sector in India and Bucket B is the NBFC sector in India. Black balls are like Rob Kirby’s Coffee Can stocks (see Kirby’s celebrated paper from 1984 or my book ‘Coffee Can Investing’ from 2018 for more details) which can give you high returns with low volatility. Red balls are like typical Nifty stocks i.e. you will get returns of around 13% per annum over ten years (i.e. half as much as the black balls) with moderate-high volatility.

If the NBFC sector is like Bucket B and Bucket B is 10x less likely to be a winning bet compared to Bucket A, then why do so many smart investors bet on the NBFC sector? Conversely, if the paints sector is like Bucket A, why do so few investors bet on paint companies? [I can think of only one private equity (PE) company in India who has invested in the paints sector in the past five years whereas nearly every PE company has an NBFC investment.] In other words, why don’t investors understand the basic probabilities of successful investing?

Investment implications
There are a number of reasons why so much capital flows into the NBFC sector and so little into the paints sector:

  1. The barriers to setting up an NBFC in India are relatively low (in what remains a country with scarce provision of credit) compared to setting up a paint company. In paints, thanks to Asian Paints, it is a big challenge for a new entrant to get its distribution into a competitive position [to understand more about this read Chapters 2 & 3 of my book ‘The Unusual Billionaires’ (2016)]. As a result, there are many more NBFCs clamouring for capital than there are paint companies. [Remember, Bucket B has 10x as many balls as Bucket A.] Economists call it “adverse selection” i.e. the inferior agent more proactively seeks out the money. The same thing happens when hospital blood banks offer cash for blood donations.
  2. NBFC companies – even the most successful ones – are highly capital consumptive because they rarely generate enough surplus on an annual basis to finance their loan book growth. Paint companies – even second rung ones – typically have Return on Capital above their Cost of Capital which means that routinely generate free cashflow. Hence paint companies seldom need money whilst NBFCs routinely need money.
  3. In the game that I sketched out at the beginning of this note, I assumed that we could easily distinguish whether you had picked a red ball or a black ball. But what if you could fool me into believing that the red ball you had drawn was actually a black ball? If you can easily fool me then it is no longer obvious that you should choose Bucket A (90% of balls are red) over Bucket B (99% of balls are red). Applying this to the real world, it is far easier for NBFCs to fool investors about the quality of their business (by evergreening their bad debts) than it is for a paint company (whose free cashflows or Return on Capital will give a pretty clear indication of its health). Hence in the real world investors get drawn to the NBFC sector thinking that it is like Bucket A (not Bucket B). In other words, investors see a Bajaj Finance or a Cholamandalam and end up being fooled by other NBFCs into believing that they are the next BAF or Chola.
  4. Leaving aside accounting jugglery, the fact that NBFCs are unlikely to produce as high a proportion of black balls as the paints sector takes time to hit home because of the capital structure of NBFCs. As we wrote in our 20 November 2018 blog “…investing in NBFCs is like playing European roulette. You will make money most of the time – typically in large, free market economies NBFCs go bust once every decade or so. But then one day you will (yes, “will” is the apt word) lose everything you invested in the NBFC.
    To cite a more prosaic game that most of us played as kids – investing in NBFCs is like playing Snakes & Ladders. There is that super long snake in the penultimate cell who brings you back to zero (although only a minority of the cells contain snakes). On the other hand, investing in clean, well-managed companies in the real world is like playing Ludo – it is more laborious but we know where we are going and ultimately get there.”

To conclude, whilst the base probability of making a successful investment in a given sector can be readily calculated from publicly available data on databases like ACE (which also includes data from unlisted companies), most investors are rarely able to think about the world like this. And if you call these sorts of probabilities small data (because you need only a small amount of data to calculate these probabilities), you can see why small data is more powerful than big data.

In fact, what legendary investors do is read the annual reports of thousands of companies over the course of several decades. That allows them to build a mental database of the base probabilities of successful investing in any given sector. Looking at the world this way, you realise that successful investing is essentially the translation of years of reading/absorbing reams of data into probabilities. In effect, it is when big data becomes small data that you actually get value out of it.

To read our other published material, please visit
Saurabh Mukherjea is the author of “The Unusual Billionaires” and “Coffee Can Investing: the Low Risk Route to Stupendous Wealth”.      

Note: the above material is neither investment research, nor investment advice. Marcellus Investment Managers is regulated by the Securities and Exchange Board of India as a provider of Portfolio Management Services and as an Investment Advisor.

Copyright © 2019 Marcellus Investment Managers Pvt Ltd, All rights reserved.


Marcellus Investment Managers is regulated by the Securities and Exchange Board of India as a provider of Portfolio Management Services and as an Investment Advisor.

The information provided on this website does not, and is not intended to, constitute investment advice; instead, all information, content, and materials available on this site are for general informational purposes only.  Information on this website may not constitute the most up-to-date information. The enclosed material is neither investment research, nor investment advice. Marcellus does not seek payment for or business from this email in any shape or form. The contents and information in this document may include inaccuracies or typographical errors and all liability with respect to actions taken or not taken based on the contents of this site are hereby expressly disclaimed.  The content on this website is provided "as is;" no representations are made that the content is error-free.

No reader, user, or browser of this site should act or refrain from acting on the basis of information on this [site/newsletter] without first seeking independent advice in that regard. Use of, and access to, this website or any of the links or resources contained within the site do not create an portfolio manager -client relationship between the reader, user, or browser and website authors, contributors and their respective employers.  The views expressed at, or through, this site are those of the individual authors writing in their individual capacities only.