Three Longs & Three Shorts

Can an algorithm determine just how rich an Indian is merely from their name?

William Shakespeare famously wrote in Romeo and Juliet “What’s in a name? that which we call a rose by any other name would smell as sweet.” Turns out, there is actually a lot that can be gathered from the name, at least going by Rishabh Srivastava’s big data analytics story on Scroll. Rishabh is the founder of, as the name suggests an AI focused company. has analysed more than 100 million names from publicly available sources to develop algorithms that predict audience demographics including age, affluence, ethnicity, etc and that is making its way into how marketeers and lenders scan the vast population to target the right audience for their products and services.
“Is someone named Rhea Ahluwalia more likely to be affluent than someone named Panna Lal? If your instinctive answer was “Yes”, you would – ceteris paribus – be right. Indian names, both first and last, contain enormous signalling value. They can be used by marketers, analysts and entrepreneurs to assess the affluence, age, ethnicity and gender distributions of their audience.
Age prediction: Intuitively, most Indians recognise that names such as Shubham and Rishabh are younger and more modern names, while names like Om and Shashi are older names. We quantified this by creating an age histogram for individual names. For instance, a cursory look at the graph below shows that the name Abhishek skews young. We used the skew of these histograms to create an age-index for each first name in India. The lower the age-index for a name, the younger a name is likely to be. The graph below shows the age-index for selected names….
Affluence estimation: We also overlaid electoral roll data with data scraped from property sites to discover correlations between where different communities lived and where property and rental prices were the highest. For instance, property prices in Bankner in Delhi are far lower than property prices in areas like Rajouri Gardens and Rohini. We could draw correlations between these prices and the distributions of different communities in these areas. The image below, for instance, lists the nine most common surnames in selected communities in Delhi….
Ethnicity prediction: Rishabh Srivastava, my name, is highly likely to be that of a North Indian, while a name like Ronojoy Sen is highly likely to be that of someone who is ethnically Bengali….
Applications: We built these tools mostly as a fun learning project, but have since learnt that they have a number of commercially useful applications. Using names can help brands measure the demographics and affluence of their audience, as well as that of the publishers they advertise with. Similarly, publishers can use names to find out what kind of stories attract what kind of audiences….Calculating the affluence index of users acquired from campaigns in real time can help brand stop a campaign that acquires the wrong kinds of users before too many marketing dollars have been spent. Name analytics can also be used by e-commerce and content websites to target specific demographics with specific campaigns. For example, Marathi users can be targeted with campaigns linked to Ganesh Chaturthi. Similarly, Bengali users can be targeted with campaigns linked to Durga Puja.


Note: the above material is neither investment research, nor financial advice. Marcellus does not seek payment for or business from this in any shape or form. Marcellus Investment Managers is regulated by the Securities and Exchange Board of India as a provider of Portfolio Management Services and as an Investment Advisor.