Whilst it is increasingly obvious that AI and workplace automation is adversely affecting the Indian job market thanks to multiple rounds of layoffs from tech companies (and more silent job cutting by Financial Services and Media companies), Neha Arya points out that the world has seen multiple rounds of tech driven job destruction and creation over the past century:

“Technological advancements reshape models of interaction among individuals, firms, governments, and across these groups. These changes often influence the fundamental nature of work, along with several dynamics of the labour market. Lin (2011) used historical US Census data for the period 1965-2000 (that used the Dictionary of Occupation Titles) to show the novel job titles it captured, for example, “web developer”, “chat room host”, and “radio pharmacist”. Using Lin’s approach, Autor et al. (2021) found that over 60% of employment in 2018 was under job titles that did not exist in 1940.

However, while technology, such as automation, can displace workers from existing jobs or tasks, it also creates new work, reinstating demand for workers with specific expertise (Acemoglu and Restrepo 2018). Identifying these new occupational titles is especially crucial for developing countries like India, which have a large, informal and vulnerable workforce and are experiencing rapid technological advancements.”

Ms. Arya then drills deeper into a specific type of worker who is essential for the success of AI – the data worker:

“Generative AI (GenAI) models are swiftly becoming popular, like OpenAI’s ChatGPT,- which became the fastest-growing web application in history with 100 million monthly active users within 2.5 months and the attainment of 500 million users in a short span of time…

The term “generative” underscores the fact that these AI systems can create or generate new material autonomously without human input…However, a huge amount of human labour goes into development of these AI systems. Many of these AI systems (including ChatGPT, Google’s Gemini, DALL-E, among others) are based on a complex “human-in-the loop” (HITL) model…HITL uses the judgement of human data workers for annotation, labelling, and categorising raw data (like text files, images or videos) to train machine learning models (IBM, 2025). Data curators, labellers, content moderators, validators, and human feedback providers work to ensure that AI does not perform poorly or dangerously (for instance, in autonomous cars). Accuracy of these data is crucial for efficiency and better predictability/performance of AI models. Thus, data workers are the backbone of AI systems, ensuring their functionality, accuracy, and safety – ironically while themselves working in precarious, fragmented, and often invisible conditions.

Why is this an important contemporary and future concern for India? To meet cost-efficiency goals, businesses increasingly rely on gig workers in the AI supply chain – often outsourcing tasks to crowd workers via digital labour platforms (DLPs) or smaller firms employing data workers.

For instance, at the announcement of Amazon’s Mechanical Turk or MTurk (a virtual labour marketplace/crowdwork platform) in 2006, Amazon’s CEO Jeff Bezos referred to it as “artificial artificial intelligence”. It signified that the “Human Intelligence Tasks” (HITs) available on the platform were microtasks (often simple and repetitive) to be performed by a reserve army of cheap labour.

A prominent outcome of data workers using MTurk (called ‘turkers’) for a project (by Jia Deng et al.), was the release of ImageNet dataset, the largest labelled image dataset, in 2009. It was fuelled by the work of millions of workers across the globe, who manually labelled a million images for very low wages.

A 2016 survey by the Pew Research Center, of almost 3,000 turkers from the US revealed that over 50% of all workers reported hourly earnings below $5 (Pew, 2016). Unsurprisingly, a large majority of data workers are in the Global South, where wages are significantly lower. For instance, in Kenya, data workers mostly receive hourly wages of only US$2, while in Argentina hourly wages go as low as US$1.7.”

Data workers are a great example of what economists call the “substitution effect” or “reinstatement effect” of new technologies i.e. the ability of new technologies to create new types of jobs whilst laying waste to pre-existing professions.

So how many data workers does India have? Mr Arya says that India’s traditional labour market surveys are NOT designed to identify the existence of such workers:

“India’s annual Periodic Labour Force Survey (PLFS) uses the National Classification of Occupations (NCO) (2015), in which “data entry clerks” are captured by “Family 4132”…Essentially, the categories include traditional clerical data input roles and do not explicitly cover modern AI-related data work. Overall, therefore, these workers remain statistically invisible, as is the case for digital platform-based gig workers.”

But says Ms Arya if you visit online job portals in India you can see tens of thousands of data labelling, data annotator and bot training jobs advertised:

“In 2024, an estimated 50,000 Indian (freelance) annotators were present on international digital platforms, and 20,000 full-time annotators within India, according to this Economic Times report (citing data from TeamLease).

The same report also states that the global market for data annotations is valued at an estimated US$8.22 billion, and is expected to grow swiftly at nearly 26.2% annually by 2028. From US$250 million in 2020-21, India is expected to service over US$7 billion of the global annotation market by 2030 (National Association of Software and Service Companies (NASSCOM, 2021). Even India’s ‘National Strategy for Artificial Intelligence’ identifies data annotation work as having the potential of “absorbing a large portion of the workforce that may find itself redundant due to increasing automation” (NITI Aayog, 2018).”

Disclaimer: Alphabet (parent company of Google), Amazon which is part of Marcellus’ Global Compounders Portfolio, a strategy offered by the IFSC branch of Marcellus Investment Managers Private Limited and regulated by IFSCA. Accordingly, Marcellus, its employees, immediate relatives, and clients may hold interests or positions in these stocks. Any references to these companies are made solely for informational and educational purposes, in the context of the article discussed.’

If you want to read our other published material, please visit https://marcellus.in/blog/

Note: The above material is neither investment research, nor financial advice. Marcellus does not seek payment for or business from this publication in any shape or form. The information provided is intended for educational purposes only. Marcellus Investment Managers is regulated by the Securities and Exchange Board of India (SEBI) and is also an FME (Non-Retail) with the International Financial Services Centres Authority (IFSCA) as a provider of Portfolio Management Services. Additionally, Marcellus is also registered with US Securities and Exchange Commission (“US SEC”) as an Investment Advisor.



2025 © | All rights reserved.

Privacy Policy | Terms and Conditions