Fraud and Deception Detection: Text-Based Analysis

Many of us in Marcellus are fans of Jason Voss’ book, “The Intuitive Investor” which explains how meditation helps investors became better at their job by allowing them to access higher levels of clarity of thought and intuition. In this piece, Jason – himself a very successful investor in the US – turns his mind to a different topic: fraud detection.
Jason begins by making the simple – but oft forgotten – point that most investors do NOT have time (or the resources) to read annual reports real time: “Companies communicate information through words more than numbers. For example, from 2009 to 2019, the annual reports of the Dow Jones Industrial Average’s component companies tallied just over 31.8 million words and numbers combined, according to AIM Consulting.
Now, JP Morgan’s 2012 annual report is 237,894 words. Let’s say an average reader can read and comprehend about 125 words per minute. At this rate, it would take a research analyst approximately 31 hours and 43 minutes to thoroughly read the report. The average mutual fund research analyst in the United States makes around $70,000 per year, according to WallStreetMojo. So that one JP Morgan report costs a firm more than $1,100 to assess…”
So, if investors can’t read most annual reports as soon as they are published, how can they figure out whether the management teams they are backing are telling the truth about what is happening inside the company? Does listening to the management speaking on conference calls or on TV help? Jason has two layers of bad news for us – not only are investors bad at detecting the lies told by management teams, the investors are also overconfident (about their lie-detection skills): “…research demonstrates we cannot detect deception through body language or gut instinct. In fact, a meta-analysis of our deception-spotting abilities found a global success rate just 4% better than chance. We might believe that as finance pros we are exceptional. We would be wrong.
In 2017, we measured deception detection skills among finance professionals. It was the first time our industry’s lie detection prowess had ever been put to the test. In short: ouch! Our overall success rate is actually worse than that of the general population: We did not score 54%, we earned an even-worse-than-a-coin-toss 49.4%.
But maybe our strengths are in our own sector. Put us in a finance setting, say on an earnings call, and we’ll do much better, right? Nope, not really. In investment settings, we could detect deception just 51.8% of the time.
There is more bad news here (sorry): Finance pros have a strong truth bias. We tend to trust other finance pros way more than we should. Our research found that we only catch a lie in finance 39.4% of the time. So that 51.8% accuracy rate is due to our tendency to believe our fellow finance pros.”
So what to do? Jason says that the use of tech can help us get better our jobs: “Starting with James W. Pennebaker’s pioneering work, researchers have applied natural language processing (NLP) to analyze verbal content and estimate a transcript’s or written document’s credibility. Computers extract language features from the text, such as word frequencies, psycholinguistic details, or negative financial terms, in effect, dusting for language fingerprints. How do these automated techniques perform? Their success rates are between 64% and 80%.
In personal interactions, as we noted, people can detect lies approximately 54% of the time. But their performance worsens when assessing the veracity of text. Research published in 2021 found that people have about a 50% or coin-flip chance to identify deception in text. A computer-based algorithm, however, had a 69% chance.”
Building further on this insight, Jason says that he has built a tech tool which analyses the language used by management teams to figure out whether they are lying: “…we developed Deception And Truth Analysis (D.A.T.A.) with Orbit Financial. Based on a 10-year investigation of those deception technologies that work in and out of sample — hint: not reading body language — D.A.T.A. examines more than 30 language fingerprints in five separate scientifically proven algorithms to determine how these speech elements and language fingerprints interact with one another.
The process is similar to that of a standard stock screener. That screener identifies the performance fingerprints we want and then applies these quantitative fingerprints to screen an entire universe of stocks and produce a list on which we can unleash our financial analysis. D.A.T.A. works in the same way.
A key language fingerprint is the use of articles like a, an, and the, for example. An excess of these is more associated with deceptive than truthful speech. But article frequency is only one component: How the articles are used is what really matters. And since articles are directly connected to nouns, D.A.T.A is hard to outmaneuver. A potential dissembler would have to alter how they communicate, changing how they use their nouns and how often they use them. This is not an easy task…”

If you want to read our other published material, please visit https://marcellus.in/blog/

Note: The above material is neither investment research, nor financial advice. Marcellus does not seek payment for or business from this publication in any shape or form. The information provided is intended for educational purposes only. Marcellus Investment Managers is regulated by the Securities and Exchange Board of India (SEBI) and is also an FME (Non-Retail) with the International Financial Services Centres Authority (IFSCA) as a provider of Portfolio Management Services. Additionally, Marcellus is also registered with US Securities and Exchange Commission (“US SEC”) as an Investment Advisor.

Please read the following carefully and select your residency jurisdiction

By selecting this option, I hereby confirm that I AM NOT a resident of the United States of America or a resident of a jurisdiction, the laws of which impose prohibition on soliciting the advisory business in that jurisdiction without going through the registration requirements and/or prohibit the use of any information contained in this website. By selecting this option, I hereby confirm that I AM a resident of the United States of America or a resident of a jurisdiction, the laws of which impose prohibition on soliciting the securities business in that jurisdiction without going through the registration requirements and/or prohibit the use of any information contained in this website

If accessing this website by giving false declaration, the person shall be solely liable/responsible for any adverse consequences suffered, legally as well as financially, pursuant to use of any information contained in this website

Fraud and Deception Detection: Text-Based Analysis

Long read: Blind to Disruption – The CEOs Who Missed the Future

Long read: That Dropped Call With Customer Service? It Was on Purpose

Long read: ‘You can’t be the player’s friend’: inside the secret world of tennis umpires

Fraud and Deception Detection: Text-Based Analysis

Related Long Reads

Long read: Blind to Disruption – The CEOs Who Missed the Future

Long read: That Dropped Call With Customer Service? It Was on Purpose

Long read: ‘You can’t be the player’s friend’: inside the secret world of tennis umpires