Glenn Pomerantz and Christopher Kim, BDO

This is an extract from the third edition of GIR's The Practitioner’s Guide to Global Investigations. The whole publication is available here

Forensic data analysis

Forensic analysis of data refers to analysis of electronically stored data. The most commonly analysed data are accounting and financial, but several non-financial categories of data are also very useful to investigators. Each is explored below.

Data analysis generally has three applications in the investigative process:

  • to initially detect fraud or non-compliance (e.g., monitoring performed by internal audit);
  • to corroborate an allegation in order to justify launching an investigation (e.g., proving that an allegation received via a hotline appears to have merit); and
  • to perform certain parts of the investigation (e.g., analysis of payments made to suspicious vendors).

Each of these will be explained further. But first, a few important points about data analytics are essential.

Data analytics rarely prove that fraud or non-compliance occurred. Rather, data analysis identifies transactions or activities that have the characteristics of fraud or non-compliance, so that they can be examined further. These are often referred to as anomalies in the data.

If an investigation ultimately leads to employee terminations or legal proceedings to recover losses, it is critical to have properly analysed the anomalies that data mining has identified. Could the anomaly in the data, or an anomaly in a document, while often identified as a characteristic of fraud, also simply indicate a benign deviation? Failing to investigate and rule out non-fraudulent explanations for anomalies can have consequences that many investigators have learned about the hard way.

Identifying and exploring all realistically possible non-fraudulent, non-corrupt explanations for an anomaly is also called reverse proof. Examining and eventually ruling out all of the valid possible non-fraudulent explanations for an anomaly in the data or documentation can prove that the only remaining reasonable explanation is fraud or corruption.

Take a simple example to illustrate this important concept. An employee is found to have submitted the same business expenditure twice for reimbursement (paid for using a personal credit card). Further analysis shows that this is not an isolated incident. In fact, the rate at which the employee submitted duplicate expenditures has increased over time – a classic red flag commonly associated with perpetrators of fraud. Is this a sufficient basis to support an allegation of misconduct?

This would be premature. What if, on further analysis, the investigator also finds that the employee has been asked to work an increasing number of hours every week and travel much more extensively over time. Investigating further, it is found that this employee is particularly disorganised and has never been asked to do this much business travel before. These additional facts make the distinction between an intentional act of fraud and an escalating series of honest mistakes a bit blurry.

Careful consideration of alternative theories for data and document anomalies is critical to protecting the organisation and the investigator from liability stemming from falsely accusing someone of wrongdoing.

Data mining to detect fraud or non-compliance

Depending on which application or phase of the investigative process is involved, the nature of forensic data analysis can vary. For example, as an initial detector of fraud or non-compliance through ongoing monitoring, forensic data analytics usually takes one of two broad, but opposite, approaches: identification of any activity that deviates from expectations, or identification of activity that possesses specific characteristics associated with fraudulent or corrupt behaviour or other non-compliant conduct.

The former approach is taken when acceptable behaviour is narrowly defined, such that the slightest deviation warrants investigation. The latter approach is the more common. It is driven by a risk assessment and is based on what this type of fraud or non-compliance would look like in the data. For example, a shell company scheme might evidence itself by an address in the vendor master file matching an address in the employee master file. Any instances of such a match would be investigated.

In some cases, basing the ‘investigate’ or ‘don’t investigate’ decision on a single characteristic in the data can result in numerous false positives. For this reason, more sophisticated data analytics often rely on the consideration of multiple characteristics in assessing the risk of activity being fraudulent or corrupt.

Regardless of which of these two approaches is taken, data analytics often represents an essential tool for gathering evidence to lay the foundation for substantive examination of books, records and other evidence. Following the reverse-proof concept described above is critical once anomalies indicative of possible wrongdoing are uncovered.

Corroborating allegations

As a method of corroborating an allegation that has been received, data analysis can be of great value. It is a significant advantage to the investigator because, more often than not, it can be performed on electronic data without alerting the subject of the allegation. In this application, the allegation is first assessed in terms of what impact the alleged fraudulent or corrupt act would have on financial or non-financial data. To illustrate, take the example of an allegation that workers in the shipping department of a warehouse are stealing inventory by short shipping orders to customers. There are numerous sources of data, both financial and non-financial, that could be analysed to assess the validity of this allegation:

  • gross profit margins – an unexplained decline in gross profit margins by product, or by location (as a result of having to re-ship additional items, with no associated revenue, to satisfy the customer);
  • inventory purchases – unexplained increases in purchases of certain inventory items without a corresponding increase in sales;
  • customer complaints – customer service data indicating complaints about incomplete shipments, especially if those complaints can be correlated back to specific orders; and
  • shipping records – using the customer complaint data, orders are correlated to specific shipments and employee names associated with filling and shipping these orders. Shipping records might also reveal more shipments to a customer than orders, indicating a second shipment was needed to complete the order after the customer complained.

This is a simple example, but one that illustrates that for every allegation, there likely exists data associated with either the perpetration or concealment of the fraud or non-compliance. And this data normally exhibits one or more anomalies in comparison with data from similar transactions that do not involve fraud or non-compliance.

Using data analysis in an investigation

The final application of forensic data analysis is performed during the investigation itself. Once an anomaly has been found to involve fraud or non-compliance, additional forensic data analysis, along with substantive forensic examination of the evidence, may be performed to:

  1. determine how long the activity has occurred;
  2. determine which employees (or third parties) have participated in the fraud (i.e., assessing whether collusion was involved);
  3. measure the financial damage resulting from the activity;
  4. identify other fraudulent or corrupt conduct by the same individuals; and
  5. determine how the fraudulent or corrupt act was concealed and how internal controls were circumvented.

Determining who is involved in the fraud as well as who possessed knowledge of it is critical to the mitigation and control enhancement objectives. According to a recent report by the Association of Certified Fraud Examiners (ACFE), nearly 45 per cent of all fraud and corruption schemes investigated involve multiple perpetrators. This figure has been steadily rising since the ACFE began studying fraud. The 45 per cent is split nearly evenly between cases involving multiple internal perpetrators and those involving collusion between insiders and outsiders, such as vendors or customers.

Point 4, above, may also come as a surprise to some, but is important. The ACFE report indicates that 31.8 per cent of the time an individual engages in fraud (especially with respect to asset misappropriations), they employ multiple methods to commit their crimes. The allegation or investigation may have initially focused on only one specific method. Exploring what other activities the subject might have the capability of engaging in is an integral element of the investigation. Investigators and victims attempt to ‘put a fence around the fraud’ as early in the investigative process as possible. Understanding the responsibilities of the subject and the potential for unrelated schemes is essential for erecting the fence. Victims often desire a narrow investigative scope – a sort of wishful thinking. An investigator’s worst case scenario is missing a scheme conducted by a subject despite investigating the subject.

The question of who knew what and when can be particularly important in satisfying auditors in the context of financial reporting fraud. In addition to quantifying the financial statement impact from fraud, auditors rely on representations from management. Knowledge of whether previous representations came from fraudsters and the auditor’s assessment of management’s integrity are often important aspects of financial reporting fraud investigations.

In the next sections, the distinction between financial and non-financial data will be explored, followed by a discussion of internal versus external data.

Analysis of financial data

Most analyses of internal data relevant to an investigation begin with financial data, much of which comes from the organisation’s accounting system. Accounting data can exist in several separate systems, such as:

  • general ledger, the master ledger that reflects all accounts and the sum of all accounting activity for the organisation;
  • general journal, where journal entries are initially recorded before being posted to the general ledger;
  • books of original entry, which contain details of certain types of financial transactions, summaries of which are posted to the general ledger. Examples of books of original entry include sales, cash receipts, cash disbursements and payroll; and
  • subsidiary ledgers, which contain additional details of transactions and activities that appear only in summary form in the general ledger. Examples of subsidiary ledgers are accounts receivable and accounts payable ledgers.

Performing an investigation often requires the extraction and analysis of data from all these systems to see the big picture or to properly trace the history of a transaction or series of activities. The days of manually maintained books of original entry are gone. The vast majority of organisations now use electronic accounting and financial software, and in larger organisations these systems are included as part of a broader ERP system.

Some systems are hybrids of financial and non-financial information. Examples of these systems include:

  • Inventory: in addition to cost information associated with purchases, the system may also provide data on quantities and dates of purchases, deliveries, shipments, inventory damaged or scrapped, and counts resulting from physical observation.
  • Payroll: in addition to data on net amounts paid to employees, the payroll system will usually include other relevant data needed to calculate an employee’s gross and net pay, including various worker classification codes, hours worked during a pay period, rates of pay, tax and withholding information, along with bank account information for the electronic transfer of funds to employees.
  • Human resources: in most large organisations a human resources system that is separate from payroll is maintained. Included in this system are data on rates of pay and past raises, incentive payments, and other financial data about each employee, as well as significant amounts of non-financial data, such as each employee’s home address. Human resource information systems may also include vital information associated with an employee’s initial hiring, such as background and reference checks, verification of information provided on an employment application, etc. This information can be important if the organisation anticipates filing an insurance claim to be indemnified for losses attributable to an employee.

Availability of and legal considerations associated with each of these sources of internal data vary from one jurisdiction to another, particularly with respect to payroll and personnel information. Privacy issues must be considered before embarking on any use of such data in an investigation.