This will not come as a surprise: An article published in GreenBook about how to improve data quality in market research highlighted the fact that there is no single solution for detecting and preventing survey fraud. Of course not! It would never be that easy. Instead, we have to approach the problem by thinking of it in terms of layers of protection that we implement throughout the research process, all the way from design, through execution, to analysis.
This is similar to the Swiss Cheese Model of Pandemic Defense, a metaphor that describes how cheese slices represent multiple layers of protection that can be employed to avoid contracting COVID-19. Similarly, the Swiss Cheese model can be applied to preventing survey fraud. While no individual layer of protection – fraud prevention and detection software, the screener, the Main questionnaire, and post-fieldwork data cleaning – is perfect, we have the best chance of preventing survey fraud when we apply all the layers together.
While this is a good analogy, the issue with the Swiss Cheese metaphor is that each slice appears to contribute equally to preventing the virus when, in fact, some of the layers provide more protection than others. In our metaphor, we’ve designed each slice of cheese to proportionally represent the (unequal) role they play in preventing fraud.
How to use all the cheese slices to reduce survey fraud
What follows are some of the best practices we employ as we consider survey fraud prevention in each step of the research process.
1. Fraud prevention and detection software has limits
Most sample providers have software (some sophisticated, some basic, and some can even block good respondents) that utilizes passive machine data and behavioral data to combat fraud technologies. These tools flag and remove both duplicates and fraudsters before they enter your survey. While this first slice should be the thickest one, it does not always work that way in practice.
Over time, some of these tools have become more refined at blocking more sophisticated fraud, but bad actors are continually trying to figure out ways around this technology. Further, the threshold (i.e. quality score) set to determine the outcome of suspicious respondents is something the panel providers decide. Because the panel providers need to balance sample quality with survey traffic, their priorities regarding those thresholds might be different from yours.
2. Designing tanks, not screeners
Considering the limitations of anti-fraud survey solutions, the thickest, most important layer of protection is your screener. The screener is also critically important for lower incidence consumer and B2B work. This is where it is easiest to identify individuals who maliciously try to qualify for your survey. As such, screeners need to be designed like tanks, bulletproof and tough, but also smart and subtle, using quality assurance (QA) “traps” that are not blatantly obvious to fraudsters. For example, most fraudsters will not be fooled by fake brands.
3. Fraudsters blend in
Once fraudsters have qualified to take your survey, they easily blend into your dataset. They know what researchers look for and they are able to skillfully avoid most traps. For this reason, the main questionnaire is another thin slice of cheese. Here, the common QA traps you’ve set up will mainly work to identify real humans who are not paying attention, not bots. If you have a large sample size, this is arguably a small number of people who have little impact on the overall story.
4. Post-fieldwork data cleaning: Time-consuming, but necessary
Aside from the screener, this is where researchers have the most control. When data collection is near-complete, researchers are able to see the full picture. Patterns in the data emerge. Outliers stand out. Data that earlier seemed sensible now raises questions from within a broader context. As much as we would like to automate the entire data-cleaning process, this is where human intelligence and intuition are vital in recognizing fraud.
At this stage, it is critical to have a solid relationship with a panel partner who shares your concerns about data quality, and who trusts your expertise. Any bad respondents must be removed from your dataset and replaced, so having a panel partner that will not turn this process into a long negotiation is worth their weight in gold. Find a partner you can rely on to keep an already time-consuming process short.
Always add extra cheese
Preventing and identifying survey fraud is time-consuming, but is an absolute necessity for any organization that relies on good quality data to make decisions. Understanding the role and effectiveness of each slice of cheese, or layer of protection, can help researchers spend less time cleaning data and more time doing what we love: discovering and sharing insights.
The take-away? Always ask for extra cheese, both metaphorically and literally!