Published Nov 28, 2022
Marie Hense, Vice President of Data Quality at Toluna
Originally published by Research Live.
Today’s world is increasingly multi-lingual. Countries’ populations consist of people with diverse backgrounds and various languages, whether it’s because of regional differences (in large countries such as India) or due to immigration and the subsequent formation of sub-populations (such as Spanish-speaking communities in the US).
To facilitate day-to-day living in these multi-lingual societies, technology has stepped up to the plate with the development of translation tools — whether they are web-based translation services, voice-based translators or translation add-ons that automatically translate everything that is shown on a device’s web browser in real time. While these tools are brilliant for everyday life and enabling communication in diverse communities, the blurring of linguistic lines poses a challenge for research.
There are three main areas of consideration surrounding this topic: data accuracy, inclusivity and fraudulent activity. All three need to be carefully evaluated to decide if respondents using an auto-translate function while taking a survey should be permitted to participate or excluded from the data collection.
Data Accuracy: In research, we need to ensure that questions are understood exactly the same from one respondent to the next, and the nuanced wording of questions makes a real difference in this. If you’ve ever used an auto-translator, you know that the quality of translations is still far from perfect.
For us in research, these potential inaccuracies introduced by auto-translations can seriously challenge our ability to compare responses. Auto-translations can also introduce skew by using ‘loaded’ language in place of the original text’s carefully crafted, neutral wording. This can bias respondents and cause them to reply in a certain way that wouldn’t have happened in the survey’s original language.
Inclusivity: Some people have an auto-translate function activated on their device because they need it to navigate day-to-day life. By excluding these respondents, we may risk the representativity of our sample. This would exclude respondents from certain backgrounds, whose demographics, socioeconomic factors, behaviours and attitudes may be different than those who speak the survey’s language—and fall short in providing representation for key sub-populations in the market we’re researching.
Fraud: Some ill-intentioned respondents hack into surveys from foreign countries and use auto-translators to understand them, even if they don’t speak the language. In doing so, they’re often able to fly under the radar by selecting semi-logical answers that pass basic, in-survey quality checks, such as red herring or trap questions.
Excluding respondents from a survey needs to be done with care and consciousness. In a world where we are competing for respondents’ attention and time, we cannot afford to exclude respondents unless their inclusion would risk the quality or integrity of our data. In this case, the risks of misunderstood questions and fraudulent respondents outweigh the risk of reduced representativity and inclusion.
Respondents who have an auto-translate tool activated on their device should be prevented from entering surveys to ensure data accuracy and the genuine nature of participants. In cases where there is a desire for full population representation, surveys should be scripted in several languages for the same market to ensure the inclusion of key sub-populations.
While technology is a fantastic enabler which has transformed the speed, agility, and scalability of insights to provide businesses with intelligence that helps inform and shape their strategies, we must be conscious of how advancements in technology may impact the quality of our insights. Research should never lose its fundamental benefit to organisations: providing information for direction and decision-making. But only good quality data will be able to do so effectively.