total survey error
01.02.2021

Referring to the Total Survey Error (TSE) framework when designing and monitoring translation workflows for questionnaires

by Pisana Ferrari – cApStAn Ambassador to the Global Village

Advances in survey methodology research have brought advantages to data reliability and validity, especially in the 3MC (Multinational, Multiregional, and Multicultural Contexts) surveys. The “Total Survey Error” (TSE) is the example of a paradigm that provided the conceptual framework for optimizing surveys by maximizing data quality within budgetary constraints. This framework links the different steps of a survey design, collection, and estimation, into their (potential) source of error (e.g., measurement, processing, coverage, sampling and non-response errors), helps organize and identify error sources, and estimates their relative magnitude (Biemer, 2010). It was used mostly in the design and evaluation of one-country surveys, but a more recent, adapted, TSE framework for survey research in cross-cultural contexts addresses challenges that are unique or more prominent in 3MC surveys (Pennell et al., 2017).

Not all error sources are known, some defy expression, and some can become important because of the specific study aim, such as translation errors in cross-cultural studies. These studies require a design that will ensure that the multiple language versions meet the highest standards of linguistic, cultural and functional equivalence. And, as far back as 1989, authoritative literature (Groves, R. M., “Survey errors and survey costs”), found that early identification of the translation and adaptation (localisation) potential issues could contribute to minimising the TSE and thus avoid disadvantageous “trade-off between error and costs”.

At cApStAn, in our 20+ year experience of translation and adaptation of questionnaire items for some of the world’s most prestigious organisations and companies (eg. the OECD’s PISA and PIAAC, the IEA’s PIRLS and TIMSS, the European Social Survey) we have seen that it is best to address possible ambiguities in meaning, as well as idioms and colloquialisms, or any other potential translation issues, including cultural ones, very early on. That is why we make a strong case for linguists to be involved in the initial stages of the test or survey design: relatively low investments “upstream” can lead to very productive “downstream” outcomes.

Where do LQA and LQC fit into the TSE framework?

In the TSE Framework, different languages and cultures are among the contextual factors that can lead to errors in measurement in international surveys. Contextual factors include local survey expertise, survey infrastructure, human resources available, survey environment and tradition, but also the population mobility, number of languages and dialects, cultural aspects that can affect response and survey environment (collectivism, privacy concerns, masculinity).   Including Linguistic Quality Assurance(LQA) and Linguistic Quality Control (LQC) (verification) methods in a 3MC survey design decreases the response bias and other measurement errors caused by poor questionnaire design.

Table 1 – from Groves et al., 2004

cApStAn’s modular approach to LQA and LQC

A. Pre-translation

Our “translatability assessment” (TA) is a tried and tested method to “optimize” the master version of an assessment or survey questionnaire before the actual translation and adaptation process begins. In cApStAn’s TA process each draft source item is checked against a set of 14 translatability categories to help identify potential translation, adaptation or cultural hurdles. This is one of our top services (and applied in OECD PISA)

Pre-translation work can include the organisation of workshops for item-writers/question authors to raise awareness of translatability issues and to assist them in writing more “translatable content (free of idioms, ambiguities, unnecessary complexities).

Focus groups and cognitive interviews can also be used to gain insights into the local community and experiences of the target population, which researchers alone may not be able to recognize.

Translation and Adaptation notes or item-by-item guidelines, can be prepared to help clarify the intended meaning of a source text concept, phrase, or term.

B. Translation

This is a vast module, where cApStAn can help determine whether a complex double translation or a straightforward single translation process is required for a given project and what skills translators should have, or can propose hybrid approaches, including man-machine translation.

In case of double translation, there is reconciliation (which is done by a reconciler, who merges translation 1 and 2 into a final version that takes over the best elements of each). cApStAn has set up precise procedures for (team) adjudication, including preparation and follow-up. Senior cApStAn staff can assist in moderating adjudication meetings, and in documenting outcomes. For specialized content cApStAn may enlist the help of bilingual subject matter experts, or SMEs, to coordinate their work with our linguists, and document it.

Advanced translation models include TRAPD (Translation, Review, Adjudication, Pre-testing, and Documentation model (Harkness, 2003). Similar to the above. The two separate translations are both reviewed by a senior translator with domain expertise. During an adjudication meeting, the three linguists jointly produce a consensual version, discussing controversial points and documenting each decision. Cognitive debriefing (the P in TRAPD is for Pre-testing) is organized with a partner organisation.

Hybrid man-machine workflows are those in which one or several neural machine translation (NMT) engines are called up to suggest a translation when the translation memory does not. The translator accepts, edits or rejects the NMT suggestion. Translation quality assurance by a human.

C. Post-translation

Linguistic Quality Control (verification): cApStAn’s systematic use of a list of categories that describe translation quality and equivalence issues help report on translation quality in a standardized way and generate relevant statistics. A systematic review of verifier feedback will always take place. Verification deliverables combine the verifier’s linguistic expertise & cultural sensitivity and the reviewer’s thoroughness. Verifier training is an important part of our LQC process.

References

Groves, R. M. (1989). Survey errors and survey costs. New York: Wiley & Sons

Pennell, B.-E., Cibelli Hibben, K., Lyberg, L. E., Mohler, P. Ph., & Worku, F. (2017). A Total Survey Error Perspective on Surveys in Multinational, Multiregional, and Multicultural Contexts.

Biemer, P., Paul, (2010) Total Survey Error: Design, Implementation, and Evaluation, Public Opinion Quarterly, Volume 74, Issue 5, 2010, Pages 817–848,