When you design user studies, user experiments, and user surveys (a.k.a. user research), there is a need to sample from the user population – that is selecting real users.
Selecting real users is often considered vital for ensuring the validity, usefulness, and applicability of user research.
But what are real users?
Real users are participants drawn from the specific target population.
- Target Population: The target population is the entire group of people a researcher is interested in studying and analyzing, often related to a system or phenomenon.
- Sample: Real users are selected from the target population using one or more sampling techniques and become the sample used for the study. The sample is the set of people representing the target population participating in the investigation.
- Sampling: Sampling is the process of determining the participants taken from the target population for the study and can use techniques such as a sample frame (e.g., a list of individuals in the target population). The methodology used to sample a larger population depends on the analysis and may include, for example, random, stratified, or systematic sampling techniques.
- Users: The participants in the user study are referred to as participants. So, the terms ‘user’ and ‘participant’ refer to actual people concerning the survey. People use the technology in the ‘user’ role, and in the ‘participant’ role, people engage in the study.
However, do you really need real users for user research?
While sampling real users is considered essential by many researchers for determining a study’s validity and practical impact, this position has been challenged in the past by the high use of, for example, students and, increasingly is challenged with the discussions of the use of AI as ‘simulated’ (a.k.a., fake) users, including the use of AI for survey respondents.
Even given these issues, prior work discusses three foundational problems (FP) of not sampling real users in user research.
FP1 – Validity: Regarding how well findings can be generalized to other situations, user studies that do not employ real users face external validity issues. If the participants do not belong to the target population, they may lack the motivation, ability, or expertise to give valid responses. Although the employment of real users does not necessarily address the potential lack of motivation, it does ensure a sufficient degree of domain expertise for external and ecological validity (i.e., how well the results predict behavior in real life).
FP2 – Usefulness: While scientific validity is aimed at the accuracy and precision of results, there is a quintessential question underlying the employment of the research findings: Are the results useful for researchers and practitioners? Usefulness—or accuracy—is unlikely to be achieved when using surrogate users, as the needs and problems discovered may not match those faced by the intended users of the technology. Therefore, regardless of sampling validity, surrogate users are unlikely to provide helpful feedback compared to input drawn from real users.
FP3 – Applicability: When real users are not included in the sample, researchers may miss valuable insights for improving the technology. This is related to the issue that the sample may not represent the underlying population that will employ the technology. Since surrogate users need to gain intimate knowledge of a domain, subject matter, or problem space, they cannot provide feedback that would foreseeably lead to new features and functionalities addressing an impactful problem for the targeted population.
Are real users necessary for user research? Probably.
Will AI ‘simulated’ users play an increasingly important role? Probably.
For more on this topic, see:
Salminen, J., Jung, S.G., Kamel, A., Froneman, W., and Jansen, B. J. (2022). PeerJ Computer Science. 8:e1136 https://doi.org/10.7717/peerj-cs.1136
Jansen, B. J., Jung, S.G., and Salminen, J. (2023) Employing Large Language Models in Survey Research. Natural Language Processing Journal. 4, 100020.