Methods for lab studies

Lab (laboratory) studies are conducted in fixed locations, typically in researchers’ premises, in contrast to authentic contexts of use. Examples include usability labs, meeting rooms, quiet rooms designed for controlled experiments, or simulators.


Self-report measurement of the emotion expressed by a stimulus. 2DES is a computer program which is used to collect continuous ratings provided by the study participants.

AXE (Anticipated eXperience Evaluation)

AXE is a qualitative method that gives an initial perspective on the user experience for a product or a service. It is a method that involves singular users in an interview setting. The method builds on using visual stimuli to make evaluation participants imagine a use situation and to reveal
their attitudes, use practices and valuations. AXE is both an evaluative method and a method for collecting suggestions for improvement. The results connect perceived product attributes with different dimensions of user experience.

Aesthetics scale

Developed by Lavie and Tractinsky; aesthetic quality in particular of websites. They conducted four studies in order to develop a measurement instrument of perceived web site aesthetics. Using exploratory and confirmatory factor analyses they found that users' perceptions consist of two main dimensions, which were termed "classical aesthetics" and "expressive aesthetics".

Affect Grid

Affect Grid is a scale designed as a quick means of assessing affect along the dimensions of pleasure-displeasure and arousal-sleepiness.


Assess the user's feelings about the system with a questionnaire. In AttrakDiff questionnaire, both hedonic and pragmatic dimensions of UX are studied with semantic differentials.


Two friends explore a product/concept together and discuss about it (with or without a moderator). Videorecording is used especially when no moderator is present.

Contextual Laddering

One-to-one interviewing technique (qualitative data gathering) + quantitative data analysis technique. Preferably to be done in context.

Controlled observation

Individual participants are invited to a controlled environment (not real context) to test e.g. colors or audio of the system. The target is to gain insights of design details that would be hard to test in real contexts (e.g. controlled lighting conditions, background noise).

Differential Emotions Scale (DES)

The Differential Emotions Scale (DES) is a standardized instrument that reliably divides the individual's description of emotion experience into validated, discrete categories of emotion. The DES was formulated to gouge the emotional state of individuals at that specific point in time when they are responding to the instrument.

e) Theoretical background: theories/models underlying the tool/method
Izard, 1972, 1977


Emo2 is an instrument for the measurement of emotion during product use. Most standard tools for the measurement of emotion provide overall rating along one or two dimensions or half a dozen basic emotions. Design-oriented tools (most notably PrEmo) overcome this limitation but are focused on sensory experience after static exposure to a product. We don't know any tool designed to measure emotion over time, during interaction with a product, while providing rich feedback to designers. Self-confrontation allows the collection of extended data on the user experience without interfering with the interaction.


Emocards provide a non-verbal method for users to self-report their emotions. Flash cards or single sheet of a paper.


Emotional responses elicited by consumer products are difficult to verbalize because their nature is subtle (low intensity) and often miemotional response at the same time). As a result, these emotions are difficult to measure with verbal questionnaires. Instead of relying on the use of words, respondents can report their emotions with the use of cartoon drawings of facial expressions. The Emofaces can be used in internet surveys, formal interviews, and in qualitative interviews.


This method focuses on the known divergence between what the user does and what he says he does, between judgments and facts. The main problem is that we can not extract conclusions about the use of a product based on the opinions of the users. We are interested in the visceral and behavioural levels of the human conduct, and we revindicate that the reflective level, which is intellectually driven, must be also evaluated as another behavioural expression. It has to be interpreted from the analysis, but not assumed as if it was knowledge in its literal sense. The Emoscope groups together a set of techniques that aim to enrich the aspects of Emotional Usability in the processes of auditing the Experience of Use from a double approach: intervention on the product (EmoTools) and on the process of design (UseTherapist).

Emotion Cards

Provide a way for users to quickly document emotions at a specific moment

Emotion Sampling Device (ESD)

Emotion Sampling Device is a series of questions yielding to the emotion the user is experiencing as the result of an event. It is based on Cognitive Appraisal Theory (CAT) and Emotional Appraisal System. It asks about the causes of the emotion, rather than about the emotion itself, to avoid the typical problems of verbal assessment of emotions.

Extended usability testing

information about the UX as a by-product of contextual inquiry / usability testing


FaceReader is a tool to track the user affective state while using products or software without resorting to self-report.

Facial EMG

Does natural motion behavior during gaming enhance positive UX regarding the intensity of measured dimensions of emotional arousal and valence. Comparison of two products.


Feeltrace is a tool to record the perceived emotional content of a time-dependent stimulus (such as speech). Feeltrace is a software tool designed to collect self-report from observers watching the stimulus to be evaluated.

Fun Toolkit

Fun Toolkit comprised four special tools, a Smileyometer, a Funometer, an Again - Again Table, and a Fun Sorter and also supported the idea of measuring remembrance and of using video footage to score engagement. Method to measure fun with 5 to 10 years old children. Fun Toolkit consits of several tools, which measure the three fun dimensions expectations, engagement, and endurability.

Game experience questionnaire (GEQ)

The questionnaire consists of different modules: 1) Core module - concers actual experiences during game play; 2) social presence module - concerns gaming with others; 3) post game module - conserns experiences once a player has stopped gaming.

Geneva Appraisal Questionnaire

The Geneva Appraisal Questionnaire (GAQ) can be used to assess, as much as is possible through recall and verbal report, the results of an individual's appraisal process in the case of a specific emotional episode (as based on Scherer's Component Process Model of Emotion). The files available for download contain the current English, French, and German versions (and information on utilization).
This is a tool that can be used to describe emotional experiences (i.e. not a description as such). Tool is available in English, French and German

Geneva Emotion Wheel

Based on Scherer's Component Process Model, the Geneva Emotion Research Group has developed this new instrument to obtain self-report of felt emotions elicited by events or objects.

Group-based expert walkthrough

It is a scenario based usability inspection method, aiming to identify usability-problems, possible design improvements and successful/good design solutions in a given user interface. The evaluations are conducted as group usability inspections and require no previous training of the evaluators. Thus the method supports evaluators not accustomed to usability inspections. The group-based expert walkthrough is particularly suited for early evaluations of applications specific to a particular work domain. The method is grounded on the assumption that usability-problems and possible design improvements identified by work-domain experts utilized as evaluators had far higher impact on the subsequent development processes than these identified by usability-experts. Combined with other methods, such as probing material it goes beyond the usability aspect and collects UX issues.

Hedonic Utility scale (HED/UT)

The HED/UT is developed to both address hedonic aspects of product/website interaction and the utitility and usability aspects. HED/UT is an attitude measure from the consumer behaviour literature, and consists of 12 items measuring hedonic value, and 12 items measuring utilitarian value of a service or concept.

Human Computer trust

The Human Computer Trust scale is a psychometric instrument specifically designed to measure human-computer. Both cognitive and affective components of trust are measured; the affective components are the strongest indicators of trust.

I.D. Tool

(From the ENGAGE Web site description:) I.D. Tool identifies the physical design attributes that a product has in order to evoke the desired experience by the target customers. This is uncovered by mapping the user's mental reactions that creates the immediate affective impressions of the product as well as the long term opinions towards it.

Intrinsic motivation inventory (IMI)

The Intrinsic Motivation Inventory (IMI) is a multidimensional measurement device intended to assess participants' subjective experience related to a target activity in laboratory experiments. It has been used in several experiments related to intrinsic motivation and self-regulation (see weblink for references).

Kansei Engineering Software

The software follows the Kansei Engineering procedure suggested by Schütte (2006). {Additional info: Kansei Engineering is a method for translating feelings and impressions into product parameters. The method was invented in the 1970ies by Prof. Nagamachi at Kure University (now Hiroshima International University). Prof. Nagamachi recognized that companies often want to quantify the customer's impression of their products. Kansei Engineering can "measure" the feelings and shows the correlation to certain product properties. In consequence products can be designed in a way, which responds the intended feeling. Source:

It uses techniques such as Semantic differential technique (Osgood, 1957) and the Quantification Theory Type I (Komazawa and Hajashi, 1976)

MAX – Method of Assessment of eXperience

The MAX is a post-use method for evaluating the general experience through cards with an avatar and a board. MAX can be applied after the use of mockups, prototypes, interactive systems, or any artifact that user can interact with. It has four categories, which are represented on the board by questions that guide the user at the evaluation: (a) Emotion: What did you feel when using it?, (b) Ease of Use:Was it easy to use?; (c) Usefulness: Was it useful? and (e) Intention to Use: Would you wish to use it?

Mental effort

Zijlstra's mental effort scale is an easy and quick to use scale that helps to determine how much (perceived) mental effort was required to complete a task; depending on the setting, product and task, and in combination with other measures, this will help in getting a clearer picture of the overall quality of a product or service - too much mental effort will be stressful, and scary, too little mental effort will be boring....

Mental mapping

Participants see or try out a design and then select a famous person or film that best describes the design. E.g. Sylvester Stallone or Fatal Attraction. Alternatively, participants may be asked to imagine the product as a person and make up some stories of its life.


Lab study: Combination of qualitative and quantitative measures for tasks.

Multiple Sorting Method

This method is a variation of the Repertory Grid Technique, reported elsewhere in this methods toolkit.


General Tests of Emotion or Affect for Evaluating Consumer Reactions to Products and Services, Including User Interface.

Paired comparison

Very easy to use technique to rank order stimuli (products) with respect to some quality (e.g. enjoyment); also easy to do for children; goes back to early (1920's) test and scale development techniques; paired comparison data can be transformed in ordering stimuli.

Perceived Comfort Assessment

A scale for assessing comfortability of car seats. The method description includes the steps to develop the scale, which are applicable for various other domains as well.

Perspective-Based Inspection

A team of people with different perspectives evaluates a product.
Perspectives can include: Aesthetics, fun, comfort & other user experience.

Physiological arousal via electrodermal activity

The method utilises physiological arousal as an indicator for involvement and emotional state (arousal not valence). It is meant to also capture unconscious processes. Unobtrusive, not distracting, continuous measure, in situ

Playability heuristics

Playability heuristics evaluate the playability aspect within games. Apart from usability problems, the heuristics can reveal the experiential aspects of game play.

Positive and Negative Affect Scale (PANAS)

The Positive Affect Negative Affect Schedule (PANAS) is a psychometric scale developed to measure the largely independent constructs of positive and negative affect both as states and traits. Positive and negative affect have been shown to relate to other personality states and traits, such as anxiety. PANAS was originally developed for more clinical settings, but is also used now in evaluation studies in which moods of users might be affected (e.g. studies around effects of lighting, of entertainment content, etc).


Emotional responses elicited by consumer products might be difficult to verbalize because their nature is subtle (low intensity) and often mixed (i.e. more than one emotional response at the same time). So, emotional responses to products might be difficult to measure with verbal questionnaires. Instead of relying on the use of words, respondents can report their emotions with the use of expressive cartoon animations. In PrEmo, 14 emotions are portrayed by an animation of dynamic facial, bodily, and vocal expressions.

Presence questionnaire

Presence is defined as the subjective experience of being in one place or environment, even when one is physically situated in another.) The authors of the scale state that presence is a normal awareness phenomenon that requires directed attention and is based in the interaction between sensory stimulation, environmental factors that encourage involvement and enable immersion, and internal tendencies to become involved. Focus of the PQ is to measure presence in virtual environments and games. In addition the immersive tendencies questionnaire (ITQ) was developed to measure differences in the tendencies of individuals to experience presence.

Private camera conversation

To avoid interviewer bias, the participant goes to a booth and talks to the camera about the topics given to her/him. Videorecording may bring out more hedonic aspects than with an interviewer, because participants want to act rationally with the interviewer.

Product Personality Assignment

Participants are given a selection of product designs and a questionnaire of different personalities that they assign to designs (list of Briggs-Myers, e.g. "sensible", "friendly"). They are also asked about the reasons for the selections.

Product Semantic Analysis (PSA)

A semantic scale that is built for each evaluation case separately via user interviews and by using product semantics as the theoretical basis.

Property checklists

A structured way to do expert evaluation: the expert goes through a checklist of design goals for different product properties (form, colour, materials, graphics, sounds, functionality, interaction design).

Psychophysiological measurements

E.g. heart beat, skin perspiration, facial muscles tell about the emotional state of the user. The physiological reactions are recorded with sensors attached to the participant. This objective data can be used in combination with self-report data to find out what the user experienced.

QSA GQM questionnaires

Based on the fact the motivation is acknowledged to be one of the several aspects of UX, the QSA-GQM technique measures the intrinsic motivation of people about knowledge acquisition.

Reaction checklists

After using the evaluated system, the participant is given a list of possible reactions to it, e.g. “the phone feels good in the hand”, “I feel proud when others see me with the phone”.
The method is most suitable for collecting initial responses to a product.

Repertory Grid Technique (RGT)

RGT is a technique for eliciting and evaluating people's subjective experiences of interacting with technology, through the individual way they construe the meanings of members of the set of artifacts under investigations. It thus attempts to capture how users experience things, what the experience means for them, and covers both emotionally- based constructs (warm-cold) and more “rational” ones (professional-popular). Kelly suggested the Repertory Grid Technique (RGT) as a methodological extension of his Personal Construct Theory (Kelly, 1955). Kelly argued that we make sense of our world through our own ‘construing' of it. That is, we tend to model what we find in the world according to a number of personal constructs, which are bipolar in nature. According to Kelly, a ‘construct' is a single dimension of meaning for a person allowing two phenomena to be seen as similar and thereby as different from a third (Bannister & Fransella, 1985).

Resonance testing

Resonance testing is a method to validate product concepts againts a set of experiential goals. It is primarily intended for use with physical products


Satisfaction is a part of classic definition of usability since a long time. SUMI (Software Usability Measurement Inventory) has been developed to provide an authoritative, standardised measurement of user satisfaction with software. It can be used for the evaluation and comparison of products (or versions of a product) and to set and track verifiable targets regarding satisfaction. SUMI is a classical Likert-type measure of attitude toward a software package. The questionnaire comprises five subscales: efficiency, affect, helpfulness, control and learnability. SUMI analysis also provides a “global” satisfaction score.

Self Assessment Manikin (SAM)

SAM is an emotion assessment tool that uses graphic scales, depicting cartoon characters expressing three emotion elements: pleasure, arousal and dominance. SAM has been used often in evaluations of advertisements, and increasingly also in evaluations of products. SAM is based on the PAD emotion model of Mehrabian.

Sensual Evaluation Instrument

Different shaped objects are used by users during a usability test to express how they feel. After the test, an interview is conducted to interpret the results.

Sentence Completion

After using a system, a participant is handed a set of beginnings of sentences that she then completes. The beginnings of the sentences trigger the user think the experiential aspects of product use, e.g."When I use this product, I feel myself…", or "The appearance of this product is…"

TRUE Tracking Realtime User Experience

Games software is instrumented to track behavior.
Users report reactions throughout test.
Live video record is indexed to events.


Pairwise comparison testing aimed at young children (preschoolers).


UTAUT is based on Technology Acceptance Model (TAM), but addressing some of TAM's shortcomings; includes now also more or less affective aspects

UX Curve

UX Curve method aims at assisting users in retrospectively reporting how and why their experience with a product has changed over time.

UX Expert evaluation

UX experts use their expertise of users and UX theories to evaluate UX of a system.

UX laddering

UX Laddering is an adapted interview method and adapted data analysis process for investigating the user experience, adapted from Laddering in consumer research and based upon Means-end Theory. The goal of laddering -as with all means-end approaches- is to identify and understand the linkages between key perceptual elements across the range of attributes, consequences and values. Therefore, UX Laddering helps researchers and designers understand how concrete product attributes benefit personal values for end users.

Valence method

The valence method is based on the user experience model of Hassenzahl (2008) defining user experience as a primary evaluative feeling during the usage of a product or service. A positive user experience the consequence of fullfinging human needs. The valence method captures positive and negative feelings (valence) during the explorative usage of an interactive product. In a subsequent retrospective interview phase users indicate for each instance of a positive or negative feeling (valence marker) the product design aspects inducing it. This phase further employs the laddering interview technique [Reynolds & Gutman, 1988] to reveal the personal meaning of product design aspects to the user and the underlying fulfilled or frustrated needs. The generated information helps designers to understand and optimize the user experience potential of a product.

WAMMI (Website Analysis and Measurement Inventory)

To help you accomplish your Web goals, WAMMI (1) Measures user experience of your website based on visitors reactions using a 20 item-questionnaire; (2) Benchmarks your website relative to other websites in our international standardized database; (3) Generates objective data for your management in an easy-to-read hypertext report; (4) Analyses qualitative comments and reactions to your website from visitors; (5) Interprets quantitative and qualitative data to determine what to improve and how much to invest.

Workshops + probe interviews

After conducting exploratory user research with design probes, invite the same participants to first 'validate' your analysis and then allow them to experience and give feedback on early prototypes in a group session


iScale is a survey tool for the retrospective elicitation of longitudinal user experience data.