The first step in the evaluate phase is to agree the success criteria, which should be based on the wants & needs of users and other stakeholders that were identified within the Explore phase. These criteria are then used to plan and summarise the user testing and expert appraisals.
This page describes the activities within the Evaluate phase of the Inclusive Design Wheel, and explains how they apply to inclusive design of transport services.
On this page
Agree success criteria
It is important to consider the question ‘what does success look like?’, for an inclusive transport service. It may be helpful to use the framework shown opposite to ensure that a wide range of success criteria are considered, covering:
- People criteria: user experience and social impact.
- Profit criteria: costs and revenues, technical risk and commercial business risk.
- Planet criteria: depletion of scarce resources, energy use and waste impacts.
Furthermore, the success criteria should be holistic with respect to the whole life-cycle of the service, which can be described as follows:
- Define strategy
- Design & Trial
- Ramp-up & Promote
- Maintain & Improve
- Ramp-down & Decommission
The success criteria should be specific and measurable, and should represent the wants & needs that were identified in the Explore phase. For example, the proportion of passengers using wheelchairs relates to the need for people with wheelchairs to board the bus independently. This would be an example of a user-experience related criteria that could be measured throughout the life-cycle of a more accessible bus service.
Gather user feedback
User feedback can be obtained at different levels of formality and detail, using methods such as:
- Contextual enquiry
- Focus groups
- User trials
Within the evaluate phase, these methods might be used to gain insights on the proposed design solutions, which may be embodied through storyboards or prototypes. Note that many of the same methods might have been used in the Explore phase to help understand how users behave with existing solutions.
Initial user feedback may be obtained within a co-creation workshop, as part of the co-design process. See the page about co-design for more detail. Whenever users are involved, the page about ethical considerations will be relevant (e.g consent, privacy and GDPR).
Later in the project, formal user trials should be planned to objectively compare the performance of the proposed solution with an existing benchmark.
Whenever user feedback is gathered, it’s important to consider whether the stakeholders that are planning or conducting the activities have a vested interest in achieving a particular outcome, and whether this could skew the results. The output from Understanding user diversity (within the Explore phase) should be used to inform the sampling for the user evaluation. The design of experiment should also be independently reviewed to identify and mitigate any potential bias.
Example of user feedback from one of the DIGNITY pilot projects (Ancona). More examples like this are available within the Inclusive design log for transport.
- James Hom’s Usability Methods Toolbox website has a section on Usability testing which provides more information on objective user tests.
- The Usability Net website has an entry on Performance testing which also gives more information about usability testing.
Gather expert feedback
In this activity, a range of relevant experts use their knowledge and judgement to systematically evaluate concepts. The purpose of formative evaluation is to identify potential issues and give recommendations for how the concept could be improved. Conversely, the purpose of summative evaluation is to objectively compare and rank different concepts.
The concepts may be evaluated from lots of different perspectives, which could include user experience, accessibility, operational feasibility, technical risk, financial or environmental. It’s important to consider whether the expert who are performing these judgements have a vested interest in achieving a particular outcome, and whether this could skew the objectivity.
From an accessibility perspective, expert appraisals might include an access audit and / or a website accessibility audit. The capability loss simulation tools available on this site can assist with expert appraisals, helping experts to communicate how capability loss may affect interactions with products or services. These include the Cambridge simulation gloves and glasses. Furthermore, the Clari-Fi tool provides a way of evaluating the visual accessibility of text and icons within apps and websites that are displayed on mobile phones.
This made-up example was produced by the University of Cambridge, Engineering Design Centre, to show visual accessibility issues that we commonly find during our consultancy services. This example is described in more detail within the Inclusive design log for transport.
- James Hom’s Usability Methods Toolbox website describes various ‘inspection’ methods that can be used by experts to help them inspect and evaluate a concept or product in a systematic way.
- The Usability Net website has a section on Heuristic evaluation. This is a particularly popular method of expert appraisal, in which the product is evaluated against established guidelines or principles.
It can be helpful to identify how many people would be excluded from using a product or service on the basis of:
- Their capabilities, such as: Vision, Hearing, Thinking, Reach & Dexterity, and Mobility.
- The technology access requirements, such as installing an app, or accessing a website.
- The technology competence required to successfully interact with the product or service.
- Other exclusionary factors, such as language, gender, and age.
At the time of writing in November 2022, the DIGNITY datasets were the most comprehensive available for covering all of these aspects. Of these datasets, the German dataset had the most robust sampling, and the biggest sample size (1010 participants). The questionnaire for the German survey, and an SPSS file of the corresponding participant responses are available from (UPCommons,
The exclusion estimates shown opposite were derived from the DIGNITY German dataset, while appropriately avoiding double counting of participants excluded. The authors of this toolkit can perform these kinds of calculations as part of our consultancy services.
Exclusion estimates for a service that requires the user to:
- Have a mobile phone - Exclusion = 2.1%
- Access a website that has been designed to work on both mobile and desktop - Exclusion = 11.3%
- Access a website while out and about - Exclusion = 18.3%
- Access a website on a desktop or laptop computer (i.e., via a website that has been designed for desktop, but has poor experience on mobile) - Exclusion = 17.5%
- Install a smart phone app that requires an internet connection - Exclusion = 47.4%
- Have previous experience with a mapping application - Exclusion = 58.7%
Exclusion values are expressed as weighted percentage of valid responses, for the DIGNITY German survey. These exclusion estimates are described in more detail within the Inclusive design log for transport.
- Keates and Clarkson (2003)’s book ‘Countering design exclusion: An introduction to inclusive design’ provides more background on exclusion calculations. (Published by Springer)
- A paper by Goodman-Deane et al. (2011): ‘Estimating exclusion: a tool to help designers’ gives more recent information on exclusion calculations. (Published in ‘Proceedings of Include 2011’.
Identify strengths & weaknesses
This activity involves reviewing, summarising and presenting the findings from the evaluation activities. It can be helpful to consider the strengths and weaknesses of the proposed solution, in comparison to a defined benchmark. The performance indicator framework that was introduced within Agree success criteria can be further summarised by considering the following ‘CXO’ perspectives:
We recommend choosing a benchmark to compare proposed solutions against. This is helpful because it is easier to judge whether something is ‘better’ or ‘worse’ than something else, instead of assessing whether something is ‘good’ or ‘bad’. The benchmark could be an existing transport service, website or app.
The relative importance of the different ‘CXO’ perspectives will depend on the objectives of the project. From the most important perspectives, the proposed solution ought to perform ‘a lot better’ than the chosen benchmark. From other perspectives, it may be acceptable for the performance of the proposed solution to be ‘about the same’ as the chosen benchmark. It is rarely acceptable for the proposed solution to perform worse than the benchmark, for any of the ‘CXO’ perspectives.
Summarising the strengths and weaknesses in this way may identify:
- Uncertainties regarding the understanding of users
- Opportunities for refining or improving the ideas
- Gaps in the evidence to support the solutions, from one or more different perspectives
Each of these may be resolved by planning further activities within Explore, Create and Evaluate.
We would welcome your feedback on this page:
Read more about how we use your personal data. Any e-mails that are sent or received are stored on our mail server for up to 24 months.
Thanks for sending your feedback!