Article Text

Original research
Measuring social norms related to handwashing: development and psychometric testing of measurement scales in a low-income urban setting in Abidjan, Côte d’Ivoire
  1. Maud Akissi Amon-Tanoh1,
  2. Maria Knight Lapinski2,
  3. Jim McCambridge3,
  4. Patrice Konan Blon4,
  5. Hermann Aka Kouamé4,
  6. George Ploubidis5,
  7. Patrick Nguipdop-Djomo1,
  8. Simon Cousens1
  1. 1Infectious Disease Epidemiology, Faculty of Epidemiology and Population Health, London School of Hygiene & Tropical Medicine, London, UK
  2. 2Department of Communication, Arts and Sciences, Michigan State University, East Lansing, Michigan, USA
  3. 3Department of Health Sciences, University of York, York, UK
  4. 4Koumassi, Abidjan, Côte d'Ivoire
  5. 5Social Research Institute, University College London, London, UK
  1. Correspondence to Dr Maud Akissi Amon-Tanoh; maud.amon{at}


Objectives To design and test the psychometric properties of four context-specific norm-related scales around handwashing with soap after toilet use: (1) perceived handwashing descriptive norms (HWDN); (2) perceived handwashing injunctive norms (HWIN); (3) perceived handwashing behaviour publicness (HWP); and (4) perceived handwashing outcome expectations (HWOE).

Design Scale items were developed based on previous work and pilot tested in an iterative process. Content experts and members of the study team assessed the face validity of the items. The psychometric properties of the scales were assessed in a cross-sectional study.

Setting The study was conducted in communal housing compounds in Abidjan, Côte d’Ivoire.

Participants A convenience sample of 201 adult residents (≥16 years old) from 60 housing compounds completed the final questionnaire.

Outcome measure Confirmatory factor analysis was used to assess the goodness of fit of the global model. We assessed the internal consistency of each scale using Cronbach’s alpha (α) and the Spearman-Brown coefficient (ρ).

Results The results of the psychometric tests supported the construct validity of three of the four scales, with no factor identified for the HWOE (α=0.15). The HWDN and HWP scales were internally consistent with correlations of ρ=0.74 and ρ=0.63, respectively. The HWIN scale appeared reliable (α=0.83).

Conclusion We were able to design three reliable context-specific handwashing norm-related scales, specific to economically disadvantaged community settings in Abidjan, Côte d’Ivoire, but failed to construct a reliable scale to measure outcome expectations around handwashing. The social desirability of handwashing and the narrow content area of social norms constructs relating to handwashing present significant challenges when designing items to measure such constructs. Future studies attempting to measure handwashing norm-related constructs will need to take this into account when developing such scales, and take care to adapt their scales to their study context.

  • community child health
  • preventive medicine
  • public health
  • epidemiology
  • Social norms
  • Hand disinfection
  • Psychometrics

Data availability statement

Data are available upon reasonable request. All data relevant to the study are included in the article or uploaded as supplementary information.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • We developed three context-specific theories of normative social behaviour scales to measure handwashing-related norms in a low -income setting.

  • The scale response categories were designed to be context specific, with local expressions used to express varying degree of endorsement.

  • When designing the scale items, we developed and used strategies, including negatively framing the items, to minimise the impact of social desirability bias attached to handwashing. Nevertheless, we cannot exclude the fact that such bias may have affected the results.

  • The scale items were extensively pilot tested to ensure the items were unambiguous, not redundant and that the overall length of the questionnaire was not burdensome to study participants.

  • The use of convenience sampling may limit the generalisability of our results.



Diarrhoeal diseases account for approximately 16% of deaths in children in the 1–59 months (ie, postneonatal) period worldwide.1 Handwashing with soap (HWWS) is considered one of the most cost-effective methods of preventing diarrhoeal diseases,2–4 and is also being promoted as a key intervention to prevent other infectious diseases such as the coronavirus pandemic causing COVID-19. However, the frequency of HWWS at key moments (eg, after faecal contacts and before food contacts) is low in many settings.5 Developing effective ways of improving HWWS practices is thus an important public health challenge.

Norms can be defined as the set of unwritten rules which govern behaviour and generate social expectations about the ‘proper’ way to behave in particular situations.6–8 Norms exist at both the collective and individual levels.9–12 Collective norms operate at the group, community or cultural level, and emerge through interactions among members within a social group or community.9–12 Individuals’ interpretations of collective norms are referred to as perceived norms.9 12–14 The latter exist at the individual level and thus psychological level.9

Theory of normative social behaviour

The theory of normative social behaviour (TNSB) is concerned with the relationship between perceived descriptive norms (an individual’s perception of the prevalence of a given behaviour among their reference group)15 and behaviour.16 The TNSB hypothesises that this relationship is moderated by several norm-related constructs,9 14 17 including: perceptions of the injunctive norms (what people perceive ought to be done or how much they perceive others endorse or disapprove of a given behaviour),9 14 15 outcome expectations (the anticipation that engaging in a given behaviour will bring about benefits (or disadvantages) to oneself)9 18 and the perceived publicness of the behaviour (whether a behaviour is enacted in the private or public sphere and thus perceived to be open to scrutiny or not).9

Various studies have shown the potential for social norms to influence individuals’ decision to adopt handwashing (eg, ref 16 19–24). Settings in which such studies have been conducted include hospitals and public restrooms. Despite advice to design interventions to change social norms around handwashing practices,20 25 few studies report the development and testing of scales to measure such constructs. Only four studies were identified, with three studies conducted in high-income settings (the USA16 23 and Korea24). One study was conducted in two middle-income countries (Senegal and Peru).26 In the latter study, The World Bank (2012) designed a unidimensional handwashing social norms scale. The authors did not provide the results of the psychometric tests they conducted but reported that the scale was reliable and valid in Senegal but not in Peru.26 Handwashing social norm-related scales are needed to measure these norms to enable the evaluation of interventions aimed at improving handwashing by targeting these norms. Such scales need to be context specific to account for differences in cultural setting. In this paper, we report on the development and psychometric testing of four handwashing norm-related scales in Côte d’Ivoire.


The objectives of this study were to design and test the psychometric properties of four scales that could be used to measure handwashing social norm-related constructs in economically disadvantaged communities in compound settings in Abidjan. One scale aimed to measure the perceived handwashing descriptive norms (HWDN) around HWWS after toilet use. This represents an individual’s perception of the level of neighbours’ HWWS practices (eg, low, average, high) after toilet use. A second scale aimed to measure the perceived handwashing injunctive norms (HWIN) around HWWS after toilet use. This construct represents how much importance an individual perceives their neighbours attach to the practice of HWWS after toilet use. The third scale aimed to measure the perceived handwashing behaviour publicness (HWP) of HWWS after toilet use. This represents an individual’s perception of whether HWWS practices are visible to others. The fourth scale aimed to measure the perceived handwashing outcome expectation (HWOE) around HWWS after toilet use. This represents the benefits (or disadvantages) an individual perceives they would derive from HWWS after toilet use.


Study design

The study was a community-based cross-sectional study.


We conducted the study in housing compounds in Koumassi commune, Abidjan, Côte d’Ivoire, in January 2014. Communal housing compounds (locally known as ‘cours communes’ (‘communal courtyards’)) constitute over 50% of occupied living space in Abidjan.27 Compounds are typically built around a courtyard in which the majority of daily activities occur (online supplemental figure S1). Water and sanitation facilities (on average one water standpipe and two latrines) are usually shared by as many as 10 households. Houses are typically rented though sometimes landlords live within the compound. While the area of a typical compound can be 400 m2, population density is high with as many as 50 people living in a single compound.28–30 Although households within the same compounds are often not related there is a strong sense of community among residents.28 In 2008, UNICEF estimated the frequency of HWWS at key occasions to be less than 4% in Côte d’Ivoire.31


We used convenience sampling to select compounds with nine or 10 households, with shared water and sanitation facilities. We excluded compounds with layouts that did not support structured observations. This included compounds with walls built in front of individual households to give extra privacy to inhabitants. We also excluded compounds occupied predominantly by single males, who tend to spend most of their time outside of their compounds and are thus unlikely to be aware of their fellow residents’ handwashing practices. Compounds where households were predominantly from the same family were also excluded.

In eligible compounds, we approached adult residents who were heads of households (aged 16 years and above) and willing to engage with us. Verbal informed consent was obtained after an information sheet had been read to the participant.

Data collection procedures

Two local residents were hired to assist the corresponding author with data collection, and trained in all study procedures. We administered the data collection tools verbally in French or in Dioula, the two most widely spoken languages. All data were collected anonymously. Both participants and fieldwork assistants were masked to the study objectives. They were told that the study aimed to understand how housing compounds were organised, particularly with respect to gender roles and social cohesion among residents. There was no mention of handwashing. This masking theme was chosen based on an earlier pilot study.

We first piloted a context-specific Likert-type response scale to identify the terms most commonly used to express different levels of agreement/disagreement. We then piloted the questionnaire which contained the handwashing norm-related items. At the end of each piloting round, the lead author held a debriefing session with the fieldwork assistants and, if needed, the data collection tools were revised. Once the questionnaire had been finalised, we performed a cross-sectional study to assess the scales’ psychometric properties.

Development of a context-specific 5-point Likert-type response scale

Prior to testing the norm-related scales, and due to the low education level known to characterise our study population, we developed and tested a context-specific 5-point Likert-type response scale in compounds in Koumassi and Treichville communes, using a participatory approach, to ensure its suitability and acceptability in the target population.

For each conventional response category (eg, Definitely untrue, Untrue, Neither true nor untrue, True, Definitely true), we identified a comprehensive list of local expressions commonly used in everyday conversation to express agreement and disagreement (Box A in the online supplemental material). The methods used to develop and test the response scale are presented in the online supplemental material. Both French and Dioula versions of the response scale were developed (Box B in the online supplemental material). The pilot study indicated that when prompted with questionnaire items, interviewees would spontaneously use expressions from the developed response scale.

Development and pretesting of handwashing norm-related measures

Development of the norm-related scales was theory driven as opposed to data driven and based on previous work.16 23 32 The items were adapted to the study context, in accordance with good practice in scale development.33–39 For example, one item from Lapinski et al’s two-item HWDN scale (study 2) was ‘Most men at (University Name) wash their hands after using the bathroom’.16 In our study, this was adapted to ‘Most residents in your compound wash their hands with soap after using the toilet’. Initially, a total of 20 handwashing norm-related items were developed. The HWDN, HWIN, HWP and HWOE scales initially had five, four, five and six items, respectively. Content experts and members of the study team assessed the face validity of the items by examining the relationship between each item’s content and the conceptual definition of each study construct. To minimise the impact of social desirability bias attached to handwashing,40 41 an indication of strong perceived injunctive norms around this practice, 14 out of the 20 items were negatively framed (eg, ‘In your compound, few people wash their hands with soap after using the toilet’), but each scale included at least one positive item (eg, for the HWDN, ‘Many people, in your compound, wash their hands with soap after using the toilet).

We conducted a pilot study to finalise the design of the scales, to identify whether any scale items were redundant, and whether there were any ambiguous items. To avoid focusing respondents’ attention on handwashing, thereby increasing the risk of response bias, we integrated the scale items within a 61-item interview schedule, including a sociodemographic section. The handwashing scale items were positioned between two masking sections. The masking items were Likert-type statements related to the theme used to mask participants. In the initial masking section, we asked participants their opinions on the organisation of compounds as it pertains to gender roles (eg, ‘Men assist women in their daily compound chores’). The second masking section assessed participants’ degree of identification with their fellow compound residents (eg, ‘You consider fellow compound residents as part of your family’). The questionnaire ended with one question assessing the effectiveness of the masking items. Respondents were asked how they would explain the aim of the study to fellow compound residents. The entire questionnaire was phrased using the local vernacular form of French and also translated in Dioula.

We piloted the items in Koumassi using an iterative process. A minimum of five participants were sampled using convenience sampling and verbal informed consent was obtained after reading an information sheet to eligible residents. Each item on the questionnaire was read to participants. Respondents’ level of agreement with each statement was recorded on the response scale. To minimise classification error, we noted the exact rating expressions used by interviewees to rate each item as well as circling the corresponding scale score.

To assess whether interviewees had understood each item as per its intended meaning, we asked them to explain why they had rated the statement as they did. We judged respondents’ understanding of each item on the questionnaire by assessing whether their explanation of the way they rated each item was coherent with its intended meaning.

Piloting and revision of the questionnaires continued until there were no ambiguous items, no identification of handwashing as the key theme of the questionnaire and until participants’ reaction to the questionnaire was positive (eg, no complaints about redundancy of items or length of the questionnaire).

The scale items were finalised over four piloting rounds, involving a total of 20 residents. In the first two piloting rounds, six out of 10 participants complained that the questionnaire was lengthy and burdensome, and two participants stopped their interviews before completion. The number of items per norm-related scale was therefore reduced by dropping those items participants commonly identified as being redundant.

Additionally, some of the items were ambiguous, particularly some of those belonging to the perceived HWP scale, which some participants misunderstood as assessing whether they monitored the handwashing practices of their fellow compound residents. The items were thus reformulated. Ambiguous masking items were also either dropped or reformulated. Most interviewees were able to identify handwashing as the questionnaire theme, indicating that masking was ineffective.

As some participants did not have any opinion on some of the items, we added the response ‘Don’t know’ off scale (figure 1). The edited questionnaire contained 15 handwashing norm-related items and was tested in a third piloting round.

Figure 1

The context-specific 5-point Likert-type rating scale developed with, in brackets, the response categories and corresponding scale scores and, circled, the added space to record the key rating expressions used by interviewees, in addition to circling the corresponding scale scores. NSP, which stands for ‘Ne savent pas’ (ie, ‘Don’t know’), is a response added off scale. Box B in the online supplemental material for the context-specific scale’s corresponding conventional response categories in English and the response scale’s Dioula version.

The third piloting round indicated that there were no more ambiguous items. However, it was noted that participants tended to endorse positively phrased items, while also endorsing negatively phrased items that were discordant with the positively phrased items. As the number of items per scale was reduced, the decision was taken to drop positively phrased items from three of the scales, but not the HWOE scale. This was done to avoid inconsistent responses within the same scale due to social desirability bias.33 38 39 42 43 The total number of items was thus reduced to 10, with two, three, two and three items, respectively, for the HWDN, HWIN, HWP and HWOE scales (table 1). In addition, the negatively framed items that appeared to carry the highest risk of response bias were reformulated to explicitly exclude the interviewee from the statement (eg, In your compound, except you, few people wash their hands with soap after using the toilet’). We did so to minimise the risk that interviewees include themselves when rating other residents’ handwashing practices, and thereby over-report HWWS behaviours.

Table 1

Finalised handwashing norms scales

The final questionnaire, tested in a fourth piloting round, had a total of 43 questions, including 18 masking items, one question assessing masking, 12 sociodemographic questions, and 10 norm-related items. The remaining two questions gathered participants’ opinions of the questionnaire. It took approximately 20 min to administer the questionnaire. This version was well received by participants, with no further complaints of redundancy. Participants also expressed their appreciation of the questionnaire’s general theme and the different subjects it touched on. Handwashing was not one of the themes mentioned, which suggested that the masking items were effective.

Psychometric testing of handwashing norm-related measures

The finalised questionnaire was administered to a convenience sample of eligible compound residents in Koumassi, after obtaining verbal informed consent. Compounds which participated in the pilot rounds were excluded. When feasible, the interviews were conducted inside the respondents’ households or away from other residents. Each item was read out to respondents who were asked to indicate their level of agreement with the statement. Interviewees were not prompted with the different response options. Participants’ answers were recorded on the response scale as described previously (ie, by both circling the corresponding scale scores and recording the rating expressions used). Masking was assessed in a subset of respondents and in the fieldwork assistants. The latter were asked one open-ended question, at the end of the research, about what they thought the aim of the overall study was.

Sample size

For the psychometric testing phase of the study, we aimed to sample a minimum of 200 individuals as per scale development sample size recommendation (eg, ref 33 44 45). For example, Tinsley and Tinsley recommended a 5:1 to 10:1 participants per item ratio.45 Comrey suggested that a sample size between 200 and 300 was suitable for factor analysis.44

Statistical methods

Data were analysed using STATA (V.13.1 and V.17). Descriptive statistics were computed for each item to assess the distribution of the responses and identify items with highly skewed responses (eg, items with an extreme response distribution which almost all participants rated similarly).34 44 Prior to analysis, scores on the response scales were reversed for items that were formulated positively. Complete case analysis was performed. Confirmatory factor analysis (CFA) was used to assess the goodness of fit of the global model.46 We chose CFA as opposed to exploratory factor analysis due to the theory-driven approach used to develop the scales.47

Given the item response categories were ordinal, generalised structural equation modelling (GSEM) was used to fit an ordered probit model to scales with a minimum of three items.48 The variances of the latent variables were constrained to equal one to obtain the loadings of each scale item. As GSEM does not produce goodness-of-fit statistic estimates in STATA, structural equation modelling (SEM) was also used to fit a linear model to the data to ascertain the three-factor structure data fit. The χ2 test, comparative fit index (CFI), the root mean square error of approximation (RMSEA) and standardised root mean square residual (SRMR) were used as indices to evaluate the goodness of fit. Satorra-Bentler adjustment was used to adjust the results for non-normality using the vce(sbentler) command. Hu and Bentler’s recommended threshold cut-off criteria were used to determine the acceptable model fit.49 The internal consistency of each scale was assessed by either computing the Cronbach’s alpha (α) or the Spearman-Brown coefficient (ρ), depending on the number of items in the scale.50 The Spearman-Brown interitem correlation coefficient was also computed to assess the strength of the relationship between pairs of items in each scale.

Patient and public involvement

Participants were neither involved in the design nor in the implementation of the study. Participants were involved in the development of the context-specific Likert-type response scale, and in the finalisation of the scale items, as described previously.


Psychometric properties of the scales

We interviewed 201 residents from 60 compounds in Koumassi using the final version of the questionnaire. One hundred and forty-nine (74%) respondents were women, and 145 (72%) were aged between 16 and 34 years (table 2). The response rate exceeded 96% for all items (table 3). All scale items had a relatively balanced distribution of responses. Participants who responded ‘don’t know’ to some of the items presented to them stated that they were only concerned with what took place in their own household, and did not look at what others did in their compound. Others replied that they were not inside other residents’ heads to know what they thought. In such instances, we tried to explain to respondents that the questionnaire was seeking their perceptions only.

Table 2

Age and sex distribution of respondents to the handwashing norms scales

Table 3

Handwashing social norm-related scales items distribution

The spearman-brown coefficient indicated strong positive correlations between items designed to assess the same construct, with the exception of the handwashing outcome expectations (HWOE) construct (table 4). For the latter scale, the spearman-brown correlation coefficient showed poor interitem correlation (ρ negative or close to zero) for all pairwise combinations of items. The HWOE scale was thus dropped. CFA was used to assess the measurement properties of the HWIN, HWDN and HWP scales. The goodness-of-fit statistics indicated that the three-factor structure had an acceptable fit (ie, χ2(11)=17.12, p=0.105, RMSEA=0.06, CFI=0.99, SRMR=0.03). figure 2 reports the HWIN scale properties from the ordered probit-based GSEM. The scale appeared reliable (α=0.83). Both the HWDN and HWP scales were internally consistent and reliable (respectively, ρ=0.74, α=0.88; and ρ=0.63, α=0.78). table 5 summarises the psychometric properties of the retained scales.

Table 4

Matrix of interitem correlations for each scale (Spearman-Brown coefficient)

Figure 2

Handwashing injunctive norms (HWIN) scale properties (unstandardised estimates).

Table 5

Summary of the psychometric properties of the HWDN, HWP and HWIN scales

We assessed the effectiveness of the masking items in the last 52 participants to be interviewed. Fifty (96%) participants thought that the study aimed to understand life in their compounds, with the main themes being solidarity and good relations among compound residents. Among these participants, 11 (21%) mentioned ‘cleanliness’ in addition to mentioning the above themes. The remaining two (4%) respondents thought that the study’s main theme was handwashing. These two participants also complained about the redundancy of some of the items. The two fieldwork assistants believed the study objectives were along the same lines as those given by study participants. Neither of the fieldwork assistants mentioned handwashing among the themes cited.


We developed and assessed the psychometric properties of four scales to measure social norm-related constructs around HWWS after toilet use in housing compounds in Abidjan. Our results support the reliability of three out of the four scales: the HWDN, HWIN and HWP scales, but not that of the HWOE scale. More work is needed to design a reliable measurement tool for this construct in our study population. The scales were developed taking account of the local context. Thus, comparisons of our results with the small number of published studies (ie, ref 16 23 24 26) conducted in different settings must be done with caution.

We found that handwashing social norm-related constructs in this study were internally consistent with strong correlations. We did not identify any other studies conducted in low or middle-income country settings which designed and tested the four constructs of interests. The World Bank (2012), working in Senegal and Peru, used 12 items (eight generic items and four country-specific items) to measure social norms around handwashing in a study measuring the behavioural determinants of HWWS.26 However, while the items addressed at least two different norms (descriptive and injunctive norms), as per the definition of these constructs in social norms theories,9 14 15 the authors used them to create a unidimensional scale raising questions about construct validity. In addition, the items referred to different key handwashing opportunities (eg, after toilet use, before eating, before preparing food). As noted by the authors, attitudes and norms around different handwashing occasions may vary.26 For some of the items in the scale, it is difficult to assess which norm theoretical domain the authors intended to measure. Additionally, the items as formulated appear susceptible to acquiescence bias (ie, respondents tending to agree with questionnaire items presented to them, irrespective of the item content). The authors did not provide the results of the psychometric tests they conducted, but reported that the scale was reliable and valid in Senegal, but not in Peru.

By contrast, we identified three studies conducted in high-income countries which reported the process of designing and testing the HWDN, HWIN, HWP and HWOE. Lapinski et al designed a reliable two-item HWDN scale (α=0.93) using the TNSB in a college campus study aimed at testing the effect of descriptive norms on behavioural privacy in the USA.16 Lapinski et al also designed a reliable four-item HWDN scale (α=0.93), a six-item HWOE scale (α=0.70) and a four-item HWIN scale (α=0.82) in a study aimed at testing the TNSB in a childcare centre in the same geographical area.23 However, all items were positively framed which, in our experience, may increase the risk of acquiescence bias, and a three-item HWP scale was not reliable (α=0.36). Chung and Lapinski designed a reliable four-item HWDN scale (α=0.87), including two negatively framed items, a four-item HWDN scale (α=0.87), including two negatively framed items, a four-item HWIN scale (α=0.83), a three-item HWOE scale (α=0.84) and a four-item HWP scale (α=0.80) in a study applying the TNSB to predict handwashing practices in Korea.24

In our study, although the HWP scale was internally consistent, the way the items were formulated appears conducive to acquiescence bias. The items referred to the absence of handwashing facilities or dedicated handwashing areas in compounds. Therefore, interviewees may have rated the items with a focus on the absence of handwashing facilities in mind, which is common in compound settings in Côte d’Ivoire, rather than the publicness of the practice in mind. However, the relatively balanced distribution of responses suggests this may not have been a major problem.

The failure to design a reliable HWOE scale can be explained by the fact that each item addressed a different outcome expectation. With hindsight, this decision was problematic. Two out of the three items were selected based on the findings of a previous pilot regarding participants’ key HWWS motives. One item attempted to assess good health as an outcome expectation, as this was the motive participants had cited the most. Another item sought to assess dirt removal as an outcome expectation. Washing hands to remove the dirt on one’s hands, as the item was formulated, may be more of a comfort outcome expectation than a riddance of disgust expectation. The last item posited inconvenience as an outcome expectation, given the frequent absence of soap at the handwashing locations. This latter outcome expectation was chosen based on Curtis et al’s formative research findings in 11 countries.25 Given the low correlation between all three items, dropping one of the three items to increase the reliability of the HWOE scale was not an option.

It would have been more appropriate to design separate scales for each different outcome expectation. For instance, three out of the six HWOE scale items in Lapinski et al’s study were health related.23 In Chung and Lapinski’s study, all three scale items were health related.24 However, it is likely that the health items as formulated were conducive to acquiescence bias in both study settings. As handwashing campaigns in Côte d’Ivoire are usually based on such health messages, it is unlikely that our study population would disagree with items positing good health as their key expected outcome from handwashing. On the other hand, having a scale for each outcome expectation would make the questionnaire burdensome to participants. A possible solution would be to focus on identifying the key HWOE in the study population, and measuring changes in that specific HWOE. Given our attempt to design a HWOE scale was unsuccessful, it was decided that it would be dropped from subsequent related-research.

The use of Likert-type response scales can be difficult in low-income settings where education levels may be low. Moreover, the conventional precoded responses (eg, strongly agree, somewhat agree, etc) were judged inappropriate for our study population, as they do not match words or expressions which are commonly used in Côte d’Ivoire to express level of agreement in everyday conversation. The context-specific Likert-type response scale we created seemed to be effective in addressing the above issues. We would encourage other studies conducted in similar settings to consider using similar approach.

Study limitations

Several limitations of our study can be identified. First, convenience sampling may have resulted in a sample which is not representative of the general population. This may limit the generalisability of our results.

We used various methods to minimise social desirability bias attached to handwashing, including negatively framing the scale items. When comparing the item response distribution between the HWDN and HWIN scales, we can see that while most respondents tended to agree with the notion that few people washed their hands with soap after toilet use, about half of the respondents tended to disagree to strongly disagree with the notion that few people saw HWWS after toilet use as an important practice. Both response distributions are in line with the observed low handwashing practices in low and middle-income country settings,5 and strong injunctive norms around the practice, as illustrated by the tendency to over-report handwashing practices.40 41 This may indicate that the techniques used in this study were successful at minimising social desirability bias. Nevertheless, we cannot exclude the fact that such bias may have affected the results.

We did not collect compound identification data, and the hypothesis of a three-factor model fitting the data was tested using linear SEM methods with Satorra-Bentler adjustment. While a linear model may not be appropriate for ordinal data but under some circumstances (eg, ref 51), it allows us to have an approximative assessment of the goodness of fit of the three-factor model. Future studies should refine the aforementioned approaches by collecting and accounting for compound-level clustering. Additionally, the overall model should be tested directly using methods suitable for assessment of model fit with ordinal data in medium to large samples (eg, weighted least squares).

We did not measure handwashing practices, and thus cannot assess whether the three norm-related constructs of interest correlate with handwashing practices in our study population. A few studies conducted in high-income countries have reported on the predictive validity of the norms constructs of interest and handwashing practices after toilet use.16 23 24 Lapinski et al found strong evidence that, as the perceived HWIN became stronger, self-reported handwashing increased (p<0.001).23 Similarly, Chung and Lapinski found strong evidence of a positive association of both the perceived HWDN and the HWIN with self-reported handwashing (p<0.001).24 The evidence on behaviour publicness is less consistent. Lapinski et al found no evidence of an association between behaviour publicness around handwashing and handwashing practices (p=0.43),16 while Chung and Lapinski found strong evidence that as the perceived HWP increased, self-reported handwashing practices also increased (p<0.001).24

In scale development, it is usually recommended to be inclusive and test a large pool of items to increase the probability that the items exhaust all the possible content of the construct of interest.33 34 36 52 Factor analysis can then be used to reduce the pool of items by identifying items that perform poorly and can be eliminated.33 34 52 This study initially attempted to measure the constructs of interests with between four to six items per scale. However, the narrow content area of the constructs resulted in many of the items being very similar, which was so irritating for the respondents that some had to be dropped.

The negative reaction of respondents to the perceived repetitiveness of the questionnaire led us to reduce the item pool to two or three items per scale, before subjecting the scales to psychometric testing. The risk with scales with such small pools of items is that the items are not representative of the construct of interest and thus do not load on their intended factor. However, if items in the same scale are highly redundant, one might argue that this indicates that the construct of interest could be measured with a single item. For constructs such as (perceived) descriptive norms, a scale with up to two items may be sufficient in some circumstances. Lapinski et al’s study demonstrated the predictive validity of a two-item HWDN scale against handwashing practices.16 The social desirability attached to handwashing makes the measurement of such a construct with one item prone to error and bias and this is likely to remain the case even when there is more than one item.

Despite all these limitations, to our knowledge, this study is the first of its kind in Côte d’Ivoire and other comparable settings, and we believe our findings provide valuable information for future studies in the area.


We successfully designed three handwashing norm-related scales specific to economically disadvantaged community settings in Abidjan, Côte d’Ivoire. The methods used to design the scales, including a context-specific Likert-type response scale, could be easily replicated in other similar settings. The social desirability of handwashing and the narrow content area of social norm-related constructs around this practice present significant challenges when designing items to measure such constructs. It is key that future studies attempting to do so take these challenges into account when developing such scales, even more so studies aiming to demonstrate the predictive validity of handwashing norm-related constructs against handwashing practices. We will test the ability for the three designed scales to predict handwashing practices in our study population, as part of a cluster randomised trial aimed at evaluating the effect on HWWS after toilet use of a TNSB-based behaviour change handwashing intervention.

Data availability statement

Data are available upon reasonable request. All data relevant to the study are included in the article or uploaded as supplementary information.

Ethics statements

Patient consent for publication

Ethics approval

This study involves human participants and ethical approval was obtained from Côte d’Ivoire’s Bioethics Committee (Comité Bioéthique de Côte d’Ivoire) (see online supplemental Ethics Approval), Côte d’Ivoire’s Ministry of Higher Education and Scientific Research (Ref: 0758/MESRS/CAB 1/gsy) and the London School of Hygiene & Tropical Medicine’s Research Ethics Committee (Ref: 7029). Participants gave informed consent to participate in the study before taking part.


Nous remercions la commune de Koumassi et les participants de nous avoir ouvert leurs portes, et pour leur gentillesse tout au long de l’étude.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Twitter @MaudAmon1

  • Contributors MAA-T is the overall content guarantor and designed the study with inputs from SC, JM and MKL. MAA-T developed the data collection tools with inputs from all authors including PN-D. MAA-T, PKB and HAK implemented the study with guidance from SC. MAA-T designed the data analysis plan with inputs from SC and GBP. MAA-T analysed the data with guidance from SC and GBP. MAA-T prepared the first draft of the manuscript. All coauthors contributed to the interpretation of the results and revisions of the manuscript.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.