Need to get to grips with some of those words that are being flung around – validity, reliability, generalisability, authenticity, rigour, replicability, triangulation. At the moment they seem a bit of a blur of contested concepts – it’s about time I pursue an inquiry to get to grips with these concepts, the debates around them and more importantly – their relevance to MY particular research project.
So first of all, what is/are the question(s) that all these concepts are seeking to answer? They seem to arise in relation to two particular discussions:
Is this research good?
Are the conclusions acceptable?
Obviously, these two are interlinked. Your research design and quality do need to ‘fit’ with the claims you make and vice versa. So for example on my course I need to be able to demonstrate that my research findings are valid/authentic/credible/reliable – as appropriate to my research paradigm. The research design and how it is realised in practice will affect what I can claim.
The reason why all of this is so contested is because the hegemony on what makes good research has arisen through the long-standing and over-dominant positivist paradigm. This leads to three famous criteria, which I first came across in my academic work for D843 (Taylor, 2001).
They are:
- reliability: are the tools or instruments being used to measure reliable? – e.g. consistent over time; between subjects.
- validity: are the conclusions, propositions, inferences being made by the researcher TRUE or ACCURATE?
- replicability: would a future researcher replicate the project and produce the same or similiar results?
Out of these, the concept I have ‘grappled’ with most during my course is the word VALIDITY – even in the first assignment we were asked to say whether our research would produce ‘enough data to be valid’. For me, not working in a positivist paradigm, this seems like an odd thing to have to cover.
Digging further into the concept of VALIDITY, I see there are a number of different inter-related concerns.
Trochim (2006) identifies four types of validity – but emphasises that it is only when studying causal research questions, that all of them come into play…
1) external validity – I found this explanation helpful (Trochim, 2006) – it is the concern for your conclusions being relevant to other people or at other times. This is where the concern for SAMPLING comes in – if you do your research on an appropriately ‘representative’ sample, then you can generalise back to a wider population. Trochim also covers an alternative concept of proximal similarity – contexts can vary with time, place, people and settings – the more similiar other contexts are to your own on these dimensions, then the more likely your conclusions can be generalised to them.
2) construct validity – covered by Trochim here – it is the concern for the relationship between ideas/theories and what you have done in the real world – can you claim that what you have done in the ‘real world’ can be related back to ideas/theories?
3) internal validity – covered by Trochim here – is only relevant in studies that try to establish a causal relationship because it is concerned with inferences regarding cause-effect relationship.
4) conclusion validity – covered by Trochim here – is the degree to which conclusions about relationships are reasonable.
Victor Jupp writing in Sage Research Methods Online (sorry link will only work if you have access via OU) uses the following definition of validity “the extent to which conclusions drawn from research provide an accurate description of what happened or a correct explanation of what happens and why”. He uses three slightly different ‘types’ of validity – all three have to be addressed to ensure overall validity:
A) validity of measurement – does a research tool measure what it says it measures? Sapsford goes into this further (link for those with OU access) and describes three different ways of warranting validity of measurement. Firstly, face validity – it looks like it is. Secondly, concurrent validity – gives same answers as another measurement instrument. And thirdly, predictive validity – the measure predicts an outcome. He goes onto say that this is primarily a concern of quantitative research, but that doesn’t let qualitative research off the hook. Qualitative researchers must also work to demonstrate that patterns described are typical of people or setting NOT a product of the research situation (people involved; data collection methods etc). This is where the ‘notion’ of TRIANGULATION comes in – involving more than one researcher or more than one data collection method – but Sapsford emphasises:
the major tool is reflexivity careful and practised sensitivity about the extent to which personal characteristics and behaviour, the accidents of the research setting, the relationships involved and the biases and preconceptions of the researcher may have an effect on the nature of the data produced. The aim is to produce a research report that is transparent about how data are collected and interpretations made, so that the reader can form his or her own judgement of their validity.
B) validity of explanation – are explanations and conclusions correct for this specific study (people, time, place)? (similiar to internal validity). Jupp goes further into this (link for those with OU access) and provides a neat definition – “The extent to which an explanation of how and why some social phenomenon occurs is the correct one”. Again this is a highly developed and proceduralised concern in quantitative research – with experimental design and lab conditions being used to ‘rule out’ other possible explanations. Like Sapsford, Jupp makes a point of adapting this concern to qualitative research:
However, qualitative researchers are also concerned with establishing validity. This is not in terms of controlling or ruling out alternative variables but in terms of providing assurances that the account that has been put forward (say, of interactions in the classroom) is the correct one. This is done by analytic induction (searching for conclusions that do not fit the conclusions) and reflexivity (reflecting on the possible effects of the researcher on the conclusions put forward).
C) validity of generalisation – can the conclusions drawn from this study be generalised to other people (population validity) or other contexts (ecological validity)? (similiar to external validity). This is developed further here (for those with OU access) by Sapsford who defines it as “The extent to which information from a sample gives us information about a population, or the extent to which information about one setting tells us about others (which may be of more interest to us)”.
Jupp’s and Sapsford’s explanations seem more ‘open’ that Trochim’s in that they are phrased in ways that can apply beyond studies into causal questions. And, Jupp points out that validity is associated with questions of truth and realism so sits uneasily with postmodernist, constructivist and interpretivist approaches.
When I studied TU870 Capacities for managing development, I came across the work of Hulme (1995) who discussed ‘orthodox’ project planning. He described three types of responses to dissatisfaction with the orthodox model:
- leaving things as they are (loyalty)
- modifying the way things are (voice)
- rejecting the orthodox approach and proposing an alternative (exit).
Reading Jupp and Sapsford makes me think of VOICE. They recognise a dissatisfaction with ‘orthodox’ notions of validity when doing qualitative research and so provide suggested modifications for how you can think of the issue of validity. I guess I am more on the EXIT line – I reject the orthodox but have yet to find my alternative. Following Taylor (2001) I understand that the knowledge I will produce will be situated (will only be relevant to specific circumstances, people, time) and contingent (the claims I make do not have any status of stable TRUTH – even for the specific circs, people).
Reading Bradbury and Reason (2006, 343) I see that I am not alone in the EXIT mode. They cite a number of authors who question the very question of validity and say that we should not seek to ‘fit’ action research into concerns that arose for a different paradigm. Reason and Bradbury themselves don’t reject the use of the concept of VALIDITY, rather have a concern about shifting the dialogue about validity “from a concern with idealist questions in search of ‘Truth’ to concern for engagement, dialogue, pragmatic outcomes and an emergent, reflexive sense of what is important” (ibid). They refer to this ideas as ‘broadening the bandwidth of validity’. It seems they want to stick to the use of the VALIDITY concept because in informing a broader understanding, they can also add to conversations about validity in other types of research work. I kind of like that stance, but still feel uncomfortable about the use of the term because of the expectations it currently creates. However, for most of their chapter, Reason and Bradbury discuss the issue of ‘quality’ in action research (as contribution to the wider validity debate). They end up (page 350) by proposing some issues to be considered by action researchers:
Is the action research
- explicit in developing a praxis of relational participation?
- guided by reflexive concern for practical outcomes?
- inclusive of a plurality of knowing?
- ensuring conceptual-theoretical integrity?
- embracing ways of knowing beyond the intellect?
- intentionally choosing appropriate research methods?
- worthy of the term significant?
- emerging towards a new and enduring infrastructure?
Before moving on, I want to go back to the issue of TRIANGULATION – mentioned above in relation to validity of measurement. The reason for this is it is specifically mentioned in my course materials and assignment requirements. Uwe Flick (link for those with OU access) uses the following definition: “the observation of the research issue from (at least) two different points. This understanding of the term is used in qualitative as well as quantitative research and in the context of combining both.” The article goes on to say that there are four particular aspects of triangulation:
- Triangulation of data – combining data from different sources – different times, places or people
- Investigator triangulation – use of different observers or interviewers to balance subjective influences.
- Triangulation of theories – approach data from different theoretical angles
- Methodological triangulation – most often use – to maximise validity by combining methods.
As mentioned above, triangulation stems from a concern with validity. I see each of my ‘talk samples’ for my research as exploratory case studies so in some ways each of these case studies will triangulate each other in that they come from different people (triangulation of data). I felt it was not feasible to do methodological triangulation in the timescales of my research – i.e. more than one type of ‘observation’ on each case study. As a number of interviewers are involved there is an element of investigator triangulation (but only one ‘researcher’ analysing and interpreting the data).
So, where has all this got me? I do want to ‘exit’ from the notion of validity because of its strong association with a positivist paradigm but in doing this reading, I also don’t want the throw the baby out with the bathwater. I do need to do good research (for me a question of quality) and I do want to draw acceptable conclusions (for me conclusions that are situated and contingent). So what seems particularly relevant to me:
– reflexivity – mentioned already. Victor Jupp (link for those with OU access) defines this as “The process of monitoring and reflecting on all aspects of a research project from the formulation of research ideas through to the publication of findings and, where this occurs, their utilization. Sometimes the product of such monitoring and reflection is a reflexive account which is published as part of the research report.” It seems that this is an integral part of research and the way I go about it – the crucial aspect will be how that comes over in the final report. Jupp does emphasise however that one aspect of the reflective account is considerations of the threats to the validity of the conclusions you reach and how important it is to be clear about this so that the reader can make their own judgements (see, you can never escape the concept of validity!)
– authenticity – the T847 materials (link for those with module access) point out that authenticity is particularly important to those using critical theory to underpin their research. However, some of the issues listed seem to be of broader relevance… especially the need to demonstrate the lines of inquiry followed and the questions asked of the data. I’ve looked into the issue of authenticity before in this blog – drawing on the work of Coghlan and Brannick (2010). I really like their process imperatives.
And then Taylor (2001) who highlights a number of more general issues, useful for evaluating the quality of research:
- located in relation to previously published work – building on, or challenging, it
- coherent – depending on argument for its persuasiveness
- rigour – issue of sampling; also issue of seeking and highlighting negative instances (where feature clearly does not occur)
- richness of detail in the analysis presented to the reader (subject to word counts for me!)
- richness of detail in explaining process of analysis (ditto!)
- fruitfulness or pragmatic use – how provide a basis for future research
- quality of the interpretation
- relevance – to practice or to a social issue with consideration to how the findings from the research may be applied.
I feel entirely comfortable with the idea that my research has to be of good quality (in terms of my Masters that means ‘academically credible’). I am also coming to terms with the idea that my conclusions need to judged as ‘valid’ – what I reject is the traditional positivist assumptions contained in that term (single truth; endurable truth; generalisable truth). If all I am intending on making are modest conclusions that in this set of data, I have seen xxx and therefore I would encourage others to look for xxx in similiar data from similiar people and work from a more appreciative stance etc – then why do I need to spend precious words on issues of sampling or triangulation – it feels as if I am having to spend more words exploring what my research isn’t than what it is.
References
All web-resources as per links in text – accessed as per date of this post.
Taylor, S., 2001. Chapter 1 Locating and Conducting Discourse Analytic Research and Chapter 8 Evaluating and applying discourse analytic research. In Wetherell, M., Taylor, S., and Yates, S.J. (Editors) Discourse as data: A guide for analysis. London: Sage Publications, pp. 5-48 and 311-330.
Reason, P. & Bradbury, H. eds., 2006. Handbook of Action Research Concise Paperback Edition., London: Sage Publications.
Hulme, D., 1995. Projects, Politics and Professionals: Alternative approaches for Project Identification and Project Planning, Agricultural Systems, 47, pp 211- 233
Just a couple of notes prompted by some thoughts a fellow student put in an email to me. Arwen has been reading Patton (2002) Qualitative Research and Evaluation Methods, specifically a chapter on ‘Enhancing quality and credibility’.
Patton talks about ‘reactions’ to a researcher’s work – any ‘reader’ will judge that work by their own set of criteria. This reminded Arwen and me of Ison’s discussions on explanations – an acceptable explanation depends on a social dynamic. I also see links with Ison’s discussions on evaluation – valuing is also a social dynamic. From this perspective, what counts as ‘quality’ and what counts as ‘credibility’ will depend as much on the ‘reader’ as it does on the ‘researcher’ and the ‘product’ they produce. This makes it as important to explain why you think it is ‘quality’ and why you think it is ‘credible’ as much as the findings and conclusions. The ‘reader’ may choose not to agree with your explanations re: quality and credibility.
The other thought is much shorter – I like the term ‘credible’ – are my conclusions ‘credible’? This gives me a language to replace the word ‘validity’ and a different way of framing discussions about quality.
And another note…Arwen also mentioned that Patton cites the work of Lincoln and Guba (1986) and their suggestion of criteria for evaluating post-positivist research. I can’t source the original reference, but this internet article by Wallendorf and Belk (1989) explains this work and builds on it.
I’ve now read another article which builds on this debate.
Learmonth, Mark, Lockett, Andy & Dowd, Kevin, 2012. Promoting Scholarship that Matters: The Uselessness of Useful Research and the Usefulness of Useless Research. British Journal of Management, 23(1), pp.35–44.
Abstract at: http://onlinelibrary.wiley.com/doi/10.1111/j.1467-8551.2011.00754.x/abstract
Apart from being impressed by the title of the article, I liked the way that it also reminds me of the contested nature of terms like ‘useful’ and ‘relevance’. Again, usefulness and relevance depend on questions like – to whom? when? They are not empirical statements. Yet another reminder of reflexivity!