Reliability and Validity
The terms reliability and validity get paired together quite often, and while they are both important in assessing the effectiveness of a research design, the two terms actually assess two very different ideas.
Reliability refers to a study’s consistency, or ability to measure things in the same exact way each time we run a research project.
Validity on the other hand refers to our study’s ability to measure what we claim it is measuring. As example to differentiate the two, consider the following:
A ruler is a very reliable instrument. We can be very certain that if we were to use a ruler to measure your foot that (assuming you have stopped growing) we would be able to get the same results each time we measured your foot, even if we were to use different rulers.
But imagine I were to then take this reliable instrument, the ruler, and try to determine your personality by measuring your foot length. The ruler, still very reliable, would lack validity completely if we were to use it to measure personality via assessing foot length. As you can probably infer, the theory behind what determines one’s personality would need to be captured in the measurement tool we use to assess personality. A valid measure of personality would need to measure the key theoretical aspects of what previous research has determined makes up one’s personality.
Important types of reliability to consider:
When assessing a study’s reliability we try to determine if the study shows evidence of:
Test-retest reliability: If we were to measure the same group of participants at a different point in time, would their scores be consistent on the key variable? (note that the importance of test-retest reliability will depend on the variable in question, intelligence is theorized to be a stable trait, making test-retest reliability crucial, mood on the other hand is not a stable trait and therefore we would expect to see differences if tested repeatedly).
Internal consistency: If we were to look at your scores to different questions or items in our design, would your scores be consistent? (For example if you get the first five questions wrong on the midterm and the last five questions correct, we can assume that the midterm was not internally consistent, in the sense that you did not have an equal ability to get questions correct throughout the exam).
Inter-rater reliability: If we were to both observe a behavior, would we rate things the same? (For example, if we are assessing childhood aggression and observing the same children at the same time, would what we consider as aggression be reflected by the fact that we have the same ratings, or might your interpretation of aggression differ from mine, resulting in inconsistent scoring between us?).
Important types of validity to consider:
When conducting an experiment we will look for three different types of validity:
Construct validity- does our method of studying the key variables in our design accurately reflect the current theory of these variables?
Internal validity- can we conclude that our variables clearly delineate which variable causes the effect of the other variable?
External validity- does our study’s sample generalize to the population that we are studying?
Assignment status: Solved by our experts