Chapter Five Experimental designs and intervention studies

CHAPTER CONTENTS

Introduction 55

Experimental research 56

Assignment of participants into groups 56

Random assignment 56

Matched groups 57

Different experimental designs 57

Post-test only design 57

Pre-test/post-test design 57

Repeated measures 57

Factorial designs 58

External validity of experiments and RCTs 58

The Rosenthal effect 59

The Hawthorne effect 59

Controlled research involving human participants 59

Summary 60

Self-assessment 60

True or false 60

Multiple choice 61

Introduction

Experiments are an important form of intervention studies. A well-designed experiment enables researchers to demonstrate how manipulating one set of variables (independent variables) produces systematic changes in another set of variables (outcome or dependent variables). Experimental designs attempt to ensure that all variables other than the independent variable(s) are controlled; that is, that there are no uncontrolled extraneous variables that might systematically influence or bias changes in the outcome variable(s). Control is most readily exercised in the sheltered environment of research laboratories and this is one of the reasons why these settings are the preferred habitats of experimentalists. Much of clinical experimentation, however, takes place in field settings (e.g. hospitals and clinics), where the phenomena of health, illness and treatment are naturally located. Even in these natural settings researchers can exercise control over extraneous variables. The randomized controlled trial (RCT) is an experimental research design aimed at assessing the effectiveness of a clinical intervention. The experimental approach has also been used more broadly in the ‘hard’ sciences.

Page 56

The specific aims of this chapter are to:

1. Examine the basic structure of experimental research designs and RCTs.

2. Consider threats to the validity of the results obtained from experiments and RCTs.

Experimental research

Experimental studies and RCTs involve the following steps:

1. Definition of the population. Researchers define the population to which they wish to generalize. For example, this might be males over 55 years with coronary heart disease or the local community or a certain type of health care organization.

2. Selection of the sample. Using an appropriate sampling method, the study sample is selected from the population. It is desirable that the sample is representative of the population. It is important to note that steps 1 and 2 are common to all research designs.

3. Assignment procedures. Using a randomized assignment procedure the participants are allocated to the intervention groups. Random assignment means that each participant has equal chance of being assigned to any of the groups. In an RCT, one of the groups receives a control intervention with the active component of the intervention missing. For example, in an experimental study of the effectiveness of different types of weight-loss programmes, one group might receive an instruction manual, another might receive supervised dietary training and a third group may receive no intervention. The purpose of the randomized assignment procedure is to ensure that the groups are as similar as possible on all the relevant variables. If the groups are substantially different, it will be very difficult to attribute any differences in final outcome to the ‘treatments’ being administered.

4. Administration of intervention (treatment). The researcher then administers the intervention(s) to the various groups participating in the experiment. This is called the independent variable(s). It is important that the intervention is administered in an unbiased way, in order that a fair test of any differences in outcomes may be provided. As will be discussed later, awareness of the expected outcomes on the part of participants may lead to a spurious biasing of the outcomes for those participants. Therefore, the true aims and expectations of the researcher are sometimes concealed from the participants in an attempt to control expectancy effects.

5. Measurement of outcomes. The researcher assesses the outcome of the experiment through measurement of the dependent or outcome variable(s). Sometimes, the dependent variable is measured both before and after the experimental intervention (this is called a pre-test/post-test design) and other times only afterwards (this is called a post-test only design).

Thus, in an experimental study the researcher actively manipulates the independent variable(s) and monitors outcomes through measurement of the outcome or dependent variable(s).

Assignment of participants into groups

Random assignment

The simplest approach is to assign the participants randomly to independent groups. Each intervention group represents a ‘level’ of the independent variable. Say, for instance, we were interested in the effects of a new drug (we will call this drug A) in helping to relieve the symptoms of depression. We also decide to have a placebo control group, which involves giving patients a capsule identical to A, but not containing the active ingredient.

Given a sample size of 20, we would assign each participant randomly to either the experimental (drug A) or the control (placebo control) group. Random assignment could involve encoding all the 20 names into a computer program and using random numbers to generate two groups of 10. In this case, we would end up with the two equivalent groups:

	Levels of the independent variables
	Control group (Placebo)	Experimental group (Drug A)
Number of participants	n_c = 10	n_e = 10

Here n_c and n_e refer to the number of participants in each of the groups, given that the total sample size (n) was 20. We can have more than two groups if we want. For example, if we also included another drug ‘B’, the independent variables would have three levels. We would require a total of 30 participants if the group sizes remained at 10.

Page 57

Matched groups

Random assignment does not guarantee that the two groups will be equivalent. Rather, the argument is that there is no reason that the groups should be different. While this is true in the long run, with small sample sizes chance differences among the groups may distort the results of an experiment or RCT. Matched assignment of the participants into groups minimizes group differences caused by chance variation.

Using the hypothetical example discussed previously, say that the researcher required that the two groups should be equivalent at the start of the study on the measure of depression used in the experiment. Using a matched-groups design the participants would be assessed for level of depression before the treatment and paired for scores from highest to lowest. Subsequently, the two participants in each pair would be randomly assigned to either the experimental group or the placebo control group. In this way, it would be likely that the two groups would have similar average pre-test depression scores.

Different experimental designs

Four types of experimental design will be discussed. These are the post-test only, pre-test/post-test, repeated measures and factorial designs.

Post-test only design

At first it may appear that this would make measurement of change impossible. At an individual level this is certainly true. However, if we assume that the control and experimental groups were initially identical and that no change had occurred in the controls, direct comparison of the post-test scores will indicate the extent of the change. This type of design is fraught with danger in clinical research and should only be used in special circumstances, such as when pre-test measures are impossible or unethical to carry out. The assumptions of initial equivalence and of no change in the control group often may not be supported and, in such cases, interpretation of group differences is difficult and ambiguous.

Pre-test/post-test design

In this design, measurements of the outcome or dependent variables are taken both before and after intervention. This allows the measurement of actual changes for individual cases. However, the measurement process itself may produce change, thereby introducing difficulties in the clear attribution of change to the intervention on its own. For example, in an experimental study of weight loss, simply administering a questionnaire concerning dietary habits may lead to changes in those habits by encouraging people to scrutinize their own behaviour and hence modify it. Alternatively, in measures of skill, there may be a practice effect such that, without any intervention, the performance on a second test will improve. In order to overcome these difficulties, many researchers turn to the post-test only design.

Repeated measures

In order to economize with the number of participants required in an experimental design, the researchers will sometimes re-use participants in the design. Thus, at different times the participants may receive, say, drug A or drug B. If it were the case that every subject encountered more than one level of the drug variable or factor, then ‘drug’ would be termed a repeated measures factor. An important consideration is using a ‘counterbalanced’ design to avoid series effects. For example, half the participants should receive drug A first, and half drug B. If all the participants received drug A first and then drug B, the study would not be counterbalanced and we would not be able to determine whether the order of administration of the drugs was important. Time is a common repeated measures factor in many studies. A pre-test/post-test design involves the measurement of the same participants twice. If ‘time’ is included in the analysis of the study, then this is a repeated measures factor. In statistical analysis, repeated measures factors are treated differently from factors where each level is represented by a separate, independent group. This is true both for matched groups, discussed earlier, as well as for repeated measures, discussed above.

Page 58

Factorial designs

A researcher will often not be content with the manipulation of one intervention factor or variable in isolation. For example, a clinical psychologist may wish to investigate the effectiveness of both the type of psychological therapy and the use of drug therapy for a group of patients. Let us assume that the psychologist was interested in the effects of therapy versus no psychological treatment, and of drug A versus no treatment. These two variables lead to four possible combinations of treatment (see Table 5.1).

Table 5.1 Examples of a factorial design

	Drug A	No drug
Rogerian therapy	1	2
No psychological therapy	3	4

This design enables us to investigate the separate and combined effects of both independent variables upon the outcome measure(s). In other words, we are looking for interactions among the two (or more) factors. If all possible combinations of the values or levels of the independent variables are included in the study, the design is said to be fully crossed. If one or more is left out, it is said to be incomplete. With an incomplete factorial design, there are some difficulties in determining the complete joint and separate effects of the independent variables.

In order that the terminology in experimental designs is clear, it is instructive to consider the way in which research methodologists would typically describe the example design in Table 5.1. This is a study with two independent variables (sometimes called factors), namely, type of psychological therapy and drug treatment. Each independent variable or factor has two levels or values. In the case of psychological therapy, the two levels are Rogerian therapy and no psychological therapy. This would commonly be described as a 2 by 2 design (each factor having two levels). There are four groups (2 × 2) in the design.

If a third level, drug B, was added to the drug factor, then it would become a 2 by 3 design with six groups required. Two groups, drug B with Rogerian therapy and drug B with no psychological therapy, would be added. It is possible to overdo the number of factors and levels in experimental studies. A 4 × 4 × 4 design would require 64 groups: that is a lot of research participants to find for a study. It is relevant to note that when we evaluate two or more groups over a period of time we are also using a factorial design.

Multiple dependent variables

Just as it is possible to have multiple independent variables in an experimental study, it is also sometimes appropriate to have multiple outcome or dependent measures. For example, in order to assess the effectiveness of an intervention such as icing of an injury, factors such as extent of oedema and area of bruising are both important outcome measures. In this instance, there would be two outcome measures. The use of multiple dependent variables is very common in health research. The outcomes measured are usually evaluated individually, although there are more complex statistical techniques which enable the simultaneous analysis of multiple dependent variables.

External validity of experiments and RCTs

We have already discussed external validity in Chapter 3. There are further criteria for ensuring the generalizability of an investigation, depending on the procedures used and the interaction between the patients and the therapist.

Page 59

The Rosenthal effect

A series of classic experiments by Rosenthal (1976) and other researchers have shown the importance of expectancy effects, where the expectations of the experimenter are conveyed to the experimental subject. This type of expectancy effect has been termed the Rosenthal effect and is best explained by consideration of some of the original literature in this area.

Rosenthal and his colleagues performed an experiment involving the training of two groups of rats in a maze-learning task. A bright strain and a dull strain of rats, specially bred for the purpose, were trained by undergraduate student experimenters to negotiate the maze. After a suitable training interval, the relative performances of the two groups were compared. Not surprisingly, the ‘bright’ strain significantly outperformed the ‘dull’ strain.

What was surprising is that the two groups of rats were actually genetically identical. The researchers had deceived the student experimenters for the purposes of the study. The students’ expectations of the rats likely had resulted in different handling and recording which had apparently affected the rats’ measured learning outcomes. Rosenthal’s results have been confirmed time and time again in a variety of experimental settings, and with a variety of participants.

If the Rosenthal effect is so pervasive, how can we control for its effects? One method of control is to ensure that the ‘experimenters’ do not know the true purpose of the study; that is, the experimental hypothesis. This can be done by withholding information – just not telling people what you are doing – or by deception.

Deception is riskier and less ethically acceptable. Most organizations engaged in research activity have ethics committees that carefully monitor and limit the use of deception in research. If the people carrying out the data collection are unaware of the research aims being tested we say that they were ‘blind’ to the research aims.

The Hawthorne effect

As well as the impact of the expectations of participants in experimental studies, there is also the issue of whether attention paid to participants in the experimental setting alters the results.

In the late 1920s, a group of researchers at the Western Electric Hawthorne Works in Chicago investigated the effects of lighting, heating and other physical conditions upon the productivity of production workers. Much to the surprise of the researchers, the productivity of the workers kept improving independently of the actual physical conditions. Even with dim lighting, productivity reached new highs. It was obvious that the improvements observed were not due to the mani-pulations of the independent variables themselves, but some other aspect of the research situation. The researchers concluded that there was a facilitation effect caused by the presence of the researchers. This type of effect has been labelled the Hawthorne effect and has been found to be prevalent in many settings. Of particular interest to us is the Hawthorne effect in clinical research settings. It must be considered that even ‘inert’ or useless treatments might result in significant improvements in the patients’ condition under certain circumstances. The existence of the placebo effect reinforces the importance of having adequate controls in applied clinical research. Although we cannot eliminate it, we can at least measure the size of it through observation of the control group, and evaluate the experimental results accordingly.

Controlled research involving human participants

Investigations involving human participants require that researchers should consider both psychological and ethical issues when designing experiments. Human beings respond actively to being studied. When recruited as a research participant, a person might formulate a set of beliefs about the aims of the study and will have expectations about the outcomes. In health research, placebo effects are positive changes in signs and symptoms in people who believe that they are being offered effective treatment by health professionals. These improvements in signs and symptoms are in fact elicited by inert treatments probably mediated by the patients’ expectations. Health professionals who hold strong beliefs about the effectiveness of their treatments communicate this attitude to their patients and possibly increase the placebo effects.

Page 60

Placebo responses as such are not a problem in everyday health care. Skilled therapists utilize this effect to the patients’ benefit. Of course, charlatans exploit this phenomenon, masking the poor efficacy of their interventions. In health research, however, we need to demonstrate that an intervention has therapeutic effects greater than a placebo, hence the need for a placebo control group. The ideal standard for an experiment is a double-blind design, in which neither the research participants nor the health providers/experimenters know which participants are receiving the active form of the treatment.

It is not always possible, however, to form placebo control groups. In the area of drug research and some other physical treatments it can be possible; but with certain behaviourally-based interventions, in areas such as psychotherapy, physiotherapy or occupational therapy, double-blind placebo interventions may be impossible to implement. For instance, how could a physiotherapist offering a complex and intensive exercise programme be ‘blind’ to the treatment programme being offered? In these situations researchers employ no treatment or traditional intervention control groups when evaluating the safety and effectiveness of a novel intervention.

Also, there are ethical issues that need to be taken into account when using placebo or non-intervention control groups. Where people are suffering from acute or life-threatening conditions, assignment into a placebo or no-treatment control group could have serious consequences. This is particularly true for illnesses where prolonged participation in a placebo control group could have irreversible consequences for the sufferers. Under such circumstances, placebo or no-treatment controls might well be unethical and the selection of a traditional treatment group would be preferred.

Summary

The experimental (RCT) approach to research design involves the active manipulation of the independent variable(s) through the administration of an intervention often with a nonintervention control group and the measurement of outcome through the dependent variable(s). Good experimental design requires careful sampling, assignment and measurement procedures to maximize both internal and external validity.

The Hawthorne and Rosenthal effects are important factors affecting the validity of experimental studies, and we attempt to control for these by ‘blinding’ when ethically possible.

Common experimental designs include the pre-test/post-test, post-test only, repeated measures and factorial approaches. These designs ensure that investigations can show causal effects. More recent approaches to evidence-based health care have emphasized the importance of employing experimental trials wherever possible.

Self-assessment

Explain the meaning of the following terms:

control group

double blind

factorial design

Hawthorne effect

history

independent variable

instrumentation

internal validity

matching

mathematical model

maturation

mortality

physical model

placebo effects

post-test only design

pre-test/post-test design

random assignment

Rosenthal effect

selection error

True or false

1. The dependent variable is the variable measured by the investigator.

Page 61

2. Selection or assignment errors arise when, after the assignment of the participants, the groups are not equivalent.

3. If in a clinical investigation more people die in the control group than the experimental group, the investigation lacks internal validity.

4. Ideally, the control group and the experimental group should receive exactly the same treatments.

5. Given that a placebo is an inert substance, its administration has no effects on the participants’ behaviours.

6. The random assignment of participants is always preferable to assignment by matching.

7. Persons can serve as their own controls.

8. We reduce the effects of subject and experimenter expectancies by blindfolding.

9. Dependent variables are manipulated by the researcher.

10. Control can be exercised by keeping the values of extraneous variables constant.

11. Confounding extraneous variables generate explanations other than the independent variable.

12. Placebo effects can be seen as confounding extraneous variables.

13. If A antecedes B, then A must be the cause of B.

14. ‘Double-blind’ experiments are impossible for evaluating some health interventions.

15. ‘Maturation’, that is natural recovery, can be an extraneous variable.

16. The use of placebo control groups is always unethical.

Multiple choice

1. The aim of controlled observation is to:

a remove the effects of confounding influences

b identify the effects of the independent variable on the dependent variable

c establish causal relationships

d all of the above.

2. To say that an investigation lacks internal validity means that:

a the independent variable had no effect

b the dependent variable was not measured

c uncontrolled variables may have affected the outcome

d there were several dependent variables.

3. Which of the following is most representative of a placebo effect?

a Headache is reduced when an antidepressant is administered.

b Headache is reduced one second after swallowing an analgesic, well before it is absorbed by the body.

c Headache is increased after a fierce argument with a ‘significant other’.

d Headache is decreased after the use of biofeedback.

A researcher is studying the effect of a new drug on healing of ulcers. Patients are assigned randomly, by the physician who treats them, to receive either the standard treatment or the new drug. Patients are informed that they are being studied, but they do not know which treatment they are getting. The measure of rate of healing is the number of days until the ulcer is completely healed.

4. The independent variable in the above study was:

a ulcers

b the new drug

c type of treatment

d rate of healing.

5. The dependent variable in the above study was:

a ulcers

b the new drug

c type of treatment

d rate of healing.

6. This study is:

a double blind because the patients do not know what treatment they are getting or which is expected to be more effective

b double blind because neither the researcher nor the patients know what treatment each person is getting

c single blind because the patients do not know what treatment they are getting but the physician who treats them does

d not blind at all as the patients know they are being studied.

7. Which of the following threats to internal validity is not controlled for in this study?

a maturation

b regression to the mean

c repeated testing

d history

e none of the above.

Page 62

8. This investigation is an example of:

a an experiment, because patients are assigned randomly

b a study, because the treating doctor did the assigning

c an experiment, because two treatments are compared

d a study, because there is no control group.

9. If an experiment is internally valid, this means that:

a the study’s results cannot be generalized to other equivalent settings

b the sampling method is appropriate

c the study is not really an experiment

d the independent variable is responsible for any trends observed

e the dependent variable has face validity.

A researcher wishes to study the effectiveness of a new drug for treating arthritis. Participants are selected randomly from patients attending a clinic and are assigned randomly to treatment with the new drug (A) or with the current standard treatment (B). A pre-test/post-test design is used. A number of arthritic symptoms are assessed using a checklist of seven items.

10. The dependent variable in this study is:

a the number of arthritic symptoms

b the new drug

c the method of assignment

d the type of treatment.

11. The independent variable in this study is:

a the number of arthritic symptoms

b the new drug

c the method of assessment

d the type of treatment.

12. The actual outcome of the study is illustrated by the following graph. Even though the participants were assigned randomly, the graph indicates possible threat(s) to internal validity due to:

a maturation

b regression to the mean

c assignment

d b and c

e all of the above.

13. A factorial design involves:

a more than one independent variable

b more than one dependent variable

c only one independent variable, but with more than one level

d no control group

e none of the above.

14. Simulation research means:

a using computers to analyse data

b studying differences between different artificial limbs

c studying a model of a situation to reach conclusions about a situation

d pretending to research one behaviour while actually studying another.

15. An appropriate design for an investigation will be one which:

a minimizes all possible sources of error

b is experimental

c gives you the answer you expect

d none of the above.

16. If a study is externally valid, then:

a it must have been an experiment

b the dependent variable has face validity

c it cannot be internally valid

d its results can be generalized to other equivalent settings.

17. The placebo effect:

a happens only in drug studies

b occurs in experimental as well as control groups

c is another name for relaxing participants

d occurs only if you do not have double blinding.

18. Random selection of participants in a study is typically employed to:

a maximize generality of the results

b minimize random measurement error

c control assignment errors

d minimize the Rosenthal effect

e minimize the Hawthorne effect.

Page 63

19. Random assignment of participants in an experiment is typically employed to:

a minimize random measurement error

b minimize the Rosenthal effect

c minimize the Hawthorne effect

d ensure that the experimental and control groups are similar at the outset

e maximize generality of the results.

20. The principal advantage of a factorial design is that:

a all important factors are taken into account

b only one factor is considered

c the dependent variable is more reliable

d the joint effects of two or more independent variables may be assessed

e the independent variable has more levels.

Page 64