CHAPTER 16
Outcome Measurement in Upper Extremity Practice

JOY C. MACDERMID, BScPT, PhD

WHAT IS A HEALTH OUTCOME MEASURE?

HOW CAN CLINICIANS USE OUTCOME MEASURES?

WHAT ARE THE IMPORTANT MEASUREMENT PROPERTIES OF OUTCOME MEASURES?

WHAT DO I NEED TO KNOW BEFORE USING OUTCOME TOOLS?

HOW DO I FIND OUTCOME MEASURES THAT ARE SUITABLE FOR ME?

HOW DO I ADMINISTER THE OUTCOME MEASURE TO MEASURE CHANGE IN MY PATIENTS?

HOW DO I ANALYZE OUTCOME SCORES?

HOW DO I INCORPORATE PREDICTORS OF OUTCOME?

HOW DO I SET UP A PROCESS TO MEASURE OUTCOMES IN MY PRACTICE?

SUMMARY


CRITICAL POINTS

Outcome measures need to be both reliable and valid.

Therapists should perform outcome evaluation using procedures that incorporate currently accepted standardized methods both in the clinical setting and in manuals or research paper(s).

Outcome measures are typically used for evaluation over time, but they can also be used to discriminate between patient groups and predict future status such as return to work.

The implementation of outcome measures within clinical practice requires information that may require purchase or permissions from developers and always requires knowledge about proper scoring and interpretation of scores.

Selecting the appropriate outcome starts with determining the purpose and scope of measuring and then requires matching the patient’s problem, level of difficulty, and communication capacity with these properties of the tool.

What Is a Health Outcome Measure?

A health outcome measure is any measurement of a patient’s health status. That view of health status can be broad, such as when we measure overall health or quality of life. We can also focus on very specific aspects of health. Pain and function are specific aspects of health that are of particular interest to hand therapists. Health can change over time as a result of time, treatment, or disease. Patients’ perceptions of their health status can change because of anatomic and physiologic changes that alter body functions, psychological changes that affect perception, or calibration of health or social changes that alter the experience of living with a specific health status. For this reason, measuring outcomes can be complex and requires a theoretical foundation, as well as different instruments to account for different perspectives and purposes.

The most internationally accepted standard of health is proposed by the World Health Organization. This organization produces both The International Classification of Diseases (ICD) (www.who.int/classifications/icd/en/) and an International Classification of Functioning, Disability and Health (ICF) (www.who.int/classifications/icf/en/). The latter is increasingly being used as a framework by which outcome measures are classified1-6 (Fig. 16-1).

images

Figure 16-1 The conceptual model for the International Classification of Functioning. (Adapted from www.who.int/classifications/icf/en/.)

Body functions are physiologic or psychological functions of body systems.7 Body structures are anatomic parts of the body, such as organs, limbs, and their components. Impairment is the loss or abnormality of psychological, physiologic, or anatomic structure or function. Examples of impairments that hand therapists typically measure include hand size, appearance, strength, range of motion (ROM), volume, sensory threshold, and pain. Methods and interpretation for measuring impairments of the hand are the traditional focus of hand therapy and are detailed in many of the chapters in this book discussing evaluation.

Activity is the execution of a task or action by an individual. Participation is involvement in a life situation. Inability in these areas can be termed activity limitations or participation restrictions. Tests that measure performance of specific tasks include tests such as the TEMPA (Test d’Evaluation des Membres Supérieurs de Personnes Agées),8 Jebson Test of Hand Function,9 Purdue Pegboard,10 Minnesota Rate of Manipulation Test,11 the RNK dexterity test,12,13 and other “hand function tests” can measure activity limitations. Activity limitations can also be measured by self-report by asking individuals whether they can perform a specific activity like lifting a grocery bag. Indicators that focus on resuming roles, like returning to work, reflect participation and can be considered as measures of actual status or by self-report. Many self-report functional scales contain both activity and participation type of items. In fact, a number of studies have now classified items of specific upper extremity scales14,15 or hand problems16-18 using the ICF. However, moving ICF into practice has been slow. Recently, core and brief core measures were developed for hand conditions using an international evidence-based consensus process. These codes are available for all to use (www.icf-research-branch.org/research/Hand.htm).

In this chapter, we discuss principles that apply to all outcome measures, but emphasize self-report because many of the chapters in this book have focused on impairment measures.

How Can Clinicians Use Outcome Measures?

The basic functionalities about the measure’s scores include evaluation of change over time, discrimination between groups of patients, and prediction of future status.19 Hand therapy is characterized by development of advanced evaluative measures of hand impairment. Publication of Clinical Assessment Recommendations was one of the first accomplishments of the American Society of Hand Therapists, and the second edition remained in print for 20 years.20 This guide has traditionally focused on measuring hand impairments. However, increasingly it is becoming standard practice to include functional measures—particularly those involving self-report. In fact, some payers now mandate this practice. A new expanded version of the Clinical Assessment Recommendations is expected in 2010 and will include self-report measures and information on ICF.

A standardized outcome measure is one that has specific properties: it is published; there are detailed instructions on how to administer, score, and interpret the test; it has a defined purpose; it was designed for a specific population; and there are published data indicating acceptable reliability and validity. Standardization in clinical measurement is essential to ensure that outcome measures are capable of providing valid information about a patient’s health status.

Evaluation over Time

The most common application in clinical practice is evaluation over time.19 Optimally, evaluation over time includes using outcome measures to set goals and then determining whether detectable important changes have occurred.

When designing a treatment plan, hand therapists typically determine which pathologic processes or physical impairments are contributing to the patient’s complaint or compromised health. Treatment programs are then designed to allow optimal recovery and to minimize any residual impairment or functional limitation. Before and after intervention, examination is required to determine the effectiveness of the selected intervention. For example, if a therapist evaluates a tendon repair and is concerned about tendon gliding, then active range of motion (AROM) is appropriate to measure. Treatment for this patient might include a variety of interventions that are expected to improve tendon glide. It is essential that hand therapists evaluate impairment measures that are expected to change, such as AROM in this case. However, AROM may not be relevant to function for all patients, problems, or stages of recovery. Therapists should avoid “rote” use of any measure without considering the context and whether the measurement is able to provide useful information. If we think about a flexor tendon patient who achieves improved glide as result of hand therapy, the goal should be concomitant to improving the patient’s ability to perform activities such as gripping a handle and more successful participation in work. These effects can be measured using standardized outcome instruments. However, a review of outcome measures used when assessing patients with tendon and nerve repairs indicates a primary focus on impairments, particularly range of motion.21

The first step in using outcome measures to evaluate change over time begins at the initial assessment. The therapist must select an appropriate measure that is both relevant to the patient’s problem and also has capacity to detect change. (See Table 16-1 and companion Web site for examples.) A short-term goal for improvement is then set using evidence from the literature about the minimal detectable change (MDC). Short-term goals are set to exceed the minimal detectable change so that you can be confident that the patient has improved beyond the amount that might occur due to random fluctuation in patient status. Longer-term goals can be set using your clinical experience about reasonable targets for that patient population or surgical procedure, with assistance from published outcome studies that provide scores for different patient subgroups and stages of recovery. Using MDC helps us determine whether patients have changed in subsequent reassessments. Using published outcome data we can determine if patients have met acceptable targets.

Table 16-1 Examples of Self-Report Scales Used on Outcome Evaluation in the Upper Extremity

images

images

images

images

images

images

images

Discrimination

Discrimination between groups is required when the purpose is to discern different subgroups within a population.19 For example, the Katz hand diagram22 discriminates between individuals having carpal tunnel syndrome (CTS) and those who do not. Others have developed a diagnostic scale to assess the probability of CTS.23 Diagnostic tests are not outcome measures, but rather are designed to differentiate different groups (e.g., those having a pathology versus those who do not). In general, measures designed for diagnosis are not useful for evaluating change over time. For example, Phalen’s test is useful for diagnosing CTS but not for assessing treatment effectiveness or outcome. Discrimination can also be performed for other purposes than diagnosis. It can be important to differentiate among clinical subgroups that do not require different treatment approaches or have a different prognosis. For example, with constructs like readiness or capability to return to work, safety during mobility, or return to home, determining if illiteracy is a factor can help in deciding how best to optimize treatment planning and patient outcomes. Increasingly, we are seeing a move toward differentiating patient subgroups that require different treatment approaches. Scales (or clinical prediction rules) devised for this purpose are an example of discriminative measures.

Prediction

Finally, outcome measures can be used to predict future outcomes. What will be the final strength 1 year after a fracture? Who will return to work? Who will require surgery for median nerve compression? These are prediction questions that might interest clinicians. When we predict outcomes, we use scores on rating scales at some preliminary stage to predict future scores or outcomes. For example, we demonstrated that patients presenting for conservative management of carpal tunnel that subsequently proceeded to have a surgical release had higher initial symptom severity scores24 than those whose conditions were successfully managed conservatively (3.3 vs. 2.9). Similarly, high baseline scores after a distal radius fracture were indicative of patients less likely to return to work at 6 months.25 In fact, baseline score is commonly a predictor of final status, suggesting that patients presenting with unusually poor scores are usually at higher risk of poor outcomes.

Outcome measures have specific measurement properties that determine how well they function in evaluating change, discriminating, or predicting. These measurement properties can be competing; therefore, an instrument designed for one purpose may not be suited to others.19 Generally, hand therapists are most interested in evaluative measures. Unfortunately, in some cases, clinicians use measures designed for discrimination as evaluative outcome measures without realizing they may not be appropriate for this purpose. For example, hand diagrams are useful for assessment, but not for evaluating treatment. Similarly, evaluative measures may not be predictive. For example, AROM may be used to evaluate a change in tendon glide over time, but does AROM predict the ability to return to work? We demonstrated that physical impairments were less predictive of time to return to work following distal radius fracture than were self-report measures.25 Although there is some evidence in the clinical literature on the evaluative aspects of AROM as an outcome measure, we really do not have much evidence on its predictive or discriminative properties. It is clear that it is important to “pick the right tool for the job.”

What Are the Important Measurement Properties of Outcome Measures?

The three measurement properties fundamental to how a tool can be used for clinical measurement are reliability, validity, and responsiveness (ability to detect clinical change over time).

Reliability

Reliability is the consistency or repeatability of a measurement. Reliability is fundamental to other measurement properties because without stability, the utility of any measure is compromised. However, high reliability, in itself, does not ensure that other measurement properties are also acceptable. Therefore, both the reliability and validity of outcome measures should be documented before clinicians use them to make decisions.

Measurements can be repeated by the same therapist (intrarater), by different therapists (inter-rater), or on different occasions (test–retest). Generally, intrarater reliability is higher than other forms of reliability analysis because the measurement error attributable to differences between testers and occasions is not considered. However, for evaluating patients over time, it is important to know that a measure remains constant over time if the patient remains stable (i.e., test–retest reliability). When we expect to share our measurements with others through clinical assessment notes, progress reports, or research studies, it is important to note that the status we report is consistent with what others would have determined (i.e., inter-rater reliability). Some clinicians mistakenly assume that impairment measures are more reliable than self-report measures because they consider the latter subjective. Certainly, patient perceptions of their status have some random fluctuation based on factors like recent functional demands and mood. However, in general, self-report measures have higher reliability coefficients than many impairment measures.26,27

Furthermore, a variety of factors affect measured impairments like grip strength. These include consistency of instructions and positioning, and interaction effects with the tester, time of day, fatigue, nutritional status, mood, and motivations. Hand therapists should be aware of methods to make their measurements more comparable to that described in the literature and consistent over time. Standardization, including elements like consistent technique, landmarking instructions, positioning, and instrument calibration, is used by hand therapists to reduce measurement error. It has been demonstrated that certain clinical measures such as ROM can be performed reliably by both novice and experienced therapists when a standardized method is used.28 By using methods described in reliability studies, therapists can be more confident of comparing their results with those of others.

Reliability can be assessed using different statistics. Basic understanding of these statistics is important for clinicians because it helps them comprehend how to use published reliability studies to improve clinical expertise. The intraclass correlation coefficient (ICC)29 is commonly used in the hand therapy literature to describe the relative reliability (the ratio between variability observed on repeated measurements within individuals compared with the variability between individuals). Reliability coefficients can be compared with benchmarks. Various benchmarks have been proposed. Fleiss suggested that less than 0.40 indicates poor reliability, that 0.40 to 0.75 is moderate, and that greater than 0.75 is excellent reliability.30 The problem with this approach is that it suggests that reliability is a pass/fail criterion that measures should achieve. However, a better way for hand therapists to think about measurement error is that it exists for all measures and it is more important to understand the extent to which it affects a given assessment. Statistics like the standard error of measurement (SEM), or mean error, allow therapists to view measurement error in more quantitative terms.31 For example, it has been shown that ROM measurements of elbow flexion and extension vary 3 to 5 degrees on average for the same tester and 5 to 8 degrees for different testers.28 SEM is important because it can be used to calculate the MDC. Minimal detectable change is a useful target for short-term improvement as it represents the amount of change that is likely to indicate a real change in status. MDC has been established for many self-report measures. Exemplars of how to apply these principles to setting goals and evaluating change are available in the hand therapy literature.31,32

Validity

Validity is the extent to which the measure accurately portrays the aspect of health status that it was intended to describe. It can be thought of as the “trueness” of the measure.33 Validity is difficult to ascertain because in many concepts of interest to hand therapists, such as pain or disability, there is no single or measurable true answer. A measure may be valid for one purpose, but not for other purposes. For example, a general health instrument may be a valid indication of overall health but may not be valid when assessing change in upper extremity function after certain hand injuries. Therefore, validity needs to be assessed by a variety of methods and in various situations. Validity is the cumulative evidence provided to support the use of outcome instruments in specific situations to perform specific analytic functions. For this reason, various forms of validity are recognized.

Content validity is the extent to which a measure represents an adequate sampling of the concept being measured. This can be measured in the development of patient questionnaires by using focus groups or patient surveys to determine which items should contribute to the outcome scale. It can be determined through consensus reviews or expert panels that review existing items. For example, we expect a carpal tunnel instrument to include questions about classic symptoms of CTS, such as numbness, tingling, and waking at night. By looking at the items of the Symptom Severity Scale,34 we make a judgment that it has content validity.

Construct validity is the extent to which scores obtained agree with the theoretical underpinnings of that scale. Testing constructs derived from the theoretical underpinnings of an instrument requires that relationships be investigated. Does the instrument relate to other instruments the way one would expect? Scales measuring similar constructs should be correlated (convergent validity); whereas dissimilar constructs should not demonstrate a significant relationship (divergent validity). Another type of construct that is tested is evaluating whether subgroups expected to be different based on the theoretical construct or existing evidence demonstrate this difference on the outcome measure being evaluated (known groups validity). For example, do people with more severe fractures score more poorly? Do patients in a nursing home have lower scores than patients living independently?

The process of demonstrating whether an instrument is valid and reliable is ongoing and requires multiple studies to ensure that measurement scales can be applied to different clinical populations and examination needs. Clinicians are often tempted to devise their own instruments or modify existing instruments to make something that is directly applicable to their own clinical situation. This is not advisable as the new instrument is not validated, nor comparable to other scores.

Responsiveness

The ability to detect change over time is critical to determining whether patients improve with treatment or deteriorate over time. Responsiveness is the measurement property that reflects this. Numerous studies in the hand therapy literature compare the responsiveness of different self-report and impairment measures. It is important for hand therapists to know about the relative responsiveness of different tools they might use in their practice. If an instrument is not able to pick up change, then insurers, patients, or members of the health-care team may not believe that the treatment efforts are effective. As a first step, therapists should consult the literature to find out about the relative responsiveness of different tools. As a general rule, the more specific the measure is to the problem or condition that is being treated, the more responsive the tool usually is. As an example, the Short-Form 36 (SF-36) is the common and important indicator of general health status. However, it is generally not very responsive in hand conditions. This might be expected when looking at the items since few of them relate to upper extremity function. However, therapists must also consider whether an instrument will be responsive in specific patients. A common example is patients with higher levels of functioning (younger, healthier patients) or higher demands (athletes, musicians, workers) who are unable to perform their normal roles but generally do not have difficulty with the lower-level items of many common functional scales. As a general rule, if a patient scores near the upper or lower range of a score, therapists should think about the potential for “ceiling” or “floor” effects. If the score is at a range where a MDC could not be achieved, then the tool is not appropriate for that patient. A different tool or a patient-specific tool might be indicated in these cases.

What Do I Need to Know Before Using Outcome Tools?

It is important to select an outcome measure that is reliable and valid for the purpose and clinical examination. A thorough search of the literature for a measure that meets your clinical examination needs is advisable since many recently developed impairment and disability measures are available. Then literature on the measurement properties of potentially useful instruments should be reviewed to determine which ones have acceptable measurement properties.

Once a measure with acceptable measurement properties has been identified, the clinician needs to investigate issues of practicality. Some patient questionnaires require permission from the authors or payment (or both) before being used. The time required to complete the test, language requirements, cost, data analysis requirements, and training requirements must be determined. Issues around whether a measure is feasible for your clinical setting are important considerations. Standards for the use of measures were described in 1991 by the American Physical Therapy Association’s Task Force in their Standards for Tests and Measurements in Physical Therapy.35 Other sources of information on standardization are the Advisory Group on Measurement Standards of the American Congress of Rehabilitation Medicine and a variety of publications that focus on outcome measures.36 The general standards for use of measures discussed here are also addressed elsewhere.33,35,36

Users of measures should know the technical aspects of the instrument; that is, they should have documentation about the test’s scoring, interpretation, reliability, and validity. For impairment-based measures, details on calibration and test positioning and procedures are key. Self-report measures have specific scoring metrics. Some instruments, such as the Disabilities of the Arm, Shoulder, and Hand (DASH)37 or the Patient-Rated Wrist Evaluation (PRWE),38 can be easily scored by the therapist at the time of examination. Others, such as the SF-36, have more complicated scoring algorithms.39,40

Measures should be used by the individuals who have the necessary training, experience, and professional qualifications. Certain types of examinations tend to fall in the domain of specific professionals. For example, certain psychometric examinations are performed only by licensed psychologists. Some aspects of hand examination require the expertise of a trained hand therapist. However, a number of outcomes can be evaluated by other professionals or even support staff. One purpose of reliability studies is to define what level of training and experience is required to administer a measure. Some assessments may not be technically demanding, and, thus, both inexperienced and experienced testers may achieve highly reliable results. For example, elbow flexion and extension and forearm rotation have been shown to be reliably measured when the tester is an experienced orthopedic surgeon, an experienced hand therapist, or an inexperienced physical therapist.28 Some physical assessments may require greater technical skill. For example, passive movement characteristics of the shoulder have been measured reliably when the tester is an experienced manual therapist.41 Knowledge of the technical skills required to perform specific measures may help determine appropriate delegation.

Therapists should make themselves aware of the training procedures required to properly administer a test and ensure that they adhere to the procedures. Sometimes the detail on how to perform the test or train personnel to perform the test is not provided or is incomplete. Dexterity tests such as the Jebson Hand Function Test are widely used to assess patient outcomes. However, the detail on how the test is performed is insufficient, resulting in variations in how different clinicians perform it. Therapists can find details about how to perform a test in a manual or in reliability studies. When the details of how to perform a test are absent or incomplete, it is the tester’s responsibility to locate detailed instructions and to use these in clinical practice.

Having normative data or comparative data is required to interpret outcome scores objectively. For example, when commenting on a patient’s strength, it is advisable to compare the patient’s injured hand with the best estimate of that particular patient’s normal strength (i.e., his or her uninjured side) and to also compare that patient’s strength against scores considered normal values for patients of a similar size, age, and sex. Conclusions about the extent of strength recovered during rehabilitation or the ability to perform strength-based tasks can be made using these comparative data. For health status questionnaires like the SF-36, normative data and data for pathologic conditions are available for comparison.40 Normative data for upper extremity self-report measures is less common, but comparative data for clinical populations is readily available from clinical research.

Therapists should know the proper environmental conditions and equipment requirements for completing the outcome evaluation. For example, examination of sensory thresholds requires specific equipment and environmental conditions. The patient must be positioned properly, and the surroundings must be quiet if an accurate measurement is to be achieved. Instruments must be sensitive to small increments in pressure and they must be calibrated to ensure that they remain consistent over time. Environment can also affect self-report measures. Patients may feel uneasy about expressing dissatisfaction to their doctor or therapist, but they can be more frank when an independent assessor administers the questions. Telephone administration or the use of interpreters may be necessary to obtain self-report information, but might influence how people respond. The optimal approach to administering self-report measures is to have them administered by an independent person, such as the clinic receptionist providing the forms.

How Do I Find Outcome Measures That Are Suitable for Me?

The first step in identifying an appropriate instrument is to decide which concepts are important to measure based on the problem being treated and the expected effects of the intervention. As previously stated, the ICF framework is increasingly being used as a conceptual framework. Outcome measures can be found by searching the literature or textbooks for information addressing their psychometric properties. For impairment measures, search (using Boolean operators) for the type of measure (e.g., grip, strength, motion) AND (reliability OR validity) in PubMed or the Cumulative Index to Nursing and Allied Health Literature (CINAHL). You should retrieve numerous articles addressing clinical measurements. For self-report measures, there is no standard term, so using synonyms for a self-report outcome measure like “self-report” OR “questionnaire” OR “outcome measure” can be a good search strategy, combined with the psychometric terms (reliability OR validity) to identify appropriate clinical studies. Sometimes you will be fortunate to find a systematic review of an outcome measure like ones published for the neck disability index,42 shoulder outcome measures,43,44 hand measures,45 or measures specific to hand osteoarthritis.46 Often, the actual forms required must be obtained from the developers as they may not appear in studies. Another source for locating outcome instruments are Web sites specific to the measures. Searching “outcome measure database” in a search engine takes you to a number of databases that contain information about different outcome measures. Many of these are by professional groups affiliated with rehabilitation. Searching using the name of the scale you wish to find and PDF (or using an advanced search to limit your retrieval to PDF files) can help you locate downloadable forms available on the Internet.

Data extraction forms, critical appraisal forms, and guides are available to assist in critically appraising the outcomes scales themselves.32,47,48 However, in most circumstances, hand therapists are interested in using a measure that has acceptable psychometric properties, is clinically feasible, and is used by others in the profession.

Two commonly used self-report measures are the DASH and the PRWE. The DASH is a 30-item scale that focuses on disability of the upper extremity, but also contains items on symptoms.49-51 The items are summated using a simple equation. The PRWE has 15 items; 5 addressing pain and 10 addressing disability (6 standardized specific activities and 4 inquiring about the patient’s usual preinjury activity). Both have high reliability and have been validated for a variety of hand and wrist conditions, have been translated into multiple languages, and are used by a variety of hand therapy practices.31,38,52 The DASH has the advantage of being useful for conditions of the upper extremity and being widely recognized. It is slightly less responsive than the PRWE for hand and wrist conditions, but substantially more responsive than generic measures.53-55 Recently, the QuickDASH56 was introduced. It contains 11 items from the original DASH, and in early studies has shown equivalent psychometric properties.57-62 Any of these measures are appropriate for routine use in a hand therapy clinic as a supplement to relevant impairment measures. However, in some patients, traditional standardized self-report forms may not be adequate. Examples include patients with high demands (e.g., workers or athletes), unique skills (e.g., musicians), unique conditions (e.g., congenital problems, instability), or a mild spectrum of disease in an otherwise healthy patient. In these cases, scales specifically designed for the higher-level functioning may be an alternative. However, in many cases these are not available. Another alternative can be patient-specific scales. This option allows patients to select items that are of importance to them and for which they are currently experiencing difficulty. Therefore, the level of difficulty is determined by the patient. A brief form of this is the Patient-Specific Functional Scale,63,64 in which three to five items are selected by the patient and rated on subsequent examinations. A more detailed clinician-administered instrument is the Canadian Occupational Performance Measure,65-68 which identifies occupational performance issues in different domains and rates them according to performance and satisfaction. Both measures are suitable for hand therapy practice, although the latter is more time-consuming to use. Patient-specific scales can be very useful in clinical practice since they are by nature client-centered and tend to be most responsive in picking up clinical change. One drawback is that scores cannot be compared across patients because the items are different. But since most therapists are interested in change over time, this does not affect their usefulness in treating individual patients.

When a clinical condition has unique features and is common, a condition-specific tool may be developed. The most commonly used example in hand therapy is the Symptom Severity Scale.34 This tool goes by several names, but addresses symptoms specific to CTS, such as waking at night, numbness, and tingling. It has been shown to be more responsive than functional scales and generic scales in detecting recovery following treatment for CTS.69

A number of scales combine measures of impairments, such as ROM and strength, with measures of functional ability, which is usually evaluated by the clinician. Examples of such scales are the Mayo Elbow Performance Index (MEPI)70 and the Constant–Murley score.71 Scores from these scales are often rated as excellent, good, fair, or poor so that the number of patients falling into each category can be reported in case series or outcome studies. Clinicians should be aware that these scales have limitations. The subjective categories are not meaningful because they have not been validated, are not consistent between scales, and are not reliable.72 These scales tend to be developed by clinicians based on their personal opinion of items to include and weighting of subcomponents and, hence, have not been developed through the optimal clinimetric process. Of greater concern is that the reliability and validity of these scales tends to be poorly addressed. The most studied of this type of scales is the Constant-Murley scale developed for the shoulder. A recent systematic review of this measure highlighted some strengths and limitations of this scale.73 Given that impairment and disability are separate constructs, it is advisable for therapists to track and interpret these separately in clinical practice. Where these clinician-based scales are used, the actual score—not subjective ratings like” good” and “fair”—should be reported.

How Do I Administer the Outcome Measure to Measure Change in My Patients?

When using outcome tools to assess change in individual patients, a number of practical details must be considered. Who will administer the questionnaire? Some therapists like to introduce the questionnaire themselves to their patients as a way to facilitate their subjective evaluation of them. Other therapists prefer the questionnaire to be administered by an independent person to minimize the opportunity for bias. Therapists need to work collaboratively with others in their clinic to determine which process is optimal for their clinic.

The way in which the questionnaire is administered may depend on the patient’s capacity. The effect of illiteracy is grossly underestimated. When patients ask a spouse or family member to fill out forms, refuse to cooperate, or ask to take questionnaires home, illiteracy may be the underlying reason. Others may try to mask their illiteracy by completing the forms, but their answers will be nonsensical. By offering to read questions to patients in a circumspect way, or allowing them to complete it with a spouse, you can provide these patients the opportunity to convey their opinions on their status without compromising their dignity.

It is important to plan how often outcome instruments will be administered. A baseline and final status evaluation are the minimum requirements for determining the effect of treatment. However, when instruments are used throughout a treatment program, they can help assess the course of recovery. In this case, they must be applied at intervals over which an MDC is expected to occur. Some developers recommend 2 weeks, but clearly this depends on the rate of change of the problem. Instruments are not usually completed twice within the same week because patients may be relatively stable within this time frame.

Some clinical examinations are performed for research or program evaluation and require a postdischarge or long-term evaluation. This should be performed when surgical and rehabilitation efforts have been maximized. Long-term follow-up can also be performed to gauge deterioration in patients with chronic disease or to gauge the effects of late complications (e.g., joint arthoplasty failure). In these cases, it may be necessary to use multiple types or occasions of contact to get adequate responses. A combination of mail, phone, or Internet surveys of status can be useful for determining longer-term status with adequate response rates so that the information is considered valid.

How Do I Analyze Outcome Scores?

Treatment effects are assessed by evaluating the change in score from baseline to post-treatment. The MDC is the amount of change that exceeds measurement error. The clinically important difference (CID) is the amount of change that has been shown to make a difference to patients. These two concepts are determined differently, but a discussion of those methods is beyond the scope of this chapter. It is important to be aware of these two indicators because they help with interpretation of outcome measure scores. MDC and CID can sometimes be found in clinical measurement studies. Individual scores can be interpreted using these two concepts in combination with their clinical skills about expected outcomes and data from normal patients. MDC is a useful benchmark for short-term goals since it is the amount of change that provides us confidence that the person’s ability has actually changed. Longer-term goals should exceed CID since we know that this amount of improvement should indicate a true benefit to the patient.

When introducing a new impairment or disability measure into your clinical practice, you must ensure that it is used as designed by the developers. Sometimes, because of the space limitations in scientific journals, the methodology description for tests has been abbreviated and the detail is insufficient to allow replication. In these cases, the user is obligated to contact the authors of the new methodology to provide more thorough instructions.

How Do I Incorporate Predictors of Outcome?

When measuring outcome and describing it to others, it is important to know and document important predictors of outcome. Unfortunately, we are at a preliminary stage in our understanding of how impairment translates into disability. Understanding this relationship is important in providing an accurate prognosis for patients, determining who is likely to benefit from treatment, and designing treatment programs that focus on the critical components of the disability and handicap. Certain factors, such as age, job demands, and comorbidity, may affect outcome in a predictable fashion; others, like gender, may have variable effects depending on the pathology or construct.

Severity of injury is an important consideration in evaluating outcome. Therapists need to understand how severity of injury affects prognosis and treatment response. For example, it has been suggested that conservative management is most effective for mild CTS. Our outcome data on distal radius fractures suggest that patients with more severe fractures (i.e., those with more radial shortening) experience more pain and disability in their long-term outcome.74-77 On the other hand, patients with severe limitation at a baseline assessment have the most “room for improvement” and may experience the greatest change in raw scores.

Patient characteristics can also be powerful predictors of outcome. Further research is required to fully understand which patient characteristics affect outcome in upper extremity conditions. When patient characteristics and injury characteristics present at baseline examination of distal radius fractures are assessed, the most important predictor of the extent of pain and disability at 6 months after fractures was the presence of injury compensation (legal or worker’s compensation).75,77-79 Level of education was also shown to be predictive of outcome. It is important to recognize that these are associations and not necessarily causes. Injury compensation may be related to the level of difficulty of the job or to motivation. Educational level may relate to a number of factors, such as ability to modify job demands, ability to acquire less demanding employment, ability to understand home programs, and compliance.

How Do I Set Up a Process to Measure Outcomes in My Practice?

In studying the implementation of outcome measures into practice, key elements have been determined.31,32,80 These include spending some initial time evaluating why the measures might be useful and how they might be used. Therapists need to determine how “high-tech” or “low-tech” their clinics’ process will be, based on practical issues such as computer accessibility or funding. Hand therapists still predominantly use paper and pencil self-report measures in practice, although a customized computerized database system designed for hand therapists will soon be available. Regardless of the system, practical issues about who, when, and how the measures will be administered must be resolved—when several people work together in the clinic, this needs to be a group process. Some time reviewing research studies to identify different types of instruments is needed. At this point clinicians may benefit from attending workshops about outcome measures. Some time needs to be allocated to finding outcome measures and identifying their correct scoring and standardized administration. It is important to establish a realistic and incremental process for building outcome measure administration into routine practice. Making forms easily accessible and staff responsibilities clear and agreed upon is essential. Frequently, a trial of two different outcome measures can be useful to help therapists evaluate which works better for their practice. Once familiar with the basics of how to administer outcome scales, therapists should deepen their knowledge about how to interpret scores, use them to set goals, and write more definitive medical records.

Summary

Hand therapists assess their patients to determine how pathology of the upper limb has affected impairment, activity, and participation. They formulate treatment plans to mitigate related problems and assess the treatment’s effectiveness by using standardized outcome evaluations. The foundations of hand therapy rest on standardized outcome measures used to improve the ability to diagnose impairment, assess change in patient status, predict future outcomes, conduct clinical research, and institute continuous quality improvement.

REFERENCES

1. Kirchberger I, Glaessel A, Stucki G, Cieza A. Validation of the comprehensive international classification of functioning, disability and health core set for rheumatoid arthritis: the perspective of physical therapists. Phys Ther. 2007;87:368–384.

2. Stucki G, Cieza A, Melvin J. The International Classification of Functioning, Disability and Health (ICF): a unifying model for the conceptual description of the rehabilitation strategy. J Rehabil Med. 2007;39:279–285.

3. Uhlig T, Lillemo S, Moe RH, et al. Reliability of the ICF Core Set for rheumatoid arthritis. Ann Rheum Dis. 2007;66:1078–1084.

4. Coenen M, Cieza A, Stamm TA, et al. Validation of the International Classification of Functioning, Disability and Health (ICF) Core Set for rheumatoid arthritis from the patient perspective using focus groups. Arthritis Res Ther. 2006;8:R84.

5. Stucki G, Cieza A, Ewert T, et al. Application of the International Classification of Functioning, Disability and Health (ICF) in clinical practice. Disabil Rehabil. 2002;24:281–282.

6. Grimby G, Smedby B. ICF approved as the successor of ICIDH. J Rehabil Med. 2001;33:193–194.

7. World Health Organization. International Classification of Impairments, Disabilities and Handicaps. A Manual of Classification Relating to the Consequences of Disease. Geneva: WHO; 1980.

8. Desrosiers J, Hebert R, Dutil E, Bravo G. Development and reliability of an upper extremity function test for the elderly: the TEMPA. Can J Occup Ther. 1993;60:9–16.

9. Jebson RH, Taylor N, Trieschmann RB, et al. An objective and standardized test of hand function. Arch Phys Med Rehabil. 1969;50:311–319.

10. Tiffin J, Asher EJ. The Purdue Pegboard: norms and studies of reliability and validity. J Appl Psych. 1948;32:234–247.

11. Jurgenson CE. Extension of the Minnesota Rate of Manipulation Test. J Appl Psychol. 1943;27:164–169.

12. MacDermid JC, Mule M. Concurrent validity of the NK hand dexterity test. Physiother Res Int. 2001;6:83–93.

13. Turgeon TR, MacDermid JC, Roth JH. Reliability of the NK dexterity board. J Hand Ther. 1999;12:7–15.

14. Dixon D, Johnston M, McQueen M, Court-Brown C. The Disabilities of the Arm, Shoulder and Hand Questionnaire (DASH) can measure the impairment, activity limitations and participation restriction constructs from the International Classification of Functioning, Disability and Health (ICF). BMC Musculoskelet Disord. 2008;9:114.

15. Silva DA, Ferreira SR, Cotta MM, et al. Linking the Disabilities of Arm, Shoulder, and Hand to the International Classification of Functioning, Disability, and Health. J Hand Ther. 2007;20:336–343.

16. Kjeken I, Dagfinrud H, Slatkowsky-Christensen B, et al. Activity limitations and participation restrictions in women with hand osteoarthritis: Patients descriptions, and associations between dimensions of functioning. Ann Rheum Dis. 2005;64:1633–1638.

17. Pap G, Angst F, Herren D, et al. Evaluation of wrist and hand handicap and postoperative outcome in rheumatoid arthritis. Hand Clin. 2003;19:471–481.

18. Harris JE, MacDermid JC, Roth J. The International Classification of Functioning as an explanatory model of health after distal radius fracture: a cohort study. Health Qual Life Outcomes. 2005;3:73.

19. Kirshner B, Guyatt G. A methodological framework for assessing health indices. J Chronic Dis. 1985;38:27–36.

20. American Society of Hand Therapists. Clinical Assessment Recommendations. 2nd ed New Jersey: Mt. Laurel; 1992.

21. MacDermid JC. Measurement of health outcomes following tendon and nerve repair. J Hand Ther. 2005;18:297–312.

22. Katz JN, Stirrat CR. A self-administered hand diagram for the diagnosis of carpal tunnel syndrome. J Hand Surg [Am]. 1990;15:360–363.

23. Graham B, Regehr G, Naglie G, Wright JG. Development and validation of diagnostic criteria for carpal tunnel syndrome. J Hand Surg Am. 2006;31:919–924.

24. Boyd KU, Gan BS, Ross DC, et al. Outcomes in carpal tunnel syndrome: symptom severity, conservative management and progression to surgery. Clin Invest Med. 2005;28:254–260.

25. MacDermid JC, Richards RS, Roth JH, McMurty R. Predictors of time lost from work following a distal radius fracture. J Occup Rehabil. 2007;17(1):47–62.

26. MacDermid JC, Ramos J, Drosdowech D, et al. The impact of rotator cuff pathology on isometric and isokinetic strength, function, and quality of life. J Shoulder Elbow Surg. 2004;13:593–598.

27. Traynor R, MacDermid JC. Immersion in Cold-Water Evaluation (ICE) and self-reported cold intolerance are reliable but unrelated measures. Hand (N Y). 2008;3:212–219.

28. Armstrong AD, MacDermid JC, Chinchalkar S, et al. Reliability of range-of-motion measurement in the elbow and forearm. J Shoulder Elbow Surg. 1998;7:573–580.

29. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86:420–428.

30. Fleiss JL. Reliability of Measurement. In: The Design and Analysis of Clinical Experiments. New York: Wiley; 1986:1–32.

31. MacDermid JC, Stratford P. Applying evidence on outcome measures to hand therapy practice. J Hand Ther. 2004;17:165–173.

32. MacDermid JC, Grewal R, Macintyre NJ. Using an evidence-based approach to measure outcomes in clinical practice. Hand Clin. 2009;25:97–111.

33. Cole B, Finch E, Gowland C, Mayo N. Physical Rehabilitation Outcome Measures. Toronto: Canadian Physiotherapy Association; 1994. 1-217

34. Levine DW, Simmons SP, Koris MJ, et al. A self-administered questionnaire for assessment of severity of symptoms and functional status in carpal tunnel syndrome. J Bone Joint Surg Am. 1993;75A:1585–1592.

35. American Physical Therapy Association’s Task Force on Standards for Measurements in Physical Therapy. Standards for tests and measurements in physical therapy. Phys Ther. 2000;71:589–622.

36. Johnston MV, Keith RA, Hinderer SR. Measurement standards for interdisciplinary medical rehabilitation. Arch Phys Med Rehabil. 2000;73:s3–s23.

37. Upper Extremity Collaborative Group. Measuring disability and symptoms of the upper limb: a validation study of the DASH questionnaire. Arth Rheum. 1996;39:S112.

38. MacDermid JC. Development of a scale for patient rating of wrist pain and disability. J Hand Ther. 1996;9:178–183.

39. Ware JE, Kosinski M, Keller SD. SF-36 Physical and Mental Health Summary Scales: A User’s Manual. Boston: The Health Institute, New England Medical Center; 1994.

40. Ware JE, Snow KK, Kosinski M, Gandek B. SF-36 Health Survey Manual and Interpretation Guide. Boston: New England Medical Center, The Health Institute; 1993.

41. Chesworth BM, MacDermid JC, Roth JH, Patterson SD. Movement diagram and “end-feel” reliability when measuring passive lateral rotation of the shoulder in patients with shoulder pathology. Phys Ther. 1998;78:593–601.

42. Pietrobon R, Coeytaux RR, Carey TS, et al. Standard scales for measurement of functional outcome for cervical pain or dysfunction: a systematic review. Spine. 2002;27:515–522.

43. Fayad F, Mace Y, Lefevre-Colau MM. [Shoulder disability questionnaires: a systematic review]. Ann Readapt Med Phys. 2005;48:298–306.

44. Bot SD, Terwee CB, van der Windt DA, et al. Clinimetric evaluation of shoulder disability questionnaires: a systematic review of the literature. Ann Rheum Dis. 2004;63:335–341.

45. van de Ven-Stevens LA, Munneke M, Terwee CB, et al. Clinimetric properties of instruments to assess activities in patients with hand injury: a systematic review of the literature. Arch Phys Med Rehabil. 2009;90:151–169.

46. Dziedzic KS, Thomas E, Hay EM. A systematic search and critical review of measures of disability for use in a population survey of hand osteoarthritis (OA). Osteoarth Cartilage. 2005;13:1–12.

47. MacDermid JC. Critical Appraisal of Study Quality for Psychometric Articles. Evaluation Form In: Evidence-Based Rehabilitation: A Guide to Practice. Thorofare, New Jersey: Slack, Inc.; 2008:387–388.

48. MacDermid JC. Critical Appraisal of Study Quality for Psychometric Articles. Interpretation guide In: Evidence-Based Rehabilitation: A Guide to Practice. Thorofare, New Jersey: Slack, Inc.; 2008:389–392.

49. Beaton DE, Katz JN, Fossel AH, et al. Measuring the whole or the parts? Validity, reliability, and responsiveness of the Disabilities of the Arm, Shoulder and Hand outcome measure in different regions of the upper extremity. J Hand Ther. 2001;14:128–146.

50. Solway S, Beaton DE, McConnell S, Bombardier C. The Dash Outcome Measure User’s Manual. 2nd ed Toronto, Ont: Institute for Work and Health; 2002.

51. MacDermid JC, Tottenham V. Responsiveness of the disability of the arm, shoulder, and hand (DASH) and patient-rated wrist/hand evaluation (PRWHE) in evaluating change after hand therapy. J Hand Ther. 2004;17:18–23.

52. MacDermid JC, Turgeon T, Richards RS, et al. Patient rating of wrist pain and disability: a reliable and valid measurement tool. J Orthop Trauma. 1998;12:577–586.

53. MacDermid JC, Richards RS, Donner A, et al. Responsiveness of the Short Form-36, disability of the arm, shoulder, and hand questionnaire, patient-rated wrist evaluation, and physical impairment measurements in evaluating recovery after a distal radius fracture. J Hand Surg [Am]. 2000;25:330–340.

54. Schmitt JS, Di Fabio RP. Reliable change and minimum important difference (MID) proportions facilitated group responsiveness comparisons using individual threshold criteria. J Clin Epidemiol. 2004;57:1008–1018.

55. Dowrick AS, Gabbe BJ, Williamson OD, Cameron PA. Outcome instruments for the assessment of the upper extremity following trauma: a review. Injury. 2005;36:468–476.

56. Beaton DE, Wright JG, Katz JN. Development of the QuickDASH: comparison of three item-reduction approaches. J Bone Joint Surg Am. 2005;87:1038–1046.

57. Mintken PE, Glynn P, Cleland JA. Psychometric properties of the shortened Disabilities of the Arm, Shoulder, and Hand Questionnaire (QuickDASH) and Numeric Pain Rating Scale in patients with shoulder pain. J Shoulder Elbow Surg. 2009;18:920–926.

58. Fayad F, Lefevre-Colau MM, Gautheron V, et al. Reliability, validity and responsiveness of the French version of the questionnaire Quick Disability of the Arm, Shoulder and Hand in shoulder disorders. Man Ther. 2009;14:206–212.

59. Wu A, Edgar DW, Wood FM. The QuickDASH is an appropriate tool for measuring the quality of recovery after upper limb burn injury. Burns. 2007;33:843–849.

60. Matheson LN, Melhorn JM, Mayer TG, et al. Reliability of a visual analog version of the QuickDASH. J Bone Joint Surg Am. 2006;88:1782–1787.

61. Imaeda T, Toh S, Wada T, et al. Validation of the Japanese Society for Surgery of the Hand Version of the Quick Disability of the Arm, Shoulder, and Hand (QuickDASH-JSSH) questionnaire. J Orthop Sci. 2006;11:248–253.

62. Gummesson C, Ward MM, Atroshi I. The shortened Disabilities of the Arm, Shoulder and Hand Questionnaire (QuickDASH): validity and reliability based on responses within the full-length DASH. BMC Musculoskelet Disord. 2006;7:44.

63. Westaway MD, Stratford PW, Binkley JM. The Patient-Specific Functional Scale: validation of its use in persons with neck dysfunction. J Orthop Sports Phys Ther. 1998;27:331–338.

64. Sterling M, Brentnall D. Patient Specific Functional Scale. Aust J Physiother. 2007;53:65.

65. Law M, Baptiste S, McColl M, et al. The Canadian Occupational Performance Measure: an outcome measure for occupational therapy. Can J Occup Ther. 1990;57:82–87.

66. Law M, Polatajko H, Pollock N, et al. Pilot testing of the Canadian Occupational Performance Measure: clinical and measurement issues. Can J Occup Ther. 1994;61:191–197.

67. Law M, Baptiste S, Carswell A, et al. Canadian Occupational Performance Measure. 3rd ed Ottawa, Ontario: CAOT Publications ACE; 1998.

68. Carswell A, McColl MA, Baptiste S, et al. The Canadian Occupational Performance Measure: a research and clinical literature review. Can J Occup Ther. 2004;71:210–222.

69. Katz JN, Gelberman RH, Wright EA, et al. Responsiveness of self-reported and objective measures of disease severity in carpal tunnel syndrome. Med Care. 1994;32:1127–1133.

70. Morrey BF, An KN, Chao EYS. Functional evaluation of the elbow. In: Morley BF, ed. The Elbow and its Disorders. 2nd ed Philadelphia: WB Saunders; 1993:86–97.

71. Constant CR, Murley AHG. A clinical method of functional assessment of the shoulder. Clin Orthop Relat Res. 1987;214:160–164.

72. Turchin DC, Beaton DE, Richards RR. Validity of observer-based aggregate scoring systems as descriptors of elbow pain, function, and disability. J Bone Joint Surg Am. 1998;80:154–162.

73. Roy JS, MacDermid JC, Woodhouse LJ. A systematic review of the psychometric properties of the Constant-Murley score. J Shoulder Elbow Surg. 2010;19(1):157–164.

74. Grewal R, MacDermid JC. The risk of adverse outcomes in extra-articular distal radius fractures is increased with malalignment in patients of all ages but mitigated in older patients. J Hand Surg [Am]. 2007;32:962–970.

75. MacDermid JC, Roth JH, Richards RS. Pain and disability reported in the year following a distal radius fracture: A cohort study. BMC Musculoskelet Disord. 2003;4:24.

76. MacDermid JC, Richards RS, Roth JH. Distal radius fracture: a prospective outcome study of 275 patients. J Hand Ther. 2001;14:154–169.

77. MacDermid JC, Roth JH, McMurtry R. Predictors of time lost from work following a distal radius fracture. J Occup Rehabil. 2007;17:47–62.

78. Grewal R, MacDermid JC, Pope J, Chesworth BM. Baseline predictors of pain and disability one year following extra-articular distal radius fractures. Hand (N Y). 2007;2:104–111.

79. MacDermid JC, Donner A, Richards RS, Roth JH. Patient versus injury factors as predictors of pain and disability six months after a distal radius fracture. J Clin Epidemiol. 2002;55:849–854.

80. MacDermid JC, Solomon P, Law M, et al. Defining the effect and mediators of two knowledge translation strategies designed to alter knowledge, intent and clinical utilization of rehabilitation outcome measures: A study protocol [NCT00298727]. Implement Sci. 2006;1:14.

81. Carlsson AM. Assessment of chronic pain part 1: aspects of reliability and validity of the visual analogue scale. Pain. 1983;16:87–101.

82. Dixon JS, Bird HA. Reproducibility along a 10cm vertical visual analogue scale. Ann Rheum Dis. 1981;40:87–89.

83. Downie WW, Leatham PA, Rhind VM. Studies with pain rating scales. Ann Rheum Dis. 1978;37:378–381.

84. Jensen MP, Karoly P, Braver S. The measurement of clinical pain intensity: a comparison of six methods. Pain. 1986;27:117–126.

85. Langley GB, Sheppeard H. The visual analogue scale: Its use in pain measurement. Rheumatol Int. 1985;4:145–148.

86. Wilkie D, Lovejoy N, Dodd M, Tesler M. Cancer pain intensity measurment: concurrent validity of three tools—finger dynameter, pain intensity number scale, visual analogue scale. Hosp J. 1990;6:1–13.

87. Scott J, Huskisson EC. Vertical or horizontal visual analogue scales. Ann Rheum Dis. 1979;38:560.

88. Bergner M, Bobbitt RA, Kressel S, et al. The Sickness Impact Profile: conceptual formulation and methodology for the development of a health status measure. Int J Health Serv. 1976;6:393–415.

89. Bergner M, Bobbitt RA, Pollard WE, et al. The Sickness Impact Profile: validation of a health status measure. Med Care. 1976;14:57–67.

90. Bergner M, Bobbitt RA, Carter WB, Gilson BS. The Sickness Impact Profile: development and final revision of a health status measure. Med Care. 1981;19:787–805.

91. Heald SL, Riddle DL, Lamb RL. The shoulder pain and disability index: the construct validity and responsiveness of a region-specific disability measure. Phys Ther. 1997;77:1079–1089.

92. Stucki G, Liang MH, Phillips C, Katz JN. The Short Form-36 is preferable to the SIP as a generic health status measure in patients undergoing elective total hip arthroplasty. Arth Care Res. 1995;8:174–181.

93. Beaton DE, Richards RR. Measuring function of the shoulder. A cross-sectional comparison of five questionnaires. J Bone Joint Surg Am. 1996;78:882–890.

94. Ware JE, Kosinski M, Keller SD. SF-12®: How to Score the SF-12® Physical and Mental Health Summary Scales. 2nd ed Boston: The Health Institute, New England Medical Center; 1995.

95. Engelberg R, Martin DP, Agel J, et al. Musculoskeletal Function Assessment instrument: criterion and construct validity. J Orthop Res. 1996;14:182–192.

96. Engelberg R, Martin DP, Agel J, Swiontkowski MF. Musculoskeletal Function Assessment: reference values for patient and non-patient samples. J Orthop Res. 1999;17:101–109.

97. Martin DP, Engelberg R, Agel J, et al. Development of a musculoskeletal extremity health status instrument: the Musculoskeletal Function Assessment instrument. J Orthop Res. 1996;14:173–181.

98. Martin DP, Engelberg R, Agel J, Swiontkowski MF. Comparison of the Musculoskeletal Function Assessment questionnaire with the Short Form-36, the Western Ontario and McMaster Universities Osteoarthritis Index, and the Sickness Impact Profile health-status measures. J Bone Joint Surg Am. 1997;79:1323–1335.

99. Swiontkowski MF, Engelberg R, Martin DP, Agel J. Short musculoskeletal function assessment questionnaire: validity, reliability, and responsiveness. J Bone Joint Surg Am. 1999;81:1245–1260.

100. Hudak PL, Amadio PC, Bombardier C. Development of an upper extremity outcome measure: the DASH (Disabilities of the Arm, Shoulder and Hand) [corrected]. The Upper Extremity Collaborative Group (UECG). Am J Ind Med. 1996;29:602–608.

101. Jain R, Hudak PL, Bowen CV. Validity of health status measures in patients with ulnar wrist disorders. J Hand Ther. 2001;14:147–153.

102. Pransky G, Feuerstein M, Himmelstein J, et al. Measuring functional outcomes in work-related upper extremity disorders. Development and validation of the Upper Extremity Function Scale. J Occup Environ Med. 1997;39:1195–1202.

103. Roach KE, Budiman-Mak E, Songsiridej N, Lertratanakul Y. Development of a shoulder pain and disability index. Arth Care Res. 1991;4:143–149.

104. Williams JW Jr, Holleman DR Jr, Simel DL. Measuring shoulder function with the Shoulder Pain and Disability Index. J Rheumatol. 1995;22:727–732.

105. Matsen FA III, Lippitt SB, Sidles JA, Harryman DT. Practical Evaluation and Management of the Shoulder. Philadelphia: W.B. Saunders; 1994. 1-17

106. MacDermid JC, Ghobrial M, Quirion KB, et al. Validation of a new test that assesses functional performance of the upper extremity and neck (FIT-HaNSA) in patients with shoulder pathology. BMC Musculoskelet Disord. 2007;8:42.

107. MacDermid JC, Solomon P, Prkachin K. The Shoulder Pain and Disability Index demonstrates factor, construct and longitudinal validity. BMC Musculoskelet Disord. 2006;7:12.

108. Roddey TS, Cook KF, O’Malley KJ, et al. The relationship among strength and mobility measures and self-report outcome scores in persons after rotator cuff repair surgery: impairment measures are not enough. J Shoulder Elbow Surg. 2005;14:95S–98S.

109. Cloke DJ, Lynn SE, Watson H, et al. A comparison of functional, patient-based scores in subacromial impingement. J Shoulder Elbow Surg. 2005;14:380–384.

110. Ostor AJ, Richards CA, Prevost AT, et al. Diagnosis and relation to general health of shoulder disorders presenting to primary care. Rheumatology (Oxford). 2005;44:800–805.

111. Paul A, Lewis M, Shadforth MF, et al. A comparison of four shoulder-specific questionnaires in primary care. Ann Rheum Dis. 2004;63:1293–1299.

112. L’Insalata JC, Warren RF, Cohen SB, et al. A self-administered questionnaire for assessment of symptoms and function of the shoulder. J Bone Joint Surg Am. 1997;79:738–748.

113. Kirkley A, Alvarez C, Griffin S. The development and evaluation of a disease-specific quality-of-life questionnaire for disorders of the rotator cuff: The Western Ontario Rotator Cuff Index. Clin J Sport Med. 2003;13:84–92.

114. Getahun T, MacDermid JC, Patterson SD. Concurrent validity of patient rating scales in assessment of outcome after rotator cuff repair. J Musculoskelet Res. 2000;4:119–127.

115. Kirkley A, Griffin S, McLintock H, Ng L. The development and evaluation of a disease-specific quality of life measurement tool for shoulder instability. The Western Ontario Shoulder Instability Index (WOSI). Am J Sports Med. 1998;26:764–772.

116. Research Committee ASES, Richards RS, An K-N, et al. A standardized method for the assessment of shoulder function. J Should Elbow Surg. 1994;3:347–352.

117. Michener LA, McClure PW, Sennett BJ. American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form, patient self-report section: reliability, validity, and responsiveness. J Shoulder Elbow Surg. 2002;11:587–594.

118. Winters JC, Sobel JS, Groenier MS, et al. A shoulder pain score: a comprehensive questionnanire for assessing pain in patients with shoulder complaints. Scand J Rehab Med. 1996;28:163–167.

119. van der Windt DA, van der Heijden GJ, de Winter AF, et al. The responsiveness of the Shoulder Disability Questionnaire. Ann Rheum Dis. 1998;57:82–87.

120. Kohn D, Geyer M. The subjective shoulder rating system. Arch Orthop Trauma Surg. 1997;116:324–328.

121. King GJ, Richards RR, Zuckerman JD, et al. A standardized method for assessment of elbow function. Research Committee, American Shoulder and Elbow Surgeons. J Shoulder Elbow Surg. 1999;8:351–354.

122. MacDermid JC. Outcome evaluation in patients with elbow pathology: issues in instrument development and evaluation. J Hand Ther. 2001;14:105–114.

123. MacDermid J. Update: The Patient-rated Forearm Evaluation Questionnaire is now the Patient-rated Tennis Elbow Evaluation. J Hand Ther. 2005;18:407–410.

124. Rompe JD, Overend TJ, MacDermid JC. Validation of the Patient-rated Tennis Elbow Evaluation Questionnaire. J Hand Ther. 2007;20:3–10.

125. Goldhahn J, Angst F, Simmen BR. What counts: outcome assessment after distal radius fractures in aged patients. J Orthop Trauma. 2008;22:S126–S130.

126. Weixin X, Seow C. Chinese version of the Patient Rated Wrist Evaluation (PRWE): cross-cultural adaptation and reliability evaluation. J Hand Ther. 2004;17:84–85.

127. Angst F, John M, Goldhahn J, et al. Comprehensive assessment of clinical outcome and quality of life after resection interposition arthroplasty of the thumb saddle joint. Arth Rheum. 2005;53:205–213.

128. Atroshi I, Johnsson R, Sprinchorn A. Self-administered outcome instrument in carpal tunnel syndrome. Reliability, validity and responsiveness evaluated in 102 patients. Acta Orthop Scand. 1998;69:82–88.

129. Atroshi I, Breidenbach WC, McCabe SJ. Assessment of the carpal tunnel outcome instrument in patients with nerve-compression symptoms. J Hand Surg Am. 1997;22:222–227.

130. Amadio PC, Silverstein MD, Ilstrup DM, et al. Outcome assessment for carpal tunnel surgery: the relative responsiveness of generic, arthritis-specific, disease-specific, and physical examination measures. J Hand Surg Am. 1996;21:338–346.

131. Chatterjee JS, Price PE. Comparative responsiveness of the Michigan Hand Outcomes Questionnaire and the Carpal Tunnel Questionnaire after carpal tunnel release. J Hand Surg [Am]. 2009;34:273–280.

132. Ozyurekoglu T, McCabe SJ, Goldsmith LJ, LaJoie AS. The minimal clinically important difference of the Carpal Tunnel Syndrome Symptom Severity Scale. J Hand Surg [Am]. 2006;31:733–738.

133. Chung KC, Pillsbury MS, Walters MR, Hayward RA. Reliability and validity testing of the Michigan Hand Outcomes Questionnaire. J Hand Surg Am. 1998;23:575–587.

134. McMillan CR, Binhammer PA. Which outcome measure is the best? Evaluating responsiveness of the Disabilities of the Arm, Shoulder, and Hand Questionnaire, the Michigan Hand Questionnaire and the Patient-Specific Functional Scale following hand and wrist surgery. Hand (N Y). 2009;4:311–318.

135. Chung KC, Hamill JB, Walters MR, Hayward RA. The Michigan Hand Outcomes Questionnaire (MHQ): assessment of responsiveness to clinical change. Ann Plast Surg. 1999;42:619–622.

136. Sambandam SN, Priyanka P, Gul A, Ilango B. Critical analysis of outcome measures used in the assessment of carpal tunnel syndrome. Int Orthop. 2008;32:497–504.

137. Stratford P, Gill C, Westaway MD, Binkley J. Assessing disability and change on individual patients: a report of a patient specific measure. Physiother Can. 1995;47:258–263.

138. Gross DP, Battie MC, Asante AK. The Patient-Specific Functional Scale: validity in workers’ compensation claimants. Arch Phys Med Rehabil. 2008;89:1294–1299.

139. Jolles BM, Buchbinder R, Beaton DE. A study compared nine patient-specific indices for musculoskeletal disorders. J Clin Epidemiol. 2005;58:791–801.