J Manipulative Physiol Ther 2002 (Mar); 25 (3): 141-148 ~ FULL TEXT
Jennifer E. Bolton and B. Kim Humphreys, DC, PhD
Anglo-European College of Chiropractic,
Bournemouth BH5 2DF, England.
OBJECTIVE: To modify an existing outcome measure (Bournemouth Questionnaire [BQ]) for use in patients with nonspecific neck pain and test its psychometric properties.
DESIGN: Prospective longitudinal study in which the questionnaire was administered on 3 occasions (pretreatment, retest, and posttreatment).
SETTING: Anglo-European College of Chiropractic outpatient clinic and 8 field chiropractic practices.
METHOD: Seven core items relating to the biopsychosocial model of pain were included in the original questionnaire (back BQ). The wording of one of these items (disability in activities of daily living) was modified to include activities likely to be affected by neck pain. Testing of the neck BQ was carried out in 102 patients with nonspecific neck pain.
RESULTS: The instrument demonstrated high internal consistency on 3 administrations (Cronbach's alpha = 0.87, 0.91, 0.92). All 7 items were retained on the basis that they each significantly contributed to the total score (item-corrected total score correlations >0.43) and to the instrument's responsiveness to clinical change (item change-corrected total change score correlations >0.42). The instrument was reliable in test-retest administrations in stable subjects (ICC = 0.65). The instrument demonstrated acceptable construct validity and longitudinal construct validity with established external measures. The treatment effect size of the instrument was found to be high (1.67).
CONCLUSION: The neck BQ covers the salient dimensions of the biopsychosocial model of pain, is quick and easy to complete, and has been shown to be reliable, valid, and responsive to clinically significant change in patients with nonspecific neck pain. Its use as an outcome measure in clinical trials and outcomes research is recommended.
From the Full-Text Article:
It is surprising, given the prevalence of neck pain and its impact on the individual and society,  that there are not more instruments that have been developed and tested for use in trials evaluating treatment interventions in this condition. The few that do exist are all geared toward the pain severity and disability dimensions of the condition and as such do not take a wider view of neck pain based on a biopsychosocial model of pain. That neck pain, like back pain, is more likely explained by a biopsychosocial model than a medical one, is now entirely in keeping with our present understanding of musculoskeletal pain disorders and the current moves away from passive treatment to active rehabilitation in the management of nonspecific neck pain. [5, 6, 28]
As a consequence, there is a need for an outcome measure that comprehensively incorporates the salient dimensions of the biopsychosocial pain model for use in neck studies. At the same time, such an outcome measure must be practical for use, not only in the research setting, but also in the busy routine clinic setting if it is to be used to evaluate both the efficacy and the effectiveness of treatment interventions.  These same considerations were behind the development and testing of a new, short-form comprehensive outcome measure for use in patients with back pain. 
The back BQ contains 7 core items:
(1) pain intensity;
(2) disability in ADL and
(3) in social activities; the emotional dimensions of
(4) anxiety and
(5) depression; and the cognitive aspects of
(6) fear-avoidance behavior and
(7) pain locus of control.
Mindful of the similarities between back and neck pain, and the need for a comprehensive yet short outcome measure for use in neck pain patients based on the biopsychosocial model, this study was formulated to modify the back BQ and then test its psychometric properties in patients with nonspecific neck pain.
Basing neck pain on a biopsychosocial model in the same way as back pain, and given the generic nature of the 7 core items in the back BQ, very little modification was made to the original questionnaire. Only 1 of the 7 items, namely disability in ADL was changed to exclude those activities likely to be affected by back pain and replace them with activities likely to be affected by neck pain. Because of these small changes, we now advocate the use of a generic BQ that can be used in allpainful musculoskeletal complaints, including shoulder and extremity pain. In this generic BQ, the wording of the item on disability in ADL is phrased “How has your complaint interfered with your daily activities (housework, washing, dressing, lifting, reading, driving, climbing stairs, getting in/out of bed/chair, sleeping)?” This encompasses those activities, 1 or more of which is likely to be affected by each of the painful nonspecific musculoskeletal conditions. Apart from the small change in the wording of the disability in ADL scale of the neck BQ, no change was made to the response scaling for the questionnaire items. The 11-point NRS has previously been shown to be a responsive scale, as well as one that is relatively easy for patients to complete. [30, 31]
To test for redundancy of items, the item-corrected total correlations and item change score-corrected total change score correlations were determined. The results of this study show that, in both cases, each of the 7 items contributes significantly to the total score of the BQ and that, as such, there are no redundant items in the neck BQ. Moreover, the results of the internal consistency tests (Cronbach's alpha) showed that the neck BQ is a homogenous instrument tapping different aspects of the same attribute (ie, the neck pain experience). This is further evidence that neck pain is more likely explained by a biopsychosocial model than a medical one.
The test-retest results showed that the neck BQ is a reliable instrument, and that in stable subjects there is moderate agreement in consecutive administrations of the questionnaire. From these data, it has been possible to show that a change score in excess of 12 points (out of a total of 70), or approximately 17%, is indicative of a real change (“signal”) over and above the variability (“noise”) of the measuring instrument. This is an important point that raises the matter of clinical change versus statistical change, and the fact that the two are not necessarily synonymous. 
Because no “gold standard” exists, testing the validity of instruments such as the BQ is difficult. The best that can be done is to use established measures that purport to measure similar constructs as the instrument under test. In the case of the neck BQ, this was even more difficult than usual because of the paucity of established measures specifically designed for use in neck patients. As a result, we chose to use the 2 most frequently used neck disability measures (the NDI and the NFDS), even though they only measure pain and disability. Both the NDI [12, 33] and the NFDS  have undergone psychometric testing in neck patients. In addition, we used the generic health status measure, the SF36. This has been validated  and widely used in different populations, including patients with back pain.  Because the SF36 produces 8 separate scale scores rather than a single index, only the individual scales of the SF36 were used as external criteria for testing.
The total score of the neck BQ correlated significantly with the total scores of the NDI and the NFDS, both in terms of absolute scores (external construct validity) and the change scores over time (external longitudinal construct validity). When testing the external validity of the individual items of the neck BQ, we trawled the established measures and selected those that appeared to most closely match the attribute under test in each item. In one case, item 6 (fear-avoidance behavior), we were unable to find an appropriate scale or measure to use as an external criterion and we therefore used the scores from an individual question of the SF36 (question 8). Apart from item 7 (pain locus of control), there was moderate to strong (and in all cases statistically significant) correlation with the chosen external measures, supporting the external construct validity and external longitudinal construct validity of individual items of the neck BQ. The poor (and statistically insignificant) correlation between item 7 and the general health scale of the SF36 was most likely due to the fact that the external scale does not adequately reflect the pain locus of control construct. It was however, the best fitting scale we could find. Moreover, the correlation between item 3 and the social functioning scale of the SF36 was low and not statistically significant when testing external longitudinal construct validity, but not when testing external construct validity. We are unable, at this time, to offer an explanation for this seemingly spurious finding.
In contrast to reliability and validity, responsiveness (the ability to detect clinically significant change) is an often-neglected psychometric property of a measure. Considering that the ability to detect clinical change is an essential property of an evaluative measure, this is a serious shortcoming. Although there are several ways of estimating the responsiveness of an instrument,  arguably the most common approach is determination of the instrument's treatment effect size. As far as we are aware, there are no published data on the effect size of either the NDI or the NFDS. The data from this study demonstrate that the effect size of the neck BQ is large, and considerably greater than that of the NDI and the NFDS. This result has important implications for those conducting clinical trials and outcomes research in neck patients. One of the problems in clinical trials, even multicenter trials, is recruitment of patients. Use of an outcome measure with a large effect size substantially reduces the sample size needed to establish a clinically significant difference as statistically significant. We suggest that further work be done in this area to investigate whether or not the large effect size difference between the BQ and neck disability measures reported in this study is replicated in other patient populations. The data also reveal that the primary reason for this difference in effect size is almost certainly the higher (percentage) values of the pretreatment BQ scores of these patients when compared with those measured with other instruments.
This study has limitations. The patients recruited to the study were convenience samples from a chiropractic college teaching clinic and chiropractors' field practices. As such, these patients may not be representative of all patients with neck pain who present to chiropractors, nor may they be representative of other ambulatory patients with neck pain. The psychometric properties of the neck BQ should therefore be tested in other populations, including patients who suffer neck pain as a result of a traumatic injury. It is important to remember that an outcome measure validated in one patient group is not necessarily valid in another in which the patient characteristics, particularly levels of disability and chronicity of the complaint, may be different. No attempt was made in this study to distinguish between patients with acute and chronic neck pain. Finally, for the purposes of testing the responsiveness of the instrument, distinction was necessary between patients who had undergone clinically significant change and those who had not. Clinically significant change remains a debatable issue, which in the absence of consensus, we have defined as the self-report of patients' perceptions of improvement in their condition.
As a result of a lack of outcome measures specifically designed and developed for use in neck patients, and in particular ones based on neck pain as an illness, we have developed a short-form, comprehensive neck outcome measure. The neck BQ covers the salient dimensions of the biopsychosocial model of pain, is quick and easy to complete, and has been shown to be reliable, valid and responsive to clinically significant change in nonspecific neck pain patients. We therefore recommend the neck BQ as an outcome measure in clinical trials and in outcomes research for evaluating the efficacy and effectiveness of treatment interventions for nonspecific neck pain.