J Manipulative Physiol Ther 2007 (May); 30 (4): 259–262
Ralph E. Gay, MD, DC, Timothy J. Madson, PT, MS, Kathryn R. Cieslak, PT, OCS
Department of Physical Medicine and Rehabilitation,
Mayo Clinic, Rochester, Minn, USA.
OBJECTIVE: This study compares the sensitivity to change of the Neck Disability Index (NDI) and the Neck Bournemouth Questionnaire (NBQ) in patients with chronic uncomplicated neck pain.
METHODS: This prospective longitudinal study was completed in an outpatient physical therapy clinic. Subjects, with uncomplicated neck pain (no concurrent shoulder pain or nerve root symptoms) for more than a 3-month duration, participated in a 4-week course of therapy that included moist heat, neck exercises, and either mobilization or massage. Outcome measures included standardized response means (sensitivity to change), Cronbach alpha (internal consistency), and 2-way Spearman correlations between the 2 questionnaires and between a pain Visual Analog Scale and each questionnaire (convergent validity).
RESULTS: Mean (SD) score change of the NDI was 6.22 (5.12), and of the NBQ, 14.00 (11.99). Standardized response means were 1.21 and 1.17, respectively. Both questionnaires were more sensitive to change than the pain Visual Analog Scale (0.68). There was moderate correlation between the change scores of all 3 outcome tools (Spearman 0.46-0.57). The NBQ had higher internal consistency than the NDI.
CONCLUSIONS: The NDI and the NBQ performed comparably in this group of patients with chronic uncomplicated neck pain. Both are sensitive to change and would be efficient outcome tools in studies of chronic neck pain. Both had acceptable internal consistency and are appropriate for use as single-outcome scales.
From the Full-Text Article:
We compared the performance of the NDI and the NBQ in a sample of patients with chronic uncomplicated neck pain who were treated with physical therapy and massage or mobilization. We found the questionnaires to have similar sensitivity to change. Both had reasonable internal consistency based on Cronbach a. The NBQ had good convergent validity with the NDI (which is a more widely used and validated questionnaire). Both questionnaires appeared to have low respondent burden. To our knowledge, the performance of these outcome measures has not been previously studied in a sample with chronic uncomplicated neck pain treated with conservative measures.
Sensitivity to change can be measured by several change coefficients depending on the experimental methods and patient characteristics. The SRM is appropriate to use when the population being examined is reasonably homogenous, and therefore, all subjects are expected to improve to approximately the same extent.  We prospectively chose to combine the two treatment groups to make a single group for this study. Although the pilot study was not powered to determine a difference between the groups and no formal comparison was made, there were no differences in the clinical characteristics of the groups, and the treatments (physical therapy with either massage or mobilization) were expected to have similar ESs based on systematic reviews. [16, 17] Although unlikely, there is a possibility that these treatments have significantly different ESs, and our evaluation of responsiveness with the SRM is inappropriate.
Because the NDI and NBQ had similar sensitivity to change (SRM 1.21 and 1.17 respectively), a manual therapy trial in this chronic neck pain population would require a similar sample size for each. However, sensitivity to change does not guarantee that the observed change is clinically meaningful. The “responsiveness” of an outcome measure refers to its ability to measure both sensitivity to change and to reflect clinically meaningful change.18 Meaningful improvement is more complex than relief of pain and is harder to demonstrate than sensitivity. Some investigators have used qualitative approaches to determine what aspects of health outcome are meaningful. [6, 19] We did not attempt to quantify meaningful improvement in this study; we chose simply to measure pain as an admittedly poor surrogate for overall improvement. The lack of strong correlation between pain VAS change and NDI or NBQ change suggests that clinical improvement is more complex than a patient's rating of pain severity.
A high degree of internal consistency (homogeneity) is desirable in outcome scales because it indicates that the individual items provide information about the concept being measured. Because the reliability of a questionnaire depends as much on the sample being tested as on the questionnaire,  it is important to demonstrate internal consistency in the target population. Both the NDI and NBQ appear to have acceptable internal consistency for clinical research use in the group studied. Streiner  suggested that a Cronbach a of .8 to .9 was desirable for scales used in research and noted that a higher value might indicate unnecessary redundancy as opposed to increased internal consistency.
We did not attempt to determine a minimal clinically important difference for the NDI or NBQ. A previous study found that a 34% change in the NBQ (raw change score/baseline score × 100) had the best sensitivity and specificity to distinguish meaningful improvement.  In addition, NBQ change scores of 12 or more have been considered beyond that expected due to variability of the instrument alone.  The mean change score in the current study (14.0) was much lower than in the Bolton study (22.8), with SDs of the baseline scores being similar (11.99 and 13.66, respectively). Only about half of our subjects had NBQ score changes of 12 or more (median, 13). Accordingly, the ES of the NBQ reported by Bolton and Humphreys  (1.67) was larger than we observed (1.28). This may be due to the treatments employed or characteristics of the subjects we studied.
There are several weaknesses in this study. First, the sample size is smaller than we would have liked. We chose to exclude patients with conditions that could mimic or complicate neck pain so that the ES would reflect treatment of neck pain only. Yet, our sample size is not unlike similar studies. [22, 23] The pretreatment scores were well distributed, and variances were similar to those reported by other investigators using the NDI and NBQ. Second, we did not require a minimum pain level for study entry. Our approach was pragmatic, recruiting patients who had been referred by their physician for treatment. The absence of a floor or ceiling effect in our data suggests that our approach was appropriate. Third, we did not randomize the order in which the questionnaires were completed. This may have introduced an order effect. Lastly, the follow-up was only 4 weeks. We cannot comment on the stability of questionnaire properties over a longer time.
We found the NBQ and NDI to have similar ESs. Consequently each would require a about the same number of subjects if used as the primary outcome measure in a trial of manual therapy for chronic uncomplicated neck pain. Both were more sensitive to change than the pain VAS. The NBQ had good convergent validity with the NDI with strong correlation between them in regard to pretreatment and posttreatment scores. Although the NBQ had slightly higher internal consistency than the NDI, their overall performance was similar in this sample.