|
|
[Back to articles by Hugh MacPherson]
Health status
The categories of health measured in
the SF-36 are Physical Functioning (PF), Social Functioning (SF), Role-limitation Physical
(RP), Role-limitation Emotional (RE), Bodily Pain (BP), Mental Health (MH), Vitality (V)
and General Health (GH). Table 2 shows the SF-36 average scores for this group of patients
and compares them with samples from a recent UK population survey (Brazier et al. 1992): a
group of patients who had consulted their GP in the previous 2 weeks, a group who had not
and a group who had been diagnosed by their GP as having one or more chronic physical
problems. Scores range from 0 to 100 on each sub-scale, lower scores indicating poorer
health. It is evident that the sample visiting an acupuncturist
report poorer health than those who have recently visited a GP (and those who have not).
They are closer to the patients who were diagnosed by their GP as having a chronic
physical problem, though tending towards worse health on several sub-scales. Table 2
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
SF-36 subscale |
This study |
GP visit |
Not GP |
Chronic |
Physical Functioning |
67 |
81 |
88 |
66 |
Social Functioning |
67 |
76 |
89 |
74 |
Role limitation |
48 |
67 |
86 |
58 |
Role limitation |
63 |
73 |
84 |
74 |
Bodily Pain |
53 |
68 |
82 |
59 |
Mental Health |
65 |
66 |
74 |
69 |
Vitality |
44 |
52 |
63 |
50 |
General Health |
62 |
63 |
73 |
53 |
(n) |
(58) |
(290) |
(1208) |
(77) |
Population Data from Brazier et al (1992) |
||||
Using the ICPC (Lamberts and Woods
1987), the primary reason for the encounter in nearly half the patients was
musculoskeletal (45%) while the remainder of the presenting conditions (55%) were spread
across the following: neurological, skin, psychological, female-genital, male-genital,
women, urological, ears, eyes and digestive.
The majority of patients (34 out of 58, i.e. 59%) had
had their primary condition for more than two years prior to consulting an acupuncturist.
However, patients with musculo-skeletal conditions tended to consult sooner than patients
with other conditions (Table 3).
Duration |
Musculo-skeletal |
Other |
Total |
<1 month |
5 |
2 |
7 |
|
|
Outcome measures
The two patient measures that were
used to assess outcome were repeated at the beginning of the first (T1), fourth (T2),
seventh (T3) and tenth (T4) sessions. These measures were used to compare changes in
health status over time. Not all patients completed ten treatments (either they were not
necessary or patients terminated treatment for other reasons). Therefore the results were
analysed as repeated measures for the sample who completed all ten treatments (and for
whom data at four time periods is available: n=18) and for patients completing at least
seven treatments (three time periods: n=27) and four treatments (two time periods: n=51). The results of the patients' visual assessments are
presented in Figure 1. Higher scores indicate worse health. Each data line represents the
scores repeated over time for a group of patients who provided data at two (n=50), three
(n=26) or four (n=18) time periods respectively. It is apparent that patients report an
improvement between the first and second time periods and between the second and third,
but then the graph levels off. The trend to improve over time is highly significant
statistically for each of the three data lines (ANOVA: to T4, F=10.04, p<0.001; to T3,
F=27.94, p<0.001; to T2, F=55.84, p<0.001). The results of the patients' verbal assessments are
presented in Figure 2. The results are very similar to those indicated by the visual
assessments, indicating congruence between the two measures. Again, the scores are very
similar whether analysing data from patients who completed up to T4 (n=18) or up to at
least T3 (n=27) or up to at least T2 (n=51). The trend to improve over time is highly
significant statistically for each of the three data lines (ANOVA: to T4, F=8.87,
p<0.001; to T3, F=22.48, p<0.001; to T2, F=41.07, p<0.001). Comparing Figures 1 and 2 it should be noted that
because the visual scale ranges from 1 to 7 (7 points) while the verbal ranges from 0 to 4
(5 points), the visual scale offers the patient a wider range of options and may therefore
be more discriminatory between different levels of low morbidity. However, the verbal
assessment score does indicate more clearly the differences in scores between each of the
three graphs. The observed steeper graph for the sample that includes patients who
discontinue treatment after T2 (the fourth treatment) indicates that patients who dropped
out prematurely started off with poorer health and terminated with better health than
those who continued. This suggests that patients were discontinuing treatment because of
improvement in their health, rather than because of no change or deterioration. The visual and verbal assessments gave very similar results with a high correspondence. The correlation between visual and verbal scores are highly significant statistically. At the first session (T1) the correlation was 0.62 (n=57; p<0.001), at the fourth session (T2) it was 0.82 (n=50; p<0.001), at the seventh session (T3) it was 0.74 (n=26; p<0.001) and at the tenth session (T4) it was 0.85 (n=17; p<0.001). |
|
|
Figure 1
|
Visual Assessment Scores at each time
period (possible range 1-7) |
|
|
Figure 2
|
Verbal Assessment Scores at each time
period (possible range 1-4) Note that a correlation of 1.0 would mean that the
results were identical, and 0.0 would mean no correlation. The statistical test is the
Pearson product moment correlation; n is the number of pairs of patient scores
contributing to the correlation; p is the statistical probability that the correlation
could be as a result of chance rather than a genuine association between the two measures
(p<0.001 is regarded as highly significant, indicating that there is a probability of
less than 1 in 100 that the result could arise by chance; p<0.05 is regarded as
significant, indicating that there is a probability of less than 1 in 20 that the result
could arise by chance). Severity of condition
The results are reported below
comparing outcomes with severity of condition for the patients completing all sessions.
However, the data has been analysed also when patients who drop out earlier are included.
Where the pattern differs when these patients are included, these results are also
reported. The SF-36 scores measured at T1 were used to define a
baseline measure of severity. Since the SF-36 provides 8 sub-scale scores as reported in
Table 2, and no single composite measure exists, two were selected as potential baseline
measures: General Health (GH) and Bodily Pain (BP). Each of these has face validity as a
measure of initial condition severity and each provided a spread of scores suitable for
dividing into two sub-samples. Each potential baseline measure was used to divide the
patient sample into two approximately equal sized groups, representing more severe
and less severe initial conditions. |
|
|
Figure 3a
|
Visual assessment scores at each time
period. |
|
|
Figure 3b
|
Visual assessment scores at each time
period. A multivariate analysis of variance was carried out
with the visual assessment score repeated over the four time periods as one variable and
the two level severity measure as the other variable. The analysis was performed twice,
using each of the potential baseline severity measures. The mean scores for the analyses
are shown in Figures 3a and 3b. The analysis is shown for the sample that completed all
sessions and therefore provided data at all four time periods. However, the analysis was
carried out also for patients who completed only three time periods. The results are not
illustrated in Figures 3a and 3b because the mean scores were almost identical and
therefore the graphs would be super-imposed on those shown. Figure 3a shows that the bodily pain scale of the SF-36
does discriminate the visual assessment scores at T1 and that the group with the more
severe initial condition makes a greater improvement over time (up to T3, the seventh
session). This is confirmed by the statistical analysis (MANOVA) with a highly significant
improvement over time for this group (F=12.76; df=3,45; p<0.001) and a not quite
significant improvement for the less severe group (F=2.74; df=3,45; p=0.054). This
difference between the two severity groups is also revealed statistically by a severity
main effect (F=5.42; df=1,15; p=0.03) and by a severity by time interaction effect
(F=3.27; df=3,45; p=0.03). Figure 3b, using the general health sub scale of the
SF-36 as a baseline measure of severity of the initial condition, shows a very similar
pattern, though the discrimination between the two groups is not quite as great. This is
confirmed by the statistical analysis (MANOVA) with a highly significant improvement over
time for the more severe group (F=9.72; df=3,45; p<0.001) and a smaller but also
significant improvement for the less severe (F=3.90; df=3,45; p=0.014). The difference
between the two severity groups being smaller results in a statistically non-significant
severity main effect (F=3.78; df=1,16, p=0.07) and a statistically non-significant
severity by time interaction effect (F=2.10; df=3,45; p=0.11). However, when all patients
who completed up to the T3 period are included, the severity main effect (F=5.61; df=1,24;
p=0.03) and the severity by time interaction effect (F=3.54; df=2,48; p=0.04) are
statistically significant. These findings suggest that treatment results in a
greater improvement for patients with high initial bodily pain than for patients
presenting with poorer general health, perhaps adding support to the popularly held view
that acupuncture is particularly useful for the treatment of bodily pain. |
|
|
Figure 4
|
Visual assessment scores at each time
period related to duration of condition. |
|
|
Figure 5
|
Visual assessment scores at each time
period related to type of condition. Comparing outcome with duration
Figure 4 shows the graphs of response
following treatment for two sub-groups of patients defined by the duration of their
condition prior to the first consultation (split into two groups divided by the 2 year
boundary). Again, although the data is illustrated for patients who completed the four
assessment periods, the pattern is similar for those who discontinued earlier. Both groups show a statistically significant
improvement over time (F=10.04; df=3,48; p<0.001). The graphs suggest that patients who
had had their condition for longer had poorer health at all four assessment periods.
Moreover, although both groups made similar improvements between T1 and T2, the longer
duration group showed very little further improvement after the fourth session (T2), while
the shorter duration group continued to improve up to, but not beyond, the seventh session
(T3). However, it must be pointed out that these apparent
differences between the two duration groups are not statistically significant. There is no
significant duration main effect (F=1.61; df=1,16; p=0.22) and there is no significant
duration by time interaction effect (F=0.64; df=3,48; p=0.59). Comparing outcome with condition
The data comparing outcome with
condition is sufficient only to justify a broad comparison of musculo-skeletal with non
musculo-skeletal conditions. The comparison of progress over the four time periods is
shown in Fig 5. It appears that the musculo-skeletal conditions were slightly less severe
at the outset and responded to treatment more steadily over the whole course of ten
treatments (for patients who continued for ten sessions, though the same pattern exists
when the data is analysed for all patients who completed up to T3). However, there are no
statistically significant differences between the two condition groups, only the main time
effect. Therefore, one should be careful not to read too much into any apparent
differences in the graphs of Fig 5. Moreover, if there were any difference between groups,
there would be a possible confounding, since the patients with musculo- skeletal
conditions tended to present earlier (Table 3) and shorter duration conditions appear to
have responded more to treatment (Fig 4). Comparing outcome with age and gender
The data were analysed in a similar
way to assess whether grouping of patients by age or gender resulted in different outcomes
at the four time periods, as measured by the visual assessment score. No differences were
found. For age there is no main effect (F=0.26; df=2,15; p=0.78) nor age by time
interaction effect (F=0.55; df=6,45; p=0.77). For gender there is no main effect (F=0.57;
df=1,16; p=0.46) nor gender by time interaction effect (F=0.63; df=3,48; p=0.60). Thus
neither the age nor gender of the patient appears to be related to the initial ill health
or the rate of recovery. |
|
|
|
Discussion This study illustrates that
simple outcome measurements can reveal a considerable range of interesting observations
about the circumstances of people requesting acupuncture, and the outcomes of a course of
treatment. The findings are from the practices of seven trained
acupuncture practitioners spread throughout England. The size of the sample of patients is
small and the results will require replication before any strong conclusions can be drawn.
However, they do give clear guidance on specific studies that could now be carried out on
a larger scale. They also provided useful feedback to the group of practitioners on their
clinical work as part of the process of reflective practice. In the remainder of this discussion, by posing a series
of questions, we summarise the main findings that invite replication and further study and
also identify some issues and concerns that need to be addressed in further research. What sorts of people consult an acupuncturist? Nearly half the patients (45%) were attending with a
musculo-skeletal condition, the remainder were spread across neurological, skin,
psychological, female-genital, male-genital, women's, urological, ears, eyes and digestive
conditions (Lamberts and Woods 1987). The majority of patients (59%) had had their main
condition for more than two years prior to visiting the acupuncturist. What are the outcomes of treatment and what appears
to influence the outcome? Patients with more severe initial conditions, as
measured by the SF-36 (bodily pain and general health sub-scales) tend to make more rapid
improvement (figs 3a and 3b). There are indications (not statistically significant) that
patients who have had their condition for less than two years gain more benefit from
treatment (Fig 4), and that musculo-skeletal conditions are initially less severe and
respond well to acupuncture over a course of ten treatments (Fig 5). Neither the age nor
the gender of the patient appears to influence the outcome of treatment. What is the appropriate number of treatments? What further research is required? Firstly, there is a need to involve the wider
professional acupuncture practitioner community in studying outcomes as part of a
reflective approach to practice. In this way the profession could be better prepared to
develop its own clinical practice and, with knowledge and confidence, to collaborate with
external expert researchers and evaluators. Secondly, the findings reported here could have
important consequences for the way acupuncture is practised and promoted, although before
advising that they should be acted on the study needs to be repeated with larger numbers
and with certain improvements:
|
|
|
Conclusion
|
We are aware that a key advantage of
this study was that it was designed to be simple and easy to carry out. In this it was
successful. A larger replication of this study would need to balance the needs for
additional data with the benefits of simplicity that have characterised this study. Future studies using this methodology could be designed
at two levels. Firstly, basic outcome studies replicating the methods outlined here could
be undertaken by a wider range and larger number of practitioners. And secondly, studies
to address specific issues, such as those outlined above, could be developed with a cohort
of more experienced practitioner-researchers. Acupuncturists who have the commitment, time
and experience can develop new research methods and pass on the learning to their
colleagues so that, over time, acupuncture practitioners as a whole have a stronger
research base. |
|
|
Acknowledgements
|
This study arose out of the enthusiasm
and commitment of a group of acupuncture practitioners as part of a project organised by
the Foundation for Traditional Chinese Medicine. In addition to the authors, the following
practitioners were actively involved in the study: Richard Blackwell, Sigyta Hart, Val
Humphrey, Peter Luty, Jackie Shaw, David Smyth and Dr Frederick Staebler. We thank them
all and the course tutors who created the right conditions for the study to take place:
Richard Blackwell, Professor Roy Carr Hill, Dr Peter Davies, Francesca Diebschlag, Dr
David Reilly, Dr Mike Robinson and Kate Thomas. Finally, we thank the Research Council for
Complementary Medicine for an award under their First Rung scheme to support the
implementation of the study. Hugh MacPherson BSc PhD |
|
|
References
|
|
|
[Top]
|