This question addresses the possible connection between surgical success and
patient characteristics. Will patients with less severe symptoms or extent of
stenosis benefit to a greater or lesser degree from surgery compared to patients
with more severe symptoms or extent of stenosis?
Implicit in this question is whether some patients might benefit more from
surgery than from medical management. We turn to this implicit question after
first addressing the relationship between patient characteristics and surgical
outcomes.
Relationships Between Patient Characteristics and Surgical Outcomes
We addressed this question by performing a systematic narrative review of
each study. In this review, we calculated, wherever possible, each study's
effect size. The following analysis is subgrouped according to the type of
lumbar spinal stenosis and the patient characteristics that were correlated
with the outcomes of surgical treatment. Only studies that stratified
surgical treatment outcomes by patient characteristics or used a regression
analysis to compare patient characteristics to outcomes or used a regression
analysis to compare patient characteristics to outcomes were considered to
have evidence that could be used to answer question 6.
Table 25. Controlled Trials of Patients with Central Lumbar
Spinal Stenosis
| Herno, Saari, Suomalainen
et al.,1999a | Central lumbar stenosis no post-surgery stenosis | Mixed ecompression techniques | 1 | 35 | Back pain, leg pain, walking, global, Disability | Mean of 47 months |
| Central lumbar stenosis post surgery stenosis | Mixed decompression techniques | 2 | 57 |
| Herno, Partanen,
Talaslahti et al., 1999b | Central lumbar stenosis post surgery stenosis | Standard wide decompressive laminectomy | 1 | 41 | Walking, global, disability | Between 113 and 157 months |
| Central lumbar stenosis no post surgery stenosis | Standard wide decompressive laminectomy | 2 | 15 |
| Hanakita, Suwa, and
Mizuno, 1999 | Central lumbar stenosis younger than 64 years | Standard wide decompressive laminectomy | 1 | 59 | Back pain, leg pain, walking, global | Between 12 and 96 months |
| Central lumbar stenosis older than 64 years | Standard wide decompressive laminectomy | 2 | 61 |
| Central lumbar stenosis | Partial laminectomy or hemilaminectomy | 3 | 16 |
| Central lumbar stenosis | SWDL with fusion (arthrodesis) | 4 | 20 |
| Thomas, Rea, Pikul et
al., 1997 | Central lumbar stenosis | Standard wide decompressive laminectomy | 1 | 12 | Walking | Pretreatment and 24 to 62 months |
| Central lumbar stenosis | Laminotomy | 2 | 14 |
| Yone, Sakou, Kawauchi et
al., 1996 | Central lumbar stenosis | SWDL with fusion and istrumentation | 1 | 10 | Back pain, leg pain, walking, global | Between 24 and 68 months |
| Central lumbar stenosis | Laminotomy | 2 | 17 |
| Grob, Humke, and Dvorak,
1995 | Central lumbar stenosis | Partial laminectomy or hemilaminectomy | 1 | 15 | Back pain, back pain relief, leg pain relief,
walking, global | Pretreatment and 24 to 32 months |
| Central lumbar stenosis single level | Partial laminectomy with fusion and instrumentation | 2 | 15 |
| Central lumbar stenosis multiple segments | Partial laminectomy with fusion and instrumentation | 3 | 15 |
| Johnsson, Uden, and
Rosen, 1991 | Central lumbar stenosis | Conservative-not described | 1 | 20 | Back pain, walking, global, work | Between 7 and 51 months |
| Central lumbar stenosis moderate stenosis | Standard wide decompressive laminectomy | 2 | 30 |
| Central lumbar stenosis severe stenosis | Standard wide decompressive laminectomy | 3 | 14 |
| Ray, 1982 | Central lumbar stenosis | Standard wide decompressive laminectomy | 1 | 48 | Global | Mean of 10 and 13 months |
| Central lumbar stenosis | Partial laminectomy or hemilaminectomy | 2 | 17 |
| Surin, Hedelin, and
Smith, 1982 | Central lumbar stenosis marked stenosis | Standard wide decompressive laminectomy | 1 | 15 | Global | Between 14 and 70 months |
| Central lumbar stenosis moderate stenosis | Standard wide decompressive laminectomy | 2 | 7 |
| Tajima, Fukazawa, and
Ishio, 1980 | Central lumbar stenosis | Mixed decompression techniques | 1 | 13 | Global | Unkown post operative time |
| Central lumbar stenosis and disk lesion | Mixed decompression techniques | 2 | 14 |
Herno et al. (
Herno, Saari, Suomalainen et
al., 1999), Herno et al. (
Herno, Partanen, Talaslahti et al., 1999), Hanakita et al.
(
Hanakita, Suwa, and Mizuno,
1999), Johnsson et al. (
Johnsson,
Uden, and Rosen, 1991), Surin et al. (
Surin, Hedelin, and Smith, 1982), and Tajima et al.
(
Tajima, Fukazawa, and Ishio,
1980) stratified the reporting of surgical treatment outcomes by
patient characteristics (see
Table 25).
In addition to these studies, the prospective single arm surgical trials of
Katz et al. (Katz, Stucki, Lipson et al.,
1999) and Jonsson et al. (Jonsson, Annertz, Sjoberg et al.,
1997), and the retrospective single arm surgical trial of Thomas et al.
(Thomas, Rea, Pikul et al.,
1997) also examined the relationship between patient characteristics
and surgical outcome.
Effect of Extent of Stenosis on Surgical Outcomes Among Patients With
Central Stenosis
Table 26. Analysis of Walking Capacity in Meters from
Johnsson, Uden, and Rosen, 1991
| Severe stenosis; n = 14 | 120±114 | 2,373±4,142 | 2,253 | 0.747 | −0.020
1.513 |
| Moderate stenosis; n = 30 | 186±203 | 1,453±2,935 | 1,267 | 0.601 | 0.080
1.120 |
| Mean Difference between Patient Groups | 66 | 920 | |
| Hedges' d between Patient Groups | −0.360 | 0.269 |
| 95% Confidence Limits between Patient
Groups | −0.998
0.279 | −0.368
0.906 |
Table 27. Analysis of Walking Capacity According to a
Three-Level Scale from Johnsson, Uden, and Rosen, 1991
| Patient Group | Walking Capacity | Effect Size at Followup | Confidence Limits at Followup |
|---|
| Severe stenosis; n = 14 | 3 | 0 | 11 | 0.636 | 0.406
0.998 |
| Moderate stenosis; n = 30 | 7 | 8 | 15 |
Johnsson et al. (1991)
stratified patients receiving standard wide laminectomy according to the
extent of pretreatment stenosis. A partial block at myelography
indicated moderate stenosis (30 patients, mean and standard deviation of
the AP diameter was 7.9 ±2.3 mm), and a total block indicated severe
stenosis (14 patients, diameter = 0). The mean ages and their standard
deviations were 62 ±9 years in the moderate group and 69 ±8 years in the
severe group. Walking capacity was evaluated at a mean followup time of
50 ±32 months in the moderate group and 58 ±32 months in the severe
group. The original walking data and our calculations of effect size are
presented in
Table 26. At baseline, the
walking capacity was similar between groups, and both groups showed
large increases in walking capacity at followup (>1,200 m).
However, only the moderate stenosis group showed a statistically
significant effect size when pre- and posttreatment values were
compared. The large standard deviations at followup in both patient
groups, approximately twice the mean, indicate a large variation in
individual improvement in walking capacity. Our analysis of walking
capacity based on the three level scale data reported by
Johnsson et al. (1991) (worse,
unchanged, improved) indicated only a small effect in favor of the
severe stenosis group (Hedges' d effect size = .636, confidence levels
of 0.406 and 0.998 (see
Table 27). The data
from this trial seem to suggest that a select group of patients
benefited from surgery more than others. However, the degree of stenosis
did not seem to determine who these patients would be, and insufficient
information is presented in the trial to determine which patients might
make up the group that benefits.
Surin et al. (1982) also
stratified outcomes by degree of stenosis, but the moderate stenosis
group (AP diameter between 11 and 14 mm) contained only seven patients,
while the marked stenosis group (<10 mm) contained 15 patients.
The sample size < 10 in the moderate stenosis group prevents a
reliable comparison of the data between patient groups. The average
followup examination after surgery was 29 months, but the range was 14
to 70 months. The small sample size and the large range in followup
times prevent the use of data from this study in our analysis.
Herno et al. (1999a) and Herno et al. (1999b)
retrospectively examined surgical outcomes based on the presence of
stenosis at postsurgical followup. Herno et al. (1999a) reported on patients treated surgically
for the first time between 1982 and 1984. Herno et al. (1999b) reported on patients who
underwent surgery between 1985 and 1987. The stenosis was found at the
original surgical site or an adjacent vertebral level. The mean patient
ages were between 50 and 55 years at the time of surgery. Since these
are retrospective trials, no pretreatment scores were reported for the
outcome measures. Without baseline scores, the extent of patient
disability before treatment cannot be judged or compared to patients in
other studies. In addition, the degree of improvement over baseline
conditions cannot be determined. This makes the results of these studies
difficult to interpret. Baseline clinical features were reported. In
Herno et al. (1999b) the
nonstenosis group (15 patients) had a significantly greater baseline
mean cross-sectional area of the dural sac compared to the stenosis
group (41 patients) (145 mm2 vs. 95 mm2, p
<0.0001, Mann-Whitney test). Laminectomy, extended laterally to
decompress the nerve roots, was performed on all patients. After a mean
followup time of 10.2 and 11.1 years for the nonstenosis and stenosis
groups, respectively, no statistically significant differences were
found in the mean Oswestry score (a measure of back pain disability)
(28.7 v. 31.2, Kruskal-Wallis test) and walking capacity (15 minutes on
a treadmill) (515 m v. 470 m, Kruskal-Wallis test).
Herno et al. (1999a) reported
similar baseline clinical characteristics among the 35 patients with no
stenosis postsurgery and the 57 patients with postsurgery stenosis.
After a mean followup time of 3.5 years, no statistically significant
differences were found in the mean Oswestry score (28.4 vs. 26.4, p =
0.755, Kruskal-Wallis test) and walking capacity (706 m vs. 602 m, p =
0.178, Kruskal-Wallis test). In these two studies, the presence of
postsurgery stenosis was not correlated with surgical outcomes as
measured by disability from back pain and walking capacity at the time
of followup. The lack of pretreatment scores prevents any assessment of
the actual effect of surgery on these patient groups, and the study does
not provide evidence for a correlation between patient characteristics
and the success or failure of surgery.
Jonsson et al. (1997) used a logistic regression analysis to correlate
patient characteristics with favorable outcomes. In 105 patients with an
average age of 65 years (range of 37-83 years), patients with an AP
diameter of <6 mm had the most favorable outcomes at 5 years
after surgery. No statistical information (p values, correlation
coefficients) was reported to support these conclusions.
Effect of Age on Surgical Outcomes Among Patients With Central Lumbar
Stenosis
Hanakita et al. (1999)
retrospectively stratified by age (<64 years and >65 years
at the time of surgery) patients who received laminectomy only, partial
hemilaminectomy, or laminectomy plus fusion. A sufficient number of
patients were available in the laminectomy-only group to evaluate the
effect of age on surgical outcome (59 patients younger than 65 and 61
patients equal to or older than 65). However, the followup time was not
specifically reported and could have been between one and eight years
for any patient. During the followup period, 12 patients in the younger
group were lost to followup (20 percent) and 17 patients in the older
group were lost to followup (28 percent). Based on patient evaluation of
surgical outcome, 77 percent of the younger group and 66 percent of the
older group were considered "cured" or "better" as compared to
"unchanged" or "worse" according to the rating scale used by the
authors. Our computed effect sizes for this change yielded a Hedges' d
of 0.054 with confidence limits of −0.155 and 0.263, which is not
statistically significant. If dropouts are considered failures and put
in the "worse" category, the analysis still showed no difference between
the groups but shifted somewhat in favor of the younger group (Hedges' d
= 0.349 with confidence limits of −0.011 and 0.710). This study suggests
that surgical success based on patient evaluation of outcome ("I'm
better than I was") is not affected by age. This is not the same as
saying that surgery benefited both age groups to the same extent. The
lack of pretreatment assessment scores describing patient condition
before surgery prevents the assessment of actual surgical benefit. The
older age group may regard "cured" or "better" as a small improvement,
while the younger age group may require a large improvement before they
consider themselves "cured" or "better." Assessment of changes
(differences in pre- and posttreatment scores) in walking capacity,
pain, disability, and activities of daily living are needed for an
actual determination of surgical benefit in the two age groups.
Effect of Herniated Disks on Surgical Outcomes Among Patients With
Central Lumbar Stenosis
Tajima et al. (1980) divided
lumbar spinal stenosis patients into two groups: those with and without
accompanying herniated disks. However, the two groups are not readily
comparable because of differences in surgical methods with the groups.
The stenosis-only group contained 11 patients who received standard wide
laminectomy and two patients who received laminectomy plus fusion. The
stenosis-and-disk group contained six patients who received laminectomy
only, five patients who received laminectomy plus fusion, and three
patients in whom "Love's method" was used but not described. The data in
this study cannot be used to address the effect of concurrent herniated
disk on surgical outcome in lumbar spinal stenosis patients because
differences in outcomes between the groups may be due to differences in
surgical methods as well as the presence of a herniated disk.
Effect of Patient Health and Comorbidity on Surgical Outcomes Among
Patients With Central Lumbar Stenosis
Katz et al. (1999) looked for
predictors of surgical outcome in 199 patients two years after surgery.
These patients received either standard decompression (138),
decompression with fusion (31), or decompression with fusion and
instrumentation (30). The physical examination and radiographic
variables were not associated with outcomes. The best predictors of
symptom severity and walking capacity after surgery were the patients
ratings of their own health and the presence of cardiovascular
comorbidity. Better walking capacity, better mental health,
decompression with fusion, and higher income had borderline significance
with symptom severity and walking capacity outcomes. However, these
variables combined accounted for only 33 percent of the variation in
walking capacity outcome and 27 percent of symptom severity outcome.
Jonsson et al. (1997) used a logistic regression analysis to correlate
patient characteristics with favorable outcomes. At 5 years after
surgery, patients with comorbid disorders affecting walking ability
fared significantly worse than those without these comorbid conditions.
No statistical information (p values, correlation coefficients) was
reported to support these conclusions.
Thomas et al. (1997)
retrospectively examined 26 patients who received either laminectomy or
laminotomy in order to determine which patient characteristics were
related to patient outcomes. The average age was 68 years (range: 40 to
86). The authors noted that among the patients with poor outcomes there
was a multiplicity of other diseases compared to those patients with
good outcomes. No statistical analysis was presented. With only 26
patients, this study may have too few patients from which to predict
outcomes based on patient characteristics.
Yone et al. (1996) and Ray (1982) both retrospectively
examined patients who received different surgical procedures, but did
not stratify outcomes by patient characteristics. Grob et al. (1995) randomized patients to
different treatments, but did not report outcomes based on patient
characteristics. Therefore, these studies cannot be used to judge if
patients with less severe symptoms benefit from surgery as well as
patients with more severe symptoms.
Effect of Herniated Disks on Surgical Outcomes Among Patients With
Lateral Lumbar Stenosis
Table 29. Analysis of Global Success Data from
Kirkaldy-Willis, Wedge, Yong-Hing et al., 1982
| Patient Group | Global Success at Followup | Hedges' d Effect Size | Confidence Intervals |
|---|
| Lateral stenosis and Lateral and central stenosis
patients; n = 33 | 14 | 9 | 10 | 0 | −0.600
p =
0.006 | −1.028
−0.171 |
| Lateral stenosis plus herniated disk and lateral and
central stenosis plus herniated disk patients; n = 44 | 34 | 7 | 2 | 1 |
The study by
Kirkaldy-Willis et al.
(1982) was a retrospective case series in which patients were
selected if they had one of the following four conditions: lateral
stenosis, lateral stenosis with disk herniation, lateral and central
stenosis, and lateral and central stenosis with disk herniation. The
average age for the entire group of patients was 46 years (range of 41
to 52 years). Between 12 and 120 months after partial laminectomy, these
patients were asked to assess the degree of improvement in their
condition. We analyzed these categorical data to determine if disk
herniation in combination with lateral stenosis was detrimental to
patient recovery compared to lateral stenosis alone (see
Table 29). Based on patient
reporting of general improvement after surgery, the patients with
herniated disks in addition to stenosis experienced greater improvement
(Hedges' d = −0.600, confidence intervals of −1.028 to −0.171).
Improvement after surgery is based on the patient's perception of their
pain and disability before treatment. The patients with herniated disks
may have experienced more pain or disability before surgery, but in this
retrospective study, no pretreatment measures of pain or disability are
provided to judge each group's relative condition before surgery.
Therefore, this study does not help us in determining if the
pretreatment condition of patients with lateral lumbar stenosis
influences their response to surgery.
Effect of Patient Characteristics on Surgical Outcomes Among Patients
With Degenerative Spondylolisthesis
Table 30. Controlled Trials of Patients with Degenerative
Spondylolisthesis
| Plotz and Benini,
1998 | Degenerative spondylolisthesis | Decompressive surgery without fusion | 1 | 17 | Back pain relief, leg pain relief, global
success | 9 to 120 months |
| Degenerative spondylolisthesis | Decompressive surgery with fusion and translaminar screw
fixation | 2 | 18 |
| Degenerative spondylolisthesis | Decompressive surgery with fusion and AO internal
fixator | 3 | 71 |
| Fischgrund, Mackay,
Herkowitz et al., 1997 | Degenerative spondylolisthesis | SWDL with fusionand instrumentation | 1 | 40 | Back pain, Leg pain, global success | 24 to 36 months |
| Degenerative spondylolisthesis | SWDL with fusion (arthrodesis) | 2 | 35 |
| Thomsen, Christensen,
Eiskjaer et al., 1997 | Degenerative spondylolisthesis | Partial laminectomy with fusion and instrumentation | 1 | 20 | Quality of life, mental status, Physical
function and activities of daily living | 12 and 24 months |
| Degenerative spondylolisthesis | Partial laminectomy and fusion | 2 | 21 |
| Yuan, Garfin, Dickman
et al., 1994 | Degenerative spondylolisthesis | Fusion and pedical screw fixation | 1 | 2177 | Back pain relief, leg pain relief, physical
function and activities of daily living | 23 to 51 months |
| Degenerative spondylolisthesis | Fusion | 2 | 456 |
| Bridwell, Sedgewick,
O'Brien et al., 1993 | Degenerative spondylolisthesis | Partial laminectomy or hemilaminectomy | 1 | 9 | Walking | 34 to 45 months |
| Degenerative spondylolisthesis | Partial laminectomy and fusion | 2 | 10 |
| Degenerative spondylolisthesis | Partial laminectomy with fusion and instrumentation | 3 | 24 |
| Satomi, Hirabayashi,
Toyama et al., 1992 | Degenerative spondylolisthesis | Fusion and instrumentation | 1 | 27 | Back pain, Leg pain, Walking, Global success | approximately 36 months |
| Degenerative spondylolisthesis | Mixed decompression techniques | 2 | 14 |
| Herkowitz and Kurz,
1991 | Degenerative spondylolisthesis | Standard wide decompressive laminectomy | 1 | 25 | Back pain, leg pain, global success | 29 to 48 months |
| Degenerative spondylolisthesis | SWDL with fusion (arthrodesis) | 2 | 25 |
| Lombardi, Wiltse,
Reynolds et al., 1985 | Degenerative spondylolisthesis | Standard wide decompressive laminectomy | 1 | 20 | Global success | 24 to 84 months |
| Degenerative spondylolisthesis | SWDL with fusion (arthrodesis) | 2 | 21 |
| Fitzgerald and
Newman, 1976 | Degenerative spondylolisthesis | Rigid brace | 1 | 29 | Global success | 6 to 216 months |
| Degenerative spondylolisthesis | Mixed decompression techniques | 2 | 14 |
| Rosenberg,
1976 | Degenerative spondylolisthesis | Partial laminectomy or hemilaminectomy | 1 | 11 | Global success | Soon after treatment to 120 months |
| Degenerative spondylolisthesis | Standard wide decompressive laminectomy | 2 | 15 |
| Degenerative spondylolisthesis | Conservative-not described | 3 | 170 |
Four randomized controlled trials and six controlled trials examined
surgical treatment of degenerative spondylolisthesis (see
Table 30). The randomized
controlled trial of
Thomsen et al.
(1997) included patients with isthmic spondylolisthesis,
secondary degenerative spondylolisthesis, and primary degenerative
spondylolisthesis, and reported surgical outcomes separately for each of
these groups. However, when reporting on surgical outcomes according to
patient characteristics, such as duration of symptoms and number of
levels fused, all three types of spondylolisthesis were combined.
Therefore, this study does not report evidence specific to primary
degenerative spondylolisthesis to address the question of patient
characteristics and benefit from surgery. The randomized controlled
trials of
Fischgrund et al.
(1997),
Bridwell et al.
(1993), and
Herkowitz and Kurz
(1991) did not stratify or analyze data based on patient
characteristics. In the controlled trials of
Plotz and Benini (1998),
Yuan et al. (1994),
Satomi et al. (1992),
Lombardi et al. (1985),
Fitzgerald and Newman (1976), and
Rosenberg (1976) data within the
surgical treatment groups was not analyzed or reported in relation to
patient characteristics. Therefore, these studies cannot be used to
address the connection between patient characteristics and surgical
outcome.
Comparison of Surgical Versus Conservative Treatment
Table 31. Trials Comparing Surgery and Conservative
Therapy
| Amundsen T, Weber H,
Nordal HJ et al., 2000 | Patients were diagnosed as having severe,
moderate, or mild symptoms; severe patients received
surgery, mild patients received conservative treatment, and
moderate patients were randomized to conservative or
surgical treatment | Sciatic pain in the leg (s), with or without
back pain; radiologic signs of stenosis and compression of
the clinically afflicted nerve roots; no bulging or
herniated disk or previous surgery of the back | | Age: | | 21-40: | 2 | | 41-60: | 9 | | 61-70: | 6 | | >71: | 1 | | Sex: | | 39% M | | Severity: | | 100% | | moderate | | Pain: | | light: | 0 | | moderate: | 5 | | severe: | 13 |
| | Age: | | 21-40: | 2 | | 41-60: | 7 | | 61-70: | 4 | | >71: | 0 | | Sex: | | 69% M | | Severity: | | 100% | | moderate | | Pain: | | light: | 0 | | moderate: | 5 | | severe: | 13 |
| Nonsurgery brace, rehabilitation, physiotherapy, | 18 | Only study to randomly assign patients to either
conservative or surgical treatment |
| Laminectomy: no fusions | 13 |
| Mariconda, Zanforlino,
Celestino et al., 2000 | Unmatched observational comparison within
cohort: all indicated for surgery; nonsurgery, 3 had
contraindicating comorbidity, 14 refused surgery (appears to
correlate with milder disease) | age >40, radicular pain, central
stenosis (degenerative or combined), dural area
<130mm2, indicated for surgery, no previous back
surgery | Straight leg-raise: | | Nonsurgery | 17 | Evaluator blinded for initial exam,
questionnaire for followup exams; no dropouts reported
Beaujon score (0 = worst, 20 = normal): neurogenic
claudication, leg pain at rest, leg pain at exertion, low
back pain, neurological deficit (motor or sensor),
medications, QOL |
| 47% | 45% | Laminectomy; diskectomy if necessary; no fusions
or instruments | 20 |
| Neurological deficits: | |
| 47% | 70% |
| Stenosis: | |
| 78 mm2 | 68 mm2 |
| Multiple levels: | |
| 65% | 70% |
| Initial Beaujon score: | |
| 11 (±2.4 SD) | 8.1 (±2.7) |
| Hurri, Slatis, Soini et
al., 1998 | (1) Unmatched comparison of 57 patients. who
had surgery and 18 treated conservatively for unreported
reasons
(2) comparison of Oswestry means
adjusted (general linear model) for age, sex, body mass,
severity of stenosis (radiographic) | Myelography diagnosis of LSS (<11 mm
diameter) in years 1978-82 (134 patients) and able to be
traced and interviewed by phone (86 still alive, 75
interviewed) | Age: | | Conservative | 18 | Oswestry Disability Index (0 = normal, 50 =
worst): pain, personal care, lifting, walking, sitting,
standing, sleeping, sex life, social life, traveling |
| Mean 47 | Mean 50 | Majority got laminectomy, some disk surgery,
some foraminotomy, no fusions | 57 |
| Sex: | |
| 66% M | 56% M |
| Severe (<7.0 mm): | |
| 33% | 46% |
| Sciatic (knee or below): | 82% |
| 72% | |
| Neurological deficit: | 67% |
| 44% | 21% |
| Previous surgery: | |
| 22% | 26% |
| Early retirement: | |
| 44% | |
| Swezey, 1996 | (1) Unmatched comparison; only the 23% (11/47)
not improved by conservative treatment were treated
surgically
(2)Crossover with repeated measures: 11
patients who had unsatisfactory improvement with
conservative therapy during first period were given surgery
during second period | 47 consecutive patients in 1986-7, 43 with
>mild LSS by CT or MRI; 18 also had
spondylolisthesis; 4 had no CT or MRI but had
spondylolisthesis by x-ray; all available for phone
interview, and 31 examined | Conservative treatment improved 56%, left 39%
unchanged, 5% worse | None improved by conservative treatment;
"greater preponderance" of moderate (7/11) or severe (4/11)
neurogenic claudication | Conservative: instruction in ergonomics and flexion
exercises, analgesics, NSAIDs; 23% had traction, 28% had
epidural corticosteroid injections | 36 | |
| Conservative then surgery: all had wide laminectomies, 1
also had foraminectomy, 1 also had fusion | 11 |
| Herno, Airaksinen, Saari
et al., 1996 | Retrospective matched-pair | 57 LSS patients from 1980-87 who had myelography
and no surgery; from 310 not previously operated on patients
with LSS and myelography, 54 were manually matched | Matched on age, sex, myelographic findings
(block, subtotal block, AP diameter <10 mm, AP
diameter 10-12 mm), major symptoms (claudication, leg pain,
mixed), and duration of symptoms | Matched on age, sex, myelographic findings
(block, subtotal block, AP diameter <10 mm, AP
diameter 10-12 mm), major symptoms (claudication, leg pain,
mixed), and duration of symptoms | Conservative (not systematic) | 54 | Patients not operated on for these reasons: 69%
pain was bearable, 19% refused surgery, 5.6% were retired,
9% other reasons, 4% insufficient data |
| Surgery: standard wide laminectomy with partial or whole
facetectomy if needed | 54 |
| Atlas, Deyo, Keller et
al., 1996 | Prospective unmatched observational comparison
within cohort | LSS by neurogenic claudication and radiographic
findings of stenosis, including herniated disks; at least 2
weeks of unsatisfactory improvement with conservative
treatment; no prior lumbar spinal surgery, cauda equina
syndrome, developmental deformity, fractures, infection,
tumor, inflammation, pregnancy; age >17 | Straight leg positive: | | Conservative: various types | 58 | Commonly known as the Maine Lumbar Spine
Study
4-year followup outcomes were recently
published in Atlas,
Keller, Robson et al., 2000 |
| 12% | 28% | Surgery: 88% laminectomy, 9% diskectomy only,
4% fusion and laminectomy | 72 |
| Radiographic normal or mild: | |
| 33% | 18% |
| Radiographic severe: | 28% |
| 10% | |
| Extreme low back pain: | 59% |
| 39% | 79% |
| Extreme leg pain: | |
| 32% | 20% |
| Disabled, in bed: | |
| 11% | |
| Nagler and Bodack,
1993 | Retrospective unmatched observational comparison
within cohort; patients receiving surgery were a "select
group with fewer concomitant medical problems,"(i.e., a de
facto pseudo-randomization to the extent comorbidities are
independent) | 100 patients with LSS symptoms (pain, numbness,
weakness in back and legs that was worse with standing or
walking and was relieved by sitting or lying) and positive
CT and/or myelogram; not predominantly disk caused; 80 able
to be interviewed by phone at 1 year | Age: | | Conservative: combinations of analgesics, electrical
stimulation, ice, heat, hydrotherapy, ultrasound, muscle
relaxation techniques, stretching and strengthening
exercises, treadmill training | 41 | |
| 56.4 mean (range 26 to 82) | 57.4 mean (range 28 to 78) | Surgery: laminectomy, with or without
foraminectomy | 39 |
| More comorbidities | Fewer comorbidities |
| (Johnsson,
Uden, and Rosen, 1991) | Unmatched (retrospective?) observational
comparison within cohort; 2 untreated patients had
contraindicating cardiovascular disease; 17 refused surgery | Patients with <12 mm diameter on
myelography but not total block, no previous spinal surgery,
no impaired circulation in legs (i.e., any claudication was
neurogenic) | Age: | | Untreated (details not reported) | 19 | |
| 60±9 (range 42 to 80) | 62±9 (range 45 to 78) | Surgery: laminectomy and facetectomy; no fusion | 30 | |
| Symptom duration: | |
| 22±24 months (range 4 to 96 months) | 3±35 months (range 4 to 96 months) |
| Observation after myelography: | |
| 31±12 months (range 7 to 51 months) | 50±32 months (range 5 to 109 months) |
| Neurogenic Claudication: | |
| 84%
(16/19) | 77%
(23/30) |
| Radicular pain: | |
| 11%
(2/19) | 17%
(5/30) |
| Mixed symptoms: | |
| 5%
(1/19) | 7%
(2/30) |
| AP diameter (myelography): | |
| 8.6±1.7 SD | 7.9 ±2.3 SD |
| Walking capacity: | |
| 1,355
(±3,081 SD) | 186
(±203 SD) |
In this part of our analysis, we turn to the question that is implicit in our
primary question, "Do some patients benefit more from surgery than from
medical management?" We located eight studies that examined patients with
central and/or lateral stenosis who received nonsurgical or surgical
treatment (see
Table 31). For each of these studies,
we reviewed the design and results to determine if these studies provided
reliable evidence to answer this question. In this process, we calculated,
wherever possible, each study's effect size. A critical requirement in any
study trying to address this question is having comparable patients in both
the nonsurgery and surgery groups. Only one of the studies is a randomized
controlled trial, so potential bias may exist in how each study assigned
patients to treatment groups. Therefore, within each study we will determine
if the patient groups are comparable in terms of pretreatment signs,
symptoms, and measurements used to assess posttreatment success. These
trials do not provide evidence for the effectiveness of any one conservative
treatment because multiple types of conservative treatments were used in
each nonsurgery patient group.
Johnsson et al. (1991) compared 44
patients treated surgically for lumbar spinal stenosis with 20 patients not
surgically treated. The nonsurgery patients refused surgery (19 patients) or
the anesthesiologist refused to administer anesthesia because of advanced
cardiovascular disease (2 patients). To obtain comparable groups for
analysis, one patient in the nonsurgery group with severe stenosis was
removed from the analysis, and the surgery patients were divided into severe
stenosis (30 patients) and moderate stenosis (14 patients). Thus, we are
concerned with the comparison of the conservatively treated
moderate-stenosis group with the surgically treated moderate-stenosis group.
The smallest AP diameter had a mean and standard deviation of 7.9 ±2.3 mm in
the moderate-stenosis surgery group and 8.6 ±1.7 mm in the nonsurgery group.
The mean ages were 60, 62, and 69 years for the nonsurgery,
moderate-stenosis, and severe-stenosis groups, respectively. Neurogenic
claudication was diagnosed before treatment in 84 percent, 77 percent, and
86 percent of patients in the nonsurgery, moderate-stenosis, and
severe-stenosis groups, respectively. The signs and symptoms of lumbar
spinal stenosis and the extent of stenosis seen by myelography are the only
evidence reported that indicate that the nonsurgery and moderate-stenosis
groups are comparable. Followup exams were performed, on average, 31 months
after treatment in the nonsurgery group and 50 months after treatment in the
moderate-stenosis surgery group.
Table 32. Analysis of Neurogenic Claudication Comparing
Nonsurgical Patient to Surgical Patients from Johnsson, Uden, and
Rosen, 1991
| Moderate stenosis; n = 30 | 23 | 11 | 1.76 0.50 to 6.22 p = 0.858 | 0.323 −0.395 to 1.040 p =
0.378 |
| No surgery; n = 19 | 16 | 10 |
Table 33. Analysis of Walking Capacity in Meters Comparing
Nonsurgical Patient to Surgical Patients from Johnsson, Uden, and
Rosen, 1991
| Moderate stenosis; n = 30 | 186 ± 203 | 1,453 ± 2,935 | 1,267 | −0.601
p = 0.023 | −1.119
−0.084 |
| No surgery; n = 19 | 1,355 ± 3,081 | 2,342 ± 4,069 | 987 | −0.268
p = 0.411 | −0.906
0.371 |
| Mean Difference between Patient Groups | 1,169 | 889 | |
| Hedges' d | −0.599
p = 0.045 | −0.256
p = 0.384 |
| 95% Confidence Limits | −1.186
−0.012 | −0.833
0.321 |
The original data form
Johnsson et al.
(1991) on the occurrence of neurogenic claudication and on
walking capacity, plus our analysis of these data, are contained in
Table 32 and
Table 33. The data on neurogenic
claudication indicate that the frequency of this symptom declined to the
same extent in both groups. The chances of recovery from neurogenic
claudication were equal between groups (see
Table
32). The analysis of the walking data is complicated
by the significantly better capacity of the nonsurgery group before
treatment (see
Table 33). The nonsurgery group
could, on average, walk 1,169 m more before treatment than the
moderate-stenosis group. After surgery, the two groups had comparable
walking capacity. This study seems to indicate that patients with moderate
spinal stenosis can recover from some of the symptoms associated with lumbar
spinal stenosis. However, this evidence is weakened by the apparent
differences between patients in the nonsurgery and surgery groups in initial
walking capacity. The authors have pointed out that this difference in
walking capacity may indicate that symptoms were more severe in the patients
treated surgically.
Nagler and Bodak (1993)
retrospectively analyzed the recovery from symptoms of 41 conservatively
treated patients and 39 patients who received laminectomies. The degree of
improvement, based on a ratio of the original symptoms to those at one year,
was similar between groups (54 percent of conservative-treaatment patients
and 56 percent of surgery patients had >50 percent improvement;
X2 = 0.06114, p = 0.804689, according to our calculations).
The actual numbers of patients with each symptom before and after treatment
were not reported. Without these data, we cannot determine the comparability
of the patient groups before treatment. The authors suggest that the surgery
patients represent a selected group with fewer concomitant medical problems.
Therefore, the data in this study cannot be used to conclusively determine
if patients with comparable levels of lumbar spinal stenosis do better with
conservative treatment or surgery.
A recently published randomized controlled trial does provide a comparison of
a nonsurgical control group with a surgery group with the same baseline
patient characteristics (Amundsen T,
Weber H, Nordal HJ et al., 2000). In this study, randomization
was considered ethical only among patients whose severity of pain did not
indicate to the physician that either surgery or conservative therapy was
the indicated treatment. The physicians selected surgery for 19 patients
with severe symptoms and conservative treatment for 50 patients with mild
symptoms. The remaining 31 patients were randomized: 13 patients received
surgery (laminectomy without fusion) and 18 patients received conservative
therapy (a brace and rehabilitation for one month, followed by an additional
2 months with the brace, and physiotherapy).
Table 35. Examiner Assessment Data from Surgically and
Conservatively Treated Patients with Lumbar Spinal Stenosis from
Amundsen T, Weber H, Nordal HJ et al., 2000
| Patient Group | Time Post Treatment | Examiner Assessment of Patient Condition
Following Treatment |
|---|
| Surgical treatment Nonrandomized n = 19, 2
dropouts by the 10th year | 6 month | 7 | 8 | 2 | 2 |
| 1 year | 9 | 8 | 2 | 0 |
| 4 year | 11 | 5 | 2 | 1 |
| 10 year | 7 | 5 | 4 | 1 |
| Conservative treatment Nonrandomized n = 50, 9
dropouts and 10 surgery by the 10th year | 6 month | 18 | 17 | 5 | 10 |
| 1 year | 18 | 14 | 6 | 12 |
| 4 year | 17 | 11 | 8 | 14 |
| 10 year | 15 | 8 | 6 | 21 |
| Surgical treatment Randomized n = 13, 2 dropouts
by the 10th year | 6 month | 4 | 8 | 1 | 0 |
| 1 year | 5 | 4 | 1 | 3 |
| 4 year | 8 | 3 | 1 | 1 |
| 10 year | 5 | 5 | 0 | 1 |
| Conservative treatment Randomized n = 18, 1
dropout and 9 surgery by the 10th year | 6 month | 3 | 4 | 1 | 10 |
| 1 year | 3 | 3 | 2 | 10 |
| 4 year | 2 | 6 | 0 | 10 |
| 10 year | 3 | 5 | 0 | 10 |
At followup times of 6 months, 1 year, 4 years, and 10 years, patients were
evaluated for pain, working ability, assessment of own condition, walking
ability, and physical activity. The examining physician used these findings
to determine if a patient's clinical status was "excellent," "fair,"
"unchanged," or "worse" than at the start of the trial. These categorical
data are presented in
Table 35. Within 3 to 27
months of entering the study (median 3.5 months), 10 of the 18
conservatively treated moderate-symptom patients were crossed over to
surgery (56 percent). Among the patients with moderate symptoms and
randomized to treatment, a higher percentage who received surgery were rated
good (excellent and fair category combined) at each of the followup periods
than those who received conservative therapy (6 months: 92 percent v 39
percent; 12 months: 69 percent vs. 33 percent; 4 years: 85 percent vs. 44
percent; 10 years: 77 percent vs. 44 percent). This finding may be an
artifact of the study design.
The trial by Amundsen et al. (2000)
represents an attempt to randomize patient treatment between conservative
and surgical approaches and thereby resolve an important clinical question.
Will lumbar spinal stenosis patients with moderate symptoms benefit more
from surgery or from conservative treatment? The data as presented by the
authors suggest that surgery may be more beneficial. However, several design
and reporting problems reduce the strength of this conclusion. Even the
authors acknowledged that "because the situation was observational, the
existence of hidden confounders and selection bias made it mandatory for the
authors to be descriptive and noninferential." Our assessment of these
potential hidden confounders and selection bias follows.
The criteria for assignment to the mild-, moderate-, or severe-symptom groups
are not clearly stated. The authors assert that intensity of patient pain
was the most important reason for selection to the severe-symptom group
which received immediate surgery. However, several pieces of evidence
indicate that physicians may have underrated the pain and severity of
condition in some patients, resulting in these patients' inclusion in the
moderate group as opposed to the severe group. First, the median time lag to
crossover to surgery was 3.5 months, with a range of 3 to 27 months. This
means that perhaps half of the conservatively treated moderate patients
switched to severe patients within 3 months of entering the trial. If the
original diagnoses had been correct, one would not have expected that half
of the crossovers would have occurred by 3.5 months. Individual or subgroup
data on when crossovers occurred were not reported.
Second, while 14 of 19 severe-symptom patients (74 percent) reported having
severe pain at the start of the trial, 20 of 31 moderate-symptom patients
(65 percent) and 24 of 50 mild-symptom patients (48 percent) also reported
severe pain at the start of the trial. This indicates that severity of pain
was not the only determining factor in how the physicians allocated patients
to the treatment groups. Supporting this notion is that larger proportions
of older patients appear in the mild-symptom group than in the other two
groups. Nine of the 12 patients (75 percent) older than 71 years appear in
the mild group (moderate: 8 percent, severe: 17 percent). Older patients may
have been selected for conservative treatment because they would be expected
to have a poorer prognosis after surgery. Although this is appropriate
clinical judgment, this allocation introduces a bias into the trial that
could favor surgery.
Third, there appears to be a difference between how physicians and patients
rated symptoms. This difference manifests itself as an underrating by the
physician. Thus, the authors report of the agreement between physician and
patient yields a kappa statistic of 0.59 at 6 months and smaller kappas at
later followup periods. This degree of agreement between patients, and
examiner is moderate to small, with patients reporting more in the worse
category than examiners (6 vs. 3). Such underrating of patient pain by
medical personnel has also been observed in other studies (Choiniere, Melzack, Girard et al.,
1990; Daniel, Long, Murphy/Kores
et al., 1983). Because of physician underrating, patients may
have been assigned to the randomized (moderate) group when they should have
been assigned to the surgical (severe) group. If randomized to conservative
treatment, these patients would be more likely to have unsuccessful results
and be crossed over to surgery. The effect of misclassification is to
artifactually reduce the reported effectiveness rate of conservative
treatment, because patients who are crossed over are considered failures by
intent-to-treat analysis. Misclassification increases the apparent
difference in effectiveness between surgery and conservative therapy.
Further, the size of this apparent difference increases as more patients are
misclassified into the moderate group.
We can make one of two conclusions from our analysis of the patients in the
Amundsen et al. (2000) trial.
First, surgery is superior to conservative treatment for patients with
moderate symptoms. As discussed above, the data are confounded and this
conclusion lacks support. The second possible conclusion is that the
apparent superiority of surgery in moderate patients is an artifact caused
by inclusion of severe patients in the moderate-symptom group who then fail
conservative therapy. In which case, this is evidence that surgery is
superior to conservative treatment among patients with severe symptoms.
Another trial, with more carefully designed patient selection and
characterization into mild, moderate, and severe groups, will allow us to
determine the actual extent to which surgery or conservative treatment
benefits these groups.
Amundsen et al. (2000) provide
three measures by which the efficacy of surgery compared to conservative
therapy can be judged. The first outcome measure is the number of patients
needing surgery after first receiving conservative treatment. In the
mild-symptom group, only 20 percent of patients eventually needed surgery,
indicating that conservative treatment may be justified in this group. In
the moderate-symptom group, 54 percent of patients needed surgery. However,
as discussed above, this number may be inflated by physician underrating of
patient condition and judging severe patients as moderate. Therefore,
conservative treatment is more likely to fail in these patients.
The second outcome measure is the examiner assessment of "overall treatment
result." This was described as a subjective global assessment based on the
following components: (1) a patient-reported scale of "better," "unchanged,"
or "worse" compared to their condition at the start of the trial; (2) pain;
(3) working ability; (4) walking ability; (5) level of physical activity;
and (6) the opinion of the examining physician. We would expect pain to be
the major component of this scale because pain influences patient working,
walking, and physical activity. Examiners were not blinded to treatment. As
mentioned above, the agreement between examiner assessment and patient
assessment is moderate to poor.
Table 36. Patient Reported Pain Data Compared to Examiner
Assessment Data from Surgically and Conservatively Treated Patients
with Lumbar Spinal Stenosis from Amundsen T, Weber H, Nordal HJ et
al., 2000
| Original Diagnosis of Patient Condition | Treatment | Crossover | 4-year Outcome Data | 10-year Outcome Data |
|---|
| Mild symptoms n = 50 | Conservative | Surgery n = 10 | 10 | 3 | 4 | 10 | 3 | 4 |
| No crossover n = 40 | 40 | 13 | 17 | 32 | 12 | 15 |
| Moderate symptoms n = 31 | Randomized to conservative n = 18 | Surgery n = 10 | 9 | 1 | 2 | 9 | 2 | 2 |
| No crossover n = 8 | 8 | 1 | 2 | 8 | 2 | 3 |
| Randomized to surgery n = 13 | --- | 13 | 5 | 8 | 11 | 5 | 5 |
| Severe symptoms n = 19 | Surgery | --- | 19 | 6 | 11 | 17 | 8 | 7 |
| Wilcoxon Signed Ranks Test
results | Z = −2.23, p = 0.026 | Z = −1.13, p = 0.257 |
Table
36 shows a comparison of the third outcome
measure, patient-reported pain, and overall treatment result at 4 and 10
years for each of the four original treatment groups and the two crossover
groups (mild conservative to surgery and moderate conservative to surgery).
We used a Wilcoxon Signed Ranks Test to test the hypothesis that the number
of patients reporting no pain is equal to the number of patients rated
"excellent." At four years, the hypothesis is rejected because, in each
treatment group, more patients were rated as "excellent" than were reported
as having no pain. This suggests that pain was not a primary determinant of
the physician's assessment of treatment results. At 10 years, no significant
difference was found in the number of patients rated as "excellent" and the
number of patients with no pain. The trend agreed with the 4 year data, but
was not statistically significant, possibly because of low power. Given the
examiner tendency toward underrating pain and overrating a patient's overall
condition, the most reliable measure of treatment outcome presented by
Amundsen et al. (2000) is
patient-reported pain.
If complete relief from pain is a goal of treatment for lumbar spinal
stenosis, then few patients obtained that goal at 4 or 10 years with either
conservative or surgical treatment. Patient-reported pain data are presented
by
Amundsen et al. (2000) for the
start of the trial, at 3 months, 4 years, and 10 years. Patients were
categorized as having no pain or light pain, moderate pain, or severe pain.
At the start of the trial, the only patients with light pain were found in
the mild-symptom, conservatively treated group. The data in
Table 36 indicate that only one-third
of the severe/surgery patients were pain-free at 4 years, and only half were
pain-free at 10 years. In the moderate group randomized to surgery, 42
percent of patients were pain-free at 4 years, and 45 percent of patients
were pain-free at 10 years. Of patients with mild symptoms who remained with
conservative treatment, 34 percent and 44 percent were pain-free at 4 and 10
years, respectively. Of patients with moderate symptoms who remained with
conservative treatment, 13 percent and 25 percent were pain-free at 4 and 10
years, respectively. Among the patients who were crossed over from
conservative to surgical treatment, 30 percent and 43 percent of
mild-symptom patients and 13 percent and 33 percent of moderate-symptom
patients were pain-free at 4 and 10 years, respectively. As reported, these
data are difficult to interpret. A measure of the magnitude of improvement
in pain after treatment would provide a better gauge of treatment success.
Although the authors measured pain on a visual analog scale, this data is
not reported. Other measures of treatment success such as physical activity
and walking ability were assessed but not reported.
Atlas et al. (1996) reported on o1
year outcomes of patients in the Maine Lumbar Spine Study, a prospective,
observation cohort study of patients with spinal stenosis treated surgically
(81 patients) or nonsurgically (67 patients). This trial has three design
features that make its results more reliable than the Amundsen et al. (2000) trial. First, Atlas et al. (1996) used an
objective scoring system for classifying patients' severity of disease.
Therefore, misclassification of patients with severe disease into the
moderate category was less likely. Second, there were no crossovers from
conservative to surgical treatment. Therefore, the effect of surgery was not
exaggerated. Third, study outcomes were based on patient ratings and not on
physician ratings. Therefore, results were not skewed by the observer.
In this trial, laminectomy was performed in 88 percent of the surgery
patients. Nonsurgery patients received mostly bed rest (29 percent), back
exercises (39 percent), physical therapy (23 percent), spinal manipulation
(23 percent), narcotic analgesics (21 percent), and epidural steroids (18
percent). Extensive patient information reported in the article showed that
the severity of symptoms and the degree of disability because of pain were
significantly greater in the surgically treated patients. These patients
reported more frequent and severe leg and back pain and poorer functional
status, but had greater improvement than patients treated nonsurgically.
Because patient preference was the most common reason for not choosing
surgery, the nonsurgery group may have selected conservative treatment
because their symptoms were less severe than those patients who chose
surgery. Therefore, the entire pool of data in this study cannot be used to
determine if patients with comparable levels of lumbar spinal stenosis do
better with conservative treatment or surgery.
However, a subgroup of 54 patients (31 surgery and 23 nonsurgery patients)
reported moderate symptoms before treatment. Results from this group are
useful for comparing conservative and surgical treatments. Patients
receiving surgery were significantly improved compared to patients who did
not receive surgery.
Table 34. Effect Sizes for One and Four Year Outcomes of Patients
with Moderate Symptoms at Baseline from Atlas, Deyo, Keller et al.,
1996 and Atlas, Keller, Robson et al., 2000
| 1-Year Outcome Data | Actual Case | Worst Case | Actual Case | Worst Case | Actual Case | Worst Case |
| Surgery; n = 31 | 19 Yes 12 No | 19 Yes 23 No | 20 Yes 11 No | 20 Yes 22 No | 21 Yes 10 No | 21 Yes 21 No |
| No surgery; n = 23 | 3 Yes 20 No | 3 Yes 20 No | 7 Yes 16 No | 7 Yes 16 No | 5 Yes 18 No | 5 Yes 18 No |
| Effect size: Hedges' d
1 | 1.55 | 0.93 | 0.75 | 0.40 | 1.14 | 0.70 |
| Lower limit of 95% confidence intervals | 0.46 | 0.18 | 0.08 | −0.19 | 0.32 | 0.06 |
| Upper limit of 95% confidence intervals | 2.64 | 1.68 | 1.42 | 0.99 | 1.95 | 1.34 |
| p value for effect size | 0.005 | 0.015 | 0.028 | 0.19 | 0.006 | 0.03 |
| 4-Year Outcome Data | Actual Case | Worst Case | | | Actual Case | Worst Case |
| Surgery; n = 29 | 24 Yes 5 No | 24 Yes 18 No | | | 21 Yes 8 No | 21 Yes 21 No |
| No surgery; n = 22 | 12 Yes 10 No | 13 Yes 10 No | | | 6 Yes 16 No | 7 Yes 16 No |
| Effect size: Hedges' d | 0.75 | 0.01 | | | 1.06 | 0.45 |
| Lower limit of 95% confidence intervals | 0.07 | −0.55 | | | 0.37 | −0.14 |
| Upper limit of 95% confidence intervals | 1.46 | 0.58 | | | 1.74 | 1.04 |
| p value for effect size | 0.04 | 0.96 | | | 0.003 | 0.14 |
Our analysis of percentage data from this group for symptom improvement,
overall results of treatment, and patient satisfaction with treatment are
presented in
Table 34. Since these patients with
moderate symptoms were not randomized to treatment, the results may be
biased by unknown differences between the two groups. However, these results
do provide some evidence that among patients with lumbar spinal stenosis who
have moderate pain, surgery may be more beneficial than conservative
treatment. Four-year outcomes for this same group of patients were recently
published. (
Atlas, Keller, Robson et al.,
2000) The data from this report continue to show better outcomes
among patients who initially had moderate pain and received surgery. Of the
68 patients originally treated nonsurgically, 15 (22 percent) underwent
surgery after 3 to 48 months (median 17 months).
The validity of the results of this study is threatened by a high dropout
rate and the authors' failure to report the characteristics of those
patients who dropped out. The article reports that 148 patients were
enrolled in the study, and those in the 25th to 75th percentiles for
severity were considered "moderate." While the percentile calculation would
suggest that 74 patients should be in this category, only 54 patients were
reported (31 surgical, 23 nonsurgical). We presume that nine patients in the
moderate-severity group had not reached the one-year followup, because the
authors report a total of 130 patients who had reached followup (half of
which is 65). This leaves 11 patients unaccounted for. To test whether these
dropouts could have threatened the validity of the observed effect, we
repeated the effect size calculation under a worst-case scenario, in which
all 11 patients were arbitrarily counted as surgical failures. When the
calculations were repeated in this manner, effect sizes remained significant
for two of the three outcomes of interest at the 1 year followup: "major
symptom much better" and patient satisfaction. In the worst-case scenario,
the effect size for overall treatment results decreased to a statistically
insignificant value. The calculations and effect sizes are shown in
Table 34.
Atlas et al. (2000) reported on
changes in patients' predominant symptom and satisfaction with treatment at
4 years. Using the data from 29 surgical patients and 22 nonsurgical
patients, the authors showed that surgical patients had significantly better
outcomes. Our analysis of the reported data and the results of our
worst-case scenario for the selected four-year outcomes are shown in
Table 34. Overall, results of
treatment at four years were not reported by Atlas et al. In the worst-case
scenario, the greater number of dropouts at four years (14 instead of 11)
may have decreased the effect sizes for both symptomatic improvement and
patient satisfaction to a statistically nonsignificant level. However, the
effect sizes were still positive. Therefore, the statistical significance of
the observed long-term effects is not robust to this worst-case sensitivity
analysis of dropouts.
Swezey (1996) retrospectively
evaluated the progress of 47 patients diagnosed with neurogenic claudication
5 years earlier. At the time of diagnosis, the patients' average age was 76
years. Forty-three of these patients had moderate to marked lumbar spinal
stenosis. The authors do not report how patients were judged to be mild,
moderate, or severe with regard to stenosis. No spinal canal measurements
were reported. All patients were started on conservative treatment
(exercise, use of a cane, analgesic and nonsteroidal anti-inflammatory
drugs). Thirteen patients received epidural steroid injections when other
measures did not provide relief from neurogenic claudication. During the
5-year period, 11 patients were given laminectomy to relieve symptoms. This
group was considered to have a greater proportion of moderate to severe
neurogenic claudication, and all were considered improved after surgery.
However, the lack of data on spinal canal diameters reduces the usefulness
of this data in predicting who will benefit from conservative treatment and
who will need surgery. Of the other 36 patients, 20 reported improvement in
symptoms, 14 reported no change, and two reported a worsening of symptoms.
This is one of the few trials that follows patients from diagnosis through
conservative treatment and then to surgery. Although the numbers may be too
small to provide a reliable estimate of the success of conservative
treatment, this study indicates that 72 percent of patients (34 of 47) that
begin conservative treatment will improve or remain the same, while 23
percent of patients (11 of 47) will eventually receive surgery for relief of
symptoms. Those patients that receive surgery will tend to have greater
stenosis.
Herno et al. (1996) attempted to
generate comparable treatment groups by retrospectively matching surgical
patients to a group of 57 nonsurgical patients. Patients were matched
according to sex, age, myelographic findings, major symptoms, and duration
of symptoms. Fifty-four matched pairs were constructed. Total spinal block
and subtotal block occurred in only one and three matches, respectively. The
remainder of the matches had AP diameters of <12 mm (31 patients) or
had lateral stenosis (19 patients). The followup periods were 4.3 and 4.1
years for the nonsurgery and surgery groups, respectively. At followup,
measures of disability and functional status were similar between groups.
The authors caution that an important shortcoming of this retrospective
study is the lack of knowledge about starting pain level and disability in
either group and that patients with more pain are likely to have selected
surgery over conservative treatment. This precludes any reliable comparison
of surgery and medical treatment.
Hurri et al. (1998)
retrospectively assessed the outcomes of surgery and conservative treatment
12 years after treatment began. Among the 57 patients in the surgery group,
26 (46 percent) were considered to have severe stenosis (<7.0 mm
sagittal diameter of the canal), while in the nonsurgery group, 6 of 18
patients had severe stenosis. The authors report that there were no
statistically significant differences in the percentage of patients improved
(63 percent surgery vs. 44 percent conservative) or worse (18 percent vs. 11
percent) after 12 years. Although long followup periods are, in general,
desirable, the authors suggest that after such a long period, any current
patient problems may be caused by factors other than the original stenosis
and that this could obscure the efficacy of treatment. This, however, is a
difficult hypothesis to test. A study with multiple followup times would
seem to be needed, as well as an accounting of the length of time that
symptoms are relieved and the number of patients who received relief but
later had a return of symptoms. Therefore, this study was not used in our
analysis.
The study by Mariconda et al.
(2000) consisted of 20 patients who received surgical treatment
(standard wide decompressive laminectomy) and 17 patients who did not
receive surgery. This latter group consisted of 14 patients who refused
surgery and three patients who were not considered for surgery due to
advanced cardiopulmonary disease. The treatment given to the patients who
did not receive surgery was not described. All patients were older than 40
years, and the mean for the most stenotic dural sac cross-sectional area was
68 mm2 in surgical patients and 78 mm2 in
conservatively treated patients. Baseline characteristics were different
between the groups. The nonsurgery group had a better functional status as
determined by the overall Beaujon scoring system (mean and standard
deviation of 11 ±2.4 v. 8.1 ±2.7, p <0.05, t-test), the Beaujon score
for leg pain at exertion (0.8 ±0.7 v. 0.25 0.6, p <0.05, t-test), and
the Beaujon score for neurological deficits (3.3 ±1.0 v. 2.3 ±1.0, p
<0.05, t-test). In the Beaujon scoring system, the overall score goes
from 0 (worst) to 20 (normal), the leg pain at exertion score goes from 0
(severe) to 2 (none), and the neurological deficit score goes from 0 (major
or sphincter dysfunction) to 4 (none).
Figure 21. Beaujon Score from Mariconda,
Zanforlino, Celestino et al., 2000
The Beaujon Scoring System goes from a worst possible functional
score of 0 to a normal functional score of 20.
Figure 22. Leg Pain at Exertion Score
from Mariconda, Zanforlino, Celestino et al.,
2000
The Beaujon Scoring System for leg pain exertion goes from a score of
0 (severe) to 2 (no pain).
Figure 23. Beaujon Score for
Neurological Deficit from Mariconda, Zanforlino, Celestino et al.,
2000
The Beaujon score for neurological deficit goes from 0 (major or
sphincter dysfunction) to 4 (none).
Our calculation of the effect sizes (Hedges' d) for these baseline
characteristics were −1.1 and confidence intervals of −1.80 and −0.41 for
the overall Beaujon score; −0.83 and confidence intervals of −1.50 and −0.16
for leg pain at exertion; and −0.98 and confidence intervals of −1.66 and
−0.29 for neurological deficits. A comparison of outcomes and effect sizes
at pretreatment, 1, and 2 years posttreatment are presented in Figures
, , and
. While baseline scores showed differences, the
scores at one and two years posttreatment were nearly identical, implying
that the surgical group had benefited more than the conservative group.
However, this may be a case of each group reaching the maximum score
possible for these patients (a ceiling effect). In
Mariconda et al. (2000), patients with less severe
symptoms of lumbar spinal stenosis appeared to improve with conservative
treatment alone. However, the discussions we present in the section entitled
"Factors That Could Potentially Predict the Success of Conservative
Treatment" suggests that this conclusion is far from certain. The lack of a
nonsurgical control group with baseline scores similar to the surgical group
prevents a reliable evaluation of the benefits of surgery. Although scores
improved after surgery, we cannot know the extent to which these scores
would have improved or become worse if surgery had not been performed.
Some information on the efficacy of conservative vs. surgical treatment for
patients with severe symptoms may be obtained from prospective surgical
trials provided pretreatment patient characteristics are clearly stratified
and reported, and patients have previously failed conservative treatment. As
discussed later in Chapter 5, few
surgical studies report the use of and failure of prior conservative
therapy. Two single arm surgical trials (no nonsurgical patients were
included in the studies) have reported patient characteristics and the
failure of prior conservative treatment (Weiner, Walker, Brower et al., 1999; Kleeman, Hiscoe, and Berg, 2000). These studies
provide some evidence that patients with severe symptoms will not benefit
from conservative treatment but will benefit from surgical treatment.
Independent evaluation of patient outcomes in both trials increases the
validity of this.
Weiner et al. (1999) demonstrated
that partial laminectomy improved walking ability in 30 patients with severe
neurogenic claudication (average age of 68 years). Nine months after
surgery, the average walking ability increased from approximately 100 m to
between 300 m and 600 m. Other measures of patient outcomes also improved;
13 patients had almost complete pain relief and 13 patients had a good deal
of pain relief. In a similar study with 54 patients with neurogenic
claudication (average age 71 years), Kleeman et al. (2000) used partial laminectomy to relieve leg
pain. Two and a half years after surgery, 65 percent of patients reported
complete relief from their leg pain and 33 percent reported that their pain
was better.
Single arm trials (noncontrolled) have a large potential for bias in favor of
the therapy under investigation. Therefore, the studies by Weiner et al. (1999) and Kleeman et al. (2000) may be biased
in favor of reporting successful results for surgical treatment.
Uncontrolled, nonrandomized controlled trials, and historically controlled
trials have been shown to favor new therapies over the control therapy
compared to RCTs (Sacks, Chalmers, and
Smith, 1982; Colditz, Miller,
and Mosteller, 1989). Uncontrolled single treatment trials may
therefore be expected to produce an over estimate of the benefits of the
therapy.
Katz et al. (1999) prospectively
evaluated the pain and walking ability of 263 preoperative patients and 199
patients at 2 years post surgery. Patients had a mean age of 69 years at
enrollment (range of 50 to 92 years). The use of prior conservative therapy
is not reported. Katz et al.
(1999) reported that there were no significant differences between
patients who dropped out (27 percent of patients by 2 years) and those who
remained in the study. At enrollment, 81 percent of patients reported they
were in severe pain. By 2 years after surgery, this was 31 percent. Also at
2 years, the percentage of patients who could walk two blocks had increased
from 39 percent to 67 percent, and the percentage of patients who could walk
1 mile increased from 13 percent to 42 percent. Eighty percent or more of
the patients in this study may be in the severe category of symptoms, but
patient outcomes were not stratified according to preoperative conditions.
Other single arm prospective trials of surgical treatment also suggest that
patients improve after surgery (Jonsson, Annertz, Sjoberg et al., 1997;
Javid and Hadar, 1998).
However, these trials have patients with a wide range of ages and symptoms
that limit the use of their data for determining if mild-, moderate-, or
severe-symptom patients benefit from surgery. Jonsson et al. (1997) examined
105 patients with an average age of 65 years, but a range of 37 to 83. All
patients reported preoperative leg pain and dysfunction and were given
partial laminectomies. Eighty-six patients were available through the 5 year
followup period. At 2 years, leg pain was relieved in 67 percent of
patients, and the percentage of patients with poor walking ability
(<0.5 km) declined from 70 percent to 30 percent. By 5 years, only 52
percent of patients were free of leg pain and 35 percent of patients could
not walk more than 0.5 km. These results for walking capacity should be
contrasted with those of Weiner et al.
(1999). In this study, patients improved from less than 100
meters to an average of 500 meters. Under the Jonsson et al. (1997) rating
of walking ability, the Weiner et al.
(1999) patients would still be considered to have poor walking
ability. These differences in approach to rating walking ability and the
wide ranges in walking ability complicate the comparison of study results
and reduce the value of the data presented by Jonsson et al (1997). Average
walking distances are not reported for each of the categories of walking
ability in Jonsson et al. (1997). Therefore, we are not able to judge the
degree of improvement associated with each initial category of walking
ability, nor compare patients from Jonsson et al. (1997) with patients of
the same condition in Weiner et al.
(1999).
Javid and Hadar (1998) examined 86
patients with central stenosis (average 65 years, range of 27 to 89) and 23
patients with lateral recess stenosis (average 54 years, range of 25 to 79),
all of whom received a standard wide laminectomy. Preoperative leg pain and
walking difficulty was reported by 98 percent and 76 percent of patients
with central stenosis, respectively, and 96 percent and 57 percent of
patients with lateral recess stenosis, respectively. Data on walking ability
are only reported at the last followup (range 1 to 11 years). At the last
followup, patients with central stenosis judged their walking ability as
much better or somewhat better (69 percent), no different (10 percent), and
somewhat or much worse (21 percent). Patients with lateral recess stenosis
judged their walking ability as much better or somewhat better (68 percent),
no different (9 percent), and somewhat or much worse (23 percent). These
walking data are of little value because 24 percent of the central stenosis
patients and 43 percent of the lateral recess stenosis patients originally
reported no difficulty in walking, yet their evaluation of walking ability
is included in the last followup data. As in Jonsson et al. (1997), no
average walking distances are reported for each of the categories of walking
ability, and therefore, no comparisons to other studies are possible.