Intended for healthcare professionals

CCBYNC Open access
Research

Surgery with disc prosthesis versus rehabilitation in patients with low back pain and degenerative disc: two year follow-up of randomised study

BMJ 2011; 342 doi: https://doi.org/10.1136/bmj.d2786 (Published 19 May 2011) Cite this as: BMJ 2011;342:d2786
  1. Christian Hellum, orthopaedic surgeon1,
  2. Lars Gunnar Johnsen, orthopaedic surgeon23,
  3. Kjersti Storheim, physiotherapist456,
  4. Øystein P Nygaard, neurosurgeon2,
  5. Jens Ivar Brox, consultant1,
  6. Ivar Rossvoll, orthopaedic surgeon23,
  7. Magne Rø, consultant7,
  8. Leiv Sandvik, professor8,
  9. Oliver Grundnes, orthopaedic surgeon5
  10. and the Norwegian Spine Study Group
  1. 1Department of Orthopaedics, Oslo University Hospital and University of Oslo, Kirkevn 166, 0407 Oslo, Norway
  2. 2National Centre for Diseases of the Spine, University Hospital of Trondheim, 7030 Trondheim
  3. 3Orthopaedic Department, University Hospital of Trondheim, 7030 Trondheim
  4. 4Norwegian Research Centre for Active Rehabilitation (NAR), Department of Orthopaedics, Oslo University Hospital, Kirkevn 166, 0407 Oslo
  5. 5Hjelp24, Nimi, Oslo Sognsveien 75 D, 0855 Oslo
  6. 6Faculty of Medicine, Norwegian University of Science and Technology, Trondheim, Høgskoleringen 1, 7491 Trondheim (MR), FORMI, Oslo University Hospital, Kirkevn 166, 0407 Oslo
  7. 7Multidiscipline Spinal Unit, Department of Physical Medicine and Rehabilitation, University Hospital of Trondheim, 7030 Trondheim
  8. 8Section for Biostatistics and Epidemiology, Oslo University Hospital, Kirkevn 166, 0407 Oslo
  1. Correspondence to: C Hellum christian.hellum{at}medisin.uio.no
  • Accepted 25 March 2011

Abstract

Objective To compare the efficacy of surgery with disc prosthesis versus non-surgical treatment for patients with chronic low back pain.

Design A prospective randomised multicentre study.

Setting Five university hospitals in Norway.

Participants 173 patients with a history of low back pain for at least one year, Oswestry disability index of at least 30 points, and degenerative changes in one or two lower lumbar spine levels (86 patients randomised to surgery). Patients were treated from April 2004 to September 2007.

Interventions Surgery with disc prosthesis or outpatient multidisciplinary rehabilitation for 12-15 days.

Main outcome measures The primary outcome measure was the score on the Oswestry disability index after two years. Secondary outcome measures were low back pain, satisfaction with life (SF-36 and EuroQol EQ-5D), Hopkins symptom check list (HSCL-25), fear avoidance beliefs (FABQ), self efficacy beliefs for pain, work status, and patients’ satisfaction and drug use. A blinded independent observer evaluated scores on the back performance scale and Prolo scale at two year follow-up.

Results The study was powered to detect a difference of 10 points on the Oswestry disability index between the groups at two years. At two years there was a mean difference of −8.4 points (95% confidence interval −13.2 to −3.6) in favour of surgery. In the analysis of prespecified secondary outcomes, there were significant differences in favour of surgery for low back pain (mean difference −12.2, −21.3 to −3.1), patients’ satisfaction (63% (n=46) v 39% (n=26)), SF-36 physical component score (mean difference 5.8, 2.5 to 9.1), self efficacy for pain (mean difference 1.0, 0.2 to 1.9), and the Prolo scale (mean difference 0.9, 0.1 to 1.6). There were no significant differences in return to work, SF-36 mental component score, EQ-5D, fear avoidance beliefs, Hopkins symptom check list, drug use, and the back performance scale. One serious complication of leg amputation occurred during surgical revision of a polyethylene dislodgement. The drop-out rate was 20% (34) and the crossover rate was 6% (5).

Conclusions Surgical intervention with disc prosthesis for chronic low back pain resulted in a significantly greater improvement in the Oswestry score compared with rehabilitation, but this improvement did not clearly exceed the prespecified minimally important clinical difference between groups of 10 points, and the data are consistent with a wide range of differences between the groups, including values well below 10 points. The potential risks of surgery and the substantial amount of improvement experienced by a sizeable proportion of the rehabilitation group also have to be incorporated into overall decision making.

Trial registration NCT 00394732.

Introduction

Low back pain is common with a lifetime prevalence of about 59-84%.1 Although relatively few patients develop chronic low back pain with disability, it represents extensive individual, societal, and financial problems. In patients who have had longstanding or serious disabling low back pain in the previous 12 months, a third will improve and have less serious problems during the following year.2 Most patients who develop chronic low back pain, however, stay in this condition for years.

Fusion of assumed symptomatic segments in patients with chronic low back pain has been used widely, but randomised studies comparing fusion with non-surgical treatment indicate that a rehabilitation programme can be as effective as surgery. Four randomised studies have compared lumbar fusion with non-operative treatment.3 4 5 6 7 Fritzell et al found that fusion significantly reduced pain and disability compared with usual care.3 Brox et al and Fairbank et al compared fusion with a multidisciplinary rehabilitation programme focusing on cognitive intervention and supervised exercise.4 5 6 7 They found similar improvement in pain and disability in the two intervention groups.

During the past 25 years, insertion of a disc prosthesis has become an option. In the four published randomised studies comparing disc prosthesis with fusion, the clinical outcome of disc prosthesis was at least equivalent to that of fusion.8 9 10 11 As surgical procedures should be evaluated against non-surgical methods,12 13 we compared the efficacy of disc prosthesis and a multidisciplinary rehabilitation programme.

Methods

Study design

A multicentre study conducted at five university hospitals in Norway included patients with low back pain and degenerative discs. Patients were included in the period between April 2004 and May 2007 and were treated within three months after randomisation. They were randomised in blocks with a website hosted by the medical faculty. Allocation was concealed for all people involved in the trial. A coordinating secretary not involved in the treatment could access randomisation details on the internet. The patient and the treating unit were informed about the allocation shortly after randomisation. Randomisation was stratified by centre (the five university hospitals) and whether the patient had had previous surgery (microsurgical decompression) or not. Independent observers collected and entered data. Storage of data was allowed by the Norwegian data inspectorate.

Participants

Patients were referred from all health regions in Norway. They were recruited from local hospitals or primary care to their nearest university hospital as usual without any supplemental recruitment attempt. An orthopaedic surgeon and a specialist in physical medicine and rehabilitation examined the patients before enrolment. All patients were informed about the procedures and told that neither of the treatment methods was documented as superior to the other. Eligible patients were aged 25-55 and had low back pain as the main symptom for at least a year, structured physiotherapy or chiropractic treatment for at least six months without sufficient effect, a score of at least 30 on the Oswestry disability index, and degenerative intervertebral disc changes in L4/L5 or L5/S1, or both. Degeneration had to be restricted to the two lower levels. We evaluated the following degenerative changes: at least 40% reduction of disc height,14 Modic changes type I or II, or both,15 high intensity zone in the disc,16 and morphological changes classified as changes in signal intensity in the disc of grade 3 or 4.17 The disc was classified as degenerative if the first criterion alone or at least two changes were found on magnetic resonance imaging. The discs were independently classified by two observers (orthopaedic surgeon/radiologist). When there was disagreement, a third observer classified the images and the outcome was decided by simple majority.

Degeneration of the facet joints was not an exclusion criterion, but symptoms of nerve root involvement were. Details of further inclusion and exclusion criteria, compliance with randomisation, and drop-outs are listed in the appendix 1 on bmj.com.

Study interventions

Rehabilitation—The rehabilitation was based on the treatment model described by Brox et al4 and consisted of a cognitive approach and supervised physical exercise. A team of physiotherapists and specialists in physical medicine and rehabilitation directed the multidisciplinary treatment. Other specialists, such as psychologists, nurses, social workers, etc, could complete the team. The intervention was standardised through three seminars and videos and lecture sessions for the treatment providers before the study. The intervention was organised as an outpatient treatment in groups at the involved university hospitals and lasted for about 60 hours over three to five weeks. The treatment consisted of lectures and individual discussions focusing on relevant topics (such as anatomy and the physiological aspects of the back, diagnostics, imaging, pain medicine, normal reactions, coping strategies, family and social life, and working conditions), daily workouts for increased physical capacity (endurance, strength, coordination, and specific training of the abdominal muscles and the lumbar multifidus muscles), and challenging patients’ thoughts about, and participation in, physical activities previously labelled as not recommended (such as lifting, jumping, vacuum cleaning, dancing, and ball games). Follow-up consultations were conducted at six weeks, three months, six months, and one year after the intervention. See appendix 2 on bmj.com for detailed description of the rehabilitation intervention.

Surgery—The surgical intervention consisted of replacement of the degenerative intervertebral lumbar disc with an artificial lumbar disc (ProDisc II, Synthes Spine). The ProDisc consists of three pieces: two metal endplates of cobalt chromium molybdenum alloy and a core (made from ultrahigh molecular weight polyethylene) fixed to the inferior endplate after insertion. Surgeons used a Pfannenstiel or a para-median incision with a retroperitoneal approach. A nearly complete discectomy was performed with removal of the cartilaginous endplates and a sufficient release of the posterior longitudinal ligament to ensure disc space mobilisation. A fluoroscope was used to ensure that the prosthesis was placed in the midline and sufficiently towards the posterior edge of the vertebrae. All hospitals participating in the study used the same artificial lumbar disc device. One surgeon at each centre had main responsibility for the operation (five centres and five surgeons). Surgeons were required to have inserted at least six disc prostheses before performing surgery in the study. There were no major postoperative restrictions. Patients were not referred for postoperative physiotherapy, but at six weeks’ follow-up they could be referred for physiotherapy if required, emphasising general mobilisation and non-specific exercises.

Outcome measures

The primary outcome measure was pain and disability measured with version 2.0 of the Oswestry disability index,18 translated into Norwegian and tested for psychometric properties by Grotle et al.19 (Scores range from 0 to 100, with lower score indicating less severe pain and disability.) Secondary outcomes included low back pain (measured with a visual analogue scale, ranging from 0 (no pain) to 100 (worst pain imaginable)) and general health status assessed with SF-36 (scores range from 0 to 100, higher scores correspond to better health status)20 21 and EQ-5D (scores range from −0.59 to 1 (1 equals perfect health)).22 For psychological variables we included emotional distress (Hopkins symptom check list (HSCL-25), scores range from 1 to 4, with lower scores indicating less severe symptoms) and the fear avoidance belief questionnaire (FABQ) for work and physical activity (scores range from 0 to 42 (work) and from 0 to 24 (physical), with lower scores indicating less severe symptoms).23 24 Self efficacy beliefs for pain were registered by a subscale of the arthritis self efficacy scale (scores range from 1 to 10 and are summarised and divided by 5; lower scores indicate uncertainty in managing the pain).25 Work status was evaluated as suggested by Fritzell et al.3 (See table A in appendix 3 on bmj.com.) We calculated a net back to work rate, subtracting patients who went back to work from patients who stopped working, satisfaction with the result of the treatment on a seven point Likert scale, and satisfaction with care on a five point Likert scale.26 Further daily consumption of drugs was registered. Patients attended for follow-up visits at six weeks, three and six months, and one and two years (the main end point of follow-up was at two years). At two years we sent a questionnaire including the most important outcome measures to 29 of the 34 patients who were lost to follow-up (see table B in appendix 3 on bmj.com).

At the two year follow-up, two independent observers blinded to treatment evaluated patients using the back performance scale (consists of five tests with a score ranging from 0 to 15, worst possible)27 and the Prolo scale (consists of functional and economic parts, which are summed to a worst score of 2 and a best score of 10).28 Patients were informed before this session not to reveal the treatment received, and had tape placed on their abdominal wall to hide the scarring from the operation. We also carried out a full health economic analysis, which will be reported elsewhere.

Statistical considerations

The trial was designed to have 80% power to detect a significant difference of at least 10 points in change in the mean Oswestry disability index score between the intervention groups at two year follow-up.5 Baseline standard deviation was estimated at 18.18 Considering these assumptions and adding 25% for a multicentre study design and 30% for possible drop-outs, we estimated we required 180 patients.

Planned analyses

The main statistical analysis was in the intention to treat population at one and two year follow-up. According to our protocol the analysis was performed with the assumption that patients who dropped out had no improvement after drop-out (last value carried forward). We also determined if different centres had different outcomes. We used χ2 test or Fisher’s exact test to analyse categorical variables and independent two sided t test or analysis of variance to analyse continuous variables. A significance level of 5% was used throughout. All statistical analyses were performed with SPSS version 16.0. We did not adjust for significantly different baseline scores.

Unplanned analyses (analyses not recorded in the original protocol)

We conducted a per protocol analysis for the primary outcome variable (score on Oswestry disability index). Consistent with criteria from the Food and Drug Administration,8 we considered an individual change in score of at least 15 points from baseline to two year follow-up as a minimal important change. A deterioration of 6 points in the score was considered a “change for the worse.”29 We calculated the number needed to treat with confidence intervals.30 A mixed model analysis was used to evaluate the effect of each efficacy variable over time and between groups. In the mixed model patients were not excluded from the analysis of an efficacy variable if the variable was missing at some, but not all, time points after baseline. In the additional analysis (categorical or ordinal data at two year follow-up), missing data were not replaced. Significantly different baseline scores were not adjusted for in the longitudinal model. Each outcome variable was adjusted for the baseline values of the variable.

Results

Of the 605 patients screened for eligibility, 173 were included in the study and treated between April 2004 and September 2007 (86 with surgery and 87 with rehabilitation) (fig 1). The drop-out rate from inclusion to two year follow-up was 20% (n=34) (15% (n=13) in the surgical arm and 24% (n=21) in the rehabilitation arm). Five patients (6%) crossed over from rehabilitation to surgery, but none crossed from surgery to rehabilitation. Of the 34 patients lost to follow-up, 26 answered a questionnaire two and a half to five years after treatment (see table B in appendix 3 on bmj.com).

Figure1

Fig 1 Enrolment, randomisation, and follow-up of study patients, showing cumulative values at two years. *Not enough degenerative change to satisfy inclusion criteria (n=29), degenerative changes in more than two lower lumbar discs (n=80), Oswestry disability index score too low (n=88), did not want to undergo surgery (n=28), did not want to participate in rehabilitation (n=20), too much general pain (n=20), had previously been through similar training programme (n=26), and other reasons (n=135; deformity, psoriasis arthritis, language problems, coccygodynia, age, fracture, previous operation, tumour, spondylodiscitis, hip arthrosis). †Coronary heart disease and heart attack some days after randomisation (n=1); obvious exclusion criterion discovered some days after randomisation (n=50; earlier large abdominal operation (n=1), not enough degenerative change to satisfy inclusion criteria (n=2), degenerative changes in more than two lower lumbar discs (n=2). ‡One patient received one of two disc prostheses because of bleeding. §One patient with serious vascular complication underwent secondary leg amputation and was lost to follow-up. ¶One patient crossed over between 6 months and 1 year and five patients between 1 year and 2 years. Five patients underwent surgery with disc prosthesis and one patient with fusion. **Two patients underwent surgery with instrumented fusion before two year follow-up. ††One patient excluded because of missing baseline values and follow-up values

Patients’ characteristics

Most baseline characteristics were similar in the two treatment groups (table 1). Low back pain score and SF-36 mental health subscores, however, were significantly worse in the rehabilitation group than in the surgery group.

Table 1

 Baseline characteristics in patients with low back pain and degenerative disc randomised to disc prosthesis surgery or rehabilitation. Figures are numbers (percentage) unless stated otherwise

View this table:

Surgical treatment and complications

Of the patients randomised to surgery, 25 (33%) underwent two level surgery. Median surgical time was 165 minutes (range 72-570 minutes) and median blood loss was 310 ml (range 50-6000 ml) (table 2). Four patients had bleeding of more than 1500 ml.

Table 2

 Treatment and complications in 77 patients with low back pain and degenerative disc randomised to disc prosthesis surgery

View this table:

Six patients (8%) had complications resulting in impairment at two year follow-up, and the reoperation rate was 6.5% (n=5) (table 2). One patient had a serious complication: at the three month follow-up, the polyethylene inlay was found to be dislodged. During revision surgery, injury to the left common iliac artery led to compartment syndrome resulting in a lower leg amputation. One patient reported retrograde ejaculation at one year follow-up. At two year follow-up, two patients reported sensory loss in the thigh and two patients reported new radicular pain. In addition, one patient had an arterial thrombosis of the dorsalis pedis artery, which temporarily resulted in a slightly colder foot. Table 2 presents further complications. Two patients had an additional fusion and two patients had partial resection of the spinous processes because of persistent back pain.

Primary outcome

Planned analyses according to protocol

The mean change Oswestry disability index score from baseline to two year follow-up was 20.8 (95% confidence interval 16.4 to 25.2) in the surgery group and 12.4 (8.5 to 16.3) in the rehabilitation group (table 3). The mean treatment effect (difference between groups) at two year follow-up was −8.4 (−13.2 to −3.6) in the intention to treat analysis (last value carried forward). Subgroup analysis showed no differences in the main outcome variable between centres and level(s) operated on.

Table 3

 Planned analysis of primary outcome in patients with low back pain and degenerative disc randomised to disc prosthesis surgery or rehabilitation. Mean (SD) outcome values on Oswestry disability index (ODI) at 12 and 24 months and treatment effect

View this table:

Unplanned analyses

In the mixed model analysis, the Oswestry score improved significantly more in the surgical group than in the rehabilitation group at all time points, in both the intention to treat (fig 2) and per protocol analyses (table 4). The mean change from baseline to two year follow-up was 22.5 (intention to treat) (95% confidence interval 18.5 to 26.4) in the surgery group and 15.6 (intention to treat) (11.7 to 19.5) in the rehabilitation group. The mean treatment effect (difference between groups) at two year follow-up was 6.9 (2.1 to 11.7) in the intention to treat analysis. In an analysis in which the patient with lower leg amputation was given worst score in the group, the difference between the groups remained significant (P<0.001). Some 70% (n=51) of the patients in the surgery group and 47% (n=31) of the patients in the rehabilitation group had an improvement in Oswestry score of at least 15 points (P<0.006) (intention to treat). The number needed to treat was 4.4 (2.6 to 14.5). Worsening of low back pain was experienced by 11% (n=8) of the surgical group and 9% (n=6) of the rehabilitation group. Subgroup analysis showed no differences in the main outcome variable between centres and level(s) operated on.

Table 4

 Unplanned analysis of primary outcome in patients with low back pain and degenerative disc randomised to disc prosthesis surgery or rehabilitation. Mean (SD) outcome values on Oswestry disability index (ODI) at follow-up and treatment effect (difference (95% confidence interval)), minus values indicating larger improvement in outcome with surgery

View this table:
Figure2

Fig 2 Primary outcome variable within intention to treat mixed model analysis. Mean difference in Oswestry disability index (ODI) was 6.9 points at two year follow-up, P<0.001 (adjusted for baseline index)

Secondary outcomes

Planned analyses according to protocol

Low back pain, SF-36 physical summary, and patients’ satisfaction improved significantly more in the surgical group than the rehabilitation group at two year follow-up (table 5). The mean difference between the groups in change from baseline to two year follow-up was −12.2 (95% confidence interval −21.3 to −3.1) for low back pain and 5.8 (2.5 to 9.1) for SF-36 physical summary. On the seven point global rating scale at two years, 63% (46) of patients in the surgery group and 39% (26) in the rehabilitation group (P=0.005 for difference between treatment groups) considered themselves completely recovered or much improved. Self efficacy for pain favoured the surgical group. SF-36 mental summary, EQ-5D, FABQ work and physical, HSCL-25, return to work, and drug consumption did not differ at two year follow-up. At the start of the study, 28% (46) of patients were at work full or part time; at two year follow-up, this had increased to 56% (n=74). There was a “net back to work” rate of 31% (n=21) in the surgical group and 23% (n=15) in the rehabilitation group (P=0.31) (table 5). Scores on the back performance scale did not differ significantly between the groups (−0.8, −1.8 to 0.2; P=0.10). The Prolo sum score favoured the surgical group, with a mean difference of 0.9 (0.1 to 1.6; P=0.019).

Table 5

 Planned analysis of secondary outcomes in patients with low back pain and degenerative disc randomised to disc prosthesis surgery or rehabilitation. Mean (SD) values at 12 and 24 months (unless stated otherwise) and treatment effect

View this table:

Unplanned analyses

In the mixed model analysis, low back pain (table 6), SF-36 physical summary (table 8), and EQ-5D, HSCL-25, and self efficacy for pain (table 9) improved significantly more in the surgical group than the rehabilitation group at all time points. The mean difference between the groups in change from baseline to two year follow-up for low back pain was −12.7 (95% confidence interval −21.1 to −4.2, table 6) and SF-36 physical summary 4.3 (0.8 to 7.9, table 8). Further analyses are shown in tables 7, 8, and 9.

Table 6

 Unplanned analysis in secondary outcome in patients with low back pain and degenerative disc randomised to disc prosthesis surgery or rehabilitation. Mean (SD) outcome values for back pain* at follow-up and treatment effect (difference (95% confidence interval))

View this table:
Table 7

  Unplanned analysis in secondary outcomes in patients with low back pain and degenerative disc randomised to disc prosthesis surgery or rehabilitation. Mean (SD) outcome values for SF-36*

View this table:
Table 8

  Unplanned analysis in secondary outcome in patients with low back pain and degenerative disc randomised to disc prosthesis surgery or rehabilitation. Mean (SD) outcome values for physical and mental component summary scores on SF-36* at follow-up and treatment effect (difference (95% confidence interval))

View this table:
Table 9

 Secondary outcomes in patients with low back pain and degenerative disc randomised to disc prosthesis surgery or rehabilitation. Mean (SD) outcome values on EQ-5D, HSCL-25, FABQ, and self efficacy at follow-up and treatment effect (difference (95% confidence interval))

View this table:

Discussion

This randomised trial comparing disc prosthesis with multidisciplinary rehabilitation showed a significant difference in the primary outcome variable (Oswestry disability index after two years) in favour of surgery. The difference between groups of 8.4 points on the index (with intention to treat analysis) at two year follow-up, however, was smaller than the difference of 10 points that the study was designed to detect. As evident in the confidence intervals, the data are consistent with a wide range of differences between the groups, including values well below 10 points. There is, as far as we know, no agreement on the size of the clinically important difference between two treatment groups. As an alternative we can assess the proportion of patients achieving a clinically meaningful improvement.31 By using a clinically meaningful improvement for an individual patient of 15 points on the Oswestry disability index,8 70% (n=51) of patients in the surgical group and 47% (n=31) of those in the rehabilitation group achieved at least this improvement (intention to treat). We will publish data on the estimated minimal clinically important change elsewhere, but the changes are in agreement with recommendations from FDA studies. As there is no consensus based agreement of how large a difference between groups must be to be of clinical importance it is impossible to conclude whether the effect found in our study is of clinical importance. As such a decision must be made before a new treatment can be recommended in clinical practice; our study underlines the need for such a consensus agreement.

The change in the Oswestry disability index score in our study is comparable with those seen in previous studies. In our study, the mean score was reduced by 29% (12.4 points) in the rehabilitation group (intention to treat analysis). Brox et al4 found a similar reduction of 29% (12.0 points) at one year follow-up, while Fairbank et al6 and Fritzell et al3 observed a smaller reduction at two year follow-up (8.7 and 5.5 points, respectively). In our study, there was a mean reduction in score of 50% (20.8 points) in the surgical arm (intention to treat analysis). Similar reductions have been reported in other studies,8 9 11 though Zigler et al used the “chiropractor version” of the Oswestry index.32 This questionnaire has not been sufficiently validated and consequently it is difficult to compare the outcome.18

It could be argued that patients who withdrew after randomisation or dropped out during or after treatment had a superior or inferior outcome. We therefore sent a questionnaire to such patients. The nine patients who withdrew after surgery experienced a reduction in Oswestry score of 30.2 (SD 4.5) points. The six who withdrew after rehabilitation had a reduction of 11.8 (SD 3.0), and the 11 patients who withdrew without treatment had no change (1.0 (SD 4.5) points) (see table B in appendix 3 on bmj.com). This might support the assumption of no improvement in outcome after drop-out, justifying use of the last value carried forward analysis.

Most changes in secondary variables measuring disability and pain favoured surgical treatment, though there were no significant differences between groups in FABQ work, FABQ physical, SF-36 mental health, EQ-5D, HSCL-25, drug consumption, return to work, and the back performance scale in the main analysis. In the surgical group we found a similar “net back to work” rate as reported by Fritzell et al.3 Nevertheless, it has been argued that sick leave, to a large extent, is influenced by factors outside the domain of medical and therapeutic interventions.33 The somewhat smaller difference between groups in the back performance scale than in the Oswestry disability index might be explained by differences in psychometric properties between the outcome measurements or by patients overstating the effect in a subjective questionnaire.

Strengths and limitations

Our study has several strengths. It was randomised and had few patients who crossed over to the other treatment regimen. In addition, an independent research assistant collected the data, the observers at the two year evaluation were blinded, the interventions were standardised, and the financing of the study was public. Choosing magnetic resonance imaging criteria for inclusion could be a strength or limitation. To our knowledge, there are no specific criteria to determine which degenerative changes should be operated on. When designing the study we wanted the inclusion of patients across centres to be as unanimous as possible, treating the same population, although this possibly would lead to less external validity of the study. It could also possibly lead to inclusion of more severe degenerated discs in our study compared with other studies.8 9

One limitation of our study is the lack of a placebo or sham group. The regression to the mean and the natural resolution of chronic low back pain must also be considered in both groups. When balancing a non-operative regimen with an operative treatment, there is probably a difference in placebo effect that is difficult to untangle from the treatment effect.34 35 36 37 The placebo effect might be higher in the surgical group, although the possible placebo effect of rehabilitation over several weeks with personal contact with a therapist should not be underestimated. Furthermore, it could be argued that the patients included in the study wanted surgery, but the number of patients not wanting the rehabilitation programme was similar to the number of patients not wanting surgery (see figure and appendix 1 on bmj.com). Brox et al found no difference in treatment effect between patients who did and did not “believe” in surgery,4 5 and a recent study found no significant relation between baseline expectations and follow-up scores.38 On the other hand, “expectation being fulfilled” might be a predictor of global outcome.38 During the inclusion process, we emphasised the advantages and disadvantages of the two treatment options and that none of the treatments are documented as superior to another. It is still possible, however, that patients in the rehabilitation group found themselves faced with “more of the same.” The lack of routine rehabilitation in the surgical arm could be another limitation in the study. We wanted to avoid the postoperative treatment containing elements from the rehabilitation programme. Hence, patients received only general advice when they were discharged from the hospital and received no rehabilitation in the first weeks after surgery. At six weeks, however, patients could be referred if required to a physiotherapist at their home for functional mobilisation and general muscle training. Furthermore, some surgical patients underwent a second operation but repeat rehabilitation was not considered. Patients did not request a second chance for rehabilitation, though they were advised during follow-up consultations. Another weakness in our study is the difference in compliance between groups and the high drop-out rate. This difference in adherence to the protocol probably leads to an underestimate of the true effect of surgery, especially in the intention to treat analysis. In similar studies comparing surgery with rehabilitation, the drop-out rates were similar to ours.6 39 40 41 The patients we included in our study were highly selected, with one or two level degenerative changes and good general health. Thus, our results are valid only in similar patients. Furthermore, we examined several secondary outcome variables that could lead to the detection of differences by chance. Although we conducted several unplanned analyses (not recorded in the original protocol), in common with similar studies, we consider it as an important asset to our data. Lately, similar studies have applied repeated measurements by using mixed models.40 Using unplanned analysis could be considered a weakness, but our findings in these analyses support our main analyses and strengthen our conclusion. Nevertheless, caution should be used in interpreting the results of non-prespecified analyses.

Potential harms of disc prosthesis surgery

Surgery carries a risk of serious complications, as seen in one of our patients. In a review by Inamasu et al, the perioperative vascular injury rate for anterior lumbar interbody fusion was 0-18% (mean 3%).42 This is an important drawback of surgery. No major differences in complication rates between insertion of a disc prosthesis and fusion have been found in a randomised setting.8 9 11 The short term reoperation rate in our study was 6.5% (n=5) and the vascular injury rate was 6.5% (n=5) (table 2). Although vascular complications are reported, serious consequences like amputation and mortality are rare.43 Recently Kurtz et al looked at the rates of short term revision and mortality total disc replacement.43 They found similar reoperation rates as with anterior fusion surgery and hip arthroplasty. Four retrospective studies have reported long term reoperation rates of up to 13%.44 45 46 47 Data on the anterior revision rate of the prosthesis is difficult to extract from these studies but seems considerably lower. The potential long term revision rate with a higher complication rate on revisions needs to be considered.48

Earlier addressed but unresolved questions are the incidence of adjacent level degeneration after total disc replacement and distinct characteristics of patients associated with good outcome. Some studies have examined these issues but more information is needed.49 50 51 In a univariate analysis we found indications that patients with Modic I or II changes have a superior result in the surgical arm and that patients with high Oswestry scores seem to be more suitable for rehabilitation. A full multivariate analysis of good outcomes will be published soon to answer these questions. Another important issue is the incidence of degeneration in the facet joints of the operated level. An analysis of adjacent level degeneration and degeneration of the operated level in addition to a full health economic analysis will be published later.

The total blood loss and operation time were higher in our study than in similar studies. The learning curve might be quite flat, and perhaps the participating surgeons should have carried out disc prosthesis surgery in more patients before the start of the study. Using a surgeon to expose the disc (access surgeon), might also have reduced the blood loss and operation time. Blumenthal et al and Zigler et al performed one level surgery, while a third of our patients underwent two level surgery.8 9 This could explain some of the increased blood loss and operation time in our study. Because of the complexity of the surgery and the risk of serious complications, we think this kind of surgery should be confined to a few specialist centres with experienced spine surgeons and available vascular surgeons. A high quality rehabilitation programme should be available.

Our study was not designed to evaluate specific mechanisms of reduction of pain and disability. Possible explanations for the pain reduction are removal of the disc in the surgical group and better coping in the rehabilitation group, but the patients were heterogeneous and probably had a mixed aetiology difficult to separate. Even though we did not have a control group, the mixed causes of chronic low back pain, the association of surgery with potentially serious complications, and the considerable improvement in the rehabilitation group suggest that it is reasonable to consider a rehabilitation programme before surgery.

What is already known on this topic

  • In patients with chronic low back pain, compared with fusion, the clinical outcome with disc prosthesis has been at least equivalent

  • Compared with multidisciplinary rehabilitation, improvement in disability and pain are similar

What this study adds

  • Surgery with disc prosthesis resulted in a significantly greater improvement in scores on the Oswestry disability index and variables measuring disability and pain, although the difference in Oswestry score between groups was lower than the study was designed to detect

  • There were no differences in return to work and several outcomes measuring mental health

Notes

Cite this as: BMJ 2011;342:d2786

Footnotes

  • We thank the patients participating in the study; Coast Hospital for Physical Medicine and Rehabilitation, Stavern, for videos and material for lectures for the rehabilitation intervention; Hege Andresen at St Olavs Hospital, Trondheim, for data coordination; Per Farup at St Olavs Hospital, Trondheim, for organising the web randomisation system; Astrid Woodhouse and Kirsti Vanvik from St Olavs Hospital for performing the two year control; and Lucy Hyatt for paid editorial assistance.

  • The Norwegian Spine Study Group

  • University Hospital North Norway, Tromso (eight patients): Odd-Inge Solem (department of orthopaedic surgery), Jens Munch-Ellingsen (department of neurosurgery), and Franz Hintringer, Anita Dimmen Johansen, Guro Kjos (department of physical medicine and rehabilitation).

  • Trondheim University Hospital, Trondheim (21 patients): Hege Andresen, Helge Rønningen, Kjell Arne Kvistad (national centre for spinal disorders, department of neurosurgery), Bjørn Skogstad, Janne Birgitte Børke, Erik Nordtvedt, Gunnar Leivseth (multidiscipline spinal unit, department of physical medicine and rehabilitation).

  • Haukeland University Hospital, Bergen (64 patients): Sjur Braaten, Turid Rognsvåg, Gunn Odil Hirth Moberg (Kysthospitalet in Hagevik, department of orthopaedic surgery), Jan Sture Skouen, Lars Geir Larsen, Vibeche Iversen, Ellen H Haldorsen, Elin Karin Johnsen, Kristin Hannestad (Outpatient Spine Clinic, department of physical medicine and rehabilitation).

  • Stavanger University Hospital, Stavanger (27 patients): Endre Refsdal (department of orthopaedic surgery).

  • Oslo University Hospital, Oslo (53 patients): Vegard Slettemoen, Kenneth Nilsen, Kjersti Sunde, Helenè E Skaara (department of orthopaedics), Anne Keller, Berit Johannessen, Anna Maria Eriksdotter (department of physical medicine and rehabilitation).

  • Contributors: All authors had full access to the data, were responsible for study concept and design, and critically revised the manuscript for important intellectual content. Acquisition of data: CH, LGJ, KS, OPN, MR, OG acquired the data, which were analysed and interpreted by HC , LGJ, KS, OPN, JIB, LS, IR, and OG. CH drafted the manuscript. CH and LS did the statistical analysis. CH, LJ, KS, OPN, JIB, MR, and OG provided administrative, technical, or material support. CH, KS, OPN, JIB, IR, LS, and OG supervised the study. CH is guarantor.

  • Funding: The study was funded by the South Eastern Norway Regional Health Authority and EXTRA funds from the Norwegian Foundation for Health and Rehabilitation, through the Norwegian Back Pain Association.

  • Competing interests: All authors have completed the Unified Competing Interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

  • Ethical approval: The study was evaluated and approved by the regional committees for medical research ethics in east Norway and all participants gave written informed consent. We did not obtain participants’ informed consent for data but the presented data are anonymised and risk of identification is low.

  • Data sharing: Dataset available from the corresponding author at christian.hellum{at}medisin.uio.no.

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: http://creativecommons.org/licenses/by-nc/2.0/ and http://creativecommons.org/licenses/by-nc/2.0/legalcode.

References

View Abstract