Introduction

Multicenter gathering of data in quality registries gives unique possibilities to achieve information from large groups of patients [1]. Data collection in registers is usually less stringent than in research projects. For example, data are not collected by the researchers, diagnoses may be uncertain, the quality of data is unknown, data may be missing, and the study design is often retrospective, even if data are collected prospectively [2].

Another major limitation of register studies involving questionnaires is often the relatively high proportion of non-responders. In two recent studies based on the Swedish National Spine register ‘Swespine’, the non-responders consisted of a higher proportion of men, were slightly younger, more frequent smokers, and had a longer preoperative leg and back-pain duration [3, 4]. Therefore, the responders may not be completely representative of the surgically treated population.

In individuals treated for lumbar degenerative disorders, others have compared follow-up data from the initial questionnaire responders to non-responders and did not find any major discrepancies between the two groups [5, 6]. However, the proportions of initial non-responders were low, 22% and 12%, respectively, and these studies were done at single hospitals. Thus, these results still leave uncertainty regarding the interpretation of data from a nation-wide register like Swespine.

We choose another way to elucidate the validity of Swespine, by comparing register data with a locally performed prospective observational study, the lumbar disc herniation study ‘LDHS’, which has an unusually high follow-up rate.

We hypothesized that patients treated with surgery for lumbar disc herniation in Swespine and LDHS have (1) similar preoperative characteristics, (2) similar outcome at 1 and 2 years, and (3) similar relation between patient satisfaction and ODI/VAS leg pain, despite a large variation in data collection between the two cohorts.

Materials and methods

Swespine

Swespine [1] was introduced in 1993 to obtain prospective data on outcome following spine surgery in Sweden and has since the start been developed continually. The coverage (the number of clinics performing spine surgery, using the register) is approximately 90%. The completeness (the number of patients reported to Swespine at the time of surgery) is approximately 75%. ‘Opt-out’ is used, which means that unless the patient actively declines participation, the surgeon registers diagnosis, type of surgical procedure, length of hospitalization, and any complications occurring during the in-patient stay.

The patient is asked to fill out a questionnaire without the assistance of health care personnel, including data on anthropometrics, co-morbidities, smoking status, medication, work, sick leave, and patient-reported outcome measurements (PROMs) before surgery, 1, 2, 5, and 10 years postoperatively. At the 1-year follow-up, the patients are also asked whether any complications occurred during the first 3 months after surgery. All questionnaires are mailed to the patient and answered without the assistance of personnel involved in the care. One reminder is sent.

Lumbar disc herniation study

The prospective observational study, LDHS, included patients admitted for lumbar disc herniation surgery at Södersjukhuset, Stockholm, Sweden. Details of the study have been presented earlier [7]. Inclusion criteria were: one-level paramedian or central lumbar disc herniation, radiculopathy with corresponding MRI finding, a duration of more than 2 months, or earlier if requiring hospitalization, a need for 25% or more of sick leave or similar disability, and an age of 18 years or older. Exclusion criteria were ‘cauda equina impaction’, previous lumbar spine surgery, intra- or extraforaminal localization of the disc herniation and conditions that could affect follow-up or outcome interpretation, such as difficulties to understand Swedish, severe psychiatric illness, drug abuse, or co-morbidity. The LDHS used the same questionnaires and management of these, as in Swespine, but with additional follow-ups at 6 weeks and 6 months. Research nurses ensured the data quality and frequent reminders for follow-ups were sent to reach a high response rate.

We applied the same inclusion and exclusion criteria to Swespine as used in the LDHS, except for those not available in Swespine; ‘corresponding MRI finding’, ‘duration of more than 2 months, or earlier if requiring hospitalization’, and a ‘need for 25% or more of sick leave or similar disability’ and information about ‘conditions that could affect follow-up’. At least 1 of the 7 outcome variables at 1 and 2 years postoperatively were needed for inclusion.

Flowcharts of the patients in the two cohorts are shown in Figs. 1 and 2.

Fig. 1
figure 1

Flow charts of patients in Swespine

Fig. 2
figure 2

Flow charts of patients in the Lumbar Disc Herniation Study

Patient-reported outcome measures

The following PROMs were used; back- and leg pain, measured from 0 (no pain) to 100 (maximum pain) on a ‘Visual Analog Scale’ (VAS) [8]; the ‘Oswestry Disability Index’ (ODI), a questionnaire for rating disability and function related to back problems [9] giving a score from 0 (best) to 100 (worst); ‘EuroQol 5-dimensions’ (EQ-5D), a form measuring quality of life, translated to an index between − 0.59 (worst) and 1.00 (best) [10] and preoperatively, ‘Short Form 36’ (SF-36) that for each of the eight sections range from 0 (worst) to 100 (best) [11]. SF-36 was in this study only used at baseline.

At the 1- and 2-year follow-ups, identical questionnaires were mailed to the patient, also including questions on ‘Satisfaction’ and ‘Global Assessment’ (GA) of back- and leg pain [12]. The Satisfaction question is formulated: ‘Are you satisfied with the result of the surgery?’ The three alternative answers were dichotomized into ‘Satisfied’, vs ‘Uncertain’ and ‘Dissatisfied’. The GA of back-pain question was formulated ‘How is your back pain today, when compared to before surgery?’ The five alternative choices were dichotomized into ‘Pain free’ and ‘Much better’ vs ‘Somewhat better’, ‘Unchanged’, or ‘Worse’. The GA question for leg pain was treated in the same way.

Reoperations and complications

Reoperations on the same spinal segment and side were reported by the surgeon performing the reoperation and counted if within 2 years. Complications occurring during the in-patient stay were reported by the surgeon and complications occurring during the first 3 months were reported by the patient at the 1-year follow-up.

Statistics

Descriptive data are presented as mean (SD), or number (%). The study groups are compared with the Student’s t test (if equal variances) or the Welch–Satterthwaite’s t test (if unequal variances) for continuous variables and with Chi-square test for categorical variables after dichotomization.

Outcome analysis was conducted with analysis of covariance for continuous variables and with logistic regression for categorical. Adjustments were made for variables with significant differences between the groups at baseline.

VAS leg pain and ODI may be considered the most important variables for outcome after lumbar disc herniation. Satisfaction with the surgical result and its relation to these two variables were compared between the cohorts. We used ‘Receiver Operating Characteristics’ (ROC) curves to define the optimal cut-off value for VAS leg pain and ODI at 1 year using Youden’s index (maximum = sensitivity + specificity − 1) [13]. The ‘Area Under the Curve’ (AUC) is the area under the curve to the bottom right corner, representing the accuracy of the test variable, to correctly classify the external criterion. We used the DeLong test for group comparisons of AUC:s [14].

SPSS version 23 was used for the statistical analysis. Missing data were excluded analysis by analysis. The level of significance was set to P < 0.05.

Ethical approval

The study was approved by the Ethics Committee at the Karolinska University Hospital Huddinge, Stockholm (number 310/98) and the Stockholm Regional Ethical Review Board (number 2012/206-31/1).

Results

When comparing the Swespine and the LDHS cohorts at baseline, there were statistically significant differences in age, co-morbidities, belief in return to work, retirement and disability pension, duration of back pain > 3 months, preoperative VAS back pain, SF-36 Role Physical and SF-36 Role Emotional (Table 1).

Table 1 Baseline characteristics in the Swespine and the LDHS groups

After surgery, the mean outcome improvement, satisfaction, and global assessment of leg and back pain were similar in both groups except from VAS back pain (unadjusted analysis) at 1 and 2 years (Table 2). After adjustment for baseline differences in analyses of covariance, VAS back-pain lost significance, but ODI and EQ-5D reached significance, although showing very small differences in absolute values (data not shown). Looking at absolute values of the outcome (Table 3) at 1 year, there were no statistically significant differences, except at 2 years, when ODI and EQ-5D were more favourable in Swespine. Adjustment for baseline differences in analyses of covariance increased the significance, but the absolute differences were still small (data not shown). There were no evident trends of outcome change between 1 and 2 years within the two cohorts, despite a larger loss to follow-up in the Swespine cohort (Table 3).

Table 2 Postoperative results at 1 and 2 years, shown as the mean change (Δ) from pre- to postoperative values (SD) for VAS leg and back pain, ODI, and EQ-5D. For the questions on ‘Satisfaction’ and ‘Global assessment’ (GA), the proportion of ‘satisfied‘ and ‘pain free‘ or ‘much better‘, respectively, is shown
Table 3 Absolute outcome values at 1 and 2 years for the continuous variables in the two groups

Reoperations and other complications did not differ between the groups, except for surgical site infection, being more common in the LDHS cohort (Table 4).

Table 4 Number (%) of patients reporting complications and reoperations in the Swespine and the LDHS cohorts

The ROC curves for LDHS and Swespine are shown in Fig. 3. The cutoff for being ‘Satisfied’ with the surgical result at the 1-year follow-up was 24 for VAS leg pain and 22 for ODI in Swespine, and 26 and 27 in the LDHS cohort, respectively (Table 5). The ‘Area Under the Curve’ (AUC) shows ‘Good’ and ‘Excellent’ accuracy for Swespine and LDHS, at 1 year, respectively, with no significant differences between the groups (Table 5) [15].

Fig. 3
figure 3

‘ROC’ curves show the sensitivity and one specificity at all values of VAS leg pain and ODI at 1 year, corresponding to the variable Satisfaction dichotomized into satisfied vs uncertain or dissatisfied

Table 5 Values of the optimal cut-off points, calculated with Youden’s Index and the sensitivity and specificity at these points for each variable, i.e., below the cut-off points the patients regard themselves as satisfied in the Swespine and the LDHS cohort

Non-response analysis

The baseline variables of the patients missing at 1 and 2 years in Swespine were compared to the responders at the same time. The largest differences were: non-responders compared to the responders, more often males (60 vs 54%), younger (41.6 vs 44.8 years), smokers (28 vs 19%), unemployed (16 vs 11%), had heavy physical work strain (32 vs 24%), non-elective admission (20 vs 17%), and less retirement pension (5 vs 9%). At 2 years, similar differences between non-responders and responders were seen.

Discussion

Results from studies with a high follow-up rate, for example, randomized controlled trials, or cohort studies take long time and much effort to achieve. Gathering data from national quality or other multicenter registers often result in vast amounts of data during relatively few years. As discussed, the problem on hand is frequently a lower quality of data and non-responders. If these shortcomings could be handled properly, much would be gained in terms of earlier knowledge and clinical improvements and at lower cost and less spend of scientific resources.

A direct comparison between Swespine register data and a much smaller cohort study with a high degree of follow-up after surgery for lumbar disc herniation showed some differences in baseline variables, but similar clinical outcome.

Any differences in VAS leg pain, ODI, VAS back pain, and EQ-5D at baseline, as well as change from baseline to 1- and 2-year postoperative, were non-significant or small and below the reported minimal clinical important differences (MCID) [16,17,18]. Despite a loss to follow-up in the Swespine cohort between 1 and 2 years, no change in the relation to the data in the LDHS cohort was found.

The LDHS group was in mean 3 years younger than the Swespine group. The reason for this is unknown, but may explain some of the difference in the variables concerning co-morbidity, retirement, and belief in working ability. More co-morbidity, a higher proportion of retired and a higher age, could lead to a less favorable outcome in Swespine, but was not reflected in the outcome data comparisons at 1 and 2 years [19, 20].

The concept of PASS (Patient Acceptable Symptom State), i.e., the level of pain or discomfort a patient regards as acceptable, has recently been suggested to be 22 for ODI based on the ‘Core Outcome Measures Index score’ (COMI) as external criterion [21]. The absolute values of the outcome in our study are almost identical in the two cohorts at 1 and 2 years. In both cohorts, the mean ODI values are lower (18–21) than the proposed PASS value of 22. As these values are means of the whole study populations, they include also the patients not satisfied, implying that the means of the satisfied study groups are even lower. Patients in our two cohorts had similar relationships between the external criterion, ‘Satisfaction’, and the level of ODI and VAS leg pain. The LDHS group had slightly higher cut-off points than the Swespine group, possibly a result of the also slightly higher initial back-pain value, enabling a little higher pain tolerance. The differences were though clearly lower than the suggested MCID values [16,17,18].

The outcomes of ‘Satisfaction’ and ‘GA back- and leg pain’ were also very similar in the two groups, showing around 75% of acceptable results, which is a result that seems to be rather reproducible in degenerative low back surgery through the decades [19, 22, 23]. As the aim of this study is to show if there is any real difference between the groups, the unadjusted outcome values are presented, but even adjusted for variables showing differences in baseline values, this did not alter the results essentially.

Regarding complications and reoperations, the only significant difference is an approximately 50% higher relative infection rate postoperatively in the LDHS cohort. This finding is probably an effect of more efficient reporting in the LDHS group after discharge, as there are both 6-week and 6-month follow-ups. Thromboembolic events may also appear after discharge from hospital, but are fewer and often more serious, thus probably better reported. Dural or neural damage is reported during hospitalization time irrespective of study group.

The study design has some limitations. The study cohorts are prospectively collected, but with probably less stringent inclusion criteria in Swespine. Attempts were made to imply similar inclusion and exclusion criteria retrospectively in the two cohorts. As some information is not included in Swespine, the cohorts can never be perfectly matched. Even though we had no MRI information in Swespine, we assume that MRI findings corresponded to symptoms also in the Swespine cohort. In addition, baseline data on duration of leg pain and sick leave were without significant differences between the groups, indicating that the cohorts were comparable.

In the non-response analysis, the missing groups at 1 and 2 years were compared to the responders at the same time. The absolute differences between these groups were mainly small, although the relative difference in smokers and unemployment, for instance, was about 50%. Altogether, this could result in a less favorable outcome for the non-responders, leaving the responders to a possibly better outcome. However, the results of this study do not show any difference in clinical importance in any outcome variable. Results from the previous studies in various areas have usually found that responders seem to be healthier, with a better outcome than non-responders [24,25,26]. However, this does not seem to be the case after surgical treatment for lumbar degenerative disorders [5, 6]. Our data further support this, even though we cannot be certain that our findings are valid in other lumbar disorders. In addition, we also note that the lack of a trend of change in outcome between the 1- and 2-year follow-ups in both cohorts indicate that a 1-year follow-up may be sufficient when studying outcome after lumbar discectomy.

Conclusion

Higher response rates than seen in Swespine are not needed to achieve reasonably representative data on patient-reported outcome for large cohorts. Two-year patient-reported outcome data do not seem to add additional information in the follow-up of lumbar disc hernia surgery.