Original Article
Reliability of the Goutallier Classification in Quantifying Muscle Fatty Degeneration in the Lumbar Multifidus Using Magnetic Resonance Imaging

https://doi.org/10.1016/j.jmpt.2013.12.010Get rights and content

Abstract

Objective

The purpose of this study was to investigate the reliability of the Goutallier classification system (GCS) for grading muscle fatty degeneration in the lumbar multifidus (LM) using magnetic resonance imaging (MRI) examinations.

Methods

Lumbar spine MRI scans were obtained retrospectively from the radiology department imaging system. Two examiners (a chiropractic diagnostic imaging resident and a board certified chiropractic radiologist with 30 years of experience) independently graded each LM at the L4/5 and L5/S1 intervertebral level. ImageJ pixel analysis software (version 1.47; National Institutes of Health, Bethesda, MD) was used independently by 2 observers to quantify the percent fat of the LM and allow correlation between LM percent fat and GCS grade. Twenty-five subject MRIs were randomly selected. Magnetic resonance imaging scans were included if they were obtained using a 1.5 T imaging system and were excluded if there was evidence of spinal infection, tumor, fracture, or postoperative changes. For all tests, P < .05 was defined as significant.

Results

Intraobserver reliability grading LM fat ranged from a weighted κ (κw) of 0.71 to 0.93. Mean interobserver reliability grading LM fat was κw, 0.76 to κw, 0.85. There was a significant (P < .001) correlation between LM percent fat and GCS grade. Furthermore, interobserver reliability in determining percent fat was between intraclass correlation coefficient, 0.73 to intraclass correlation coefficient, 0.90.

Conclusions

In this study, the GCS was reliable in grading LM fatty degeneration and correlated positively with a quantified percent fat value. In addition, ImageJ software (National Institutes of Health) was reliable between raters when quantifying LM percent fat.

Section snippets

Selection of Subject Images

After approval from the Logan University Institutional Review Board, a sample of lumbar spine MRI scans was obtained retrospectively through the department of radiology picture archiving and communication system. The picture archiving and communication system data set consists of images obtained at both hospitals and imaging centers. In an effort to homogenize the sample, MRI examinations were only included if they were performed using a 1.5T GE system (GE Healthcare, Milwaukee, WI) and were

Results

The scan subject mean age was 55.84 years (range, 27-82). The mean GCS grade was 1.90 (standard deviation, 0.966; min, 0 and max, 4). Of the 400 assigned GCS grades, there were 34 grade 0, 82 grade 1, 198 grade 2, 62 grade 3, and 24 grade 4. Intra and interobserver reliability statistics in assigning a GCS grade are displayed in Table 1, Table 2. Table 3 summarizes the correlation data between respective GCS grades and LM percent fat. Interobserver reliability in quantifying LM percent fat

Discussion

To the best of the author’s knowledge, this is the first work to study the reliability of this grading system in grading LM fatty degeneration.

The primary aim of this study was to determine the intra and interobserver reliability of the GCS to grading LM fatty degeneration using MRI scans. In support of our hypothesis, the data indicated that there was substantial to almost perfect intra and interobserver reliability when assigning a GCS grade. Our secondary aim was to correlate the GCS grade

Limitations

The sample is small, and a larger study might influence the results in either direction. As MRI scans were obtained from multiple imaging centers, they inherently have different pixel intensity profiles. In an effort to mitigate this, only MRI scans obtained using a GE 1.5T scanner were included. However, individual MR imaging system scanner and acquisition parameters may influence the quantitative measures. Multiple 1.5T GE scanners were used to generate the patient images in this study.

Conclusion

The data from this study show substantial to almost perfect reliability of the GCS for grading fatty degeneration of the LM. The establishment of a reliable and convenient scale of grading LM fatty degeneration using MRI will facilitate future clinical research evaluating the relationship between LM fatty degeneration and the segmental instability of LBP along with its response to therapeutic effects of stabilization interventions.

Funding Sources and Potential Conflicts of Interest

No funding sources or conflicts of interest were reported for this study.

Contributorship Information

  • Concept development (provided idea for the research): PB, YM, AW, BH, and NK.

  • Design (planned the methods to generate the results): PB, YM, AW, BH, and NK.

  • Supervision (provided oversight, responsible for organization and implementation, and writing of the manuscript): PB, YM, AW, BH, and NK.

  • Data collection/processing (responsible for experiments, patient management, organization, or reporting data): PB, YM, AW, BH, and NK.

  • Analysis/interpretation (responsible for statistical analysis, evaluation,

References (29)

  • MA Fischer et al.

    Quantification of muscle fat in patients with low back pain: comparison of multi-echo MR imaging with single-voxel MR spectroscopy

    Radiology

    (2013)
  • L Kalichman et al.

    Indices of paraspinal muscles degeneration: reliability and association with facet joint osteoarthritis: feasibility study

    J Spinal Disord Tech

    (2013)
  • JH Min et al.

    Association between radiculopathy and lumbar multifidus atrophy in magnetic resonance imaging

    J Back Musculoskelet Rehabil

    (2013)
  • D Goutallier et al.

    Fatty muscle degeneration in cuff ruptures. Pre- and postoperative evaluation by CT scan

    Clin Orthop Relat Res

    (1994)
  • Cited by (87)

    • Fat content in lumbar paravertebral muscles: Quantitative and qualitative analysis using dual-energy CT in correlation to MR imaging

      2022, European Journal of Radiology
      Citation Excerpt :

      The fat content was evaluated with CT or MRI using either semiquantitative grading methods or histograms with the application of a threshold value. Goutallier classification system (GCS) has been established as a reliable indicator for assessing PMFI on CT and MRI [15-19]. However, this semiquantitative visual grading technique is unlikely to accurately identify PMFI [20].

    View all citing articles on Scopus
    View full text