Skip to main content

Main menu

  • Home
  • Content
    • Current Issue
    • Advance Online Publication
    • Archive
  • About Us
    • About ISASS
    • About the Journal
    • Author Instructions
    • Editorial Board
    • Reviewer Guidelines & Publication Criteria
  • More
    • Advertise
    • Subscribe
    • Alerts
    • Feedback
  • Join Us
  • Reprints & Permissions
  • Sponsored Content
  • Other Publications
    • ijss

User menu

  • My alerts

Search

  • Advanced search
International Journal of Spine Surgery
  • My alerts
International Journal of Spine Surgery

Advanced Search

  • Home
  • Content
    • Current Issue
    • Advance Online Publication
    • Archive
  • About Us
    • About ISASS
    • About the Journal
    • Author Instructions
    • Editorial Board
    • Reviewer Guidelines & Publication Criteria
  • More
    • Advertise
    • Subscribe
    • Alerts
    • Feedback
  • Join Us
  • Reprints & Permissions
  • Sponsored Content
  • Follow ijss on Twitter
  • Visit ijss on Facebook
Research ArticleSpecial Issue

Reliability Analysis of Deep Learning Algorithms for Reporting of Routine Lumbar MRI Scans

Kai-Uwe Lewandrowski, Narendran Muraleedharan, Steven Allen Eddy, Vikram Sobti, Brian D. Reece, Jorge Felipe Ramírez León and Sandeep Shah
International Journal of Spine Surgery December 2020, 14 (s3) S98-S107; DOI: https://doi.org/10.14444/7132
Kai-Uwe Lewandrowski
1Staff Orthopaedic Spine Surgeon Center for Advanced Spine Care of Southern Arizona and Surgical Institute of Tucson, Tucson, Arizona
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Narendran Muraleedharan
2Aptus Engineering, Inc, Scottsdale, Arizona, and Multus Medical, LLC, Phoenix, Arizona
BASME
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Steven Allen Eddy
3Multus Medical, LLC, Phoenix, Arizona
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Vikram Sobti
4Innovative Radiology, PC, River Forest, Illinois
MD, MBA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Brian D. Reece
5The Spine and Orthopedic Academic Research Institute, Lewisville, Texas
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jorge Felipe Ramírez León
6Fundación Universitaria Sanitas, Bogotá, Colombia, Research Team, Centro de Columna. Bogotá, Colombia, Centro de Cirugía de Mínima Invasión, CECIMIN—Clínica Reina Sofía, Bogotá, Colombia
MD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sandeep Shah
7Multus Medical, LLC, Phoenix, Arizona
MSEE, MBA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading

ABSTRACT

Background: Artificial intelligence could provide more accurate magnetic resonance imaging (MRI) predictors of successful clinical outcomes in targeted spine care.

Objective: To analyze the level of agreement between lumbar MRI reports created by a deep learning neural network (RadBot) and the radiologists' MRI reading.

Methods: The compressive pathology definitions were extracted from the radiologist lumbar MRI reports from 65 patients with a total of 383 levels for the central canal: (0) no disc bulge/protrusion/canal stenosis, (1) disc bulge without canal stenosis, (2) disc bulge resulting in canal stenosis, and (3) disc herniation/protrusion/extrusion resulting in canal stenosis. For both, neural foramina were assessed with either (0) neural foraminal stenosis absent or (1) neural foramina stenosis present. Reporting criteria for the pathologies at each disc level and, when available, the grading of severity were extracted, and the Natural Language Processing model was used to generate a verbal and written report. The RadBot report was analyzed similarly as the MRI report by the radiologist. MRI reports were investigated by dichotomizing the data into 2 categories: normal and stenosis. The quality of the RadBot test was assessed by determining its sensitivity, specificity, and positive and negative predictive value as well as its reliability with the calculation of the Cronbach alpha and Cohen kappa using the radiologist MRI report as a gold standard.

Results: The authors found a RadBot sensitivity of 73.3%, a specificity of 88.4%, a positive predictive value of 80.3%, and a negative predictive value of 83.7%. The reliability analysis revealed the Cronbach alpha as 0.772. The highest individual values of the Cronbach alpha were 0.629 and 0.681 when compared to the MRI report by the radiologist, rending values of 0.566 and 0.688, respectively. Analysis of interobserver reliability rendered an overall kappa for the RadBot of 0.627. Analysis of receiver operating characteristics (ROC) showed a value of 0.808 for the area under the ROC curve.

Conclusions: Deep learning algorithms, when used for routine reporting in lumbar spine MRI, showed excellent quality as a diagnostic test that can distinguish the presence of neural element compression (stenosis) at a statistically significant level (P < .0001) from a random event distribution. This research should be extended to validated and directly visualized pain generators to improve the accuracy and prognostic value of the routine lumbar MRI scan for favorable clinical outcomes with intervention and surgery.

Level of Evidence: 3.

Clinical Relevance: Validity, clinical teaching, and evaluation study.

  • artificial intelligence
  • deep neural network learning
  • magnetic resonance imaging
  • spinal pathologies
  • reliability analysis

INTRODUCTION

Minimally invasive and endoscopic transforaminal decompression techniques have become popular in spinal surgery due to technological advances.1–6 There has been a substantial increase in the number of these types of procedures being carried out in ambulatory surgery centers.7 The advantages of endoscopic transforaminal decompression are fewer postoperative complications, a shorter interval for return to work and social reintegration,2,8–11 faster postoperative narcotic independence, and an overall reduced utilization of painkillers.2,12 The latter problem is of significance in light of the opiate abuse epidemic in the United States,13 more rigorous medical necessity assessment,14 and a demand for value-based health care measures to serve the aging baby-boomer population.13–15 In this context, a conclusive preoperative diagnostic work-up of lumbar radiculopathy is crucial, as decompression is often limited to a small area of 1 affected neuroforamen and lateral recess.16–18

In this article, the authors report on the feasibility of using a deep learning algorithm for routine reporting in spine magnetic resonance imaging (MRI). The ultimate objective of this research is to improve the accuracy and predictive value of the MRI scan when applied to the preoperative planning of targeted minimally invasive and endoscopic spinal surgeries. These targeted procedures often ignore the majority of pathologies reported on routine lumbar MRI scans of patients with injuries or degenerative conditions of the spine and focus treatment only on the validated painful pathologies. The preoperative MRI scan is an integral part of the diagnostic work-up besides history, physical examination, electrodiagnostic studies, and confirmative diagnostic spinal injections.15–19 The need to improve the diagnostic accuracy of the routine MRI scan has been well recognized by surgeons who reported on the correlation between intraoperatively observed findings as gold standard references and reflected on the use of the MRI scan as a predictor of the need for appropriate treatment and its clinical outcomes.20–24 The MRI scan, in many respects, has become the ultimate gatekeeping test in the medical necessity determination of many spinal surgeries. Diagnostic inaccuracies related to false-negative diagnoses, therefore, have a significant impact on patient care and often lead to overutilization in other subspecialties of spine care, such as pain management. From a cost-benefit point of view, these inappropriate points-of-care interactions often translate into wasted treatments if considered ineffective by patients who continue to look for care but should be treated definitively by addressing the structural problems associated with their primary spinal pain generator. Therefore, improving the value of the MRI scan as a predictor of clinical outcomes with appropriate surgical treatments is not only central but also critical to applying the value-based approach to spine care. In this study, the authors report on the results of the sensitivity, specificity, and positive and negative predictive value; Cronbach alpha reliability; and interobserver Cohen kappa analysis of MRI reports produced by deep learning neural network algorithms when compared to routine reporting provided by the radiologist.

MATERIALS AND METHODS

The premise of this research and development is based on the ability for deep learning neural network models to identify features in MRI data that represent varying intensities or severities of degenerative pathologies or injuries in patients. The feasibility of this artificial intelligence (AI) approach was demonstrated in another study included in this journal's special focus issue. In this investigation, the same team of authors is now reporting on the statistics of the accuracy and reliability analysis with the AI approach to lumbar MRI reporting, which was considered the gold standard for the comparison analysis. All patients in this consecutive case series provided informed consent, and institutional review board approval was obtained (CEIFUS 106-19). Written informed consent was obtained from the patient for publication of this report and any accompanying images.

Patients and Training Data

The deep learning neural network models analyzed 65 lumbar MRI scans from the same number of patients, comprising a total of 383 levels. The DICOM data were ordered by the first author and were obtained from 1 MRI imaging center in patients with painful lumbar degenerative spine disease or injuries. The data set included the disc levels T12–L1, L1–L2, L2–L3, L3–L4, L4–L5, and L5–S1 for each patient. The average age of the 65 patients was 42.2 years with a standard deviation of 11.8 years. There were 51.5% male and 48.5% female patients. The MRI imaging centers provided radiology reports prepared and approved by board-certified radiologists. Each radiologist was required to present a reading for the presence or absence of annular bulging25 (circumferential, paracentral, posterior), disc herniation26 (extrusion, protrusion, sequestration, fragmentation), central canal stenosis27–29 (compromise of the thecal sac with presence or absence of ventral epidural fat), and foraminal stenosis30 (compromise of the left, right, or both neural foramina and nerve roots) for each intervertebral level.

Extraction of MRI Data

For each disc location, the following classes were extracted from the radiologist report for the central canal: (0) no disc bulge/protrusion/canal stenosis, (1) disc bulge without canal stenosis, (2) disc bulge resulting in canal stenosis, and (3) disc herniation/protrusion/extrusion resulting in canal stenosis. One of the following classes was also extracted for each of the left and right neural foramina: (0) neural foraminal stenosis absent or (1) neural foraminal stenosis present. An example is shown in Table 1, where at the L3–L4 location in the side-by-side comparison, the radiologist read was converted to class (3) for the central canal, (0) for the left neural foramina, and (1) for the right neural foraminal and matched by the algorithm model. For the purpose of the reliability analysis, these findings were dichotomized into 2 simple categories: normal and stenosis.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1

Level distribution of spinal disc spaces read by the radiologist and AI deep network learning.

Statistical Analysis

For the clinical outcome analysis, descriptive statistics (mean and standard deviation), cross-tabulation statistics of sensitivity, specificity, positive and negative predictive value, and measures of association were computed for 2-way tables using IBM SPSS Statistics software (version 27.0). The Pearson χ2 and the likelihood-ratio χ2 tests were used as statistical measures of association. The Multus RadBot MRI sensitivity of accurately grading and detecting symptomatic nerve root compression (true positive rate) (TP) was calculated on the basis of the grading by the board-certified radiologist as the percentage of patients (MRI positives) among the stenosis patients who were correctly identified by the Multus RadBot as having symptomatic neural compression confirmed by a board-certified radiologist. False negatives (FN) were patients with neural compression identified by the radiologist whose Multus RadBot MRI grading was negative for stenosis (MRI negatives). Therefore, diagnostic Multus RadBot MRI sensitivity for predicting a successful clinical outcome from endoscopic transforaminal decompression procedure was calculated as follows: Embedded Image

The Multus RadBot MRI specificity (true negative [TN] rate) of accurately detecting the absence of symptomatic nerve root compression as demonstrated by the radiologist's MRI reading was calculated as the percentage of patients correctly identified as not having symptomatic neural compression. False positives (FP) were defined as Multus RadBot MRI positives without the radiologist having identified the neural compression. Therefore, diagnostic Multus RadBot MRI specificity of predicting a neural element compression was calculated as follows: Embedded Image

The positive and negative predictive values of the Multus RadBot reading of the lumbar MRI scan for agreeing with the reading of the board-certified radiologist with the presence or absence of compressive pathology (normal or stenosis) were calculated as follows: Embedded Image

Intraobserver reliability between the reading provided by the radiologist and the neural network deep learning algorithm (Multus RadBot) was done by Cronbach alpha computation and Cohen kappa analysis as a measure of agreement between the radiologist's grading of the lumbar MRI scan and the Multus RadBot's assessment of foraminal and central stenosis. The Cohen kappa was calculated from the observed and expected frequencies on the diagonal of a square contingency table. The overall quality of the Multus RadBot algorithm as a diagnostic test was assessed with the receiver operating characteristics (ROC) with determination of the area under the curve employing the left-upper-corner method using a dichotomization protocol of classifying MRI scan readings per intervertebral disc level as either normal or stenotic.31–34 The confidence intervals for the likelihood ratios were calculated using the “log method.”35,36

RESULTS

The level frequency distribution observed in the 65 patients is summarized in Table 1. The radiologist detected the presence of neural element compressive pathology (stenosis) in 60.6% of scanned levels, whereas the Multus RadBot AI algorithm determined the presence of stenosis in 64.2%, of scanned levels (Tables 2 and 3). As listed in Table 4, the most common levels reported as stenotic by the radiologist were L2–L3 (79.7%), L3–L4 (79.7%), and L4–L5 (77.8%). The frequency distribution read out by the Multus RadBot (Table 5) was similar, with some variation at L2–L3 (59.4%), L3–L4 (87.5%), and L4–L5 (93.8%), suggesting that pathology at the L2–L3 level was underdiagnosed versus overdiagnosed at the L4–L5 level. These differences were statistically significant (P < .0001). The ROC analysis showed a value of 0.808 for the area under the ROC curve (AUC), indicating that the Multus RadBot is an excellent diagnostic test that can detect the presence of neural element compression (stenosis) at a statistically significant level (P < .0001) from a random event distribution (Figure).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2

Frequency distribution stenosis as read by the radiologist.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3

Frequency distribution stenosis as read by AI deep network learning.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 4

Frequency distribution of normal versus stenosis diagnosis as read by the radiologist.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 5

Chi-square tests for frequency distribution of normal versus stenosis diagnosis as read by the radiologist.

Figure
  • Download figure
  • Open in new tab
  • Download powerpoint
Figure

Area under the curve data = 0.808; SE = 0.025; asymptotic significance < 0.0001; asymptotic 95% confidence interval, lower bound = 0.760; upper bound = 0.857. Coordinates of the curve for artificial intelligence sensitivity = 0.884; 1 − specificity = 0.267.

The cross tabulation between the Multus RadBot and radiologist's readings of the lumbar MRI scan using the radiologist's report as a gold standard revealed a Multus RadBot sensitivity of 73.3%, a specificity of 88.4% (Table 6), a positive predictive value of 80.3%, and a negative predictive value of 83.7% (Table 7), with all the differences in these 2 cross tabulations being statistically significant. The reliability analysis revealed the Cronbach alpha as 0.772. When cross tabulated by intervertebral disc level differences, in reliability by level were found (Table 8). Through a process of elimination, it was determined that Multus RadBot's performance was most reliable at the L2–L3 and L3–L4 levels with the highest individual values for the Cronbach alpha of 0.629 and 0.681 when compared to the MRI report by the radiologist, rending values of 0.566 and 0.688, respectively (Table 8). Kappa analysis of interobserver reliability rendered an overall kappa for the Multus RadBot of 0.627, suggesting that the Multus RadBot AI algorithm performed at a high reliability level (Table 9). Again, the diagnostic recognition of the Multus RadBot was the most reliable at the L2–L3 and L3–L4 levels on kappa analysis, showing kappa values of 0.738, and 0.606, respectively.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 6

Frequency distribution of normal versus stenosis diagnosis as read by RadBot.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 7

Chi-square tests for frequency distribution of normal versus stenosis diagnosis as read by RadBot.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 8

Sensitivity and specificity of RadBot AI read versus MRI read by radiologist.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 9

Chi-square tests for sensitivity and specificity of RadBot AI read versus MRI read by radiologist.a

DISCUSSION

The results of this study highlighted a small “difference in opinion” in the interpretation of routine lumbar MRI scans between the radiologist's report and AI deep neural network learning algorithm. While it is unclear whether the observed discrepancies arose out of the AI or the radiologist's reporting, it is obvious to see how such reporting discrepancies may impact patient selection for targeted spinal procedures, such as the endoscopic transforaminal surgery. As the determination of medical necessity in injured patients and in patients with painful degenerative conditions of the spine today hinges frequently on the exact verbatim reading in the MRI report, revisiting the accuracy of the MRI scan is of high relevance to patients and their physicians alike. False-positive readings may subject the patient to unwanted or unneeded treatments at high expense, and false-negative interpretations may deny justified care. The consequences of this diagnostic dilemma play out every day, affecting individualized spine care of those patients with an estimated 2.06 million episodes of low back injury per year in the United States.37

The authors purposely chose a simplified way of analyzing the level of agreement between our AI and the radiologist's MRI reading by applying the following assumptions: (1) the MRI report by the radiologist was employed as the gold standard in this reliability and accuracy analysis, and (2) the authors categorized the MRI findings in a straightforward 2-category manner (normal anatomy or stenosis present) to facilitate the study of the AI algorithm's performance as a diagnostic test by employing accepted statistical methods of chi-square testing to determine the sensitivity, specificity, positive and negative predictive value, the overall test reliability with the ROC and AUC method or the calculations of the Cronbach alpha and Cohen kappa. The numbers obtained with these methods suggest that the our AI deep learning network as a diagnostic tool has excellent performance characteristics. Typically, Cohen kappa values of 0.6 and alpha over 0.7 and ROC values higher than 0.8 are considered the hallmarks of a highly useful diagnostic test.38,39 It is not entirely clear to the authors why our AI deep learning neural network was most accurate at the L2–L3 and L3–L4 levels. The most reasonable explanation is that pathologies at the other levels, but particularly at the L4–L5 level, are much more common, thus contributing to more significant variability in how these pathologies are read by the radiologist or interpreted by the Multus RadBot.

The authors are entirely aware of the limitation of their simplified statistical analysis by assuming that the MRI report provided by the reading radiologist was flawless. The authors could have chosen to have the radiologist's report reread by another 1 or 2 radiologists to incorporate that in the reliability discussion. However, the authors purposely decided against it so as not to create an artificial scenario that does not exist in the “real world,” where routine lumbar MRI scans are read by 1 board-certified radiologist with little additional scrutiny. Clinical decision making affecting individual patients' lives are made like that every day. Therefore, the authors did not want to deviate from their simple side-by-side, Multus RadBot versus radiologist analysis approach. It goes without saying, though, that MRI raters on all sides of the medical necessity equation may use different radiological classification systems during the preoperative and diagnostic decision algorithms.15,29,40,41 The first author has demonstrated this clinical dilemma affecting hundreds of his patients who were classified by the radiologist as false negatives but ultimately underwent successful transforaminal endoscopic decompression with excellent and good Macnab outcomes in over 88.3% of patients.23 In his study of 1839 patients, the first author found a diagnostic gap of approximately 18% (330 patients),24 which initially led to the denial of appropriate spine care by the patients' medical insurance. However, patients who persevered eventually underwent seemingly inappropriate endoscopic surgical decompression for their sciatica, back, and leg pain with a 94.6% success rate.23 This type of spine care, deemed as medically not necessary based on traditional image-based clinical decision criteria done in patients responsive to successful endoscopic decompression, stimulated the authors of this study to look further into improving the preoperative diagnostic process in patients with sciatica due to herniated disc or stenosis leading up to targeted surgical decompression. Interestingly, this 18% diagnostic gap is commensurate with the Multus RadBot's percentage gain in reporting consistency in terms of sensitivity, specificity, and positive predictive value of the lumbar MRI scan with intervention reported by clinical studies where numbers are in the 60%–70% range.18

While the authors are encouraged by the excellent diagnostic performance parameters of the Multus RadBot's self-learning deep neural network models, they are also keenly aware of the underlying limitation of their study because of the underlying reporting bias inherent to the MRI reporting provided by the radiologists. Affective (unconscious emotional reaction) and cognitive (distortions of thinking) biases in the clinical diagnostic decision-making process may have impacted the radiologist's choice of words when dictating the findings he saw on the individual axial and sagittal MRI scan images.42 Cognitive biases, such as hindsight or outcome bias, are virtually unavoidable in a retrospective reclassification of clinical parameters, as knowledge of the outcome by the stakeholders in the patient care equation has been recognized to inflate the predictability of an event after it happened.43,44 Hindsight cognitive biases may also have impacted the extent of disagreement in preoperative lumbar MRI grading by the radiologist.45 Intuition bias may have played a role in the radiologist's wording of the MRI report while loosely adhering to radiographic stenosis classification systems.45 The Multus RadBot is not subject to these biases for which reasons the authors expect higher reliability numbers incorrectly identifying painful spinal pathology with further refinements of the technology when directly visualized intraoperative observations of painful spinal pathologies are used as a gold standard rather than a radiologist report of another imaging modality. The first author has successfully used this approach in a prior study of the positive predictive value of the routine lumbar MRI scan.

CONCLUSIONS

This study set out to better understand how to utilize the lumbar MRI scan as a prognosticator of favorable clinical outcomes when selecting patients for targeted spine care, such as the endoscopic transforaminal decompression procedure, aiming to cure patients of the predominant pain generator causing pain and disability in the functional context at the time when the spine care is delivered. To employ the routine lumbar MRI scan as a more accurate prognosticator for successful spine care with high patient satisfaction, this AI deep learning neural network, in the authors' opinion, needs to be further refined by focusing the segmentation models on MRI image findings of intraoperatively verified and validated pain generators responsive to treatment. The authors are in the process of completing a pilot study on this very problem. Surgical translational research on intraoperatively visualized spinal pathology should focus on analyzing the effectiveness of MRI prognosticators with spine surgical interventions, such as endoscopy, using state-of-the-art measures of central, lateral recess, and neural foraminal stenosis on MRI to further determine how they impact the prognosis of surgical treatment for neurogenic claudication and lumbar radiculopathy.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 10

Positive and negative predictive value of RadBot AI read versus MRI read by radiologist.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 11

Chi-square tests for positive and negative predictive value of RadBot AI read versus MRI read by radiologist.a

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 12

Interitem correlation matrix for reliability statistics of RadBot AI read versus MRI read by radiologist. The Cronbach alpha based on standardized items = .772 for normal and stenosis.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 13

Total reliability statistics of RadBot AI read versus MRI read by radiologist. The Cronbach alpha based on standardized items = .772 for normal and stenosis.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 14

Total statistics for highest reliability levels of RadBot AI read versus MRI read by radiologist. The Cronbach alpha based on standardized items = .772 for normal and stenosis.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 15

Kappa analysis of interobserver reliability RadBot versus MRI-reading by intervertebral disc level.

Footnotes

  • Disclosures and COI: The views expressed in this article represent those of the authors and no other entity or organization. The first author has no direct (employment, stock ownership, grants, patents), or indirect conflicts of interest (honoraria, consultancies to sponsoring organizations, mutual fund ownership, paid expert testimony). He is not currently affiliated with or under any consulting agreement with any MRI vendor that the clinical research data conclusion could directly enrich. This manuscript is not meant for or intended to push any other agenda other than reporting the research data related on automated recognition of common painful spine pathologies by deep neural network learning. The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

  • This manuscript is generously published free of charge by ISASS, the International Society for the Advancement of Spine Surgery. Copyright © 2020 ISASS

REFERENCES

  1. 1 .↵
    1. Kambin P,
    2. Gennarelli T,
    3. Hermantin F.
    Minimally invasive techniques in spinal surgery: current practice. Neurosurg Focus. 1998;4(2):e8.
    OpenUrlPubMed
  2. 2 .↵
    1. Adogwa O,
    2. Parker SL,
    3. Bydon A,
    4. Cheng J,
    5. McGirt MJ.
    Comparative effectiveness of minimally invasive versus open transforaminal lumbar interbody fusion: 2-year assessment of narcotic use, return to work, disability, and quality of life. J Spinal Disord Tech. 2011;24(8):479–484.
    OpenUrlPubMed
  3. 3 .
    1. Bini W,
    2. Miller LE,
    3. Block JE.
    Minimally invasive treatment of moderate lumbar spinal stenosis with the superion interspinous spacer. Open Orthop J. 2011;5:361–367.
    OpenUrlPubMed
  4. 4 .
    1. Al-Khouja LT,
    2. Baron EM,
    3. Johnson JP,
    4. Kim TT,
    5. Drazin D.
    Cost-effectiveness analysis in minimally invasive spine surgery. Neurosurg Focus. 2014;36(6):E4.
    OpenUrl
  5. 5 .
    1. Liu C,
    2. Zhou Y.
    Percutaneous endoscopic lumbar diskectomy and minimally invasive transforaminal lumbar interbody fusion for recurrent lumbar disk herniation. World Neurosurg. 2017;98:14–20.
    OpenUrl
  6. 6 .↵
    1. Yuan C,
    2. Wang J,
    3. Zhou Y,
    4. Pan Y.
    Endoscopic lumbar discectomy and minimally invasive lumbar interbody fusion: a contrastive review. Wideochir Inne Tech Maloinwazyjne. 2018;13(4):429–434.
    OpenUrl
  7. 7 .↵
    1. Lewandrowski KU.
    Incidence, management, and cost of complications after transforaminal endoscopic decompression surgery for lumbar foraminal and lateral recess stenosis: a value proposition for outpatient ambulatory surgery. Int J Spine Surg. 2019;13(1):53–67.
    OpenUrlAbstract/FREE Full Text
  8. 8 .↵
    1. Tye EY,
    2. Anderson JT,
    3. Haas AR, et al.
    The timing of surgery affects return to work rates in patients with degenerative lumbar stenosis in a workers' compensation setting. Clin Spine Surg. 2017;30(10):E1444–E1449.
    OpenUrl
  9. 9 .
    1. Wang X,
    2. Borgman B,
    3. Vertuani S,
    4. Nilsson J.
    A systematic literature review of time to return to work and narcotic use after lumbar spinal fusion using minimal invasive and open surgery techniques. BMC Health Serv Res. 2017;17(1):446.
    OpenUrl
  10. 10 .
    1. Khan I,
    2. Bydon M,
    3. Archer KR, et al.
    Impact of occupational characteristics on return to work for employed patients after elective lumbar spine surgery. Spine J. 2019;19(12):1969–1976.
    OpenUrl
  11. 11 .↵
    1. Lewandrowski KU,
    2. Ransom NA,
    3. Yeung A.
    Return to work and recovery time analysis after outpatient endoscopic lumbar transforaminal decompression surgery. J Spine Surg. 2020;6(Suppl 1):S100–S115.
    OpenUrl
  12. 12 .↵
    1. Nicholson T,
    2. Maltenfort M,
    3. Getz C,
    4. Lazarus M,
    5. Williams G,
    6. Namdari S.
    Multimodal pain management protocol versus patient controlled narcotic analgesia for postoperative pain control after shoulder arthroplasty. Arch Bone Jt Surg. 2018;6(3):196–202.
    OpenUrl
  13. 13 .↵
    1. Drahos GL,
    2. Williams L.
    Addressing the emerging public health crisis of narcotic overdose. Gen Dent. 2017;65(5):7–9.
    OpenUrl
  14. 14 .↵
    1. Guyer R,
    2. Musacchio M,
    3. Cammisa FP, Jr..,
    4. Lorio MP. ISASS
    recommendations/coverage criteria for decompression with interlaminar stabilization - coverage indications, limitations, and/or medical necessity. Int J Spine Surg. 2016;10:41.
    OpenUrlFREE Full Text
  15. 15 .↵
    1. Milette PC.
    Classification, diagnostic imaging, and imaging characterization of a lumbar herniated disk. Radiol Clin North Am. 2000;38(6):1267–1292.
    OpenUrlPubMed
  16. 16 .↵
    1. Geurts JW,
    2. Kallewaard JW,
    3. Richardson J,
    4. Groen GJ.
    Targeted methylprednisolone acetate/hyaluronidase/clonidine injection after diagnostic epiduroscopy for chronic sciatica: a prospective, 1-year follow-up study. Reg Anesth Pain Med. 2002;27(4):343–352.
    OpenUrlAbstract/FREE Full Text
  17. 17 .
    1. Lee IS,
    2. Kim SH,
    3. Lee JW, et al.
    Comparison of the temporary diagnostic relief of transforaminal epidural steroid injection approaches: conventional versus posterolateral technique. AJNR Am J Neuroradiol. 2007;28(2):204–208.
    OpenUrlAbstract/FREE Full Text
  18. 18 .↵
    1. Kreiner DS,
    2. Baisden J,
    3. Gilbert T,
    4. Shaffer WO,
    5. Summers JT.
    Re: Diagnostic tests the NASS stenosis guidelines. Spine J. 2014;14(1):201–202.
    OpenUrl
  19. 19 .↵
    1. Lewandrowski KU.
    Successful outcome after outpatient transforaminal decompression for lumbar foraminal and lateral recess stenosis: the positive predictive value of diagnostic epidural steroid injection. Clin Neurol Neurosurg. 2018;173:38–45.
    OpenUrl
  20. 20 .↵
    1. Ghosh S,
    2. Chaudhary V.
    Supervised methods for detection and segmentation of tissues in clinical lumbar MRI. Comput Med Imaging Graph. 2014;38(7):639–649.
    OpenUrl
  21. 21 .
    1. Costa DN,
    2. Passoni NM,
    3. Leyendecker JR, et al.
    Diagnostic utility of a likert scale versus qualitative descriptors and length of capsular contact for determining extraprostatic tumor extension at multiparametric prostate MRI. AJR Am J Roentgenol. 2018;210(5):1066–1072.
    OpenUrl
  22. 22 .
    1. Lee SH,
    2. Yun SJ,
    3. Jo HH,
    4. Kim DH,
    5. Song JG,
    6. Park YS.
    Diagnostic accuracy of low-dose versus ultra-low-dose CT for lumbar disc disease and facet joint osteoarthritis in patients with low back pain with MRI correlation. Skeletal Radiol. 2018;47(4):491–504.
    OpenUrl
  23. 23 .↵
    1. Lewandrowski KU.
    Retrospective analysis of accuracy and positive predictive value of preoperative lumbar MRI grading after successful outcome following outpatient endoscopic decompression for lumbar foraminal and lateral recess stenosis. Clin Neurol Neurosurg. 2019;179:74–80.
    OpenUrl
  24. 24 .↵
    1. Yeung AT,
    2. Lewandrowski KU.
    Retrospective analysis of accuracy and positive predictive value of preoperative lumbar MRI grading after successful outcome following outpatient endoscopic decompression for lumbar foraminal and lateral recess stenosis. Clin Neurol Neurosurg. 2019;181:52.
    OpenUrl
  25. 25 .↵
    1. Stokes IA.
    Surface strain on human intervertebral discs. J Orthop Res. 1987;5(3):348–355.
    OpenUrlCrossRefPubMed
  26. 26 .↵
    1. Fenyo A,
    2. Shinis D,
    3. Shelef I, et al.
    [Lumbar disc herniation: protrusion, extrusion or bulge? The proper use of the terms - how and when will it be defined as a disease?]. Harefuah. 2019;158(12):807–811.
    OpenUrl
  27. 27 .↵
    1. Yuan S,
    2. Zou Y,
    3. Li Y,
    4. Chen M,
    5. Yue Y.
    A clinically relevant MRI grading system for lumbar central canal stenosis. Clin Imaging. 2016;40(6):1140–1145.
    OpenUrl
  28. 28 .
    1. Lee GY,
    2. Lee JW,
    3. Choi HS,
    4. Oh KJ,
    5. Kang HS.
    A new grading system of lumbar central canal stenosis on MRI: an easy and reliable method. Skeletal Radiol. 2011;40(8):1033–1039.
    OpenUrlCrossRefPubMed
  29. 29 .↵
    1. Lee CK,
    2. Rauschning W,
    3. Glenn W.
    Lateral lumbar spinal canal stenosis: classification, pathologic anatomy and surgical decompression. Spine (Phila Pa 1976). 1988;13(3):313–320.
    OpenUrl
  30. 30 .↵
    1. Lee S,
    2. Lee JW,
    3. Yeom JS, et al.
    A practical MRI grading system for lumbar foraminal stenosis. AJR Am J Roentgenol. 2010;194(4):1095–1098.
    OpenUrlCrossRefPubMed
  31. 31 .↵
    1. Metz CE.
    Basic principles of ROC analysis. Semin Nucl Med. 1978;8(4):283–298.
    OpenUrlCrossRefPubMed
  32. 32 .
    1. Lauridsen HH,
    2. Hartvigsen J,
    3. Manniche C,
    4. Korsholm L,
    5. Grunnet-Nilsson N.
    Responsiveness and minimal clinically important difference for pain and disability instruments in low back pain patients. BMC Musculoskelet Disord. 2006;7:82.
    OpenUrlCrossRefPubMed
  33. 33 .
    1. Parker SL,
    2. Godil SS,
    3. Shau DN,
    4. Mendenhall SK,
    5. McGirt MJ.
    Assessment of the minimum clinically important difference in pain, disability, and quality of life after anterior cervical discectomy and fusion: clinical article. J Neurosurg Spine. 2013;18(2):154–160.
    OpenUrlCrossRefPubMed
  34. 34 .↵
    1. Azimi P,
    2. Yazdanian T,
    3. Benzel EC.
    Determination of minimally clinically important differences for JOABPEQ measure after discectomy in patients with lumbar disc herniation. J Spine Surg. 2018;4(1):102–108.
    OpenUrl
  35. 35 .↵
    1. Higgins JP,
    2. Thompson SG,
    3. Deeks JJ,
    4. Altman DG.
    Measuring inconsistency in meta-analyses. BMJ. 2003;327(7414):557–560.
    OpenUrlFREE Full Text
  36. 36 .↵
    1. Liberati A,
    2. Altman DG,
    3. Tetzlaff J, et al.
    The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ. 2009;339:b2700.
  37. 37 .↵
    1. Mehling WE,
    2. Gopisetty V,
    3. Bartmess E, et al.
    The prognosis of acute low back pain in primary care in the United States: a 2-year prospective cohort study. Spine (Phila Pa 1976). 2012;37(8):678–684.
    OpenUrl
  38. 38 .↵
    1. Roine J,
    2. Uusitalo L,
    3. Hielm-Bjorkman A.
    Validating and reliability testing the descriptive data and three different disease diagnoses of the internet-based DOGRISK questionnaire. BMC Vet Res. 2016;12:30.
    OpenUrl
  39. 39 .↵
    1. Spade CM,
    2. Fitzsimmons K,
    3. Houser J.
    Reliability testing of the psychosocial vital signs assessment tool. J Psychosoc Nurs Ment Health Serv. 2015;53(11):39–45.
    OpenUrl
  40. 40 .↵
    1. Lewandrowski KU.
    “Outside-in” technique, clinical results, and indications with transforaminal lumbar endoscopic surgery: a retrospective study on 220 patients on applied radiographic classification of foraminal spinal stenosis. Int J Spine Surg. 2014;8:26.
    OpenUrlAbstract/FREE Full Text
  41. 41 .↵
    1. Lee S,
    2. Kim SK,
    3. Lee SH, et al.
    Percutaneous endoscopic lumbar discectomy for migrated disc herniation: classification of disc migration and surgical approaches. Eur Spine J. 2007;16(3):431–437.
    OpenUrlCrossRefPubMed
  42. 42 .↵
    1. Valat JP.
    Epidural corticosteroid injections for sciatica: placebo effect, injection effect or anti-inflammatory effect? Nat Clin Pract Rheumatol. 2006;2(10):518–519.
    OpenUrlPubMed
  43. 43 .↵
    1. Chang MC,
    2. Lee DG.
    Outcome of transforaminal epidural steroid injection according to the severity of lumbar foraminal spinal stenosis. Pain Physician. 2018;21(1):67–72.
    OpenUrl
  44. 44 .↵
    1. Zwaan L,
    2. Monteiro S,
    3. Sherbino J,
    4. Ilgen J,
    5. Howey B,
    6. Norman G.
    Is bias in the eye of the beholder? A vignette study to assess recognition of cognitive biases in clinical case workups. BMJ Qual Saf. 2017;26(2):104–110.
    OpenUrlAbstract/FREE Full Text
  45. 45 .↵
    1. Henriksen K,
    2. Kaplan H.
    Hindsight bias, outcome knowledge and adaptive learning. Qual Saf Health Care. 2003;12 Suppl 2:ii46-50.
PreviousNext
Back to top

In this issue

International Journal of Spine Surgery
Vol. 14, Issue s3
1 Dec 2020
  • Table of Contents
  • Index by author

Print
Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word on International Journal of Spine Surgery.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Reliability Analysis of Deep Learning Algorithms for Reporting of Routine Lumbar MRI Scans
(Your Name) has sent you a message from International Journal of Spine Surgery
(Your Name) thought you would like to see the International Journal of Spine Surgery web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
Reliability Analysis of Deep Learning Algorithms for Reporting of Routine Lumbar MRI Scans
Kai-Uwe Lewandrowski, Narendran Muraleedharan, Steven Allen Eddy, Vikram Sobti, Brian D. Reece, Jorge Felipe Ramírez León, Sandeep Shah
International Journal of Spine Surgery Dec 2020, 14 (s3) S98-S107; DOI: 10.14444/7132

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Share
Reliability Analysis of Deep Learning Algorithms for Reporting of Routine Lumbar MRI Scans
Kai-Uwe Lewandrowski, Narendran Muraleedharan, Steven Allen Eddy, Vikram Sobti, Brian D. Reece, Jorge Felipe Ramírez León, Sandeep Shah
International Journal of Spine Surgery Dec 2020, 14 (s3) S98-S107; DOI: 10.14444/7132
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • ABSTRACT
    • INTRODUCTION
    • MATERIALS AND METHODS
    • RESULTS
    • DISCUSSION
    • CONCLUSIONS
    • Footnotes
    • REFERENCES
  • Figures & Data
  • Info & Metrics
  • PDF

Related Articles

  • No related articles found.
  • PubMed
  • Google Scholar

Cited By...

  • No citing articles found.
  • Google Scholar

More in this TOC Section

  • Letter to the Editor: Rasch Analysis and High Value Spinal Endoscopy—Another Perspective
  • Real-World Implementation of Artificial Intelligence/Machine Learning for Managing Surgical Spine Patients at 2 Academic Health Care Systems
  • Potential Applications of Artificial Intelligence and Machine Learning in Spine Surgery Across the Continuum of Care
Show more Special Issue

Similar Articles

Keywords

  • artificial intelligence
  • deep neural network learning
  • magnetic resonance imaging
  • spinal pathologies
  • reliability analysis

Content

  • Current Issue
  • Latest Content
  • Archive

More Information

  • About IJSS
  • About ISASS
  • Privacy Policy

More

  • Subscribe
  • Alerts
  • Feedback

Other Services

  • Author Instructions
  • Join ISASS
  • Reprints & Permissions

© 2025 International Journal of Spine Surgery

International Journal of Spine Surgery Online ISSN: 2211-4599

Powered by HighWire