Artificial intelligence for the detection of glaucoma with SD-OCT images: a systematic review and Meta-analysis

2024-03-20 06:33NanNanShiJingLiGuangHuiLiuMingFangCao

International Journal of Ophthalmology 2024年3期

Nan-Nan Shi, Jing Li, Guang-Hui Liu, Ming-Fang Cao

1Department of Ophthalmology, Affiliated People’s Hospital(Fujian Provincial People’s Hospital), Fujian University of Traditional Chinese Medicine, Fuzhou 350004, Fujian Province, China

2Eye Institute of Integrated Chinese and Western Medicine,Fujian University of Traditional Chinese Medicine, Fuzhou 350004, Fujian Province, China

Abstract

● KEYWORDS: artificial intelligence; spectral-domain optical coherence tomography; glaucoma; Meta-analysis

INTRODUCTION

Glaucoma is a group of eye disorders characterized by chronically progressive disorders of the optic neuropathy[1].It is the second most common cause of irreversible vision loss worldwide[2-3].By 2040, the prevalence of glaucoma will increase to 111.8 million, resulting in a major public health problem that threatens national eye health[4].It is classified into open angle glaucoma and angle-closure glaucoma according to the state of anterior chamber angle when intraocular pressure (IOP) increases.Primary open angle glaucoma (POAG), the most common form of glaucoma in western countries, is typically asymptomatic in its early stages and often diagnosed after irreversible visual damage has occurred[5].Primary angle-closure glaucoma (PACG)which mostly occurred in east Asian countries is mainly an asymptomatic disease, in less than 1/3 of cases, patients appear with acute primary angle closure[3].Due to the high incidence and risk, improving the efficiency of the early diagnosis of glaucoma is imperative.

Optical coherence tomography (OCT) is a common imaging technology in the evaluation of glaucomatous structural damage[6].Recently, the spectral-domain OCT (SD-OCT) is rapidly advancing which can measure the retinal nerve fiber layer (RNFL) and ganglion cell-inner plexiform layer (GCIPL)with 3D image acquisition modes, repeatable registration and advanced segmentation algorithms[7].SD-OCT which provides a higher axial resolution and a faster scan speed, has theoretical advantages in glaucoma assessment over the earlier generation of time domain (TD)-OCT[8-9].Although glaucoma damage is irreversible, early diagnosis and treatment through OCT images can prevent visual functional and structural loss.With the rapid development of computer and information technology, data information construction has been gradually integrated into every field of society.Therefore, how to use artificial intelligence (AI) to better serve massive data information in hospital management and optimize and guide disease diagnosis is attracting more and more attention in ophthalmology.

To date, the development of AI has an unstopped trend.AI is a branch of computer science that aims to create intelligent machines[10].Machine learning (ML) is the use of data or previous experience to optimize the performance criteria of computer programs.Deep learning (DL) is a new research direction in ML.The motivation for DL research is to build neural networks that simulate the human brain for analysis and learning to interpret the data.AI technologies are becoming an alternative approach to conventional technologies.AI has been used in different medicine sectors, such as radiology[11],pathology[12], dermatology[13], cardiology[14], gastroenterology[15]and ophthalmology[10].In the ophthalmology field, AI was applied in diabetic retinopathy, age-related macular degeneration, glaucoma, cataract, keratoconus, and so on from multimodality images including fundus photographs, OCT,fundus fluorescence angiography (FFA) and anterior segment photography[16].

AI has fostered breakthroughs in the screening, diagnosis and detection of progression in the field of glaucoma[17].There are four main approaches to screening patients for glaucoma:measuring IOP, examining the angle anatomy, evaluating the visual field (VF) and assessing optic nerve head (ONH) and nerve fiber layer (RNFL)[18-19].Currently, AI is commonly used in glaucoma management including IOP, VF, false positive (FP), and OCT[20].A prospective cross-sectional study demonstrated that automated IOP measurements using DL of Goldmann applanation tonometry (GAT) videos is comparable to standard GAT[21].Another study merged VF and clinical data longitudinal datasets to assess the performance of ML,and their results showed that the model was able to extract spatio-temporal features other algorithms cannot, with better diagnostic capabilities [area under the receiver operating characteristic (AUROC): 0.89 to 0.93][22].Liet al[23]developed a clinical DL system for prediction and stratifying the risk of glaucoma onset and progression based on color FPs, the study results proved that the feasibility of DL algorithms in the early detection and prediction of glaucoma progression.AI, ML,and DL will play a crucial role in glaucoma, with implications for early diagnosis of vision impairment in the setting of aging populations globally[24].

Together, AI is expected to provide automated devices to ophthalmologists for early diagnosis and timely treatment of ocular disorders in the near future[25].Therefore, we performed this systematic review and Meta-analysis to quantify the performance of AI for the detection of glaucoma in SD-OCT.

MATERIALS AND METHODS

Protocol and RegistrationWe registered our protocol on PROSPERO (https://www.crd.york.ac.uk/PROSPERO/),whose registration number was CRD42023431060.This systematic review and Meta-analysis adheres to The PRISMA extension for Diagnostic Test Accuracy (PRISMA-DTA) in 2018[26].

Eligibility CriteriaAll papers that reported AI algorithms in SD-OCT images for glaucoma diagnosis were taken into account.The inclusion criteria in detail were as follows: 1)Based on AI including DL or ML, glaucoma can be detected by SD-OCT single-modal images or SD-OCT combined with VF/fundus photography multimodal images.2) Clearing the definition of glaucoma including POAG or PACG or both.3) The outcomes consist of sensitivity, specificity, and so on.4) AI is generally divided into a test set and a training set,the training set is used to train the AI model for diagnosing glaucoma, and the test set ultimately selects the performance of the optimal model.Only the test set data were used for Metaanalysis in this study, and if the literature did not report the grouping of the specific training set and the test set, the data of the entire sample set were recorded.5) The language is limited to English.The exclusion criteria were as follows: 1) ongoing or unpublished studies, 2) using other multimodality images such as fundus photographs, anterior segment-OCT and FFA,3) publication forms including conference, review, Metaanalysis, and case report, 4) studies cannot extract the specific outcomes.

Information Sources, Search Strategies and Study SelectionWe searched six databases from PubMed, Web of Science, Cochrane Library, ScienceDirect, ProQuest and Scopus by May 31, 2023.The following search terms:“artificial intelligence”, “deep learning”, “machine learning”,“Computational Intelligence”, “Machine Intelligence”,“Computer Reasoning”, “Computer Vision System”,“Knowledge Acquisition”, “Knowledge Representation”,“glaucoma”, and “optical coherence tomography”.

Data Collection and Definitions for Data ExtractionTwo investigators (Shi NN, Li J) independently screened the literature, and if there were discrepancies in the results, the third investigator (Liu GH) would discuss them together.Then the data from the included studies were extracted by a researcher (Shi NN) and were rechecked by another (Cao MF).The extracted baseline data consist of study, year, study characteristics, datasets, device, total image numbers, image quality, outcome, method, methodology, sensitivity, and specificity.

Risk of Bias and ApplicabilityQuality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) has been widely used to assess the risk of bias in accuracy studies of diagnostic tests.QUADAS-2 tool is composed of patient selection, index test, reference standard, flow and timing.The first three parts are assessed in terms of clinical applicability.Two researchers(Shi NN, Li J) independently applied the QUADAS-2 tool to evaluate the quality of the included literature.If there were some disagreements, the third researcher (Liu GH) would negotiate and solve them.

Diagnostic Accuracy Measures and Synthesis of ResultsThe diagnostic accuracy indicators (sensitivity and specificity)of the included studies were reported in the baseline data table.The values of true positive (TP), FP, false negative (FN) and true negative (TN) were calculated by Review Manager5.4 according to the number of researchers, sensitivity, and specificity.For multiple groups of data in the same literature,we regarded each subgroup as an independent study in this meta-analysis.The sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), summary receiver operator characteristic (SROC)and area under curve (AUC) were combined to quantify the performance of AI for the detection of glaucoma in SD-OCT.

Meta-analysis and Additional AnalysesThe risk of bias for included studies was performed using Revman5.4.All Meta-analysis and addition analyses were presented with the Stata16.0.The Spearman correlation coefficient was first calculated to determine whether there was a threshold effect.When there was no threshold effect among the included studies, the Chi-square test was further used to analyze the statistical heterogeneity among the results of the included studies, andI2was used to quantitatively determine the degree of heterogeneity.IfI2＜50%, the fixed effect model was applied for the combined analysis, otherwise the random effect model was used.Meta-regression was utilized to detect the cause of heterogeneity.Then, subgroup analyses were conducted according to different methods (regions, methods, outcomes,and devices).Deek’s funnel plot and sensitivity analysis were applied to judge the publication bias and evaluate the stability of the analysis results.

RESULTS

Study SelectionWe searched 1373 records initially, about 394 from PubMed, 324 from Web of Science, 17 from Cochrane Library, 79 from ScienceDirect, 275 from ProQuest, and 284 from Scopus.The 405 duplicates were removed, and then we excluded 903 by reading the titles and abstracts of the literature.Finally, the remaining 65 papers were downloaded in full text and screened carefully, resulting in the inclusion of 20 pieces of literature that met the eligible criteria.Figure 1 shows the literature selection process.

Figure 1 Study selection flow diagram.

Study CharacteristicThe detailed study characteristics are presented in Table 1.A total of 20 studies[27-46]were included in the article which comprises 51 models.The studies were published in years from 2014 to 2023 and involved 13 diverse countries/regions [Australia, Brazil, China, India,Japan, Korea, Pakistan, Singapore, Romania, Spain, Taiwan(China), United States, and Nepal], two of which were multicenter surveys.The types of research covered prospective,retrospective, cross-sectional, and cohort observational studies.In terms of the device, there were 3 types including Topcon 3D OCT, Heidelberg Spectralis OCT and Cirrus Zeiss OCT.Fourteen studies reported POAG as the outcome indicator,and the remaining covered glaucoma (POAG and PACG)as the outcome.The DL was a commonly used method and was adopted in 12 studies.The ML method was utilized in 7 studies.Only 1 article used both methods.

Risk of Bias and ApplicabilityWe applied the QUADAS-2 tool to assess the quality of included studies (Figure 2).The 13 included studies were high quality with low risk of bias and applicability concerns.There are 3 studies[27,30,41]that did not report whether consecutive or random cases were included, thus patient selection and applicability concerns were rated as “unclear risk”.Another 3 articles[39,43-44]could not judge whether the gold standard correctly distinguishes the target disease states, therefore we assess reference standard and applicability concerns as “unclear risk”.One study[43]was graded as “high risk” for flow and timing due to the modification of inclusion criteria after upgrading the software,which resulted in not all participants being enrolled in the analysis.The other study[42]extracted possible diagnostic features of glaucoma and did not indicate whether all patients received only the same gold standard, so flow and timing was given an “unclear risk”.

Table 1 Basic characteristics of included studies

Table 1 Basic characteristics of included studies (continued)

Figure 2 Risk of bias assessment of included studies via QUADAS-2 tool.

Performance of AI in Glaucoma Detection and Synthesis of ResultsThe threshold analysis was tested first whether there was a threshold effect.The result proved there was a low heterogeneity (Spearman correlation coefficient =0.22).Figure 3 demonstrates the paired forest plot for sensitivity and specificity with 95%CIs for each study.The pooled sensitivity and specificity were 0.91 (95%CI: 0.86-0.94,I2=94.67%), 0.90(95%CI: 0.87-0.92,I2=89.24%).Figure 4 displays the paired forest plot for PLR and NLR with 95%CIs for each study.The pooled PLR and NLR were 8.79 (95%CI: 6.93-11.15,I2=89.31%), 0.11 (95%CI: 0.07-0.16,I2=95.25%).Figure 5 shows the forest plot for DOR and SROC curve with 95%CIs.The pooled DOR and AUC were 83.58 (95%CI: 47.15-148.15,I2=100%), 0.95 (95%CI: 0.93-0.97).

Addition AnalysesWith the high heterogeneity of this Metaanalysis, Meta-regression was performed to analyze the reasons.We proceeded with the analysis in four dimensions,namely, regions, methods, outcomes, and devices.Then,subgroup analyses were conducted according to diverse causes.The detailed results showed in Table 2.Deek’s funnel plot of each mode l (Figure 6) was tested to evaluate the publication bias (P=0.32), which indicated no clear bias in this Meta-analysis.The result of sensitivity analysis was presented in Figure 7.The picture clearly showed that the Meta-analysis has good stability.

DISCUSSION

The diagnosis of glaucoma in its early stages is challenging.This Meta-analysis included 20 studies and 51 models in order to investigate the performance of AI in detecting glaucoma.Based on our results of Meta-analysis in the paper, it is confirmed that there is a high accuracy for the detection of glaucoma with AI in SD-OCT images.Thus, the application of AI-based tools for detecting glaucoma may provide substantial benefits for early detection, prevention, and treatment of the disease.

Glaucoma is an eye disease that causes optic nerve damage and progressive VF loss due to increased IOP[47].This leads to progressive deterioration of the VF, usually starting from the mid-periphery and progressing in a centripetal direction until eventually only a central or peripheral vision remains[48].The early stage of glaucoma is not easy to be detected, resulting in delayed treatment and irreversible visual impairment.Therefore, early detection is essential to glaucoma treatment as it can prevent further vision loss[3].

Common to all glaucomatous eyes is the loss of retinal ganglion cells and thinning of the RNFL, particularly the cup thinning of the optic disc[49].Rapid advances in ophthalmic imaging in recent years have presented opportunities and challenges.Assessment of the optic disc and VF using OCT imaging, fundus photography, and standard automated VF meter helps in the clinical diagnosis of optic nerve damage in glaucoma[50].Detection of structural changes in glaucoma has traditionally relied on the evaluation of fundus photographs.However, photographs cannot be quantified and there is little consistency in experts’ judgment of optic disc photographs.OCT overcomes the limitations of fundus photography by allowing objective quantitative measurements of the RNFL,optic disc, and macula, which can aid in the diagnosis and progression analysis of glaucoma[51].In contrast to OCT, the ability of VF examinations to detect disease progression is influenced by the stage of the disease.In the natural course of glaucoma, structural and functional damage may not occur at the same time, and in the early stages, the likelihood of detecting disease using OCT is higher because structural changes such as ganglion cell loss and thinning of the RNFL usually occur before loss of function as detected by conventional VF testing[52].In advanced stages of the disease,loss of function and thus VF defects are more appropriate to be detected using VF testing[53-54].Reliable computer-assisted diagnosis of glaucoma has continued to expand in recent years.One is the single-path method, which inputs single-type data.The other is a multimodal fusion image, which is combined with two or more types of data[55].A number of studies have shown that multimodal imaging based on DL can detect glaucoma with higher accuracy, which can further improve the performance of glaucoma diagnosis[55-57].

Figure 3 The forest plot of the pooled sensitivity and specificity.

Figure 4 The forest plot of the PLR and NLR PLR: Positive likelihood ratio; NLR: Negative likelihood ratio.

Figure 5 The forest plot of the DOR and SROC curve DOR: Diagnostic odds ratio; SROC: Summary receiver operator characteristic.

Figure 6 Deek’s funnel plot of each model.

OCT is a non-invasive imaging technique[58].In recent years,there has been a continuous iteration of OCT technology,from the earliest TD-OCT to the current SD-OCT and sweptsource OCT (SS-OCT).The latter achieves faster scanning and higher axial resolution and incorporates innovations such as real-time eye-tracking to compensate for eye movements during data acquisition and minimize motion artifacts[59].SDOCT is currently one of the most commonly used auxiliary tests for the diagnosis of glaucoma[60].However, the diagnostic accuracy may be challenged by the enormous workload due to the necessity of manual image processing, relevant interobserver variability, and interference factors (e.g., extreme refractive errors).Currently, AI is generating global interest.The development of AI algorithms to analyze images and reach the diagnosis of diseases has a huge impact on the medical field[61].Hence, improving the diagnostic efficacy of glaucoma based on AI algorithms combined with SD-OCT images can help ophthalmologists make quick clinical decisions and further facilitate glaucoma screening.

Figure 7 Sensitivity analysis of each model.

As shown in Figures 3-5, our results yielded robust and consistent findings that lend support to the high diagnostic accuracy of AI for the detection of glaucoma in SD-OCT images.In this Meta-analysis, there exists a high accuracy in detecting glaucoma, but with high heterogeneity.We performed Meta-regression, with heterogeneity originatingfrom regions, methods, outcomes and devices.Results indicate better diagnostic efficacy in detecting glaucoma in Asia than in Western countries.It may be related to the high prevalence of glaucoma in Asia[4].In recent years, genetic and genomic studies have identified important genes associated with glaucoma that influence biological pathways and processes[62].In the future, the genetic architecture of glaucoma can be determined in one step, enabling comprehensive genetic testing and gene-targeted therapy[63].In traditional forms, ML still requires human-designed code to convert raw data into input features[64].DL is a class of state-of-the-art ML techniques.DL models are a type of artificial neural network composed of several layers of artificial “neurons”[65].It is confirmed in several studies that DL systems have great potential to improve glaucoma diagnosis[66-68].While DL programs are not standardized, and they generate great dependence on the clinician on their final provider and cost.SD-OCT, which could measure the ONH, RNFL, and macular parameters has been a vital image modality in glaucoma practice[69].Pierroet al[70]evaluated the RNFL reproducibility of various SDOCTs and showed that Heidelberg demonstrated high interoperator agreement.However, digital imaging in glaucoma continues to develop, different devices perform high diagnostic capabilities and are complementary to each other[71].This paper reviews different studies from around the world demonstrating the ability of AI algorithms to diagnose glaucoma using OCT images.As ophthalmic imaging technology continues to evolve, AI may play an important role in the near future of healthcare[72].

Table 2 Subgroup analyses and Meta-regression results

The main shortcoming of this Meta-analysis is that the datasets are different and the algorithms used in each study are their own algorithms.Besides, a limitation of this analysis is that the diagnosis of glaucoma was not the result of a single test but rather an integrated interpretation of risk factors.Therefore,misclassification due to this subjective assessment cannot be completely ruled out.Future observations will be needed to see how AI algorithms, when integrated with clinical practice,affect clinical diagnosis and assess changes over time.Third,many glaucoma patients have cataracts and corneal opacities,which reduce the quality of the images.The performance of DL and ML algorithms depends on the quality of the images,and the exclusion of low-quality images from the study may limit the effectiveness of the algorithms in real clinical applications.Fourth, since the structural data were trained and validated by the DL classifier, this might have biased the diagnostic ability by overestimating the sensitivity-specificity balance.It is also the reason of the high number of works with sensitivity/specificity closed to 1.Fifth, the Meta-analysis in this paper did not incorporate the training set.In the future, it is necessary to increase the size of the dataset to validate the AI algorithm and improve its diagnostic accuracy for glaucoma.Sixth, we did not compare the performance between AI and human experts since limited data were available.Seventh, due to the widespread use of SD-OCT, only studies with SD-OCT were included in this study.However, with the continuous development of ophthalmic imaging technology, it is necessary to expand the scope of research to swept-source OCT in the future.Additionally, some of included studies were reported without specification.We should enhance the quality and reliability of clinical ophtalmic AI research by following the guildlines[73].Finally, many AI programs work on the black box, the internal algorithm-specific features extracted by DL are especially complex to understand.It is imperative to develop explainable AI (XAI) so as to interpret trained deep networks to unbox the black-box[74].

In conclusion, our study found that AI is promising in detecting glaucoma from SD-OCT.The application of AIbased algorithms allows together with “doctor+artificial intelligence” to improve the diagnosis of glaucoma.Improving the diagnostic efficacy of glaucoma based on AI algorithms combined with SD-OCT images can help ophthalmologists make quick clinical decisions and further facilitate glaucoma screening.More datasets established by new diagnostic methods will be used in the future, which will be helpful in fundus application screening, and reducing the work-load of physicians.

ACKNOWLEDGEMENTS

Conflicts of Interest: Shi NN,None;Li J,None;Liu GH,None;Cao MF,None.

International Journal of Ophthalmology2024年3期

International Journal of Ophthalmology的其它文章: Meibomian glands segmentation in infrared images with limited annotation; Overexpression of TRPV1 activates autophagy in human lens epithelial cells under hyperosmotic stress through Ca2+-dependent AMPK/mTOR pathway; Dry environment on the expression of lacrimal gland S100A9, Anxa1, and Clu in rats via proteomics; Semaphorin 7A impairs barrier function in cultured human corneal epithelial cells in a manner dependent on nuclear factor-kappa B; Novel MIP gene mutation causes autosomal-dominant congenital cataract; A rare missense PAX6 mutation causes atypical aniridia in a three-generation Chinese family