Original Article
Preliminary experiments on interpretable ChatGPT-assisted diagnosis for breast ultrasound radiologists
Pengfei Sun1, Linxue Qian1, Zhixiang Wang1,2
Contributions: (I) Conception and design: All authors; (II) Administrative support: L Qian; (III) Provision of study materials or patients: P Sun; (IV) Collection and assembly of data: Z Wang, P Sun; (V) Data analysis and interpretation: Z Wang; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.
Background: Ultrasound is essential for detecting breast lesions. The American College of Radiology’s Breast Imaging Reporting and Data System (BI-RADS) classification system is widely used, but its subjectivity can lead to inconsistency in diagnostic outcomes. Artificial intelligence (AI) models, such as ChatGPT-3.5, may potentially enhance diagnostic accuracy and efficiency in medical settings. This study aimed to assess the utility of the ChatGPT-3.5 model in generating BI-RADS classifications for breast ultrasound reports and its ability to replicate the “chain of thought” (CoT) in clinical decision-making to improve model interpretability.
Methods: Breast ultrasound reports were collected, and ChatGPT-3.5 was used to generate diagnoses and treatment plans. We evaluated GPT-4’s performance by comparing its generated reports to those from doctors with various levels of experience. We also conducted a Turing test and a consistency analysis. To enhance the interpretability of the model, we applied the CoT method to deconstruct the decision-making chain of the GPT model.
Results: A total of 131 patients were evaluated, with 57 doctors participating in the experiment. ChatGPT-3.5 showed promising performance in structure and organization (S&O), professional terminology and expression (PTE), treatment recommendations (TR), and clarity and comprehensibility (C&C). However, improvements are needed in BI-RADS classification, malignancy diagnosis (MD), likelihood of being written by a physician (LWBP), and ultrasound doctor artificial intelligence acceptance (UDAIA). Turing test results indicated that AI-generated reports convincingly resembled human-authored reports. Reproducibility experiments displayed consistent performance. Erroneous report analysis revealed issues related to incorrect diagnosis, inconsistencies, and overdiagnosis. The CoT investigation supports the potential of ChatGPT to replicate the clinical decision-making process and offers insights into AI interpretability.
Conclusions: The ChatGPT-3.5 model holds potential as a valuable tool for assisting in the efficient determination of BI-RADS classifications and enhancing diagnostic performance.
Keywords: ChatGPT; breast; artificial intelligence (AI); diagnosis
Submitted Jan 23, 2024. Accepted for publication Jul 31, 2024. Published online Aug 28, 2024.
doi: 10.21037/qims-24-141
IntroductionOther Section
Breast cancer is one of the most common malignancies among women, with its incidence and mortality rates on the rise globally (1). Ultrasound plays a vital role in detecting breast lesions, serving as a first-line screening tool. In China, where many women have dense breast tissue, ultrasound is often the preferred imaging modality (2). The American College of Radiology’s Breast Imaging Reporting and Data System (BI-RADS) (3) provides a clear classification of breast tumor malignancy, which is essential for devising treatment plans and assessing prognosis, and is widely applied in clinical practice. However, due to its subjective nature, diagnostic results may vary across physicians with different levels of experience and from different regions (4).
In recent years, artificial intelligence (AI) has demonstrated outstanding performance in cognitive tasks (5-10). The introduction of the large language model (LLM) ChatGPT by OpenAI represents a significant advancement in natural language (6) processing, offering substantial potential for improving diagnostic accuracy and efficiency (9) while reducing human errors in the medical field (11).
Despite the potential benefits, AI models face limitations in specialized domains such as medical diagnosis, including the scarcity of training data, which can impair the model’s capacity for generalization and precise prediction making (12). Furthermore, AI models may not be sufficiently effective or dependable for use in difficult medical diagnostic tasks (13). Moreover, the “black box” nature of AI models, particularly in the context of medicine, can present significant challenges due to the lack of transparency in decision-making (14). This opacity can lead to mistrust and hinder the wider adoption of AI technologies in critical areas such as healthcare. Consequently, research into model interpretability is not only essential, but also timely (15). The “chain of thought” (CoT) methodology we employed in our previous study represents an attempt at improving model interpretability (16). This method provides a visual breakdown of the AI’s decision-making process, thereby enhancing our understanding of how the AI model arrives at a given conclusion. Illuminating the AI decision-making process can improve AI performance, foster trust, and facilitate the smoother integration of AI into healthcare by addressing one of the major concerns of healthcare professionals—the unpredictability and opacity of AI decision-making.
This study aimed to clarify the potential of the ChatGPT-3.5 model to help ultrasound doctors effectively determine BI-RADS classification, improve diagnostic performance in clinical settings, and analyze the causes of misdiagnosis, to better understand the limitations of LLM in this context. We present this article in accordance with the STROBE reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-24-141/rc).
MethodsOther Section
Data collection
From March 2023 to April 2023, we retrospectively collected data from patients with breast cancer treated at Beijing Friendship Hospital, Capital Medical University. All patients with breast masses classified as BI-RADS 4a or higher underwent either core needle biopsy or surgical pathology to confirm their diagnosis. Patients with BI-RADS 2 and 3 lesions were followed up for 3–5 years as typical benign cases. In total, 131 ultrasound reports from 131 patients were included, all of whom were female, with an average age of 43 (range, 21–78) years. Benign cases included breast cysts, fibroadenomas, and mammary gland diseases, while malignant cases were all invasive breast cancers. This study was conducted in accordance with the Declaration of Helsinki (as revised in 2013) and received ethical approval from the Medical Ethics Committee of Beijing Friendship Hospital, Capital Medical University (No. 2022-P2-060-01). The requirement for individual consent was waived due to the retrospective nature of the analysis.
A total of 57 evaluating doctors participated in the study, including 20 junior doctors (1–5 years of experience), 18 intermediate doctors (6–10 years of experience), and 19 senior doctors (>10 years of experience). They were from 57 hospitals, including Binzhou Central Hospital in Shandong Province, Beijing Children’s Hospital Affiliated with Capital Medical University, and Beijing Friendship Hospital Affiliated with Capital Medical University.
Diagnostic results generated by ChatGPT
The ChatGPT (17) series, created by OpenAI, is a cutting-edge pretrained language model that is capable of performing intricate natural language processing tasks, including generating articles, answering questions (18), translating languages, and producing code. The workflow of this study is illustrated in Figure 1. In our analysis, we input breast ultrasound medical reports into ChatGPT-3.5 and prompted it to generate diagnoses and treatment recommendations (TR). We subsequently collected the output reports for evaluation. Figure S1 shows the process of question and answer collection. The prompt provided was as follows: “Based on the following breast ultrasound description, please provide a comprehensive diagnosis (BI-RADS classification) and corresponding treatment recommendations”.
Figure 1 Overview of the experimental workflow. The procedure begins with data collection and the acquisition of ultrasound reports and progresses to the generation of diagnoses and treatment outcomes using ChatGPT-3.5. The results underwent four experimental evaluations: (I) physician assessment of AI-generated reports; (II) a Turing test to evaluate reports created by doctors versus those produced by AI; (III) a reproducibility experiment involving the generation of reports twice and a comparison of differences; and (IV) an analysis of erroneous reports. AI, artificial intelligence.
Evaluation of report performance
In order to gain deeper insights into ChatGPT’s effectiveness in producing diagnostic reports for breast cancer, we gathered and assessed the ratings of these reports based on a specific set of evaluation criteria (see Table S1). These criteria included structure and organization (S&O), professional terminology and expression (PTE), BI-RADS classification, malignancy diagnosis (MD), TR, clarity and comprehensibility (C&C), likelihood of being written by a physician (LWBP), ultrasound doctor AI acceptance (UDAIA), and overall evaluation (OE). Each criterion was rated on a scale of 1 to 5, with 1 indicating completely incorrect or unsatisfactory and 5 indicating completely correct or satisfactory. The details of the scoring table can be found in the supplementary materials (Table S2). Furthermore, we assessed the proficiency of AI-generated reports by juxtaposing their evaluations with those provided by physicians possessing varying degrees of clinical experience.
Turing test and reproducibility experiment
To evaluate doctors’ ability to distinguish between human-written and AI-generated reports (19), we incorporated 50% of the ChatGPT-generated reports into the evaluation set. Doctors assessed the likelihood that each report was authored by a physician, and we calculated the rate of accurate identifications. If their accuracy surpassed random guessing (50%), it would indicate that ChatGPT successfully passed the Turing test.
To evaluate the consistency of ChatGPT’s responses, we conducted a comparison of two outputs produced by distinct transient model instances (20). For each inquiry, we analyzed the scores allocated to both responses and conducted a statistical assessment to identify significant disparities (21), thereby offering insights into the reliability of ChatGPT’s performance.
Erroneous report analysis
We collected and examined misdiagnosed reports, defined as those with low scores (1–2 points). Two doctors with 12 years of experience analyzed and categorized the reasons for these errors (22). This analysis aimed to identify patterns and potential weaknesses in the ChatGPT model in order to guide future improvements and training strategies (23).
GPT CoT visualization
The CoT (16) method breaks down the decision-making process of the GPT model into several stages, depicting it as a flowchart. This method provides a clear and insightful means to scrutinizing the model’s decision-making patterns, enhancing our understanding of its diagnostic process. The visual representation elucidates the decisions made by ChatGPT-3.5 in assigning a BI-RADS score, assessing malignancy, suggesting a treatment plan, and generating a diagnostic report.
Statistical analyses
The statistical analysis was carried out using the Mann-Whitney test (24), provided by the SciPy package (25) with all code written in Python 3.8 (Python Software Foundation, Wilmington, DE, USA). A P value lower than 0.05 was deemed to be statistically significant.
ResultsOther Section
Report generation performance evaluation result
The mean values of the ChatGPT-3.5 performance in medical reports for different metrics were as follows: S&O, 4.08 [95% confidence interval (CI): 3.99–4.17], PTE, 4.08 (95% CI: 3.99–4.18); BI-RADS classification, 3.77 (95% CI: 3.64–3.90); MD, 3.86 (95% CI: 3.74–3.98); TR, 4.03 (95% CI: 3.93–4.14); C&C, 4.00 (95% CI: 3.89–4.10); UDAIA, 3.92 (95% CI: 3.81–4.03); and OE, 3.89 (95% CI: 3.77–4.00). The results can be found in Figure 2A. ChatGPT-3.5 exhibited remarkable performance in S&O, PTE, TR, and C&C, with scores approaching or surpassing 4. However, the scores for BI-RADS, MD, LWBP, and UDAIA were slightly lower, indicating areas in need of improvement. In summary, ChatGPT-3.5 achieved an OE score of 3.89, indicating that its performance was deemed generally acceptable. We employed a radar chart to exhibit the performance of various types of physicians and the AI system (see Figure 2B), which indicated that ChatGPT-3.5 has comparable performance to doctors in multiple aspects, particularly excelling in S&O, PTE, TR, and C&C. We conducted statistical analyses to compare the performance of ChatGPT-3.5 with that of doctors. The Mann-Whitney test indicated that the differences between the AI and doctors were statistically significant for BI-RADS classification (P value =0.028) and MD (P value =0.033). These findings suggest that while ChatGPT-3.5 performs well, there are certain areas in which expertise still outperforms AI.
Figure 2 Assessment of report quality and accuracy by physicians at various experience levels and by AI. (A) Distribution of ratings for accuracy and additional evaluation criteria among reports from JD, ID, SD, the collective Dr group, and ChatGPT (AI). (B) Radar chart showing average ratings for evaluation metrics based on varying experience levels of doctor- and AI-generated reports. JD, junior doctor; AI, artificial intelligence; ID, intermediate doctor; SD, senior doctor; Dr, doctor; CI, confidence interval; BI-RADS, Breast Imaging Reporting and Data System; PTE, professional terminology and expression; S&O, structure and organization; OE, overall evaluation; UDAIA, ultrasound doctor AI acceptance; LWBP, likelihood of being written by physician; C&C, clarity and comprehensibility; TR, treatment recommendations; MD, malignancy diagnosis.
Agreement analysis
To further evaluate the agreement between doctors and ChatGPT, we performed Cohen kappa analysis. The Cohen kappa coefficient for BI-RADS Classification was 0.68, indicating substantial agreement between the AI and physicians. This suggests that while there are discrepancies, the AI-generated reports are generally in alignment with those written by doctors.
Turing test results
We used comparative bar charts and pie charts to evaluate the distinctions between AI-generated reports and those authored by human doctors (with a score of 5 representing a high likelihood of being human written and a score of 1 denoting a very low likelihood). The proportion of doctor-written reports that garnered a score of 5 was 33.70%, whereas AI-generated reports exhibited a marginally higher proportion in this category at 35.34%. This observation suggests that AI-generated reports convincingly approximate the characteristics of reports composed by medical professionals.
Reproducibility analysis
Figure 3 Boxplot illustrating the score distribution of ChatGPT-generated reports for the same patient across various time intervals. The results indicated consistent a performance across various evaluation criteria. The key mean scores for both experiments included those for S&O (4.12 and 4.07; P=0.59), PTE (4.18 and 4.00; P=0.19), and C&C (4.09 and 3.63; P=0.048). The AI-generated medical reports showed consistent performance throughout the experiments, with high mean scores being maintained for most criteria. Although some variations were observed in specific areas, such as in BI-RADS classification (3.88 and 3.37; P=0.06) and MD (3.91 and 3.40; P=0.047), the overall performance of the AI in generating medical reports remains promising. The consistency in scores across most of the evaluation criteria warrants further investigation into the potential applications and development of AI-generated medical reports.
Figure 3 Boxplot illustrating the score distribution of ChatGPT-generated reports for the same patient across various time intervals. The differences in P values are as follows: S&O, 0.586; PTE, 0.195; BI-RADS classification, 0.058; MD, 0.047; TR, 0.067; C&C, 0.049; LWBP, 0.093; UDAIA, 0.044; and OE, 0.016. S&O, structure and organization; PTE, professional terminology and expression; BI-RADS, Breast Imaging Reporting and Data System; MD, malignancy diagnosis; TR, treatment recommendation; C&C, clarity and comprehensibility; LWBP, likelihood of being written by physician; UDAIA, ultrasound doctor artificial intelligence acceptance; OE, overall evaluation.
Erroneous report analysis
The following is a summary of results for the erroneous reports generated by ChatGPT and reviewed by clinical doctors: For incorrect diagnoses, cases with low scores (score 1–2) indicated errors in distinguishing between benign and malignant diagnoses. For example, a benign case was diagnosed as BI-RADS 4b even though we defined 4a and above as malignant.
In terms of inconsistencies, the BI-RADS classification did not always correspond with the appropriate clinical recommendations, leading to inconsistencies in the generated report’s content. For instance, a report indicated a benign diagnosis but suggested a biopsy. This inconsistency may be due to the model’s inability to fully understand the context and relationships between different sections of the report. For overdiagnosis, there was overdiagnosis in some benign lesions.
CoT visualization
The visualization results in Figure 4 depict the key steps and considerations in the decision-making process of the ChatGPT model. First, the model extracts crucial information from the patient’s ultrasound reports, such as breast echogenicity, presence or absence of masses and abnormal blood flow in the breasts, characteristics of any nodules found, and axillary lymph node status. Next, with this data, the model calculates the BI-RADS score, a crucial metric in assessing breast cancer. The calculation involves an evaluation of the breast echogenicity, structural disorder, presence or absence of masses and abnormal blood flow, characteristics of nodules, and lymph node status. Further, the model combines the previously calculated BI-RADS score. This step is not merely an evaluation of individual parameters but also an integrated risk assessment that computes the likelihood of cancer. Finally, based on the above information and diagnosis result, the model synthesizes all this information to provide a suggestion on what treatment might be suitable. This implies that the final suggestion is not solely dependent on a single parameter or result but is a comprehensive consideration of the risk level of breast cancer. Our visualization chart provides a clear and explicit representation of this process, enabling us to better understand the decision-making logic of the model in the diagnostic and treatment suggestion process. Key nodes in the model’s thought chain, such as BI-RADS score and nodule characteristics, are clearly highlighted. This research offers insight into the cognitive processes underlying the decision-making framework of the ChatGPT model in the diagnosis and recommendation of therapeutic interventions for breast cancer.
Figure 4 Visualization of the CoT for breast cancer diagnosis and treatment suggestions. This CoT consists of several steps. The “Extract Data ()” function extracts essential patient information from ultrasound reports. The “BI-RADS score calculation” operation evaluates the breast lesions according to the BI-RADS based on the extracted information. Finally, the “Treatment recommendations” function suggests what treatment might be advisable based on the matched results. BI-RADS, Breast Imaging Reporting and Data System; CoT, chain of thought.
DiscussionOther Section
In this study, we assessed ChatGPT’s performance in generating breast cancer diagnosis reports, concentrating on report scoring, quality comparisons among doctors with varying experience levels, Turing test outcomes, reproducibility analysis, and erroneous report examination. Our findings offer valuable insights into ChatGPT’s present capabilities, highlighting potential areas for improvement and practical applications within the medical domain.
Report performance
The evaluation of ChatGPT-3.5’s performance in generating medical reports, based on metrics such as S&O, PTE, TR, and C&C, yielded promising results. With mean scores around or above 4, the AI demonstrated potential for producing high-quality reports comparable to those written by radiologists. Literature also supports the promise of automated systems in medical documentation (26,27).
However, the AI’s performance in BI-RADS classification, MD, and UDAIA needs improvement. This aligns with the findings of Pang et al. (28), who noted issues in AI’s accuracy in specific medical classification tasks, and of Zhou et al. (29), who identified challenges in complex decision support and multidimensional data analysis. Thus, while ChatGPT-3.5 excels in various areas, further development is needed for comprehensive and accurate performance.
The radar chart comparison (Figure 2B) between different doctors and the AI highlights its potential in medical report generation. The literature suggests AI’s significant promise in assisting healthcare professionals (30), supporting these findings.
Turning test and reproducibility
The Turing test results provide valuable insights into AI’s ability emulate reports written by human physicians. Figure 5 shows that 35.34% of AI-generated reports achieved a score of 5, indicating a high likelihood of being perceived as human written. This slightly surpassed the 33.70% for doctor-authored reports, suggesting that AI-generated reports can closely resemble doctor-authored reports and sometimes even surpass them in perceived authenticity. Thus, AI could streamline the medical reporting process, reduce healthcare professionals’ workload, and allow more time for patient care (30-32).
Figure 5 Evaluation of the perceived human authorship of reports created by physicians at various experience levels and those generated by AI. [1, extremely unlikely; 2, somewhat unlikely; 3, moderately likely; 4, likely; 5, extremely likely (to be human written)]. (A) A histogram showing the probability distribution of reports evaluated by the Dr group. (B) A pie chart showing the distribution of Turing test scores for reports authored by AI and doctors. AI, artificial intelligence; Dr, doctor.
The reproducibility of experiment results (Figure 3) further demonstrated the consistency and reliability of AI-generated reports across multiple evaluation criteria. The AI’s performance showed high consistency in S&O, PTE, and C&C, underscoring its potential in maintaining high quality in medical reporting (31,32).
However, the inconsistency in BI-RADS classification and MD still needs to be addressed. Previous studies have noted similar challenges, where AI systems exhibited variability in accuracy for certain medical tasks (32,33). Resolving these issue will improve AI-generated report quality and bolster healthcare professionals’ confidence in in them for decision-making.
Erroneous reports
The results of erroneous reports indicated several shortcomings in ChatGPT’s handling of medical texts. First, the inconsistencies suggest that the model struggles with long-text comprehension, leading to context and relationship discrepancies within reports (34,35). Improving attention mechanisms could mitigate these issues. Second, ChatGPT’s reliance on physician descriptions without independent image analysis resulted in overdiagnosis in benign cases. Integrating computer vision techniques could improve diagnostic accuracy (36).
Finally, the model often overlooked details in cases with multiple lesions, focusing on high-malignancy descriptions and missing others. Enhancing multisource information processing could address this flaw (37).
CoT
The interpretability of AI models is crucial in healthcare, as it allows doctors and patients to understand and trust AI decisions, significantly improving patient outcomes. The CoT concept helps trace the AI’s thought process, identifying potential weaknesses and biases, thereby enhancing performance and building user trust. This is vital for the integration of AI into healthcare settings (38,39).
Explainability involves understanding why the model makes a given classification, such as assigning a BI-RADS score. This requires evaluating features such as nodule size, shape, margins, and microcalcifications to provide a clear rationale behind recommendations. Previous studies emphasize the importance of explainability in AI for healthcare, highlighting its role in improving trust and acceptance among users (40,41).
Limitations and future work
ChatGPT still has several limitations in the medical context. First, the model’s inability to analyze images directly, relying solely on physician-provided text, suggests there is a need to integrate computer vision techniques (42-44). Second, longer texts can lead to inconsistencies, and addressing this requires improving the comprehension and generation of longer texts. Third, specialized fields such as ultrasound report analysis require more domain-specific knowledge. Future research should focus on incorporating expert knowledge and clinical guidelines (36,45). Fourth, potential biases should be considered, as physicians’ familiarity with AI-generated reports might influence their assessments. Ensuring trust and transparency involves robust validation processes, clear documentation, and human oversight (46,47). Finally, the generalizability of this study may be limited to breast cancer diagnosis. Further research should explore AI-generated reports in other medical domains (23,48).
ConclusionsOther Section
The findings of this study support ChatGPT’s potential in analyzing breast ultrasound reports and providing diagnostic and TR. It exhibited strong performance across various evaluation criteria and convincingly emulated reports written by physicians. Moreover, the reproducibility results indicate a high level of consistency in essential aspects of medical reporting. However, the analysis of erroneous reports suggests that there are several areas where improvements are needed, including model understanding and context, image analysis, and the handling of multiple lesions. Furthermore, the visual dissection of the AI’s CoT provides invaluable insights into the decision-making process, highlighting the importance of model interpretability for enhancing performance, building user trust, and effectively integrating AI into healthcare environments.
AcknowledgmentsOther Section
We express our sincere gratitude to the 57 medical professionals from various institutions who participated in this study.
Funding: This study was supported by
FootnoteOther Section
Reporting Checklist: The authors have completed the STROBE reporting checklist. Available at https://qims.amegroups.com/article/view/10.21037/qims-24-141/rc
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://qims.amegroups.com/article/view/10.21037/qims-24-141/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki (as revised in 2013) and ethical approval for this study was obtained from the Medical Ethics Committee of Beijing Friendship Hospital, Capital Medical University (No. 2022-P2-060-01). The requirement for individual consent was waived due to the retrospective nature of the analysis.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
ReferencesOther Section
Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2018;68:394-424. [Crossref] [PubMed]
Madjar H. Role of Breast Ultrasound for the Detection and Differentiation of Breast Lesions. Breast Care (Basel) 2010;5:109-14. [Crossref] [PubMed]
Amuasi AA, Acheampong AO, Anarfi E, Sagoe ES, Poku RD, Abu-Sakyi J. Effect of malocclusion on quality of life among persons aged 7-25 years: A cross-sectional study. J Biosci Med (Irvine) 2020;8:26-35.
Elmore JG, Longton GM, Carney PA, Geller BM, Onega T, Tosteson AN, Nelson HD, Pepe MS, Allison KH, Schnitt SJ, O'Malley FP, Weaver DL. Diagnostic concordance among pathologists interpreting breast biopsy specimens. JAMA 2015;313:1122-32. [Crossref] [PubMed]
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D. Mastering the game of Go with deep neural networks and tree search. Nature 2016;529:484-9. [Crossref] [PubMed]
Brown TB, Mann B, Ryder N, Subbiah M, Amodei D. Language Models are Few-Shot Learners. Part of Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020.
Mallio CA, Bernetti C, Sertorio AC, Zobel BB. ChatGPT in radiology structured reporting: analysis of ChatGPT-3.5 Turbo and GPT-4 in reducing word count and recalling findings. Quant Imaging Med Surg 2024;14:2096-102. [Crossref] [PubMed]
Wang Z, Zhang Z, Traverso A, Dekker A, Qian L, Sun P. Assessing the role of GPT-4 in thyroid ultrasound diagnosis and treatment recommendations: enhancing interpretability with a chain of thought approach. Quant Imaging Med Surg 2024;14:1602-15. [Crossref] [PubMed]
Ferrucci D, Brown E, Chu-Carroll J, Fan J, Gondek D, Kalyanpur AA, Lally A, Murdock JW, Nyberg E, Prager J, Schlaefer N, Welty C. Building Watson: An Overview of the DeepQA Project. Ai Magazine 2010;31:59-79.
Schrittwieser J, Antonoglou I, Hubert T, Simonyan K, Sifre L, Schmitt S, Guez A, Lockhart E, Hassabis D, Graepel T, Lillicrap T, Silver D. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 2020;588:604-9. [Crossref] [PubMed]
Wartman SA, Combs CD. Reimagining Medical Education in the Age of AI. AMA J Ethics 2019;21:E146-52. [Crossref] [PubMed]
Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, Venugopalan S, Widner K, Madams T, Cuadros J, Kim R, Raman R, Nelson PC, Mega JL, Webster DR. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA 2016;316:2402-10. [Crossref] [PubMed]
Finlayson SG, Bowers JD, Ito J, Zittrain JL, Beam AL, Kohane IS. Adversarial attacks on medical machine learning. Science 2019;363:1287-9. [Crossref] [PubMed]
Castelvecchi D. Can we open the black box of AI? Nature 2016;538:20-3. [Crossref] [PubMed]
Lipton ZC. The Mythos of Model Interpretability: In machine learning, the concept of interpretability is both important and slippery. Queue 2018;16:31-57.
Wei J, Wang X, Schuurmans D, Bosma M, Ichter B, Xia F, Chi E, Le QV, Zhou D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. Part of Advances in Neural Information Processing Systems 35 (NeurIPS 2022) Main Conference Track, 2022.
Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training. 2018.
Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, Wang Y, Dong Q, Shen H, Wang Y. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol 2017;2:230-43. [Crossref] [PubMed]
Turing AM. Computing Machinery and Intelligence. In: Epstein R, Roberts G, Beber G. editors. Parsing the Turing Test. Springer, Dordrecht, 2009:23-65.
Bengio Y Léonard N Courville A. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation. arXiv: 1308.3432,
2013 .
Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1:307-10.
Graber ML, Franklin N, Gordon R. Diagnostic error in internal medicine. Arch Intern Med 2005;165:1493-9. [Crossref] [PubMed]
Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019;25:44-56. [Crossref] [PubMed]
Mcknight PE, Najab J. Mann-Whitney U Test. The Corsini Encyclopedia of Psychology, 2010.
Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. Author Correction: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods 2020;17:352. [Crossref] [PubMed]
Najjar R. Redefining Radiology: A Review of Artificial Intelligence Integration in Medical Imaging. Diagnostics (Basel) 2023;13:2760. [Crossref] [PubMed]
Ahmed SB, Solis-Oba R, Ilie L. Explainable-AI in Automated Medical Report Generation Using Chest X-ray Images. Appl Sci 2022;12:11750.
Pang T, Li P, Zhao L. A survey on automatic generation of medical imaging reports based on deep learning. Biomed Eng Online 2023;22:48. [Crossref] [PubMed]
Zhou Y, Wang B, He X, Cui S, Shao L DR-GAN. Conditional Generative Adversarial Network for Fine-Grained Lesion Synthesis on Diabetic Retinopathy Images. IEEE J Biomed Health Inform 2022;26:56-66. [Crossref] [PubMed]
Acosta JN, Falcone GJ, Rajpurkar P, Topol EJ. Multimodal biomedical AI. Nat Med 2022;28:1773-84. [Crossref] [PubMed]
Lundervold AS, Lundervold A. An overview of deep learning in medical imaging focusing on MRI. Z Med Phys 2019;29:102-27. [Crossref] [PubMed]
Shen D, Wu G, Suk HI. Deep Learning in Medical Image Analysis. Annu Rev Biomed Eng 2017;19:221-48. [Crossref] [PubMed]
Jha S, Topol EJ. Adapting to Artificial Intelligence: Radiologists and Pathologists as Information Specialists. JAMA 2016;316:2353-4. [Crossref] [PubMed]
Vaswani A, Shazeer N. armar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is all you need. Advances in Neural Information Processing Systems 2017;30:5998-6008.
Devlin J Chang MW Lee K Toutanova K .
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv: 1810.04805,
2019 .
Rajpurkar P Irvin J Zhu K Yang B Mehta H Duan T Ding D Bagul A Langlotz C Shpanskaya K. CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning. arXiv:1711.05225,
2017 .
Ardila D, Kiraly AP, Bharadwaj S, Choi B, Reicher JJ, Peng L, Tse D, Etemadi M, Ye W, Corrado G, Naidich DP, Shetty S. Author Correction: End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat Med 2019;25:1319. [Crossref] [PubMed]
Frasca M, Torre DL, Pravettoni G, Cutica I. Explainable and interpretable artificial intelligence in medicine: a systematic bibliometric review. Explainable and interpretable artificial intelligence in medicine: a systematic bibliometric review. Discov Artif Intell 2024;4:15.
Amann J, Blasimme A, Vayena E, Frey D, Madai VI. Precise4Q consortium. Explainability for artificial intelligence in healthcare: a multidisciplinary perspective. BMC Med Inform Decis Mak 2020;20:310. [Crossref] [PubMed]
Chaddad A, Peng J, Xu J, Bouridane A. Survey of Explainable AI Techniques in Healthcare. Sensors (Basel) 2023;23:634. [Crossref] [PubMed]
Gerdes A. The role of explainability in AI-supported medical decision-making. Discov Artif Intell 2024;4:29.
Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, Cui C, Corrado G, Thrun S, Dean J. A guide to deep learning in healthcare. Nat Med 2019;25:24-9. [Crossref] [PubMed]
Sakamoto SI, Hutabarat Y, Owaki D, Hayashibe M. Ground Reaction Force and Moment Estimation through EMG Sensing Using Long Short-Term Memory Network during Posture Coordination. Cyborg Bionic Syst 2023;4:0016.
Zhan G, Wang W, Sun H, Hou Y, Feng L. Auto-CSC: A Transfer Learning Based Automatic Cell Segmentation and Count Framework. Cyborg Bionic Syst 2022;2022:9842349. [Crossref] [PubMed]
Tschandl P, Rinner C, Apalla Z, Argenziano G, Codella N, Halpern A, Janda M, Lallas A, Longo C, Malvehy J, Paoli J, Puig S, Rosendahl C, Soyer HP, Zalaudek I, Kittler H. Human-computer collaboration for skin cancer recognition. Nat Med 2020;26:1229-34. [Crossref] [PubMed]
Samek W Wiegand T Müller KR .
Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. arXiv:1708.08296,
2017 .
Holzinger A, Langs G, Denk H, Zatloukal K, Müller H. Causability and explainability of artificial intelligence in medicine. Wiley Interdiscip Rev Data Min Knowl Discov 2019;9:e1312. [Crossref] [PubMed]
Sun G, Zhou YH. AI in healthcare: navigating opportunities and challenges in digital communication. Front Digit Health 2023;5:1291132. [Crossref] [PubMed]
Cite this article as: Sun P, Qian L, Wang Z. Preliminary experiments on interpretable ChatGPT-assisted diagnosis for breast ultrasound radiologists. Quant Imaging Med Surg 2024;14(9):6601-6612. doi: 10.21037/qims-24-141
感谢QIMS杂志授权转载!
原文链接:
https://qims.amegroups.org/article/view/128118/html
About the Journal
Quantitative Imaging in Medicine and Surgery
Aims and Scope
Quantitative Imaging in Medicine and Surgery (QIMS, Quant Imaging Med Surg, Print ISSN 2223-4292; Online ISSN 2223-4306) publishes peer-reviewed original reports and reviews in medical imaging, including X-ray, ultrasound, computed tomography, magnetic resonance imaging and spectroscopy, nuclear medicine and related modalities, and their application in medicine and surgery. While focus is on clinical investigations, papers on medical physics, image processing, or biological studies which have apparent clinical relevance are also published. This journal encourages authors to look at the medical images from a quantitative angle. This journal also publishes important topics on imaging-based epidemiology, and debates on research methodology, medical ethics, and medical training. Descriptive radiological studies of high clinical importance are published as well.
QIMS is an open-access, international peer-reviewed journal, published by AME Publishing Company. It is published quarterly (Dec. 2011- Dec. 2012), bimonthly (Feb. 2013 - Feb 2018), monthly (Mar. 2018 - ) and openly distributed worldwide.
QIMS is indexed in PubMed/PubMed Central, Scopus, Web of Science [Science Citation Index Expanded (SCIE)]. The latest impact factor is: 2.9.
Indexing
Quantitative Imaging in Medicine and Surgery is indexed and covered by
Web of Science [Science Citation Index Expanded (SCIE)]
PubMed
PubMed Central (PMC)
Google Scholar
Scopus
Information for Authors
QIMS is a member of Committee on Publication Ethics (COPE) and it follows the Committee on Publication Ethics (COPE)'s guidelines and the ICMJE recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journal.
Manuscripts submitted must be the original work of the author(s) and must not be published previously or under consideration for publication elsewhere.
Manuscripts Turnaround Time
First Editorial Decision: 3-5 days
Peer review: 1-2 months
Revision time: 2-4 weeks
Publication Ahead of Print: within 1 month after being accepted
Formal publication: within 1-3 months after being accepted. Original Articles are listed as priority.
QIMS’s position on case reports and review articles.
QIMS welcomes case reports where quantitative imaging played a role for diagnosis and/or treatment; also welcome first time realization (in animals or in human subjects) of a new imaging technique. These case reports are usually written a concise and short letter format (see <https://qims.amegroups.com/>) for example. Case reports of particular clinical importance are also published in a longer format; for these cases we expect important pathophysiological, diagnostic, therapeutic implications. We do not publish case report only because of the rarity of the cases. Note, although we believer reporting case materials is important for the advancement of medicine. The space reserved for case report remains limited for each issue. The decision to publish or not publish a case material submission can sometimes depend on the available space of the journal.
QIMS welcomes reviews and comments on published papers. Review papers should contain authors’ analytical appraisal of published papers and personal viewpoints, instead of a mere aggregation of published abstracts.
We expect review papers are in three forms, 1) expert reviews, usually published in editorial format, provide authors’ own insights and perspective; 2) systematic review; 3) educational reviews, including pictorial reviews. Systematic reviews (maybe narrative in writing but critical in nature) are particularly welcomed.
Publication Schedule
Published quarterly from Dec. 2011 to Dec. 2012 and bimonthly from Feb. 2013 to Feb 2018, QIMS now follows a monthly publication model.
Open Access Statement
This journal is a peer reviewed, open access journal. All content of the journal is published under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0). All articles published open access will be immediately and permanently free for all to read, download, copy and distribute as defined by the applied license.
Free access and usage
Permitted third party reuse is defined by the CC BY-NC-ND 4.0 license. This license allows users to copy and distribute the article, provided:
this is not done for commercial purposes and further does not permit distribution of the Article if it is changed or edited in any way.
the user gives appropriate credit (with a link to the formal publication through the relevant DOI) and provides a link to the license but not in an any way implying that the licensor is endorsing the user or the use of the work.
no derivatives including remix, transform, or build upon the material was allowed for distribution.
The full details of the license are available at https://creativecommons.org/licenses/by-nc-nd/4.0/
Copyright
For open access publishing, this journal uses an exclusive licensing agreement. Authors will transfer copyright to QIMS, but will have the right to share their article in the same way permitted to third parties under the relevant user license, as well as certain scholarly usage rights.
For any inquiry/special circumstance on the copyright, commercial usage or adaptation of QIMS articles, please contact: permissions@amegroups.com
For reprint order, please contact: sales@amegroups.com
Support for Authors to Comply with Funding Body Mandates
We work with authors of research articles supported by funding bodies with open access mandates to ensure that authors can meet their funders’ requirements for public access to research results.
In addition, we offer further support for authors who are required to comply with funding body mandates, including but not limited to:
All articles published open access will be immediately and permanently free for everyone to read, download, copy and distribute. If a specific open access license is needed, please contact the editorial office for confirmation before submission. Example of statements in a published article: https://jgo.amegroups.org/article/view/76355/html.
No copyright is claimed for any work of the U.S. government. Example of statements in published articles: https://actr.amegroups.org/article/view/8875; https://med.amegroups.org/article/view/8358/html.
Editorial Office
Email: qims@amepc.org
Publisher Information
QIMS is published by AME Publishing Company.
Addresses:
Hong Kong branch office: Flat/RM C 16F, Kings Wing Plaza 1, NO. 3 on Kwan Street, Shatin, NT, Hong Kong, China.
Singapore branch office: Pico Creative Centre, 20 Kallang Ave #03-08, 339411 Singapore, Singapore.
Updated on July 19, 2024
感谢QIMS授权转载!
观点:王毅翔等——新冠肺炎CT检查的临床意义未定,尽可能避免,武汉/湖北除外!
SCI之窗(005)—杜二珠、王毅翔等:调整X 线焦点投照位置以提高正位胸片对椎体骨质疏松性压缩变形的检出率
王毅翔 等:肝脏弥散加权磁共振ADC及IVIM定量 : 现有的困难及部分解决方法
SCI之窗(006)—王毅翔等:老年华人女性骨质疏松性椎体压缩性骨折的发病率及严重程度远低于欧美人群
SCI之窗(011):对东亚男女性老年人修正1994WHO T-值定义骨质疏松的阈值以使其与东亚人群终生脆性骨折的风险一致
QIMS之窗(006):18F-FDG-PET/CT 在结核病诊断及疗效评估中的价值
QIMS之窗(007):腹壁肿瘤及肿瘤样病变的CT及MRI表现
QIMS之窗(008):结合非强化磁共振血管造影、血流定量及灌注成像评估将发生的再次中风
QIMS之窗(009):颅内动脉瘤的神经影像学:考虑瘤体大小以外的解剖及血流动力学因素
QIMS之窗(010):髓鞘脂的UTE (超短回波时间) 磁共振成像:技术发展及挑战
QIMS之窗(011):新冠患者CT测量得到的肺血管指数及临床预后间的关系
QIMS之窗 (012): 双能CT肺血管造影显示新冠肺炎的微血管病变
QIMS之窗 (013): 胸部平片用于严重新冠病的临床价值
QIMS之窗 (014): 活动性肺结核的几种不典型CT表现及机制
QIMS之窗 (015): 类风湿关节炎患者高分辨率外周定量CT评估掌指关节3维关节间隙宽度的共识方法
QIMS之窗 (016): 骨质疏松症的影像学及骨密度诊断: 中国专家共识(英文版)
QIMS之窗 (017): 新冠肺炎的一种新CT征象: 拱形桥征
QIMS之窗 (018): 老年男性X线骨质疏松性椎体骨折综述:聚焦于男女性别间差异
QIMS之窗 (022): 双能CT区分甲状腺乳头状癌患者小于0.5 cm的转移性和非转移性淋巴结
QIMS之窗 (023): 正位胸片及腹部正位平片上识别骨质疏松性椎体压缩变形: 图文综述
QIMS之窗 (024): 基于人工智能的血管抑制技术应用于肺癌筛查中半实性小结节检测
QIMS之窗 (026): 磁共振成像显示前胫腓韧带损伤与踝关节状态及其肌腱、韧带的关系
QIMS之窗 (027): 腹主动脉瘤的CT测量:非标准化测量的临床后果及多平面重建的重要性
QIMS之窗 (028): 人工智能辅助诊断减少急诊全身CT的胸部病变漏诊
QIMS之窗 (029): 性别各异的肝脏衰老过程与磁共振成像
QIMS之窗 (030): 今天的放射科医生遇到明天的人工智能: 许诺、陷阱和无限的潜力
QIMS之窗 (032): 通过多模态融合成像三维定量评估心肌梗死: 方法学、验证和初步临床应用
QIMS之窗 (033): 门脉高压的侧枝循环:解剖及临床相关性
QIMS之窗 (034):医学图像分析应用中计算机视觉和人工智能的新进展
QIMS之窗 (035):大动脉壁的应力分布对动脉粥样硬化的影响
QIMS之窗 (036): 人工智能计算机辅助诊断系统评估肺癌、转移瘤和良性病变的预测准确性
QIMS之窗 (037):4D 血流 MRI 在心血管疾病的临床应用:现状和未来展望
QIMS之窗 (038): 含碘造影剂的交叉反应:我们需要关注吗?
QIMS之窗 (039): 全脑分析显示正常中青年深部灰质和大脑皮层年龄相关性磁敏感率变化
QIMS之窗 (040): 颈动脉支架术治疗后新发缺血性脑病灶与颈动脉钙化环壁分布程度相关
QIMS之窗 (041): 先进脑磁共振技术转化为临床实践:多模态磁共振在传统临床条件下区分痴呆亚型
QIMS之窗 (042): 3,557 名感染 COVID-19 儿童的CT扫描表现: 系统性综述
QIMS之窗 (043): 肩关节不稳影像学图文综述QIMS之窗 (044): 多排计算机断层扫描评估肝门部胆管癌血管受累
QIMS之窗 (045): 多发性骨髓瘤患者肿瘤负荷的全身磁共振成像定量评估: 与预后生物标志物的相关性
QIMS之窗 (046): 虚拟或真实: 肾上腺肿瘤的活体样电影模式重建
QIMS之窗 (047): 血池和肝脏PET 标准化摄取值的年龄相关变化: 对 2526 名患者长达十年的回顾性研究结果
QIMS之窗 (048): 辨认骨质疏松性椎体终板及皮质骨折: 图文综述
QIMS之窗 (049): 经皮冠状动脉介入治疗后有症状患者心肌灌注受损的临床和影像预测因素:动态CT心肌灌注成像的表现
QIMS之窗 (050): 定量磁共振 神经成像用于评估周围神经和神经丛损伤: 图文综述
QIMS之窗 (051): 儿童颈部良恶性肿块的影像学诊断: 图文综述
QIMS之窗 (052): CT肺结节半自动分割 的常规方法和深度学习方法的比较评估
QIMS之窗 (053): 通过ICC评估放射组学特征的可靠性: 系统性综述
QIMS之窗 (054): 550 例小儿脑肿瘤定性 MRI 的诊断准确性:评估计算时代的临床实践
QIMS之窗 (055): 年龄和吸烟对中国健康男性肺血管容积的影响 : 低剂量CT定量测量
QIMS之窗 (056): 通过薄层CT扫描区分肺部部分实性结节的良恶性
QIMS之窗 (057): 不同阶段高血压患者脑白质变化、高血压病程、年龄与脑微出血有关
QIMS之窗 (058): 乳动脉钙化作为动脉粥样硬化性心血管疾病的指标:冠状动脉CT评分系统和颈动脉内中膜厚度的比较分析
QIMS之窗 (059): 老年华人骨质疏松性骨折的发生率不到欧美人群的一半
QIMS之窗 (060): 基于简化时序方案的黑血延时钆增强心脏磁共振成像用于心肌瘢痕检测:检查怀疑冠状动脉疾病患者的单中心经验
QIMS之窗 (061): 弥漫性肝病的 CT 和 MR: 多参数预测建模算法帮助肝实质分类
QIMS之窗 (062): 心脏磁共振评价川崎病患儿心肌综合收缩力: 大型单中心研究
QIMS之窗 (063): 胸腰脊柱骨折的分类: 定量影像学的作用
QIMS之窗 (064): 不规则骨及扁平骨的骨肉瘤: 112例患者的临床及影像学特征
QIMS之窗 (065): “华人脊椎更健康”: MrOS (Hong Kong)和 MsOS(Hong Kong) 研究进展
QIMS之窗 (066): 低球管电压方案CT平描无创诊断肝脂肪
QIMS之窗 (067): 全身磁共振成像在成人淋巴瘤患者分期中的诊断性能—系统综述和荟萃分析
QIMS之窗 (069): 使用放射组学和组合机器学习对帕金森病进展进行纵向聚类分析和预测
QIMS之窗 (070): 直肠内超声和MRI使用直肠系膜浸润深度5mm为截止点对T3直肠癌进行术前亚分类的一致性和存活的意义
QIMS之窗 (071): 肺结节的体积分析:减少基于直径的体积计算和体素计数方法之间的差异
QIMS之窗 (072):深度学习图像重建可降低射线剂量成像的同时保持图像质量:增强腹部CT扫描深度学习重建与混合迭代重建的比较
QIMS之窗 (073): 严重钙化冠状动脉中隐藏的不稳定的斑块
QIMS之窗 (074): 放射组学和混合机器学习对帕金森病进展的纵向聚类分析和预测
QIMS之窗 (075): 冠状动脉慢性完全闭塞病人心血管磁共振成像随访应力分析和晚期钆增强的量化
QIMS之窗 (076): 平扫光谱CT有效原子序数图识别无钙化动脉粥样硬化斑块的临床可行性初步研究
QIMS之窗 (077): 7T磁共振神经影像学: 图文综述QIMS之窗 (078): MRI特征区分结直肠肝转移瘤的组织病理学生长模式
QIMS之窗 (079): 弱监督学习使用弥散加权成像检出急性缺血性中风和出血性梗塞病变的能力
QIMS之窗 (080): 无造影强化光谱CT有效原子序数图识别无钙化动脉粥样硬化斑块:临床可行性初步研究
QIMS之窗 (081): ImageJ定量测量超微血管成像与造影增强超声定量测量对于肝脏转移瘤检查的比较: 初步研究结果
QIMS之窗 (082): 剪切波弹性成像显示: 无论先前抗病毒治疗如何, 慢性戊型肝炎患者肝组织硬度均升高
QIMS之窗 (083): 磁共振与CT在脊柱骨病变中的价值
QIMS之窗 (084): 一种简化评分方案以提高MRI乳房成像报告/数据系统的诊断准确性
QIMS之窗 (085): 晚年抑郁症进展与 MRI 定量磁敏感性测量脑铁沉积的变化
QIMS之窗 (086): 吸烟通过调节黑质纹状体通路中铁沉积与临床症状之间的相互作用对帕金森病起到保护作用
QIMS之窗 (087): 急性肺栓塞后血栓栓塞持续存在的临床和影像学危险因素
QIMS之窗 (088): 在老年女性侧位胸片上自动检出椎体压缩性骨折的软件: Ofeye 1.0
QIMS之窗 (089): 脑血流与脑白质高信号进展之间的关联:一项基于社区成年人的纵向队列研究
QIMS之窗 (090): 基于骨密度诊断老年华人骨质疏松症发病率和定义骨质疏松症的临界T值
QIMS之窗 (091): 臂丛神经磁共振束成像: 循序渐进的步骤
QIMS之窗 (092): 造血病患者通过磁共振模块化报告评估骨髓
QIMS之窗 (093): 使用无造影剂和无触发的弛豫增强血管造影 (REACT) 评估急性缺血性中风的近端颈内动脉狭窄
QIMS之窗 (094): 用于预测自发性脑出血后不良预后和 30 天死亡率的临床-放射组学列线图
QIMS之窗 (095): 深度学习在超声成像识别乳腺导管原位癌和微浸润中的应用
QIMS之窗 (096): 磁共振灌注成像区分胶质瘤复发与假性恶化:系统性综述、荟萃分析及荟萃回归
QIMS之窗 (097): 锥形束 CT 引导微波消融治疗肝穹窿下肝细胞癌:回顾性病例对照研究
QIMS之窗 (098): 阿尔茨海默病患者皮质铁积累与认知和脑萎缩的关系
QIMS之窗 (099): 放射组学机器学习模型使用多样性的MRI数据集检出有临床意义前列腺癌的性能不均一性
QIMS之窗 (100): 一种机器学习方法结合多个磁共振弥散散模型来区分低级别和高级别成人胶质瘤
QIMS之窗 (101): MRPD脂肪分数 (MRI-PDFF)、MRS 和两种组织病理学方法(AI与病理医生)量化脂肪肝
QIMS之窗 (102): 占位性心脏病患者的诊断和生存分析:一项为期10年的单中心回顾性研究
QIMS之窗 (103): Ferumoxytol增强4DMR多相稳态成像在先心病中的应用:2D和3D软件平台评估心室容积和功能
QIMS之窗 (104): 磁共振弹性成像对肝细胞癌肝切除术后肝再生的术前评价
QIMS之窗 (105): 使用定量时间-强度曲线比较炎症性甲状腺结节和甲状腺乳头状癌的超声造影特征:倾向评分匹配分析
QIMS之窗 (106): 口服泡腾剂改善磁共振胰胆管造影 (MRCP)
QIMS之窗 (107): 钆磁共振成像造影剂引起的弛豫率改变显示阿尔茨海默病患者微血管形态变化
QIMS之窗 (108): 轻链心肌淀粉样变性患者左心室心肌做功指数和短期预后:一项回顾性队列研究
QIMS之窗 (109): 基于MR放射组学的机器学习对高级别胶质瘤患者疾病进展的预测价值
QIMS之窗 (110): 高分辨率T2加权MRI与组织病理学集合分析显示其在食管癌分期中的意义
QIMS之窗 (111): 使用多参数磁共振成像和波谱预测放射治疗后前列腺癌的复发: 评估治疗前成像的预后因素
QIMS之窗 (112):双层能谱探测器CT参数提高肺腺癌分级诊断效率
QIMS之窗 (113): 弥散加权T2图谱在预测头颈部鳞状细胞癌患者组织学肿瘤分级中的应用
QIMS之窗 (114): 老年女性椎体高度下降不到 20% 的骨质疏松样椎体骨折与进一步椎体骨折风险增加有关:18年随访结果
QIMS之窗 (115): 膝关节周围巨细胞瘤和软骨母细胞瘤的影像学:99例回顾性分析
QIMS之窗 (116): 胸部CT显示分枝杆菌感染空洞:临床意义和基于深度学习的量化自动检测
QIMS之窗 (117): 基于人工智能的甲状腺结节筛查自动诊断系统的统计优化策略评估和临床评价
QIMS之窗 (118): 基于四维血流磁共振成像的弯曲大脑中动脉壁切应力的分布和区域变化
QIMS之窗 (119): 我们最近关于老年男性和女性流行性骨质疏松性椎体骨折X线诊断的循证工作总结
QIMS之窗 (120): 许莫氏结节与流行性骨质疏松性椎体骨折和低骨密度有关:一项基于老年男性和女性社区人群的胸椎MRI研究
QIMS之窗 (121): 心肌梗死后射血分数保留的心力衰竭患者: 心肌磁共振(MR)组织追踪研究
QIMS之窗 (122): 使用 人工智能辅助压缩传感心脏黑血 T2 加权成像:患者队列研究
QIMS之窗 (123): 整合式18F-FDG PET/MR全身扫描机局部增强扫描在胰腺腺癌术前分期及可切除性评估中的价值
QIMS之窗 (124): 放射组学预测胶质瘤异柠檬酸 脱氢酶基因突变的多中心研究
QIMS之窗 (125): CT与组织病理学对评估冠状动脉钙化的敏感性和相关性的比较
QIMS之窗 (126): 磁敏感加权成像鉴别良恶性门静脉血栓的价值
QIMS之窗 (127): 乳腺癌的超声诊断深度学习模型:超声与临床因素的整合
QIMS之窗 (128): 钆塞酸增强磁共振成像肝胆期成像的优化:叙述性综述
QIMS之窗 (130): 退行性颈椎病患者检出偶发甲状腺结节:一项回顾性 MRI 研究
QIMS之窗 (131):主要由发育原因引起的许莫氏结节和主要由后天原因引起的许莫氏结节:两个相关但不同的表现
QIMS之窗 (132):肱骨头囊性病变: 磁共振成像图文综述
QIMS之窗 (133):高分辨率小视场弥散加权磁共振成像在宫颈癌诊断中的应用
QIMS之窗 (135):深度学习辅助放射平片对膝关节关节炎分级:多角度X线片与先验知识的作用
QIMS之窗 (136): Angio-CT 影像学生物标志预测肝细胞癌经动脉化疗栓塞的疗效
QIMS之窗 (137):术前低放射剂量CT引导下肺结节定位
QIMS之窗 (138):超声造影在乳腺癌患者前哨淋巴结评估和标测中的应用
QIMS之窗 (140):反转恢复超短回波时间 (IR-UTE) 磁共振对脑白质病变的临床评估
QIMS之窗 (141): 层厚对基于深度学习的冠状动脉钙自动评分软件性能的影响
QIMS之窗 (142):支气管内超声弹性成像鉴别肺门纵隔淋巴结良恶性:回顾性研究
QIMS之窗 (143):高血压和肥胖对左心房时相功能的交互作用:三维超声心动图研究
QIMS之窗 (144):超声造影在乳腺癌患者前哨淋巴结评估和标测中的应用
QIMS之窗 (145):基于K-means层级体素聚类的快速高信噪比CEST量化方法
QIMS之窗 (146):常规临床多排CT扫描自动分割机会性评估椎体骨密度和纹理特征的长期可重复性
QIMS之窗 (147):基于人工智能的CT 扫描特征直方图分析预测毛玻璃结节的侵袭性
QIMS之窗 (148):基于心脏CTA图像与超声心动图的深度监督8层residual U-Net计算左心室射血分数
QIMS之窗 (149): 高度实性成分对早期实性肺腺癌的预后影响
QIMS之窗 (150):只在磁共振发现的可疑乳腺病变: 定量表观弥散系数有额外的临床价值吗 ?
QIMS之窗 (151): 人工智能与放射科医生在CT图像骨折诊断准确性方面的比较: 多维度、多部位分析
QIMS之窗 (152): 超声剪切波速检测人群晚期肝纤维化
QIMS之窗 (153):使用Gd-EOB-DTPA增强MR结合血清标志物在乙肝病毒高危患者中区分肿块型肝内胆管癌和非典型HCC
QIMS之窗 (154):术前超声预测甲状腺癌患者喉返神经侵犯
QIMS之窗 (155): T2 弛豫时间对 MRI 表观扩散系数 (ADC) 量化的影响及其潜在的临床意义
QIMS之窗 (156): 成人血液系统恶性肿瘤的急性病变神经放射学:图文综述
QIMS之窗 (157): 老年休闲运动最常见的15种肌肉骨骼损伤: 图文综述
QIMS之窗 (158): T2弛豫时间与磁共振成像表观弥散系数 (ADC) 之间的三相关系
QIMS之窗 (159): T2弛豫时间在解释肌肉骨骼结构MRI表观弥散系数(ADC)的意义
QIMS之窗 (160): 膝骨关节炎的影像学:多模式诊断方法综述
QIMS之窗 (161): 超高场 7T MRI 在帕金森病中准备用于临床了吗?—叙述性综述
QIMS之窗 (162): 碘造影剂在CT结构化RADS中的作用——叙述性综述
QIMS之窗 (163): 医学图像分割中的Transformers: 叙述性综述
QIMS之窗 (164): 肝癌相对于肝组织的长T2导致常规IVIM成像肝癌灌注分数被低估
QIMS之窗 (165): 基于深度学习的多模态肿瘤分割方法: 叙述性综述
QIMS之窗 (167): 基于双能CT的新型生物标志物用于结直肠癌手术后极早期远处转移的风险分层
QIMS之窗 (168): ST段抬高型心肌梗死患者心肌内出血的心脏磁共振成像检测:磁敏感加权成像与T1/T2像素图技术的比较
QIMS之窗 (169): TW3人工智能骨龄评估系统的验证:一项前瞻性、多中心、确认性研究
QIMS之窗 (170): 开发和验证深度学习模型用于髋关节前后位和侧位X线片检测无移位的股骨颈骨折
QIMS之窗 (171): 开滦研究中眼球血管宽度与认知能力下降和脑小血管病负担的关系
QIMS之窗 (172): 终板炎性矮椎(Endplatitis short vertebrae)
QIMS之窗 (173): DDVD像素图的潜在广泛临床应用
QIMS之窗 (174): 弥散性甲状腺病变中超声低回声特点及原理
QIMS之窗 (175): 不同发育状态及成长时期儿童青少年的手部骨骼特征
QIMS之窗 (176): 不同肌肉测量技术在诊断肌肉减少症中的一致性:系统性综述及荟萃分析
QIMS之窗 (177): 用于冠状动脉狭窄功能评估的冠状动脉树描述和病变评估 (CatLet) 评分:与压力线FFR的比较
QIMS之窗 (178): 使用 Sonazoid 的CEUS LI-RADS诊断肝细胞癌的效果:系统评价和荟萃分析
QIMS之窗 (179): 更多证据支持东亚老年女性骨质疏松症QCT腰椎BMD诊断临界点值应该低于欧裔人
QIMS之窗 (180): 相对于无肿瘤直肠壁,直肠癌的血液灌注更高:通过一种新的影像学生物标志物DDVD进行量化
QIMS之窗 (181): 人工智能在超声图像上解释甲状腺结节的诊断性能:一项多中心回顾性研究
QIMS之窗 (182): 先天许莫氏结节有软骨终板完全覆盖及其在许莫氏结节基于病因学分类的意义
QIMS之窗 (183): 合成磁共振成像在预测乳腺癌前哨淋巴结的额外价值
QIMS之窗 (184): 通过体积倍增时间预测早期肺腺癌生长导致分期改变
QIMS之窗 (185): 对比增强盆腔MRI用于预测粘液性直肠癌的治疗反应
QIMS之窗 (186): 探讨骨质疏松症和骨折风险中椎旁肌肉与骨骼健康之间的相互作用:CT和MRI研究全面文献综述
QIMS之窗 (187): 心动周期对双层计算机断层扫描心肌细胞外体积分数测量的影响
QIMS之窗 (188): 帕金森病和多系统萎缩皮质下铁沉积的定量磁敏感图:临床相关性和诊断意义
QIMS之窗 (189): 0~14岁儿童脑18氟脱氧葡萄糖正电子发射断层扫描正常对照模型的建立及变化规律分析
QIMS之窗 (190): 急性缺血性卒中后早期神经功能恶化的多模态成像评估
QIMS之窗 (192): 良性甲状腺结节的分期:原理和超声征象
QIMS之窗 (193): 独立评估5款人工智能软件检测胸片肺结节的准确性
QIMS之窗 (194): 合成磁共振成像在前列腺癌侵袭性诊断和评估中的价值
QIMS之窗 (195): 对比增强超声和高分辨率磁共振在评估组织学定义的易破裂颈动脉斑块的诊断性能比较:系统文献综述和荟萃分析