JASBO: Jaya Average Subtraction Based Optimization with Deep Learning Model for Multi-Classification of Infectious Disease from Unstructured Data

Vian Sabeeh; Ahmed Bahaaulddin A. Alwahhab; Ali Abdulmunim Ibrahim Al-kharaz

doi:10.21123/bsj.2024.9184

المؤلفون

Vian Sabeeh قسم تقنيات المعلوماتية، الكلية التقنية الادارية/بغداد، الجامعة التقنية الوسطى، بغداد، العراق. https://orcid.org/0000-0002-0860-2335
Ahmed Bahaaulddin A. Alwahhab قسم تقنيات المعلوماتية، الكلية التقنية الادارية/بغداد، الجامعة التقنية الوسطى، بغداد، العراق. https://orcid.org/0000-0003-0965-4812
Ali Abdulmunim Ibrahim Al-kharaz قسم تقنيات المعلوماتية، الكلية التقنية الادارية/بغداد، الجامعة التقنية الوسطى، بغداد، العراق. https://orcid.org/0000-0002-7321-2296

DOI:

https://doi.org/10.21123/bsj.2024.9184

الكلمات المفتاحية:

خوارزمية تحسين معدل الفرق ، شبكة الذاكرة الثنائية ، الشبكة العصبية الالتفافية ، شبكة تشخيص اعراض الامراض ، خوارزمية جايا.

الملخص

الأمراض الوبائية اصبحت مشكلة لا يمكن تجنبها في بيئتنا الحالية مع تميزها بنفس الاعراض الامر الذي يجعل من تشخيصها و الكشف المبكر عنها امراً صعباً. لذلك، اصبح من الضروري ايجاد تقنية تعتمد على اعراض المرضى المختلفة لتصنيف امراضهم. تعد الوثائق الطبية من المصادر المهمة التي مازالت تحتاج لطرق مبتكرة وموثوقة لتحليلها من اجل الوصول لتشخيص الكثير من الامراض، وعليه من المهم اضافه جهد في هذا المجال لأثراء معالجه النصوص الطبية للاستفادة منها في مجال المعلوماتية الصحية .ولهذا، تم افتراض خوارزمية JASBO اي خوارزمية تحسين معدل الفرق باستخدام JAYA التي تعتمد على التعلم العميق والتي تعمل على تصنيف الامراض المعدية الى فئاتها اعتمادا على البيانات النصية غير المبوبة في هذا البحث. ان شبكة تشخيص الامراض ID-NET التي افترضت تتكون من شبكة عصبية التفافية CNN من اجل تشخيص الكلمات الغريبة او الكلمات المفيدة في التشخيص مع شبكة الذاكرة الثنائية الاتجاه طويلة المدى BI-LSTM . حيث تم استخدام خوارزمية JASBO من اجل تحديد حجم الفلتر في شبكة التصنيف النهائية من اجل تحديد اهم اجزاء النص المعبرة عن المرض. يبدأ عمل الشبكة بدخول النص المطلوب تصنيفه الى مرحلة التقطيع الى كلمات، ليتم توجيه الكلمات لاحقاً شبكة تعلم عميق التفافية. اضافه لذلك يتم استخراج خصائص نمطيه او تراتبية احرف الكلمات باستخدام شبكة الذاكرة الثنائية لتتحول الى مصفوفه خصائص توجه الى طبقة تحسس لا يجاد تشابه تراتبية الحروف بين الكلمات بواسطة معادلة كومار-هانزبروك للتشابه . اعتماداً على ناتج شبكة JASBO يتم التنبؤ بفئة كل كلمة فيما اذا كانت تشير لأعراض مرض ما. الشبكة المفترضة لتشخيص الامراض أظهرت كفاءة بدقة 91 % ونسبة ارجاع 88% مع نسبة F-SCORE وصلت الى 90%.

Received 05/06/2023,

Revised 12/09/2023,

Accepted 14/09/2023,

Published Online First 20/03/2024

المراجع

Assale M, Dui LG, Cina A, Seveso A, Cabitza F. The revival of the notes field: leveraging the unstructured content in electronic health records. Front. Med. 2019 17;6:66. https://doi.org/10.3389/fmed.2019.00066.

Li I, Pan J, Goldwasser J, Verma N, Wong WP, Nuzumlalı MY, Rosand B, Li Y, Zhang M, Chang D, Taylor RA. Neural Natural Language Processing for unstructured data in electronic health records: A review. Comput. Sci. Rev. 2022; 46:100511. https://doi.org/10.1016/j.cosrev.2022.100511.

Ali F, El-Sappagh S, Islam SR, Kwak D, Ali A, Imran M, Kwak KS. A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. inf. Fusion. 2020; 63:208-22. https://doi.org/10.1016/j.inffus.2020.06.008

Yuan Q, Cai T, Hong C, Du M, Johnson BE, Lanuti M, Cai T, Christiani DC. Performance of a machine learning algorithm using electronic health record data to identify and estimate survival in a longitudinal cohort of patients with lung cancer. JAMA netw. Open. 2021; 4(7):e2114723-. https://doi.org/10.1001/jamanetworkopen.2021.14723

Arji G, Ahmadi H, Nilashi M, Rashid TA, Ahmed OH, Aljojo N, Zainol A. Fuzzy logic approach for infectious disease diagnosis: A methodical evaluation, literature and classification. BBE. 2019;39(4):937-55. https://doi.org/10.1016/j.bbe.2019.09.004

Moyo E, Mhango M, Moyo P, Dzinamarira T, Chitungo I, Murewanhema G. Emerging infectious disease outbreaks in Sub-Saharan Africa: Learning from the past and present to be better prepared for future outbreaks. Front. Public Health. 2023; 11:1049986. https://doi.org/10.3389/fpubh.2023.1049986

Bashir MF, Ma B, Shahzad L. A brief review of socio-economic and environmental impact of Covid-19. Air Qual. Atmos. Health. 2020; 13:1403-9. https://doi.org/10.1007/s11869-020-00894-8

Naz R, Gul A, Javed U, Urooj A, Amin S, Fatima Z. Etiology of acute viral respiratory infections common in Pakistan: A review. Rev. Med. Virol. 2019;29(2):e2024 https://doi.org/10.1002/rmv.2024

Wang M, Wei Z, Jia M, Chen L, Ji H. Deep learning model for multi-classification of infectious diseases from unstructured electronic medical records. BMC Med. Inform. Decis. Mak. 2022 ;22(1):1 https://doi.org/10.1186/s12911-022-01776-y

Luo X, Gandhi P, Zhang Z, Shao W, Han Z, Chandrasekaran V, Turzhitsky V, Bali V, Roberts AR, Metzger M, Baker J. Applying interpretable deep learning models to identify chronic cough patients using EHR data. Comput. Methods Programs Biomed. 2021; 210:106395. https://doi.org/10.1016/j.cmpb.2021.106395.

Vidhya K, Shanmugalakshmi R. Deep learning based big medical data analytic model for diabetes complication prediction. JAIHC. 2020; 11:5691-702. https://doi.org/10.1007/s12652-020-01930-2.

Wang SM, Chang YH, Kuo LC, Lai F, Chen YN, Yu FY, Chen CW, Li ZW, Chung Y. Using deep learning for automatic ICD-10 classification from free-text data. EJBI. 2020;16(1).

Zhao J, Yu L, Liu Z. Research based on multimodal deep feature fusion for the Auxiliary diagnosis model of Infectious Respiratory diseases. Sci. Program. 2021; 2021:1-6. https://doi.org/10.1155/2021/5576978.

Maheshwari V, Mahmood MR, Sravanthi S, Arivazhagan N, ParimalaGandhi A, Srihari K, Sagayaraj R, Udayakumar E, Natarajan Y, Bachanna P, Sundramurthy VP. Nanotechnology-based sensitive biosensors for COVID-19 prediction using fuzzy logic control. J. Nanomater. 2021; 2021:1-8. https://doi.org/10.1155/2021/3383146.

Venkataraman GR, Pineda AL, Bear Don’t Walk IV OJ, Zehnder AM, Ayyar S, Page RL, Bustamante CD, Rivas MA. FasTag: Automatic text classification of unstructured medical narratives. PLoS one. 2020; 15(6):e0234647. https://doi.org/10.1371/journal.pone.0234647.

Nagamine T, Gillette B, Pakhomov A, Kahoun J, Mayer H, Burghaus R, Lippert J, Saxena M. Multiscale classification of heart failure phenotypes by unsupervised clustering of unstructured electronic medical record data. Sci. Rep.. 2020; 10(1):1-3. https://doi.org/10.1038/s41598-020-77286-6.

Ahmad A, Ullah A, Feng C, Khan M, Ashraf S, Adnan M, Nazir S, Khan HU. Towards an improved energy efficient and end-to-end secure protocol for IoT healthcare applications. Secur. Commun. Netw.. 2020; 2020:1-0. https://doi.org/10.1155/2020/8867792.

Ashraf S, Ahmed T, Aslam Z, Muhammad D, Yahya A, Shuaeeb M. Depuration‎ based Efficient Coverage Mechanism for‎ Wireless Sensor Network . J. Electr. Comput. Eng. Innovations. 2020; 8(2):145-60. https://doi.org/10.22061/jecei.2020.6874.344.

Ashraf S, Saleem S, Chohan AH, Aslam Z, Raza A. Challenging strategic trends in green supply chain management. Int. J. Res. Eng. Appl. Sci. JREAS. 2020; 5(2):71-4. https://doi.org/10.46565/jreas.2020.v05i02.006

Dehghani M, Hubálovský Š, Trojovský P. A new optimization algorithm based on average and subtraction of the best and worst members of the population for solving various optimization problems. PeerJ Comput. Sci.. 2022 ;8:e910. https://doi.org/10.7717/peerj-cs.910.

Venkata Rao R, Venkata Rao R. Jaya optimization algorithm and its variants. Jaya: An advanced optimization algorithm and its engineering applications. 2019:9-58. https://doi.org/10.1007/978-3-319-78922-4_2

MeDAL dataset , “https://www.kaggle.com/datasets/xhlulu/medal-emnlp”, accessed on January 2023.

Sugave S, Jagdale B. Monarch-EWA: Monarch-earthworm-based secure routing protocol in IoT. Comput J. 2020; 63(6):817-31. https://doi.org/10.1093/comjnl/bxz135.

Hasan AM, Qasim AF, Jalab HA, Ibrahim RW. Breast Cancer MRI Classification Based on Fractional Entropy Image Enhancement and Deep Feature Extraction. Baghdad Sci. J. 2022; 20(1) :0221- 234. https://doi.org/10.21123/bsj.2022.6782

Li W, Qi F, Tang M, Yu Z. Bidirectional LSTM with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing. 2020; 387:63-77. https://doi.org/10.1016/j.neucom.2020.01.006

Kumar-Hassebrook similarity measure , https://drostlab.github.io/philentropy/reference/distance.html.

Wang SH, Muhammad K, Hong J, Sangaiah AK, Zhang YD. Alcoholism identification via convolutional neural network based on parametric ReLU, dropout, and batch normalization. Neural Comput. Appl. 2020; 32:665-80. https://doi.org/10.1007/s00521-018-3924-0.

Cho M, Ha J, Park C, Park S. Combinatorial feature embedding based on CNN and LSTM for biomedical named entity recognition. J. Biomed. Inform.. 2020; 103:103381. https://doi.org/10.1016/j.jbi.2020.103381.

Wotaifi TA, Dhannoon BN. An Effective Hybrid Deep Neural Network for Arabic Fake News Detection. Baghdad Sci. J. 2023;20(4): https://doi.org/10.21123/bsj.2023.7427

Harris CR, Millman KJ, Van Der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R. Array programming with NumPy. Nature. 2020; 585(7825):357-62. https://doi.org/10.1038/s41586-020-2649-2

Keras: Deep Learning for humans . [cited 2023 Aug 11]. https://keras.io/

Scikit-Learn: machine learning in Python — scikit-learn 1.3.0 documentation [Internet]. [cited 2023 Aug 11]. Available from: https://scikit-learn.org/stable/

Wen Z, Lu XH, Reddy S. MeDAL: medical abbreviation disambiguation dataset for natural language understanding pretraining. arXiv preprint arXiv:2012.13978. 2020. https://doi.org/10.48550/arXiv.2012.13978

تشخيص الامراض الوبائية المتعددة من البيانات غير المبوبة باستخدام خوارزمية جايا لتحسين خوارزمية التعلم العميق

المؤلفون

DOI:

الكلمات المفتاحية:

الملخص

المراجع

التنزيلات

إصدار

القسم

الرخصة

كيفية الاقتباس

Journal Info
Journal: Baghdad Science Journal
Publisher: College of Science for Women/ University of Baghdad
Baghdad Sci. J. is peer-reviewed and open access
Print ISSN: 2078-8665
Electronic ISSN: 2411-7986
Publishing Frequency: Quarterly (from 2004 - 2021) Bi-monthly (from 2022) Monthly (from 2024)
Launched Date: 2004
Abbreviation: Baghdad Sci.J.
Each published paper in Baghdad Sci. J. has a digital object identifier (DOI) number