Abstract 3
Introduction 4
Background 4
Aims and objectives of work 8
Structure 11
1 ANN Models and Cancer Detection 12
1.1 Artificial Neurons 12
1.2 Basic Elements of Artificial Neural Networks 14
1.3 Mathematical principles for training phase 16
1.4 Anomaly Detection with ANN model 20
2 Interpretive artificial intelligence for AI models with high-dimensional
input 23
2.1 Shapley Value in Game Theory 24
2.2 Sampling Shapley Approach 26
2.3 Sampling based on Weighted Graphs 30
2.3.1 Pearson correlation coefficient 31
2.3.2 Biased Random Path Searching Method 33
2.3.3 Convergence Measurements 35
2.4 Summary of Sampling Method based on Weighted Graphs ... 37
3 Results and Analysis 39
3.1 Dataset description 39
3.2 Simulation Results for Sampling method based on Graph .... 40
3.3 Comparison with Original sampling method 43
3.4 Validation of Interpretation Results 50
4 Conclusion and Future works 51
4.1 Conclusion 51
4.2 Acknowledgment 53
References 55
With the increase of computer computing power, AI has been widely used in various fields of life, but at the same time, more and more cases show us the uninterpretability of AI algorithms. Therefore, the interpretability of AI is par¬ticularly important. In recent years, more scholars have proposed to work on the interpretability of machine learning models by Shapley values in cooperative game theory, but when there are more features in the dataset, calculating Shap- ley values becomes a challenge. Some authors have introduced approximate Shapley calculation techniques. However, as the number of players increases, it remains a challenge to strike a balance between sample size and time cost. The sampling method to calculate shapley values is a method that samples the features themselves by simulating random permutation disease and then esti¬mates the contribution of each feature to the prediction result based on each sampling result, but in use, we found that the random permutation process is still long when the number of participants is large. Therefore, we propose a new approach by using a coalition of ’’high impact” participants. Shapley values are calculated in less time and a more meaningful way to measure the plausibility of the interpretation of the results is proposed..
At the conclusion of this thesis, I would like to express my gratitude to those who have supported and inspired me throughout this endeavour.
Professor Pertrosyan Ovanes, my supervisor, is the first person I desire to thank. Throughout the course of my research, he has assisted and guided me with patience and consideration. Not only did he provide me with valuable advice and direction, but he also improved my understanding of the pertinent knowledge and skills. His knowledge and experience have inspired my future studies and research and had a profound impact on me.
Second, I’d like to thank my mentor, Zou Jinying, whose guidance and direction in academic research, as well as his care and assistance in daily life, have made me feel like a member of a large family. With his assistance, I have gained many research skills and had many experiences I never anticipated I would have.
In addition, I would like to thank my family. I’ve always been determined to advance because of their consistent encouragement and support. Their selfless care and unwavering support give me the strength to move forward. I would like to express my heartfelt appreciation to my girlfriend, who has been an unwa¬vering source of support and encouragement throughout my academic journey. Her selfless love and understanding have helped me overcome writing slumps and navigate the challenges of pursuing higher education. Without her con¬stant support, I would not have had the motivation to keep moving forward. I am grateful for her presence in my life and will always cherish her unwavering
support.
Finally, I would like to express my gratitude to everyone who assisted and supported me during my research for this paper.
[1] David Bau et al. “Gan dissection: Visualizing and understanding generative adversarial networks”. In: arXiv preprint arXiv:1811.10597 (2018).
[2] Javier Castro, Daniel Gomez, and Juan Tejada. “Polynomial calculation of the Shapley value based on sampling”. In: Computers & Operations Research 36.5 (2009), pp. 1726¬1730.
[3] Raghavendra Chalapathy and Sanjay Chawla. “Deep learning for anomaly detection: A survey”. In: arXiv preprint arXiv:1901.03f07 (2019).
[4] Robert Challen et al. “Artificial intelligence, bias and clinical safety”. In: BMJ Quality & Safety 28.3 (2019), pp. 231-237.
[5] Andre Esteva et al. “Dermatologist-level classification of skin cancer with deep neural networks”. In: nature 542.7639 (2017), pp. 115-118.
[6] Leilani H Gilpin et al. “Explaining explanations: An overview of interpretability of machine learning”. In: 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA). IEEE. 2018, pp. 80-89.
[7] Varun Gulshan et al. “Development and validation of a deep learning algorithm for de¬tection of diabetic retinopathy in retinal fundus photographs”. In: Jama 316.22 (2016), pp. 2402-2410.
[8] Anthony J.G. Hanley et al. “Prediction of Type 2 Diabetes Using Simple Measures of Insulin Resistance: Combined Results From the San Antonio Heart Study, the Mexico City Diabetes Study, and the Insulin Resistance Atherosclerosis Study ”. In: Diabetes 52.2 (Feb. 2003), pp. 463-469. ISSN: 0012-1797. DOI:10.2337/diabetes.52.2.463.
[9] Geert Litjens et al. “A survey on deep learning in medical image analysis”. In: Medical image analysis 42 (2017), pp. 60-88.
[10] Ying Liu. “A comparative study on feature selection methods for drug discovery”. In: Journal of chemical information and computer sciences 44.5 (2004), pp. 1823-1828.
[11] Scott M Lundberg and Su-In Lee. “A unified approach to interpreting model predictions”. In: Advances in neural information processing systems 30 (2017).
[12] Scott M Lundberg et al. “From local explanations to global understanding with explain¬able AI for trees”. In: Nature machine intelligence 2.1 (2020), pp. 56-67.
[13] Warren S McCulloch and Walter Pitts. “A logical calculus of the ideas immanent in nervous activity”. In: The bulletin of mathematical biophysics 5 (1943), pp. 115-133.
[14] Menaka Narayanan et al. “How do humans understand explanations from machine learning systems? an evaluation of the human-interpretability of explanation”. In: arXiv preprint arXiv:1802.00682 (2018).
[15] Michael C Oldham, Steve Horvath, and Daniel H Geschwind. “Conservation and evolution of gene coexpression networks in human and chimpanzee brains”. In: Proceedings of the National Academy of Sciences 103.47 (2006), pp. 17973-17978...(32)