Annotated Bibliography – 388

with No Comments

Pitch #1

Style transfer and Image manipulation

Given any photo this project should be able to take any art movement such as Picasso’s Cubism and apply the style of the art movement to the photo. The neural network first needs to detect all the subjects and separate them from the background and also learn about the color schemes of the photo. Then the art movement datasets needs to be analyzed to find patterns in the style. The style will then need to be applied to the initial photo. The final rendering of the photo could get a little computationally expensive, if that is the case there will be need for GPU hardware. Imaging libraries such as pillow and scikit would be needed. It might be a little hard to find proper datasets since there are limited datasets available for each art movement. Contrarily I could rid myself of the need for readily-made datasets by training the network to detect style patterns by feeding it unlabeled painting scans.

Source #1:

Chen, D. Yuan, L. Liao, J. Yu, N. Hua, G. 2017. StyleBank: An Explicit Representation for Neural Image Style Transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1897-1906.


This paper uses a network that successfully differentiates between content and style of each image. I’ve had doubts about this project being two separate tasks: 1- understanding what subjects are in the photo (content). 2- understanding the color scheme and style used for these subjects (style). This paper’s approach manages to tackle both of those. The conclusion of this paper actually mentions what wasn’t explored but it could be an interesting new thing to investigate which could be a good starting point for my work. Many successful and reputable other works are used for comparison in this paper (including source #2 below) and overall this paper is a gold mine of datasets (including the mentioned Microsoft COCO dataset). It might be hard to find a way to evaluate my final results for this pitch, however this paper does an excellent job of doing this evaluation simply by comparing the final image with final images in previous work.

Source #2:

Gatys, L.A. Ecker, A.S. Bethge M. 2016. Image Style Transfer Using Convolutional Neural Networks. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2414-2423.


Gatys et al. start with an amazing summary and comparison of previous work in the introduction. They propose a new algorithm which is aimed to overcome the challenges of separating content and style of an image. This paper is one of the most famous and popular approaches to style transfer and it has been cited many times. The filters used in this paper to detect content can be very useful in my project since the approach focuses on higher layers in the convolutional network for content detection which are not the exact pixel recreations of the original picture. Eliminating the exact pixel representation of the image could also result in reducing the need for computationally expensive code.

Source #3:

Arpa, S. Bulbul, A. Çapin, T.K. Özgüç, B. 2012. Perceptual 3D rendering based on principles of analytical cubism.Computers & Graphics, 6. 991-1004.


This paper is gives us a different perspective than the other resources since it analyzes what exactly constructs the cubism model by analyzing cubism in a 3d rendered environment. The approach uses three different sets of rules and patterns to create cubist results which are 1-Faceting, 2-Ambiguity and 3- Discontinuity. If I can successfully apply these three patterns in a 2d environment then I have decreased the number of filters that need to be applied to each photo down to only  three filters. Faceting seems to be the most important part of cubism as different facet sizes can actually create both ambiguity and discontinuity. Main problem with this paper is that there is not much other work to compare this to and it seems the only way to evaluate the final work is just by observing and out own judgement even though the paper includes some art critics input.

Source #4:

Lian, G. Zhang, K. 2020. Transformation of portraits to Picasso’s cubism style. The Visual Computer, 36. 799–807.


This paper start off by mentioning that the Gatys et al. (source #2) approach, however successful, is slow and memory consuming and it claims to improve the method to be more efficient. The problem with this paper is that even though the algorithm is a lot more efficient and quicker than Gatys et al. it only applies the same style and it only applies that style to portraits. Gatys, however memory consuming does a great job to work with any image with any content inside. So my goal would be to expand on this papers efficient approach to be able to work with a wider range of image content and styles while still being less computationally expensive than the famous Gatys et al. approach.

Pitch #2

Image manipulation detection

Neural network would be trained to detect image manipulation in a given photo. There are many ways to achieve this including but not limited to image noise analysis. Different algorithms can be compared to see which can do the best detection manipulation or which one was better automated with the training process.

Python libraries such as Keras and sklearn will be used for the Neural Network and the deep learning. Many previous research papers and datasets are available for this project or similar ones. 

Source #1:

Zhou, P. Han, X. Morariu, V.I. Davis L.S. 2018. Learning Rich Features for Image Manipulation Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1053-1061.


Strong paper with strong list of citations. Separate convolutions for RGB and noise level seem to work together very well. This paper talks about how they had issues with finding good enough datasets and how they overcame this problem by creating a pre trained model from the Microsoft COCO dataset. The noise stream approach is especially interesting. The filters used in the convulsions are given which can be used directly or slightly modified for comparison and perhaps enhancement of the method. The results section of this unlike the previous pitch is very thorough and shows how the RGB and noise level approach compliment each other. My goal would be to enhance the noise approach model until it can identify manipulation independent from the RGB stream. The noise level detection seems to be almost complete on its own and stripping the RGB stream and the comparison between the two would decrease the computation resources required for the method extensively.

Source #2:

Bayram, S. Avcibas, I. Sankur, B. Memon, N. 2006. Image manipulation detection. J. Electronic Imaging, 15.


This paper is a lot more technical than Zhou et al (source #1). Source #1 above seems to be more focused on if more content was added or if content was removed from an image while this paper also focuses on if the original photo is still with the same content but slightly manipulated. Very strong results along with thorough graphs are included for comparison between methods. Five different forensic approaches are compared in this paper which all five need to be separately researched to be able to apply the actual method (or only the best performing one in the results section of the paper). My goal would be to find what makes each approach do better than others and take only the best parts of each approach and somehow mix them together.

Source #3:

Bayar, B. Stamm M.C 2016. A Deep Learning Approach to Universal Image Manipulation Detection Using a New Convolutional Layer. In Proceedings of the 4th ACM Workshop on Information Hiding and Multimedia Security (IH&MMSec ’16). Association for Computing Machinery, New York, NY, USA, 5–10.

This paper does a great analysis of what different convolutional neural network methods are which ones could be used for maximum optimization in image manipulation and forgery detection. The datasets used are a massive collection of images with more than 80000 images (all publicly available on their github page) which would be very helpful for my project if I were to choose this pitch. The two sets of results for binary classification and multi-class classification approach both get very impressive results with the binary classification getting a minimum of 99.31% accuracy. This paper’s unique approach is to remove the need for detecting the content of the image and directly attempting to detect manipulation with no pre-processing.

Pitch #3

Radiology disease detection

Trained neural networks for detecting radiology abnormalities and diseases have reached a level that can easily compete with a human radiologists. For this project I will be using neural network libraries to detect different abnormalities. There are very different field that this can apply to such as: Brain tumor detection, breast cancer detection, endoscopic, colonoscopy, CT/MRI, oncology, etc. There are countless datasets and different approaches available for a project such as this one which leaves gives me the opportunity to compare and contrast different variations of them. This is a very rich field with a lot of previous work and published papers to explore.

Source #1:

Liu, Y. Kohlberger, T. Norouzi, M. Dahl, G.E. Smith, J.L. Mohtashamian, A. Olson, N. Peng, L.H. Hipp, J.D. Stumpe, M.C. 2019. Artificial Intelligence–Based Breast Cancer Nodal Metastasis Detection: Insights Into the Black Box for Pathologists. Arch Pathol Lab, 143 (7). 859–868.


A lot of medical terms used in this paper which helped me understand the field better by looking them up. Examples are: LYNA, Sentinel Lymph node, metastatic, etc. This paper is different from the other sources since the sources below (especially the brain tumor one: source #3) are all focused finding tumor in a specific place, while this paper is focused on finding out if cancer has spread to other places such as lymph nodes (that’s what metastatic means). Detecting if cancer has spread seems to me a better real life application of deep learning radiology and potentially might affect the final version of this pitch. More than one good datasets are mentioned in this paper which can be used in this pitch. This includes the Camelyon16 dataset from 399 patients. Additionally there is a newer Camelyon17 dataset available. This paper comes incredibly close to the Camelyon16 winner which could also be a good source to check out. Strong list of references which includes papers about using deep networks to detect skin cancer. Some of the authors of this paper have other papers that contribute to the field even further. (interesting fact, somehow a lot of the authors of this paper and the ones in the references list are from my country)

Source #2:

Manogaran, G. Shakeel, P. M. Hassanein, A. S. Kumar, P.M. Babu G.C. 2019. Machine Learning Approach-Based Gamma Distribution for Brain Tumor Detection and Data Sample Imbalance Analysis. IEEE Access, 7. 12-19.


This paper uses a datasets from openfmri which has turned into Both the old and the new dataset are very large, very useful datasets that can be used for this pitch. This paper also includes two different algorithms that help find the location of the tumor in each radiology image. Important to remember that ROI in this paper doesn’t mean Return On Investment, it actually means Region of Interest which is the region that the Neural Network has detected the tumor to be located. Orthogonal gamma distribution model seems to play a big role in the ROI detection in this paper (it is in the title) which makes this approach unique as it gives the algorithm the capability to self-identify the ROI. This automation is the winning achievement of this paper.

Source #3:

Logeswari, T. Karnan, M. 2010. An improved implementation of brain tumor detection using segmentation based on soft computing. Journal of Cancer Research and Experimental Oncology, 2 (1). 6-14.


This paper proposes a hierarchical self-organizing map (HSOM) for MRI segmentation and it claims that this approach will have improvements to a traditional Self-Organizing Map (SOM) method. The winning neuron method is a simple yet elegant method that I can utilize in my application of MRI reading and detection. This paper is relatively old compared to the other sources and doesn’t include any significant dataset, however, the simple winning neuron algorithm is a good starting point that can be expanded on with better segmentation methods, better model training, etc.

Source #4:

Bi, W.L. Hosny, A. Schabath, M.B. Giger, M.L. Birkbak, N.J. Mehrtash, A. Allison, T. Arnaout, O. Abbosh, C. Dunn, I.F. Mak, R.H. Tamimi, R.M. Tempany, C.M. Swanton, C. Hoffmann, U. Schwartz, L.H. Gillies, R.J. Huang, R.Y. Aerts, H.J.W.L. 2019. Artificial intelligence in cancer imaging: Clinical challenges and applications. CA A Cancer J Clin, 69. 127-157. 


This paper is a little different from the other sources, as it discusses Artificial Intelligence used in Cancer detection as a general concepts while also getting in depth in various fields of cancer detection such as lung cancer detection, breast mammography, prostate and brain cancer detection. The paper can be very helpful since it describes different methods of cancer detection in a machine learning environment and talks about different kinds of tumors and abnormalities at length. The authors even do a thorough comparison between previous work for each field which includes what tumor was studied, how many patients were in the dataset, what algorithm was used, how accurate the detection was and more. This paper opens the possibility for me to choose between any of the different fields to choose from for my final proposal.

Source #5:

Watanabe, A.T. Lim, V. Vu, H.X. et al. 2019. Improved Cancer Detection Using Artificial Intelligence: a Retrospective Evaluation of Missed Cancers on Mammography. In Journal of Digital Imaging 32. 625–637.


This paper focuses generally on human error in cancer detection and how Artificial Intelligence can minimize human error. Unfortunately, Artificial Intelligence cancer detection is not as mainstream as it should be even though it has been proven to assist and outperform radiologists. That is what this paper is trying to address. The paper claims that computer-aided detection (CAD) before deep learning was not being very helpful and tries to prove that since the addition of AI to CAD (AI-CAD), false negative and false positive detections have been diminished. The paper then studies the sensitivity, number of false positives, etc. in a group of radiologist with various backgrounds and experience levels to prove that the AI-CAD system in question is helpful. Perhaps my final results in my work could be compared against a radiologist’s detection for a small section in the results section of the paper using the same methods this paper used for comparison.

Source #6:

Rodriguez-Ruiz, A. Lång, K. Gubern-Merida, A. Broeders, M. Gennaro, G. Clauser, P. Helbich, T.H. Chevalier, M. Tan, T. Mertelmeier, T. Wallis, M.G. Andersson, I. Zackrisson, S. Mann, R.M. Sechopoulos, I. 2019. Stand-Alone Artificial Intelligence for Breast Cancer Detection in Mammography: Comparison With 101 Radiologists, In JNCI: Journal of the National Cancer Institute, Volume 111, Issue 9. 916–922.

Similarly to Watanabe et al. (source #5 above) this paper attempts to compare AI powered cancer detection to radiologists diagnosis without any computer assisted results. They use results from nine different studies with a total of 101 radiologist diagnostics and compare their collective results to their AI powered detection system. This paper does a more thorough comparison with a larger set of datasets and has more promising results than Watanabe et al. (source #5). Their supplementary tables show in depth results of each individual dataset against the AI system which shows slightly better results than the average of the radiologists in most cases.

Source #7:

Pianykh, O.S. Langs, G. Dewey, M. Enzmann, D.R. Herold, C.J. Schoenberg, S.O. Brink, J.A. 2020. Continuous Learning AI in Radiology: Implementation Principles and Early Applications.

The unique problem under focus in this paper is how even the best AI algorithms in radiology reading and detection can become outdated given that data and the environment is always evolving in the field. The paper proves that models trained to work in the radiology field lose their quality over even short periods of time, such as a few months, and it proposes a continuous learning method to solve this problem. Continuous learning and its advantages are discussed and explained at length in the paper. According to the data in the paper, the AI model was able to keep its quality if feedback from the radiologists and patients was given and the model was retrained once a week using this feedback data from the two groups. The problem with this approach is that inclusion of radiologists and the less knowledgeable patients into the model training process increases the risk of mistraining due to human error and also input from someone that doesn’t have enough experience and/or knowledge about the field. In my work I could explore ways of ridding this method of human input and potentially make different AI models work with each other to provide feedback for continuous learning.

Source #8:

Tang, A. Tam, R. Cadrin-Chênevert, A. Guest, W. Chong, J. Barfett, J. Chepelev, L. Cairns, R. Mitchell, J. R. Cicero, M. D. Poudrette, M. G. Jaremko, J. L. Reinhold, C. Gallix, B. Gray, B. Geis, R. 2018. Canadian Association of Radiologists White Paper on Artificial Intelligence in Radiology. In Canadian Association of Radiologists Journal69(2). 120–135.

This paper doesn’t actually build any Artificial Intelligence program and therefore there aren’t any datasets or results that can be helpful. However, the paper includes a vast insight into Artificial Intelligence and its role in modern radiology. Many helpful topics are included such as different Artificial Intelligence learning methods, their different applications in the radiology world, different imaging techniques, the role of radiologists in AI field and even the terminology needed to understand and communicate to others about the concept. Even though there is not implementation of software in this paper and no datasets are used as a result, datasets are discussed as length. What a dataset requires to be useful and how large should a dataset be are included with actual examples of these datasets (such as the ChestX-ray8 with more than 100 thousand x-ray images. 

Source #9:

Ha, R. Chang, P. Karcich, J. et al. 2018. Axillary Lymph Node Evaluation Utilizing Convolutional Neural Networks Using MRI Dataset. In Journal of Digital Imaging 31. 851–856.

Axillary lymph nodes are a set of lymph nodes in the armpit area and due to their proximity to the breasts they are the first set of lymph nodes that breast cancer spreads to. Detecting even smallest amount of microscopic cancer cells in the axillary lymph nodes is critical to determining the stage of breast cancer and the proper treatment methods. For that reason this paper focuses on using MRI scans to detect the spread of cancer to these nodes as opposed to the normal tumor detection in breast mammographs in sources #1, #5 and #6. The other major difference between this source and others is the pre-processing of the 3d datasets. Interestingly this paper also leaves pooling out and downsizes the data with a different method. I believe I can use and even optimize many of the methods in this paper to get better results in my work. Metastasis detection is a very interesting field which doesn’t have the same amount of AI work done compared to tumor detection and for that reason I believe there is more room for improvement in the field.

Source #10:

Rodríguez-Ruiz, A. Krupinski, E. Mordang, J.J. Schilling, K. Heywang-Köbrunner, S.H. Sechopoulos, I. Mann, R.M. 2018. Detection of Breast Cancer with Mammography: Effect of an Artificial Intelligence Support System.

This paper is likely what encouraged the creating of Rodríguez-Ruiz et al. (source #6). The difference between the two papers is small yet substantial. In this paper Rodríguez-Ruiz et al. compare how radiologists perform with and without the help AI powered systems as opposed to their newer paper (source #6) which takes the AI systems out and makes the radiologists compete against what was used to help them in this paper. Interestingly enough the results of the paper in which radiologists compete against AI are better compared to when AI is used as an aid to the radiologists. The data in this paper is not as extensive as the newer paper (source #6), however, as long as good data is available (such as in source #6), there is a lot for me to explore in the world of comparisons between different methods of AI and different kinds of radiologists (perhaps in a different field than mammography).

Source #11:

Hirasawa, T., Aoyama, K., Tanimoto, T. et al. 2018. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. In Gastric Cancer 21. 653–660.

With most of my sources focused on either Breast and Brain tumor detection I thought it would be helpful to expand the range of possibilities by including more applications of AI in radiology fields. This paper uses similar CNN approaches for endoscopy imaging for gastric cancer detection. Going into fields such as gastric cancer detection where the amount of previous work is not as much as breast and brain tumor detection has advantages and disadvantages. Main advantage is that I am more likely to be able to get better improved results than the most recent work and the main disadvantage is the less amount of datasets available in such a field. This paper however uses good datasets acquired from different hospitals and clinics. Perhaps acquiring data from local clinics and hospitals is something I can look into for my work as well. 

Leave a Reply