Weekly update: Annotated Bibliography (First versions)

with No Comments

Idea 1: Masked Face Detection:

Citation 1: Detecting Masked Faces in the Wild with LLE-CNNs

  • S. Ge, J. Li, Q. Ye and Z. Luo, “Detecting Masked Faces in the Wild with LLE-CNNs,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 426-434, doi: 10.1109/CVPR.2017.53.
  • Link: https://openaccess.thecvf.com/content_cvpr_2017/papers/Ge_Detecting_Masked_Faces_CVPR_2017_paper.pdf?fbclid=IwAR2UcTzeJsOAI6wPzdlkuMG4NaHMc-b1Gwmf-zl5hD3ueIEfBH-3HOgpMIE
    • Includes the MAFA dataset with 30,811 Internet images and 35,806 masked faces. The dataset can be used for us to train or test our deep learning model.
    • Proposes LLE-CNNs for masked face detection, which we can use as a starting point and as a baseline to reach or beat.
    • To look up: Convolutional Neural Network (CNN)
    • The authors show that on the MAFA dataset, the proposed approach remarkably outperforms 6 state-of-the-arts by at least 15.6%.
    • Check if the authors have published codes to reproduce all the experiment results.

The paper introduces a new dataset for masked face detection as well as a model named LLE-CNNs that the authors claimed to have outperformed 6 state-of-the-arts by at least 15.6%. Fortunately, the dataset is publicly available and is exactly what we are looking for for the problem that we are proposing. 

Idea 2: Speaker Recognition:

Citation 1: Deep Speaker: an End-to-End Neural Speaker Embedding System

  • Li, Chao & Ma, Xiaokong & Jiang, Bing & Li, Xiangang & Zhang, Xuewei & Liu, Xiao & Cao, Ying & Kannan, Ajay & Zhu, Zhenyao. (2017). Deep Speaker: an End-to-End Neural Speaker Embedding System.
  • https://arxiv.org/pdf/1705.02304.pdf
    • The author proposes Deep Speaker, a neural embedding system that maps utterances of speakers to a hypersphere where speaker similarity is measured by cosine similarity.
    • To look up: i-vector paper, equal error rate (EER)
    • Through experiments on three distinct datasets, the authors show that Deep Speaker are able to outperform a DNN-based i-vector baseline. They claim that Deep Speaker reduces the verification EER by 50% relatively and improves the identification accuracy by 60% relatively.
    • Make sure that the datasets that the authors used are publicly available.
    • Fortunately, the authors do publish their codes so we can train and test on the BookTubeSpeech dataset.

The paper presents a novel end-to-end speaker embedding model named Deep Speaker. Although the paper is not new, it is definitely something we can use for our problem since the authors do publish their codes, which are readable and runnable.

Citation 2: FDDB: A Benchmark for Face Detection in Unconstrained Settings

The link Github contains the MAFA dataset that has the images of people divided into three main factors: face with mask, face without mask, face without mask but getting blocked by phone, hand, people. This dataset exactly fits with the goal of the research.

Idea 3: Sport players prediction result using machine learning:  

Citation 1: A machine learning framework for sport result prediction

  • Bunker, Rory & Thabtah, Fadi. (2017). A Machine Learning Framework for Sport Result Prediction. Applied Computing and Informatics. 15. 10.1016/j.aci.2017.09.005. 
  • Link: https://www.sciencedirect.com/science/article/pii/S2210832717301485
    • Even though the paper is about sport result prediction not player performance prediction, it does provide good insights on how to tackle our problem. In particular, the authors provide a framework that we can apply to our problem. 
    • Moreover, each step of the framework is clearly explained with detailed examples. The framework can be used for both traditional ML models as well as for artificial neural networks (ANN).

The paper provides not only a critical survey of the literature on Machine Learning for sport result prediction but also a framework that we can apply to our problem. While the survey can help us get a sense of which method works best, the framework will let us know what to do next after we have picked our model.

We need to uploaded and finish 18 annotated, however this is my way to write annotated bibliography. I am still trying to finish 18 annotated as soon as I can.

Leave a Reply