CS388-Week3-Updates

I am still exploring my new ideas.

Idea 1 Title: AI Assistant for Safety Driving

Description: My idea is to use facial recognition to detect fatigue driving or dangerous driving. A small camera will be placed in front of the driver and the AI Assistant will be a mobile phone APP. For fatigue driving, the camera detects behaviors like closing eyes for long time, yarning, frequently rubbing eyes with hand, etc. For other dangerous driving behaviors, the camera can detect behaviors like playing mobile phone, turning back to chat, smoking, etc. After detecting dangerous driving behaviors, the APP will do a series of operations according to the dangerous behavior: giving an alarm, play refresh songs, report the location of the nearest rest stop. Actually there are already research and real products available now, but most of them are commercialized. They are expensive and big-sized. What makes my idea different is that I am going to develop it as a cheap and handy daily life tool. And I probably can enhance the accuracy of detection under dark light or special environment. The main tech is facial recognition and there are many open sources online.

Idea 2 Title: Voice Print Recognition Identifier

Description: Facial recognition is very popular recently, but many people do not notice voice print recognition. VPR is cheaper and more convenient because it only needs one microphone. Voice can tell many information like age, gender, emotion, environment, and etc. Wearing glasses or makeup might affect the performance of facial recognition, but a mature human being’s voice is stable. There are already research and real application on voice print recognition. For example, WeChat user uses voice recognition to log in. But I can innovate my idea on the application, for example, combine it with bank customer service. When a user steals other’s password and calls Chase to do money transaction, the VPR can detect if the voiceprint matches the bank account owner. Or use it to fight crime, home robots, etc. The VPR tech is already exist. I need to innovate from other aspects. But the two primary usages are: speaker identification or speaker verification. Voice print is a spectrogram of voice, forming from wavelength, frequency, intensity and other hundreds of characteristics. I can differentiate people from differences in purity of voice, resonance ways, average pitch characteristic, voice range.

The popular used spectrogram characteristics are: MFCC, PLP, FBank, D-vector by Google, Deep feature, Bottleneck feature, Tandem feature. Existing models: GMM-UBM, JFA, GMM-UBM i-vector, DNN i-vector, Supervised-UBM i-vector. Scoring algorithm: SVM, PLDA, LDA, Cosine Distance.

(But i don’t know if the scope is too big)

Idea 3 Title: Using OCR to grade hand-written homework

Description: Take a picture of the hand-written homework and use OCR tech to extract the content. (OCR is a technology converting printed text into editable text.) Then check if the algebra calculation is correct. Mark the correctness in the picture near the problem. There are already APP like that available online. But it only checks for algebra calculation, and it is not that accurate. It requires good hand-writing. I can expand my project to enhance the OCR tech, or make it check for simple English homework as well. Especially some parents in non english speaking countries cannot speak english to help their children. And probably I can expand the idea to a mobile phone APP that can summarize and analyze the grades data: collect incorrect problems, analyze which category did incorrectly the most, and etc.

Yuqian Zhang

Leave a Reply Cancel reply