Senior Capstone

with No Comments

An Integrated Model for Offline Handwritten Chinese Character Recognition Based on Convolutional Neural Networks

Abstract

Optical Character Recognition (OCR) is an important technology in computer vision and pattern recognition that recognizes text embedded in images. Although the OCR achieved high accuracy for languages with alphabet-based writing systems, its performance on handwritten Chinese text is poor due to the complexity of the Chinese writing system. In order to improve the accuracy rate, this paper proposes an integrated OCR model for Chinese handwriting that combines existing methods in the pre-processing phase, recognition phase, and post-processing phase.

Paper

Software Demo Video

Software Architecture Diagram

CS488 – Abstract

with No Comments

Optical Character Recognition (OCR) is an important technology in computer vision and pattern recognition that recognizes text embedded in images. Although the OCR achieved high accuracy for languages with alphabet-based writing systems, its performance on handwritten Chinese text is poor due to the complexity of the Chinese writing system. In order to improve the accuracy rate, this paper proposes an integrated OCR model for Chinese handwriting that combines existing methods in the pre-processing phase, recognition phase, and post-processing phase.

CS388 – Week 5 – Update

with No Comments

I read six papers for my three ideas. It was interesting that many research ideas have the same goal, but the ways they approach the problem are very different. For example, in the fall detection problem, some research ideas apply deep learning on images and videos, but others work on radio frequency instead of using the images. Another example of this is improving the performance of the Optical Character Recognition for Chinese text. My initial thought about how to solve this problem is to apply image processing techniques to improve the image quality and then deep neural networks. However, there was one paper approaching this problem from a different angle. They apply statistical natural language processing models such as N-gram in order to improve the accuracy of OCR. These ideas might help me come up with an approach that is different from the one I was thinking of doing.

CS388 – Week 4 – Update

with No Comments

I am reading papers for my first idea, which is “Detect and Translate Chinese text in images”. One research that I read was about improving the performance of the Optical Character Recognition for Chinese books that are in precarious conditions. Instead of trying to enhance the image quality, their research applies N-gram, long short-term memory, and backward and forward N-gram statistics text model to develop a more accurate OCR model.

CS388 – Week 3 – Third Idea

with No Comments
  • Name of Your Project

A Real Time Fall Detection System to Assist the Elderly Using Deep Neural Networks

  • What research topic/question your project is going to address?

The elderly have a high chance of falling and get injured or faint. This might put them to danger if they are alone. One way that can help the elder people is having a system that can monitor their actions, detect the falling action and other behaviors after falling down, classify the levels of severity and send an alert to their emergency contacts or the emergency room if the level is serious.

  • What technology will be used in your project?

Deep learning, pattern recognition, image processing

  • What software and hardware will be needed for your project?

Python, PyTorch (or Keras)

I might also need a CCTV camera if I decide to build the actual device.

  • How are you planning to implement?

First I will apply some image processing techniques to enhance the images and videos quality. If the dataset is small, I will use of image data augmentation techniques to produce more data. Then train the model that detect the person falling in the photo frame using deep neural networks, then use the people falling photos and videos to train a model that classify the level of severity. When the index of severity passes a threshold, send out the alert.

  • How is your project different from others? What’s new in your project?

There are several projects that work on the similar problem. Most of them work on detecting the falling action only. In this project, I hope to build a system that is more detail and can decide whether it is an emergency case.

  • What’s the difficulties of your project? What problems you might encounter during your project?

I might not be able to find a big enough dataset to train the model.

CS388 – Week 2 – Second Idea

with No Comments
  1. Name of Your Project

Driver Drowsiness Detection Using Deep Neural Networks

  • What research topic/question your project is going to address?

Driving while feeling sleepy or tired is one of the main causes of traffic accidents. One solution for this might be having a device in the car that monitor drivers’ behaviors and facial expressions and ring the alarm if the drivers tend to fall asleep.

  • What technology will be used in your project?

Dataset of facial expressions (images and videos)

  • What software and hardware will be needed for your project?

Python, PyTorch (or Keras)

  • How are you planning to implement?

Build a pipeline that first apply some image processing techniques to improve the quality of the images, then train a model (using neural networks) to detect and locate face position in the images, and the last step is to build a model (also using deep neural networks) to classify the behaviors and facial expressions.

  • How is your project different from others? What’s new in your project?

Most relevant projects track the drivers’ eyes to see if they close their eyes. I am considering checking eyes movements and also other behaviors such as yawning or nodding off in order to improve the classification performance.

  • What’s the difficulties of your project? What problems you might encounter during your project?

There might not be a big dataset for me to use.

CS388 – Week 1 – First Idea

with No Comments
  1. Name of Your Project

Detect and Translate Chinese text in images

  • What research topic/question your project is going to address?

Lately many translator applications have introduced the new feature that can scan a document or take an image with texts to detect and translate the texts into another language.

Many of these applications perform well with very neat and clear handwriting or high quality images but not quite well with cursive handwriting or low quality images. My research goal is to improve the detection performance in these cases.

  • What technology will be used in your project?

Chinese – English Dictionary API

  • What software and hardware will be needed for your project?

Python, PyTorch, matlab

  • How are you planning to implement?

Build a pipeline that first enhance the quality of the image data using image processing techniques, then feeds data to a deep neural network model (maybe CNN) to detect the Chinese characters and connect to a dictionary API to translate the text into English.

  • How is your project different from others? What’s new in your project?

The current applications do not perform very well on low quality images, so my goal is to find solutions to this limit of the translation apps.

  • What’s the difficulties of your project? What problems you might encounter during your project?

I did some experiments and found that big apps like Google Translate still had trouble detecting the not-very-neat handwriting. Therefore it could be very challenging to achieve my research goal.