I would like to create a generic software pipeline based on the recent breakthroughs in compressing deep neural networks. The pipeline could be then used for software applications such as Tensorflow in order to compress neural networks so that they can be deployed in resource limited mobile devices. My initial idea for the pipeline is to use network trimming/pruning, quantization, weight sharing and huffman coding

