Over Thanksgiving break I have learned how to compile and modify the software for ffmpeg in Cygwin, a virtual Unix environment. This makes the process of modifying and compiling the code easier. Furthermore, I have been developing and working on implementing algorithms to use the conclusions from my experiment better choose how many keyframes to allocate for different types of videos. I have also made the full results of my experiments available here.
Update 11/8/16
I am continuing the tests that I mentioned in the previous update and adding them to the graph.
Update 11/7/16
I spent Monday running more tests with ffmpeg to get better data. I am now forcing a specific number of prediction frames over a regular interval.
Update 11/3/16
I spent today reading through more of the documentation for ffmpeg to learn more about its structure and the commands it supports.
Update 11/2/16
I have placed some of my early data in various spreadsheets. I am continuing the process of collecting data, and am ready to use my the information I have so far and observations I have made to start the first draft of my paper.
Update 10/30/16
I have been gathering more data to find the optimal number of keyframes for various types of videos. The videos with larger file sizes take a long time to compress.
Update 10/29/16
Continuing to gather data to evaluate ffmpeg.
Update 10/28/16
Continuing to work on gathering data for the speeds and compression ratios of ffmpeg.
Update 10/27/16
I have begun work on testing how long it takes ffmpeg to compress certain files, and how effectively it compresses files at certain key frame sizes.
I have also been working on compiling the program’s source code so I can work on modifications, but I haven’t yet succeeded at that.
Update 10/26/16
After further researching open-source projects and tools that are available to me, I have decided that I will instead focus on ffmpeg. It is similar to Xvid in the sense that it is an open-source project that provides codecs for compressing and decompression data, but it has better documentation and seems easier to work with.
I have also attained several sample files and have begun experimenting with how well ffmpeg compresses them. In order to test their compression algorithms as best as possible, I have many different types of videos for performing testing on. One video is a black screen, and it compresses quite nicely, which makes since given that there is little randomness is the video. Another video, which involves confetti falling, compresses poorly, since the video is much less predictable. I plan to continue to experiment to see what ffmpeg excels at and struggles with, and I will study and evaluate its source code.
Project Proposal
I have completed my project proposal and powerpoint. Below is the timeline I have constructed for my project.
- October 21: Be familiarized with the Xvid codec, how it works, and how to make simple modifications to it to change compression.
- November 6: Have unique, decently working, personal compression algorithm. At this point I will have explored Xvid and experimented with ideas for some time, so I hope to have added some of my own ideas to the codec.
- November 8: Complete a general outline of the paper to serve as a guide.
- November 16: Complete the first draft of the paper.
- November 30/December 4: Be prepared for project presentation.
- December 12: Finish second draft of paper.
- December 16: Finish final draft of paper and software.
Project Topic
I have chosen to do my senior project on data compression and my adviser for the project will be Xunfei Jiang.
Data compression is the concept of compressing data to fit into a smaller space. Lossless compression is when when some form of data, like a video file, is compressed into less space with no loss in quality. In lossy compression, a file can be compression even more, but at the expense of the quality of the data.
There are many different types of data one may want to compress. For instance, we can compress the amount of space it takes to store text using a technique like run-length encoding. In run-length encoding, we count the repetitions of characters, and store the number of times that character repeats itself. If the sequence EEEEE appears in a string, we could instead store it as 5E so that it takes up less space. This would be an example of lossless compression since the original string can still be perfectly reproduced despite requiring less storage space.
If one was compressing a video file, one might use bit rate compression. In bit rate compression, the number of bits used to determine the colors the pixels can turn is reduced. This will cause the video to require far less storage space, but at the cost of quality, since not as many color options are available. Thus, this would be an example of lossy compression.
For my personal project, I will read papers published on various data compression techniques. I will write my paper describing various compression techniques used in computer science. I will also come up with my own method of compressing data, probably for video files. I will write code to demonstrate this compression technique, and I will explain the method and how it works in my paper.
Project Ideas
1.) Data Compression
I am interested in how data is represented as MPEG, JPEG, and other file formats, and how this data can be used to display an image or video. In particular, I am interested in the compression algorithms used to store this data in a smaller space, with little or no loss in the quality of the information. I would explore various lossless and lossy compression algorithms in the paper, and explain their strengths and weaknesses. I could then create some code to illustrate some compression algorithms and how they work.
2.) 3-D Passwords
While passwords are crucial to how we protect our information, they are also tedious to remember. One interesting alternative is 3-D passwords. The idea is the user is placed in some sort of 3-D environment with various objects that can be interacted with. The user could enter a passwords by interacting with various objects in the environment in a specific sequence. For example, a user might move a chair, head to a thermostat, and then change it to a specific temperature as a way of entering a password. This would be an appealing idea to explore in a project as well.
3.) Soft Computing
I was reading about soft computing and the idea seemed interesting and different from other ideas I have encountered so far in computer science. I would be interested in exploring it further, but don’t have a specific idea yet.