Annotated Bibliography

with No Comments

Profiler:

1/ Real Time Face Detection

M. Sharif, K. Ayub, D. Sattar, M. Raza

Real Time Face Detection method by changing the RGB color space to HSB color space and try to detect the skin region. Then, try to detect the eye of the face in the region. The first step is face detection, second step is face verification. Time is crucial since it is real time. The published time is 2012.

http://sujo.usindh.edu.pk/index.php/SURJ/article/view/1553

https://www.researchgate.net/publication/257022696_Real_Time_Face_Detection

https://arxiv.org/abs/1503.03832

2/ FaceNet: A Unified Embedding for Face Recognition and Clustering:

A paper describing the method for face recognition using FaceNet system. FaceNet is a system by Google that allows high accuracy and speed in face detection mechanism. The accuracy of FaceNet system is 99.63% on the widely used Labeled Faces in the Wild (LFW) dataset.

Two Implementations of FaceNet for Face Recognition:

https://github.com/cmusatyalab/openface

https://github.com/davidsandberg/facenet

3/ http://www.cmlab.csie.ntu.edu.tw/~cyy/learning/papers/SVM_FaceCVPR1997.pdf


Virtual Space:

1/ Towards Massively Multi-User Augmented Reality on Handheld Devices

Develop a framework for implementing augmented reality interface on hand-held devices. There’s a graphical tool for developing graphical interfaces called PocketKnife, a software renderer called Klimt and a wrapper library that provides access to network sockets, threads and shared memory. In the end, they develop several AR games with the framework, such as the invisible train game.

2/ http://www.mitpressjournals.org/doi/abs/10.1162/pres.1997.6.4.355


Investor:

1/ Financial time series forecasting using support vector machines

Using support vector machine and compare the results with other methods of forecasting. The upper bound C and the gamma kernel parameter play an important role in the performance of SVMs. The prediction performance may be increased if the optimum parameters of SVM are selected.

C parameter is the parameter for how small will the hyperplane of largest minimum margin be.

2/ https://papers.nips.cc/paper/1238-support-vector-regression-machines.pdf

3/ http://link.springer.com/article/10.1023/A:1018628609742

Annotated Bibliographies

with No Comments

 

T1-  Data Mining, analysis and prediction 

Topp, N., & Pawloski, B. (2002). Online Data Collection. Journal of Science Education and Technology, 11(2), 173-178.

This paper touches on the history online data collection, some brief review of the more recent progress and work that is being done as well as how a database connected to the Internet collects data. It also presents a brief insight into where these methods might head towards in the future. Overall, this is a short 7-page article to give a good insight and a starting point as well good references.

 

Hand, D., Blunt, G., Kelly, M., & Adams, N. (2000). Data Mining for Fun and Profit. Statistical Science, 15(2), 111-126.

This is a more detailed paper regarding the different tool, models, patterns and quality of data mining. Even though it was written in 2000 is very useful is terms of getting a broader idea of model building and pattern detection. It looks at statistical tools and their implementation as well as the challenges to data mining through well explained examples and graphs.

 

Edelman, B. (2012). Using Internet Data for Economic Research. The Journal of Economic Perspectives, 26(2), 189-206.

Economist have always been keen to collect and analyze data for their research and experimentation. This paper introduces how data scraping has been employed by companies and businesses to extract data for their use. It is an excellent paper that combines data scraping with data analysis and where and how it has been used. It sets the foundation for data analysis and lists various other good papers in the particular field.

 

 

Buhrmester, M., Kwang, T., & Gosling, S. (2011). Amazon’s Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data? Perspectives on Psychological Science, 6(1), 3-5.

Amazon’s Mechanical Turk helps bring together a statistician’s dream of data collection and an economist’s love for data analysis. It has proved to be an excellent platform to conduct research in not only economics but also psychology and other social sciences. This is a very short 4 page paper that looks at the mechanical Turk, what it has helped research and conclude and how it has been used to obtain high quality inexpensive data. This paper is significant in a sense that it is an application of the above-mentioned tools of collection, analysis and possibly prediction.

 

T2- A more informed Earlham : Interactive Technology for Social change

1/ Vellido Alcacena, Alfredo et al. “Seeing Is Believing: The Importance of Visualization in Real-World Machine Learning Applications.” N.p., 2011. 219–226. upcommons.upc.edu. Web. 20 Feb. 2017.

2/ “And What Do I Do Now? Using Data Visualization for Social Change.” Center for Artistic Activism. N.p., 23 Jan. 2016. Web. 20 Feb. 2017.

3/ Valkanova, Nina et al. “Reveal-It!: The Impact of a Social Visualization Projection on Public Awareness and Discourse.” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. New York, NY, USA: ACM, 2013. 3461–3470. ACM Digital Library. Web. 20 Feb. 2017. CHI ’13.

 

T3– CS for all : Learning made easy.

1/  Muller, Catherine L., and Chris Kidd. “Debugging Geographers: Teaching Programming To Non-Computer Scientists.” Journal Of Geography In Higher Education 38.2 (2014): 175-192. Academic Search Premier. Web. 20 Feb. 2017

2/ Rowe, Glenn, and Gareth Thorburn. “VINCE–An On-Line Tutorial Tool For Teaching Introductory Programming.” British Journal Of Educational Technology 31.4 (2000): 359. Academic Search Premier. Web. 20 Feb. 2017.

3/  Cavus, Nadire. “Assessing The Success Rate Of Students Using A Learning Management System Together With A Collaborative Tool In Web-Based Teaching Of Programming Languages.” Journal Of Educational Computing Research 36.3 (2007): 301-321. Professional Development Collection. Web. 20 Feb. 2017.

Annotated Bibliography

with No Comments
  1. Fake news:
    1. Shao, Chengcheng, et al. “Hoaxy: A platform for tracking online misinformation.” Proceedings of the 25th International Conference Companion on World Wide Web. International World Wide Web Conferences Steering Committee, 2016.
    2. Castillo, Carlos, et al. “Know your neighbors: Web spam detection using the web topology.” Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2007.
    3. Boididou, Christina, et al. “Challenges of computational verification in social multimedia.” Proceedings of the 23rd International Conference on World Wide Web. ACM, 2014
  2. Peer-to-peer platforms:
    1. Daswani, Neil, Hector Garcia-Molina, and Beverly Yang. “Open problems in data-sharing peer-to-peer systems.” International conference on database theory. Springer Berlin Heidelberg, 2003.
    2. Ripeanu, Matei. “Peer-to-peer architecture case study: Gnutella network.” Peer-to-Peer Computing, 2001. Proceedings. First International Conference on. IEEE, 2001.
    3. Hinz, Lucas. “Peer-to-peer support in a personal service environment.” Master of Science Thesis, Uppsala University, Uppsala, Sweden (2002).
  3. Browser Fingerprinting (possibility of going into cyber-security and related branches):
    1. Eckersley, Peter. “How unique is your web browser?.” International Symposium on Privacy Enhancing Technologies Symposium. Springer Berlin Heidelberg, 2010.
    2. Nikiforakis, Nick, et al. “Cookieless monster: Exploring the ecosystem of web-based device fingerprinting.” Security and privacy (SP), 2013 IEEE symposium on. IEEE, 2013.
    3. Acar, Gunes, et al. “FPDetective: dusting the web for fingerprinters.” Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. ACM, 2013.

Final Abstracts list

with No Comments

T1–  ::Data Mining, analysis and prediction 

This survey paper will first look at the tools used to gather and store data from user and other domains. It will then look at how, in the past, others have worked with data to make co-relations and predictions. It will then look attempt to look at publicly available data and try to find correlation with other market data. Our focus here will be to see the extent to which one data can be abstractly analyzed and linked to others and with what degree of certainty. It will involve working with a lot of data and analyzing it to find trends and patterns and possibly making predictions.

 

Topic 2 – CS for Social Change and Sustainability

 

Every year the different branches of campus such as Health Services, facilities, Public Safety, ITS and the registrar’s office send out emails to students that are lengthy reports which no one ever reads. Earlham facilities keep records on energy consumption that the students seldom look at and every now and then there are issues around campus that divides the student body but students rarely get to vote on.

To address these problems I suggest a mobile survey app that allows students to vote on issues as well as view various data from departments around the campus. These survey results and data will also be dynamically displayed on screens around the campus. It would involve learning and implementing graphic interface tools as well as visualization programs. If we link this through quadratics (as is done for student government voting), we can make sure that only Earlham students get to vote and each student gets to vote only once.

The ability to view data and trends on key statistics across from these departments would certainly help the students in a better-informed position and in a place to bring change.

 

T3 – CS for all

As I see my Econ professors struggle with STATA (a simple tool to work with data through commands), I cannot help but draw parallels on how it first felt to learn programming. Reality is that most people without a CS background have difficulty in learning these new tools and softwares. Softwares, most of which are outdated in their use, but, are still taught to students who usually resort to memorizing them to pass midterms. I think that it would be very helpful if we as CS students can help discover, learn, teach as well as document these softwares and help other departments. I propose an interactive interface like Code-academy where students are given tutorials that go progressively forward in complexity. Co-ordination from these departments would be essential to understand their needs and create an interface catered to help their students learn from scratch.

 

{ possible additions could be log-in mechanism via moodle to ensure students are spending the amount of time they should be taking these interactive courses”}

 

 

Annotated Bibliographies

with No Comments
I/ Econ Simulation Game
1) “Educational Video Game Design: A Review of the Literature – Semantic Scholar.” N.p., n.d. Web. 16 Feb. 2017.
2) Squire, Kurt. “Changing the Game: What Happens When Video Games Enter the Classroom.” Innovate: journal of online education 1.6 (2005): n. pag. www.bibsonomy.org. Web. 16 Feb. 2017.
3) —. “Video Games in Education.” International Journal of Intelligent Simulations and Gaming 2 (2003): 49–62. Print.
II/ Social Network Data Mining
1) Cheong, France, and Christopher Cheong. “Social Media Data Mining: A Social Network Analysis Of Tweets During The 2010-2011 Australian Floods.” (2011): n. pag. works.bepress.com. Web. 16 Feb. 2017.
2) Kempe, David, Jon Kleinberg, and Éva Tardos. “Maximizing the Spread of Influence Through a Social Network.” Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2003. 137–146. ACM Digital Library. Web. 16 Feb. 2017. KDD ’03.
3) Wu, X. et al. “Data Mining with Big Data.” IEEE Transactions on Knowledge and Data Engineering 26.1 (2014): 97–107. IEEE Xplore. Web.
III/ Connection between stocks
Zager, Laura A., and George C. Verghese. “Graph Similarity Scoring and Matching.” Applied Mathematics Letters 21.1 (2008): 86–94. ScienceDirect. Web.

Top 3s

with No Comments

Project Idea:

Real-time visualisation of point clouds through android device

http://ieeexplore.ieee.org/abstract/document/5980567/

http://ieeexplore.ieee.org/abstract/document/6224647/

http://ieeexplore.ieee.org/abstract/document/6477040/

Project Idea:

P2P Git: a decentralised version of git version control

https://github.com/git/git

https://pdfs.semanticscholar.org/f385/29a1983e66491085d91364f30daf15ccb55f.pdf

http://www.bittorrent.org/beps/bep_0003.html

http://ieeexplore.ieee.org/abstract/document/4724403/

Project Idea:

Automatic exposure, shutter speed, ISO, and aperture algorithm implementation for digital cameras.

resources:

https://mat.ucsb.edu/Publications/wakefield_smith_roberts_LAC2010.pdf

http://www.sciencedirect.com/science/article/pii/S0096055102000115

http://ieeexplore.ieee.org/abstract/document/6339326/

http://proceedings.spiedigitallibrary.org/proceeding.aspx?articleid=979346

Capstone Abstracts – v1

with No Comments

Abstract 1

Recently I became interested in P2P messaging and/or protocols. While these protocols can offer security and prevent wiretapping (for example, bitmessaging), there are some serious drawbacks. For one, decentralization is difficult to achieve while maintaining the advantages of a centralized server, which provides major shares of benefits of client-server model. Even if decentralization is achieved, the architectures turns out to be not so well for scalability. I haven’t identified what exactly I am going to work on, but focusing on an aspect that makes the P2P protocols more robust is my motivation behind the project.

 

Abstract 2

It’s a widespread belief that fake news has played a noteworthy roles in shaping the voters pick for the US presidential candidate in the election cycle 2016. Fact checking, and thus weeding out fake news is one of the most difficult challenges that technology can take on; however, it’s unlikely for a set of algorithm to match the accuracy of a human fact checker, as of today. In this paper, we examine how natural language processing can help finding patterns in dubious claim as opposed to stories that are factually consistent. Employing artificial intelligence agent, we are able to show that a “true story” is supported by several sources and report the same event/fact, while a fake news story is likely reported from a single source and gets circulated. In addition to that, we’ll also examine how AI can be used to detect the extent to which a story is verifiable, which is a key characteristic of a credible story.

Abstract 3

When a device is connected to the internet, a combination of several data points uniquely identify a machine, which is known as browser fingerprinting. Advertisers and marketers use cookies in order to target potential customers, and it is very easy to abuse those tools and it leaves any device connected to the internet vulnerable to attacks. We’ll investigate the uniqueness of browser fingerprinting briefly and examine the impact of a single data point in determining the uniqueness of a fingerprint. In doing so, we’ll analyse the privacy aspect of an user and ways to achieve the security, anonymity and how the anonymity impacts the connectivity of a device.

Capstone Abstracts- V1

with No Comments

IDEAS: 

T1 – One data predicts another

This survey paper will look at publicly available data and try to find correlation with other market data. For example, it would study how weather patterns or viral news stories could correlate to stock prices for certain stocks. It will try to see to what extent one data can be abstractly analyzed and linked to others with what degree of certainty. It will involve working with a lot of data and analyzing it to find trends and patterns.

Possible Ideas to build on: 

— > Will be looking into Behavioral economics and how certain events follow another. Using this, I will look for places to extract co-related data. 

— > Will involve a fair bit of learning STATA to work on data and derive co-relations. Some statistical modeling would be helpful. 

—> Stock market data is usually well kept however similar day to day data is rarely seen in other places. One possible topic being finding co-relations is to look in unusual places within the stock markets. for example: Boeings stocks might be brought down by President Trump’s tweets but what other markets have shown unusual reactions to his tweets. Perhaps a comparison of market changes with key words in tweets of with the most popular people on twitter on that area. 

/———————————————————————————————————————————-/

T2- Computers, data and everything else.

This survey paper will look at how the trends and tools of data analysis have changed within the stock markets and particularly with the field of Economics. Infamously labelled “the dismal science”, economist are only now able to collect and manipulate data to prove their theories. It will look at how data analysis because of modern computing is affecting other fields.

Possible Ideas to build on: 

—> Databases used in the stock markets and how they have eased day to day operations. 

—> Other popular mass scale data collection tools and how development in computing has changed their workings. { This would be more of a history digging up, I would look up how and why the successors were picked over their predecessors.}} 

—> Some bits of this project could be used on the first idea. 

/——————————————-]———————————————————————————-/

T3 – Data Mining

This survey paper looks at how and what data is being extracted from users and in what ways companies are storing and profiting from it. It looks at targeted advertisements, cyber security, the algorithms working in the background and the databases that sell our data.

Possible Ideas to build on: 

—> Look into tools of data mining. The use of cookies and pop up ads and data extraction from search bars. How are these companies getting smarter every day, what loopholes in are they employing.  How they create a virtual personal of you based on what they know about you so far. 

—> Learn how the government has in the past used data from social security and taxes to analyze various sociological aspects. Where else has such data analysis existed within the computer science. How can the two be related ? 

/——————————————————————————————————————————-/

Capstone Abstracts – v1

with No Comments

I/ Sometimes lectures and text books can be too “dry” for students to get excited about a subject, specifically economics. At the same time, researchers have found the potential of games in education, especially when used as an introduction to new concepts. EconBuild is a game that simulates different aspects of economics that we normally encounter in our economics intro classes, proving students a platform to practice what they learn in class. The game can help students to enforce the most fundamental elements of economics such as demand and supply, stock market, etc.

II/ In this day and age, more and more businesses choose to expand their brand using social networks, thus leading to the fact that social media users continue to provide advertisement, positive and negative. In order to become competitive, it is necessary for a company to establish its online present as well as analyze its component’s dominance. Using a Hadoop based approach to reduce the size of database, we can gather and analyze information about a company on social media and predict certain trends to help with its growth.

III/ Stock market is usually unpredictable. There is no particular rule that it obeys to, which is why investing in stock is considered a risky business. Many people have tried to analyze particular trends in order to guess whether the stock price would rise or not. However there hasn’t been a lot of software that analyze the relationship between different related stocks. Using support vector machine approach, combining with graph similarity scoring and matching algorithm, we can establish relationships between different stocks, thus open the possibility of being able to predict particular stock trends.

Capstone Abstracts – v1

with No Comments

This paper will describe a project created using support vector machines (SVM) to predict stock price. Since the method is support vector machines, the data must be labeled, which fits what needed for stock evaluation. Stock’s information comes from its financial statements, which are all labeled. In this particular project, the version of SVM is a machine called least square support vector machines, which are used for regression analysis. The language being used is Python with scikit-learn, which has SVM implemented in the library.

This paper will describe a project using augmented reality (AR). AR is a live direct or indirect view of a physical, real world environment augmented (or supplemented) by computer-generated sensory input such as sound, video, graphics or GPS data. For this particular project, I will use Swift to implement a iOS app to provide users a augmented reality graphical view with supplemented GPS information. The application will take the user’s location and give additional information about the POI around the areas on the phone when the POI shows up.

This paper will describe a project using Machine Learning for Real-time Face Detection and Recognition using the mobile’s camera and compare the result to college’s student database. The paper will allow people to connect easily by knowing the name, location and mobile number with just a look on the phone. The program will run on iOS and Android using Cordova as a base.