Topic Statement

with No Comments

Deeksha Srinath

Senior Seminar Topic Statement

Advisor: Charlie Peck


My interest in how social media today is influencing our lives influenced my topic. I will be working to design a unified data model for Facebook and Twitter data. I will be doing this in order to be able to query a pool of data that spans multiple social media platforms. This is useful to the scientific process because people interact with different social media sites differently. In designing a unified data model, I will be able to analyse trends across platforms.


Once my data model is established and I have moved my data into it, I am interested in exploring the different scenarios around disordered eating on social media. In a day and age when everyone has access to everyone else’s pictures at the touch of a button, I am curious about what this is doing to body image and body positivity among young women in the US, particularly women of color. Eating disorders in the US are steadily climbing, with thousands of young women losing their lives to disordered eating. Body positivity is also on the rise, with more and more people speaking out about loving their body as is and embracing the beauty in difference.
 I am interested in exploring how to mine trends in the data across platforms. I do not have an ample psychological background to understand all the facets to this part of my project. I will be working with the Psychology department in order to better understand what to look for and how to query my data usefully once it is in a unified format.

Project Topic

with No Comments

For my Senior Research, my topic will be a data mining project using data collected from Twitter. Twitter’s API offers 1% of a spatial bandwidth (in my case, the continental U.S.A.) for users to collect. This data has been collected for over 3 years, and represents well over one billion tweets. Of these, a significant percentage of tweets contains at least one hashtag, which is one kind of data I will be looking at. The other datatype I have an interest in is geo-tags, which are an optional GPS coordinate which users may choose to include. Using machine learning algorithms, I hope to identify regular hashtags, in order to classify different kinds of signals based on hashtag frequency. The purpose of this is to see if I can predict hashtag occurrence, or whether hashtags are too noisy to classify or group into reliable frequencies.

My goal is to then study the noise, and to give that noise a geo-spatial context in which to understand the events which contributed to that noise.

Here’s a simple example:
Given that the State of Indiana tests tornado sirens on the first Tuesday of each month, it is likely that hashtags similar to #tornado or #siren appear in greater numbers on the same days as tests. This is a regular signal which could be reduced to a variability of +- 6 hours. This signal can be ignored. However, should a tornado strike on a different day, the sirens will go off, and #tornado or #siren might appear on an irregular day. The siren creates a spatial event which only affects the region which hears it, which might distinguish it from the more regular signals.

At a larger scale, looking at the noisy hashtags might give insights into real time, less predictable events. This can help de-obfuscate growing stories or events in real time, allowing us to find the meaningful information which hides under layers of signals.

I will be doing this research with David Barbella (Dave). Dave and I will be working with resources hosted by NCSA, including the CyberGIS Supercomputer ROGER (an XSEDE resource, for others that are interested).

Revised Project Proposal

with No Comments

. . ., but it may be revised again soon.

Our goal is to make a 3D rhythm game that would, among other possible applications, teach players the gestural motions of an orchestral conductor and act as a teacher for conducting music. The basic gameplay is that at certain points in the music, the game will show where the wand needs to be placed in 3D space and calculate score based on the distance between the intended position and the actual position of the wand. The gameplay would be comparable to the free game osu!, except no other inputs (e.g. mouse buttons) are required to play the game.
The project will consist of both software and hardware components, namely the game itself and a controller made specifically for said game, respectively. Currently, the game is planned to be built from scratch while including libraries such as OpenGL for a graphical interface and PortAudio for interacting with audio. Meanwhile, the required hardware may include just two infrared cameras/sensors as well as one infrared emitter on the tip of a wand for the cameras to detect. The reason for using infrared is to minimize any background interference that may occur when tracking a specific object as opposed to tracking by color.

Senior Project in HCI

with No Comments

Big Picture

Topic: Software Interfaces and Human Behavior

Adviser: Charlie Peck


While this requires some additional refinement, I’ve settled on the general topic and hope to incorporate some of my interests from the other topics along the way.

I will study how interfaces affect interactions between humans and computers. There is a rich history in this area, both in academia and in history/current events. Charlie recommended the Apple design guidelines, an outstanding trove of insights about why components of a software or OS interface should be designed in a particular way. From my own research I see that human-computer interaction (HCI) contributes to choices about everything from Facebook privacy to nuclear meltdowns.

In a potential paper, I would introduce the history of some high-profile HCI choices before zooming in on a few particular factors (to be determined) for more careful analysis and software design. This is a new area of study to me, so which factors I choose in particular will be determined upon completion of further research.

For the software project, I will approach this from the perspective of optimizing the response time for a given interaction. I intend to create a simple application, likely web-based to make scale feasible, with two simple interfaces and a series of prescribed interactions to be done in a given order. I have considered using a few of our local datasets: Iceland data, 911 emergency call data, transportation data, and a few public datasets on key topics. My intention with the software is to focus on the HCI components, so my preference is to use the data environment I am already familiar with as the backend for the project.

Since the major concern with my and most projects is getting directly to the CS, I intend to focus in particular on these subdomains:

  • Human-Computer Interaction: Trivial from the description.
  • Software Engineering: Trivial from the need to design and implement the application.
  • Algorithms: Choosing the correct algorithm to optimize the interaction; evaluate the time data
  • Relational Databases: The backend data will be stored in a PostgreSQL database

In addition, this project draws on insights from several topics in the social sciences – behavioral economics, psychology, business – but I consider these topics as launch sites rather than journeys or landing sites.

Updated Project Idea

with No Comments

Intelligent Personal Assistant for Medicine

Research Supervisor: Dave Barbella

I want to build a software (potentially mobile application) that acts as an intelligent personal assistant for medical purpose. The inspiration comes from modern programs like Siri, but instead of being a general purpose, I want it to have a narrower focus (i.e. medicine). While I am still working on the details, I envision that you can talk to the app about various things such as diseases, medicines, hospitals and so on. I want the communication between the user and the app/program to be as human-like as possible. The app will also do other things like remind you to take your medicine, tell you if your physical health is matching with the symptoms of some disease, tell you when it’s time to go for a regular check-up and so on. I anticipate integrating other 3rd party web services to make some of these functionalities possible. I am also expecting to go through the works of CALO (Cognitive Assistant that Learns and Organizes) a lot among other resources.

There will be various aspects of computer science (or Artificial Intelligence specifically) that will be at the heart of this project such as:

  • Natural language processing
  • Question Analysis
  • Data collection/mashup
  • Reasoning/Pattern detection

While these are all new fields of study for me, I am excited to learn more about these and apply these while conducting my research/project.

Potential Project Ideas

with 1 Comment
  1. Connecting a seemingly similar history to a surprisingly variable present

With this project, I would examine how a set of nations (a subset of Scandinavian nations) that are today relatively homogeneous in terms of race and economic capacity have vastly differing attitudes and policies around immigration and integration of immigrants. This interest developed as I was reading about Iceland’s policies around immigration before visiting there, and being struck by it’s vastly open immigration policy. Part of the reason this was so striking to me was it’s proximity to nations that in comparison, are very closed to immigration. I am yet to find the serious Computer Science in this project, but I am hoping that in learning more about the question I am trying to ask, I am helping myself find the Computer Science tools I could use to answer it.


2. Analyzing twitter data to study emotional health as tied to disordered eating

Social media is in our homes, and in our kitchens. This project would be an advent into studying twitter data about eating preferences. With information about healthy eating at everyone’s finger tips, it’s easy to get pulled into the 1234 fad diets that are popular on the interwebs on any given day. Through this project, I would study how patterns in popularity of fad diets affect dietary preferences as projected on twitter. Disordered eating is on the rise in the US, as is veganism. The question I will be trying to ask in this project is whether so called lifestyle changes(such as switching to a vegan lifestyle) have become the ad-hoc way of normlising disordered eating, and whether this phenomenon is discoverable through twitter data.

Potential Projects (UPDATED: SEP21)

with No Comments

UPDATED IDEA: Object Recognition and Tracking for Augmented Reality

While exploring more about Augmented Reality and AR-based applications currently circulating on the internet, I have seen limitations of Augmented Reality, especially in object recognition and tracking. I would like to see the current status of the capability of object recognition and tracking technologies available and how we can improve them. If possible, I want to push further so that markerless augmented reality can be less complex and frustrating and we do not have to rely heavily on markers anymore.



Idea 1: Educational/Fun Augmented Reality Application

Seeing and interacting with digital creations of your favorite characters in reality would sound like an unrealistic fantasy but, thanks to the rapidly advancing realm of technology, we can bring our imaginations into reality now. I would like to make an educational yet fun application targeted to kids but the idea is not limited to only kids or education. The application allows the user to explore his/her surroundings and interact with the objects by using any devices that can do AR. The application should recognize the object or a part of an object and create an overlay which the user can interact with.

Idea 2: Facial/Image Recognition (Computer Vision) and Algorithms Behind it

While neurologists and other scientists are debating whether the ability to recognize is an innate ability, facial/image recognition has been an easy task for humans. It is so easy that we are not even aware of the fact that we can operate because we can recognize stuffs. However, it is still difficult for computers to perform this task. I think this part would be challenging in perfecting my idea 1 and I would like to spend time researching how we make computers recognize faces or objects under different circumstances.

Idea 3: Schedule Planner

At the beginning of every semester, the supervisor of libraries has to make a work schedule that works around the student workers’ timetable and it is a very tedious process. I want to come up with a software or at least an algorithm that would take in students’ varied timetable and build a schedule that makes everyone happy.

Project Ideas

with No Comments

1.)  Data Compression

I am interested in how data is represented as MPEG, JPEG, and other file formats, and how this data can be used to display an image or video.  In particular, I am interested in the compression algorithms used to store this data in a smaller space, with little or no loss in the quality of the information.  I would explore various lossless and lossy compression algorithms in the paper, and explain their strengths and weaknesses.  I could then create some code to illustrate some compression algorithms and how they work.


2.)  3-D Passwords

While passwords are crucial to how we protect our information, they are also tedious to remember.  One interesting alternative is 3-D passwords.  The idea is the user is placed in some sort of 3-D environment with various objects that can be interacted with.  The user could enter a passwords by interacting with various objects in the environment in a specific sequence.  For example, a user might move a chair, head to a thermostat, and then change it to a specific temperature as a way of entering a password.  This would be an appealing idea to explore in a project as well.


3.)  Soft Computing

I was reading about soft computing and the idea seemed interesting and different from other ideas I have encountered so far in computer science.  I would be interested in exploring it further, but don’t have a specific idea yet.

Project Ideas

with No Comments

Topic: bioinformatics to track ones health:

The goal of this research is to be able to use ones personal health data to track and display a time line of ones health progress. By first gathering relevant data from various inputs, this software will be able to organize and store all the data. Second is the display of health records for easy access as well as reminders for prescription refills, appointments, and when to take medicine. The last part of this is to incorporate an algorithm that tracks ones heath record to create a time line or data sheet of once health for personal and medical use. The Computer Science aspect of this research will include a lot of machine learning such as input organization, and variance tracking.

Project Topic

with No Comments

Facial Recognition

Facial recognition is something that we as human beings have been doing since the beginning of time. We have also become masters at identifying a person’s mood or emotion simply by looking at their facial expression. Today, we have harnessed the powers of artificial intelligence and are now able to apply it to facial recognition software. This software usually consists of “faceprints” which are collections of data that contain certain features and dimensions of the face (length/width of nose, depth of eye socket, ect..). And with this software we are able to not only scan an image for a face, but we are able to determine the expression or emotion that the face is portraying.  This is where I want to focus my research. I want to study facial recognition and how the software is able to detect faces and their expressions to determine human emotions.


Project Ideas

with No Comments

Idea 1 (Intelligent Personal Assistant for Medicine):

I want to build a software (potentially mobile application) that acts as an intelligent personal assistant for medical purpose. The inspiration comes from modern programs like Siri, but instead of being a general purpose, I want it to have a narrower focus (i.e. medicine). While I am still working on the details, I envision that you can talk to the app about various things such as diseases, medicines, hospitals and so on. I want the communication style to be as human-like as possible. The app will also do other things like remind you to take your medicine, tell you when it’s time to go for a regular check-up and so on. I anticipate integrating other 3rd party web services to make some of these functionalities possible. I am also expecting to go through the works of CALO (Cognitive Assistant that Learns and Organizes) among other resources.

Idea 2 (Optimal Character Recognition):

The process of OCR of converting images of typed, handwritten or printed text into machine-encoded text has always been something I have been interested/curious about. I want to research on how this process is done and hopefully recreate the technology. For a more personalized experience, I will try to learn the particular user’s handwriting style better through the app and then hopefully have a higher degree of recognition accuracy.

Idea 3 (Dissecting/Adding functionality to a machine):

While this idea seems increasingly less likely, I thought I would make a note of this regardless. Having had some interest in working with hardware/circuits, I wanted to open up a machine, learn more about the internal components/circuits. Along with that, I also wanted to add some other piece of hardware and add functionality to the machine.

Project Ideas

with 1 Comment

Some of this is copy and pasted from the email I sent last spring, but anyway, here’s my two project ideas:

  • Expanding on the research that I have been doing with Forrest, I hope to make a 3D rhythm game that just uses two Raspberry Pi infrared (NoIR) cameras as well as one infrared sensor on the tip of a wand for the cameras to detect. The basic gameplay is that at certain points in the music, the game will indicate where the wand needs to be placed in 3D space and calculate score based on the distance between the intended position and the actual position of the wand. If I have to compare this to any other rhythm game out there, it would be osu!, since the wand would effectively be treated as a mouse cursor for the game to interact with, and the gameplay would look similar, but without the mouse clicking.In terms of development, I may start with one camera and a 2D game for ease of programming and a simpler user interface, and then transition to 3D afterwards using algorithms already discussed with Forrest and Jim Rogers to get coordinates in 3D space using the two cameras. If all of this goes well, I may add a second sensor and a second wand, so that either one person can control both wands for added difficulty, or two people can control one wand each for competitive or cooperative play. My goal for this is to make a game that is relatively cheap on hardware compared to the hardware of most other rhythm games out there, and potentially teach players the gestural motions of a conductor or any other possible motions. For now, the player won’t be in direct control the music, but that may be a feature that can be implemented farther into the future, as part of the game or in a separate DAW application such as Ableton Live.
  • I would like to learn more about data compression and the algorithms that go into it, especially in the context of compressing media files such as music, images, and video, and the difference in algorithms when using lossy or lossless compression. If there is an algorithm that can perform better (in terms of either running time or resulting file size) than what existing file formats currently provide, I can write a program that can compress files using said algorithm (either into a new file format or to improve an existing algorithm) using the new algorithm(s). If the compression is lossless, I can also decompress the compressed files back to their original state as well.


Research project ideas

with No Comments

Possible research: Spatial computational resource allocation

see also: CyberGIS’16 panel

Data structures are fundamental to the efficiency of algorithms pertaining to transfer and storage, computation, and visualization. Parallel and distributed computing comes in many implementations whose purposes vary greatly. Using centralized computing networks, new resources are available to more institutions, however the bridge between onsite spatial data collection and offsite computing is uncertain, even in terms of data structuring. The changes in resolution and computational needs have brought bitmap and vector closer than ever, however the software resources rely on centralized resources, for which there are few designed for LiDAR terrain mapping.

Research topics:

1: Study data structures to store spatial information. Do aspects of existing structures resolve any problems faced by users?

2: Study whether spatial data compression could be implemented to improve computability and

3: Study methods for data browsing and distributed storage solutions. Big data systems may limit the filesizes remote end users can personally compute with, however some data must be represented by the remote end user.

Project Ideas

with No Comments

Update: my capstone project will be survey of the techniques and methodologies used in making deep learning models smaller and efficient so

that they can be run on the mobile platform. If time permits, I would also like to research about voice conversion using deep learning, especially

the recently published paper on the topic, Wavenet.

Advisor: David Barbella


UpdateAnnotated Bibliography


1. Deep learning on mobile

With the recent advances in deep learning and increase in the amount of data, we are now able to

create smarter applications with more accurate recognition engines. Most of the mobile applications

using deep learning only work online with the main processing being done in the cluster servers. But this

introduces unnecessarily delays and network bandwidth, with the additional disadvantage of not

working at all when the device is offline. Scaling down deep learning models into a mobile an interesting

area of research that could have impact on the future mobile applications. The power of the mobile

hardware will surely keep increasing, but there are quite a few software techniques we can use to

reduce the model size so that it can fit on an average mobile computing device.

More concretely, this research will analyze the current techniques used to reduce model size and

possibly offer possible future optimizations that can be done.

Current Techniques


Low precision arithmetic

Related Research

Compressing Deep Learning Models with Pruning, Quantization and Huffman Coding


2. Voice conversion/morphing

Voice conversion is about conversing voice input from a person to the target’s voice signature. It used to

be very hard to replicate and convert to another person’s voice due to the sheer complexity of the task –

accounting for different accents and specific individual quirks. But it has theoretically become possible

to achieve a relatively effective conversion using neural networks. The practical applications are ample,

ranging from entertainment, giving unique voices to the disabled. Some security systems that use voice

recognition could even become obsolete.

Related Research

Voice Morphing

High Quality Voice conversion using Deep Neural Networks


3. Shopping Experience Enriched by Machine Learning

Recommendation engines are already a popular machine learning application used in e-commerce. In this research, I would like to experiment and research about the further applications of machine learning to enrich the user’s shopping experience. For example, we could train it to understand the reviews and summarize it instead of having the user read over the most recent reviews that may or may not be related to what they’re looking for. Also, the user can specify a problem statement (e.g. “I want to buy a present for my dad”), then the system could suggest possible gifts based on some prior training dataset.


Panel overview

with 1 Comment

Panel: Future Directions of CyberGIS and Geospatial Data Science (Chair: Shaowen Wang)
Panelists: Budhendra Bhaduri, Mike Goodchild, Daniel S. Katz, Mansour Raad, Tapani Sarjakoski, and Judy —

Selected topics by Ben Liebersohn


  • 3D domains are limited, more GIS integration with 3D rendition and simulation be well received.
  • Support for different types of data, which is sometimes more proprietary or otherwise have limited longevity.
  • Can we do analysis of data which we need 3D representation in order to compute simulations with it. Not everything is just landscapes (possibly meaning >3 dimensions? -B).
  • Decision support systems need more types of data. We need the integration with the applications as well.
  • Real time data streams and distributed loads which serve local decisions on broader, better networked scales.


  • Integration needs quantification of size, needs What do we envision as the problem, and the scope? What technology (hardware, network) is needed?
  • What does all this data mean? What do we do about it? This gets you closer to the science policy area.



“As an outsider, when I see what’s going on in this community I ask: what unique problems is this community facing versus common problems? I presented networking and cloud stuff you may not have seen before. The application can drive the network and the compute resources. Flexible and scalable networks. Maybe both sides can help one another.”

Project Ideas

with 2 Comments

<!–Idea 1:

Developing some sort of hardware/software combination that would allow for monitoring of washers and dryers on Earlham’s campus. I would then create an app of some sort so that students could go on to the app and be able to 1) get notifications when a machine is done 2) look to see which machines are available so that they do not have to make the trek to their closest washing machine only to find out that that the machines are all taken.  Right now, my idea for the hardware is just a machine that is plugged into the outlet at the same spot as the machine, kind of like an adaptor, and will broadcast a signal telling whether the machine is running or not.  The software will then simply read the broadcast to determine if a machine is running or not.
Idea 1:


Developing a piece of software that is able to perform population estimation by scraping information from popular sites. I would most likely scrape instagram posts with tags, twitter posts that mention a location, facebook photos with location tags, and also if possible recent google searches regarding the location.  For instance, if 1000 people in the last day had searched on google, food locations on Miami beach, it is a good predictor that a high proportion of those 1000 people are visiting Miami beach in the near future.  Then a person would use my piece of software to say, search how busy Miami beach on that day, and a predictor of how busy it will be in the near future. This approach would require a lot of probability into the calculations.


Idea 3:


This idea would be the same concept as idea 2, but instead of scraping information, i would obtain the location information (or create fake data as charlie suggested) and process that data to then provide a more accurate depiction of the population at a location at any given moment.  The predictive capability will then be based on past data that i had collected.

1 6 7 8 9