2015 Computer Science Senior Design Projects

Student Report Title Student Report Title

Some links below may require MS Powerpoint 2007 or Adobe Reader
Click on a student picture for a larger view.

Alvin Andino

Advisor: Prof. John Rieffel

Evolving Behaviorally Diverse Soft Robots

Soft Robotics is an emerging field with many promising applications including the next generation of disaster first responders. One outstanding challenge in the field is finding ways to make soft robots move. A novel solution has been vibration, rather than linear or rotary actuation, as a means of locomotion. Soft Robots are complex enough systems where its unknown which specific motor frequencies produces spe- cific behaviors. One could have trails of a soft robot until its full behavior is found. This can be inefficient and time consuming to do with every new design. Our research is to develop a more efficient method of evaluating behavioral diversity for soft robots with unknown behavior and incorporating it in an existing genetic algorithm used to evolve soft robots. We breed robots based on how differently they behave to various motor frequencies. The aim is figure out the frequencies to navigate a space with the minimum amount of trails.

Poster Link    Report Link    Presentation Link   

Benjamin Berger

Advisor: Prof. John Rieffel

Evolving Scalable Soft Robots

Designing soft robots is difficult, time-consuming, and non-intuitive. Instead of requiring humans to engineer robots, this research uses genetic algorithms to evolve designs for robots that move when vibrated. Generative encodings are used to represent designs and are modified during the evolutionary process. A generative encoding is a set of rules that describe how to construct a 3D object. If the rules are applied over and over, it will create a larger and more complex robot. Current methods of evolving generative encodings fix a number of times to apply the rules a priori. This restricts the usefulness of the resulting encoding to one size. Instead, this research aims to create scalable solutions, with generative encodings capable of producing fit robots of various sizes. This is done by implementing the notion of a pareto frontier into the genetic algorithm. Each design produced is assigned to a category (small, medium, large, etc) and is judged on how well it can create robots that move when vibrated in each category. One design is deemed better than another if it can dominate across all categories. The resulting generative encodings should be able to produce soft robots of various sizes that can move when vibrated.

Poster Link    Report Link    Presentation Link   

Conor Carey

Advisor: Prof. Cass & Prof. Yaisawarng

Predicting Stock Price Direction Through Data Mining and Machine Learning Techniques

The stock market attracts risk takers who are willing to bet their money on whether a stock price will increase or decrease. What if you had an edge in this gambling mecca? Machine learning techniques are among the newest methods of stock market price trend prediction. Using a support vector machine (SVM), as implemented by WEKA, and eleven attributes in each data point, the data was classified as either BUY or DNB (Do Not Buy), depending on whether the price would increase or decrease. Many previous experiments’ goals were to find attributes that correlated with future stock price trends. These attributes were used to create the dataset used in this study, along with a nominal sector attribute, to determine if the addition of this variable improves the accuracy of the stock price trend prediction model.

Poster Link    Report Link    Presentation Link   

Hyung Yul Choi

Advisor: Prof. Kristina Striegnitz

Automatically Determining Review Helpfulness

Customer reviews from commerce websites have valuable information for online shoppers. They help shoppers gauge whether or not a product is worth the purchase. However, reviews vary in their quality and helpfulness. Most commerce websites have voting systems where shoppers can vote on whether a review was helpful to them or not. For popular products however, the number of reviews can be in the thousands. As a result, not all reviews will get enough attention to receive helpfulness votes even though some may contain helpful information for other shoppers. In these scenarios, it would be desirable to be able to automatically collect the most helpful reviews. This research aims to do this by finding features in the review text that are indicative of its helpfulness and training a learning algorithm that can determine review helpfulness.

Poster Link    Report Link    Presentation Link   

(Peter) Zachary Davis

Advisor: Prof. Kristina Striegnitz & Prof. David Barnett

What It Is To Be Conscious: Exploring the Plausibility of Consciousness in Deep Learning Computers

As artificial intelligence and robotics progress further and faster every day, designing and building a conscious computer appears to be on the horizon. Recent technological advances have allowed engineers and computer scientists to create robots and computer programs that were previously impossible. The development of these highly sophisticated robots and AI programs has thus prompted the age-old question: can a computer be conscious? The answer relies on addressing two key sub-problems. The first is the nature of consciousness: what constitutes a system as conscious, or what properties does consciousness have? Secondly, does the physical make-up of the robot or computer matter? Is there a particular composition of the robot or computer that is necessary for consciousness, or is consciousness unaffected by differences in physical properties? My aim is to explore these issues with respect to deep- learning computer programs. These programs use artificial neural networks and learning algorithms to create highly sophisticated, seemingly intelligent computers that are comparable to, yet fundamentally different from, a human brain. Additionally, I will discuss the required actions we must take in order to come to a consensus on the consciousness of deep learning computers.

Poster Link    Report Link    Presentation Link   

Stephen DiIorio

Advisor: Prof. Matthew Anderson

Optimizations for Rendering Realistic Lens Flares in Polynomial Optics

Lens flare is a common optical phenomenon exhibited by lens systems, like those used for professional photography or film. As light travels from the front of a lens system towards the sensor at the back, it can either refract through or reflect off of the surfaces of the lenses. A lens flare is the result of (typically) unwanted reflections and scattering caused by imperfections in the lenses. This reflected or scattered light then travels to the sensor following unexpected paths. While considered by some to be degrading artifacts, lens flares have grown to become an essential ingredient for realistic imagery and are often used as artistic effects. As such, adding lens flares to realistic image rendering has become a desired feature. However, while it is possible to render realistic lens flares using ray tracing, current techniques are often too slow to use in real-time. On the other hand, approximations can be made to vastly improve the frame rate of rendering, but this comes at the cost of visual fidelity. We present several methods of approximation in an effort to render a more realistic lens flare in real-time.

Poster Link    Report Link    Presentation Link   

William J. Doyle III

Advisor: Prof. Aaron Cass

Classification of System Call Sequences using Anomalous Detection

We used data mining techniques to detect intrusions among system call traces and have outlined our results. Recent work at the intersection of security and machine learning has lead to better understanding of anomalous intrusion detection. There is a need to more thoroughly understand how anomaly detection can be used because of its potential applications and advantages over current standard methods. In this thesis, we report on a new approach of anomalous detection using system call traces. Our goal is to be able to create a system that can accurately detect hacking attacks by analyzing the sequences of system calls the operating system is performing. We will look at how this data can be processed to achieve correct detection of intrusions on a system. In the end, we will outline ways in which system call traces can be leveraged as well as what we can do and learn from these results.

Poster Link    Report Link    Presentation Link   

C. Lee Fanzilli

Advisor: Prof. Dvorak & Prof. Webb

Stock Price Prediction Using Sentiment Detection of Twitter

If Amazon can predict what books we want to read, Netflix can predict what movies we want to watch, and Google, if you are feeling lucky, can predict what we are looking for, then it wouldnt be farfetched to say that Twitter can predict what stocks we should buy. The prediction of stock trends based on this kind of data analysis have been a hot topic for some time now, because of the growth of social media and our technological advances in analyzing large amounts of data. With the development of various computing methods there have been studies [demonstrating that] computing techniques outperform conventional models in most cases. The conventional investor, a long-term, rational person who picks stocks based on a companys history, board team, and projected performance, is now a bit of an anachronism. It is still true that past performance does not indicate future gains but we do know that individual stocks, and the market as a whole, are not 100% unpredictable. Cliff Asness provides some insight that the market is generally efficient, but not entirely so and that both market efficiency and human behavior move markets. In particular, it is possible that new information is not incorporated into the stock price instantaneously. One source of such information may be tweets. We ask whether tweets about companies and their stock tickers, contain information that is not yet calculated into the price of the stock.

Poster Link    Report Link    Presentation Link   

Brian Hazzard

Advisor: Prof. Chris Fernandes & Prof. Aaron Cass

Examining the Strength of the diff Command for Patching in a Command Line Interface

Frequently used in a command line interface, the diff command has been used for nearly 40 years as a file comparison tool. In this study, we test the effectiveness of the diff command line tool as a way to preview changes to different actions, or as a patch. We specifically look at the effectiveness in regards to selective undo: a manner of undoing previous actions in a non-linear way. Results found that users who were simply shown what the outcome of their patch would be were able to navigate the selective undo visualizations faster than than those who were shown diff statements.

Poster Link    Report Link    Presentation Link   

Andrew Ivarson

Advisor: Prof. Matthew Anderson & Prof. John Spinelli

Examining Self-Modifying Code

Self-modifying code is used for both good and bad. Software companies use it as a means of hiding a program's internals. This can allow them to prevent hackers from “cracking” the authentication on their software. Self-modifying code is also used in malware to avoid being detected by malware analysts and anti-virus software. It is created when programs write to their own instruction memory space. The power of self-modifying code comes from being difficult to analyze, because traditional analysis tools don’t account for changing instruction space. In this project, we implement an algorithm proposed by Anckaert et al.[1] to track self-modification. This algorithm can be used to generate a new type of control flow graph which represents a conservative estimate of all possible execution paths of a program. We implement this algorithm with the goal of identifying its strengths and weaknesses. We also produce visualizations of the resulting control flow graphs that show both positive and negative aspects of this algorithm. Using our visualizations, we have identified levels of sophistication the code in a program can possess for a useful visualization to be generated. We have defined sophistication in code to be related to the amount of jump and write instructions, and programs that have more complex sequences of interrelated jumps and writes yield far less useful graphs. On the other hand, programs that contain simple sequences of jumps and writes, or none at all, are easily visualized to the full understanding of the program’s internals.

Poster Link    Report Link    Presentation Link   

Nahian Jahangir

Advisor: Prof. Striegnitz & Prof. Ueno

Extraction of Example Sentences for Improved Reading and Understanding of Japanese Texts

Properly learning and understanding foreign languages can be a difficult task. This is true for native English speakers who try to improve their Japanese reading comprehension. The difficulties of Japanese list as follows, from having three different writing systems to being a heavily-context based language. In this paper, I focus on solving the reading comprehension problem through the use of example sentences. Using a target word and its source sentence, I try to create a program that returns a proper example sentence that retains the context and contains easier grammar and terms to understand. There are three different approaches to this: simple overlap, collocations overlap, and weighting words. I evaluate each of the approaches based on their precision, recall, and mean average precision. The collocations approach does the best amongst the three, with its own advantages and disadvantages. I then discuss what the next steps of the project can be in creating proper example sentences.

Poster Link    Report Link    Presentation Link   

Julian Jocque

Advisor: Prof. Rieffel

Opponent Modeling in Board Games Using the Estimation Exploration Algorithm

In this paper I will detail my senior thesis work done on modeling opponents in board games using the estimation exploration algorithm. The paper first explains background knowledge to explain concepts relevant to the project. From there I build an outline of a system that could be used to model opponents in konane using the estimation exploration algorithm. This system is implemented and then finally tested to get data. The results found from this system were quite promising, showing that in some cases this system is capable of producing the same moves as an opponent in over 80% of scenarios. Furthermore, it was found that for every opponent that I tested against, the resultant models were significantly better than an untrained random player. These results indicate that the estimation exploration algorithm can be used to effectively model an opponent in a board game, albeit with varying degrees of success.

Poster Link    Report Link    Presentation Link   

Joshua Kline

Advisor: Prof. Matthew Anderson & Prof. William Zwicker

Analysis of the PeerRank Method for Peer Grading

Peer grading can have many benefits in education, including a reduction in the time instructors spend grading and an opportunity for students to learn through their analysis of others’ work. However, when not handled properly, peer grading can be unreliable and may result in grades that are vastly different from those which a student truly deserves. Therefore, any peer grading system used in a classroom must consider the potential for graders to generate inaccurate grades. One such system is the PeerRank rule proposed by Toby Walsh, which uses an iterative, linear algebra based process reminiscent of the Google PageRank algorithm in order to produce grades by weighting the peer grades with the graders’ accuracies. However, this system has certain properties which make it less than ideal for peer grading in the classroom. We propose a modification of PeerRank that attempts to make the system more applicable in a classroom environment by incorporating the concept of “ground truth” to provide a basis for accuracy. We then perform a series of experiments to compare the accuracy of our method to that of PeerRank. We conclude that, in cases where a grader's accuracy in grading others is a reflection of their own grade, our method produces grades with a similar accuracy to PeerRank. However, in cases where a grader's accuracy and grade are unrelated, our method performs more accurately than PeerRank.

Poster Link    Report Link    Presentation Link   

S. Grant Lowe

Advisor: Prof. Nick Webb

Machine Learning Techniques for Music Prediction

The question of whether one can predict the success of a musical composition has been asked many times over the years. Researchers looking into hit song science, as it is called, have been able to predict whether a song will chart with varying degrees of success. However, not many have tried to look at how successful pop songs have changed over time and whether these changes can be tracked, to see if we are able to predict what popular songs in the future will be like. In my project, I will be looking at some of the attributes of popular music from the early 1900s to today, and seeing how those attributes change or become more important in determining a song’s release year and genre. Using the popular data mining and machine learning framework WEKA, I hope to be able to track what attributes of a song give insight into the genre that that song falls into and the decade in which the song was released.

Poster Link    Report Link    Presentation Link   

William Cody Menge

Advisor: Prof. Aaron Cass

Engaging Players by Dynamically Adjusting the Difficulty

Video game designers constantly explore ways to enhance a video game’s experience by keeping a player engaged. Dynamic Difficulty Adjustment (DDA) and other similar systems attempt to create a unique experience by adapting the game based on player behavior. The approach was focused on changing the rate at which the game dynamically adjusts it's difficulty level. The effectiveness of this study was determined by the player’s engagement level. By keeping the game at the optimal difficulty, based on the player’s performance, the game will keep the player in the Flow.

Poster Link    Report Link    Presentation Link   

Peter Miralles

Advisor: Prof. Fernandes

Effect of Display Size on a Player’s Situational Awareness in Real-Time Strategy Games

When it comes to computers and the display people use they typically want it to be as big as possible. This may not be the best thing to do when it comes to playing certain types of video games. For simple everyday use a bigger display is just fine, there is more screen real-estate to take advantage of allowing many windows to be open at once and they can be reorganized depending on what the user wants to focus on. This is not an option when it comes to video games specifically Real-Time Strategy (RTS) games. The purpose of this is to explore whether a bigger screen improves a player’s situational awareness when playing RTS games. This will be determined by a user study in which participants will be placed in two groups to play an RTS game. Their ability to retrieve information and how well they play will be measured.

Poster Link    Report Link    Presentation Link   

Liana Nunziato

Advisor: Prof. Nick Webb

Identifying Character Personas Using Natural Language Processing

Understanding the persona of a character can help readers better understand the storyline of a novel, which can be especially important when studying novels from different time periods. Generally, a reader generates a persona by looking at a character’s actions, dialog, behavior, opinions, and relationships. This project produced a program that generates character personas in a novel by displaying the most frequent verbs and adjectives that are in the same sentence as a character’s name. Using those words, the program also generates a positive or negative rating for each chapter, which is displayed in a graph of the entire book.

Poster Link    Report Link    Presentation Link   

Derek Ohanesian

Advisor: Prof. Chris Fernandes

Automated Evaluation of WordPress Theme Design

A major obstacle in publishing a website is developing a well-designed user interface for the content being published. The World Wide Web democratizes publishing by providing an open platform on which even amateur web publishers can communicate, but a welldesigned website remains less accessible to amateur publishers. Past research has focused on finding quantitative heuristics that could be used to provide useful design feedback to nonprofessional web designers. By analyzing the well-defined functions for displaying content in WordPress themes, as well as the CSS style sheets from those themes, a WordPress theme’s design can be heuristically evaluated. The result of this evaluation can provide the theme’s designers with feedback and suggestions for design improvement. For this research project, the following heuristics measured: color quantity, color contrast, balance, density of content, font size, and font quantity. Data was collected for over 200 WordPress theme designs and analyzed for these heuristics. The feedback that this automated system provides was then compared to a human heuristic evaluation, to assess the value of the feedback that the automated system provides.

Poster Link    Report Link    Presentation Link   

Computer Science Senior Design Projects from previous years:
2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003

All rights reserved - Union College, Schenectady, NY 12308
Last edit 11May2015
Address questions or comments to tay@union.edu