RIT CIS Internship: Machine Learning

Posts

Showing posts from July 22, 2018

Day 16: Building my own sentiment analysis program

July 27, 2018

I spent this morning creating my own sentiment analysis program using some example code. By going through the program line by line I was able to for the most part determine what was going on. After that Kushal walked me through some VQA code. VQA stands for visual question answering and it combines natural language processing (NLP) and computer vision. Using VQA you can have a model answer questions about images. I had to load in some .json files which was slightly difficult but I managed to figure it out . These files contained over 250,000 question answers to questions based on images also found in the dataset. Tomorrow I plan to try implementing VQA code on the questions without their images to see what kind of accuracy I can achieve.

Day 15: Continuing sentiment analysis

July 26, 2018

Today I continued to look into how a sentiment analyst algorithm works. I found that the algorithm utilized a Gated Recurrent Unit (GRU) in order to classify the data it was given. A GRU has an update gate and a reset gate. The reset gate chooses how much of the old data to forget and an update gate decides how to add in the new data. This is important in sentiment analysis because a sentence must be analyzed as a whole not piece by piece. It needs to be analyzed as a whole because it is hard to tell if a sentence is positive or negative if you forget the previous word you fed in. By the end of the day I had gained a decent understanding of the VQA basics.

Day 14: Sentiment Analysis

July 25, 2018

Sentiment analysis is determining the emotion of the writer using only a string of characters. For humans this is very easy but this is a daunting task for a computer. One thing a computer does to deconstruct a sentence is turn it into a vector. Based on weights give to each specific word a sentence can be represented by a list of numbers. These numbers can more easily be analyzed by a computer. Surprisingly interpreting sentences as vectors is an effective tool for computers. Today i looked over code that implemented a VQA algorithm that performed sentiment analysis.The program looked at a dataset of tweets that were labeled as positive or negative. After training in the data the program was able to identify new sentences as positive or negative with surprisingly good accuracy. In the future I plan to implement my own VQA algorithm.

Day 13: Continuing VQA

July 24, 2018

Today I continued what I had started yesterday. I ran into some trouble with installing spacy a natural language processing library. Natural language processing is a term used to describe computer analysis of human languages. After finally figuring out how to install all of the packages I needed I was ready to start experimenting with some VQA code. Using some sample code from an online blog I was able to explore how VQA uses recurrent neural networks(RNN) in order to analyze sentences. The rest of my day was spent going line by line through the code and googling anything I did not understand. I gained a very basic understanding of how a RNN is able to retain data it has already processed and why this is useful in VQA.

Day 12: Learning VQA basics

July 23, 2018

I spent most of today learning about VQA and sentiment evaluation. VQA stands for Visual Question Answering, and it practice it can provide simple answers based on an image. A VQA algorithm utilizes a RNN to evaluate the data it is given. A RNN is a recurrent neural network which uses data it previously passed through a function as input for the next function. This allows a sentence to be represented as a one dimensional vector. Once the sentence is changed into a vector it can be analyzed using sentiment evaluation. Sentiment evaluation takes the most important parts of the sentence and tries to determine if the sentence has a negative or positive view. Source: http://visualqa.org