Retraining the VGG16 Neural Network for Meme Classification

16 May


            The purpose of this project was to create and train a neural network model (copied from the VGG16 model created by Oxford scientists) to recognize and label images based on the “meme” category that they belong to. The two memes we chose were “Pacha” and “Doge” though this code could be adapted to include any type of meme the user wishes to classify. This project was accomplished by adapting code provided by Pleuni Pennings, Ilmi Yoon, and Ana Caballero H. All code was written in Python and executed using Google Colaboratory.

Image Preparation

            To begin, we collected 80 sample images from a simple “google images” search including 40 of each meme type, saving them in .pdf format to google drive. Images were then separated into training (n=40), validation (n=20) , and testing (n=20) data sets by resizing the images and copying a subset of them to the respective folders. Once this image preparation step was complete, we had our data structured appropriately for input into the VGG16 model.

Creating the Model

            Our image data sets were imported from google drive into the model provided by Dr. Ilmi Yoon, and all images were plotted with labels using numpy and pyplot libraries. We began with a pretrained VGG16 (Oxford) model, which was downloaded using Keras. This model was copied directly (excluding the final output layer) to create a new model to be used for our purposes, in which all layers other than the final output layer were set to be non-trainable. The final output layer of the neural network was then replaced with a new output layer specified for our unique meme classification task, containing two outputs corresponding to the two possible data classes. The output of our model is a probability density function for each image, distributed across these two classes representing the probability that the image belongs to either class.

Training the Model

            The images within the training data set were passed individually through this model, a value of the cost function was calculated for each output, and the back-propagation algorithm was then implemented to minimize this cost. This is done by adjusting the weights of the connections between the final hidden layer and the output layer, the only layer that we previously specified to be “trainable” in creating our model. This process was repeated by cycling through the entire training data set ten times, in completing ten training epochs. To assess whether this number of training epochs was optimal to achieve the ideal balance between specificity and generality, while neither underfitting nor overfitting the model to the data, the cost of the training data set was compared to that of the validation data set.

Testing the Model

            Finally, our model was tested by running the “testing” data set, containing images excluded from the training and validation data sets, through the model. The prediction accuracy of our model was calculated as the percentage of “testing” images classified correctly within the meme category that they truly belong to. Plotting these results using a confusion matrix revealed that our model was able to classify the images and distinguish between the two meme categories with 100% accuracy.

This project was completed as our final for the Exploratory Data Science for Scientists (EDSS) course, a component of the SFSU Graduate Opportunities to Learn Data Science (GOLD) program.

EDSS Team #1 – Joaquín Magaña, Michael Ward, Teagan Bullock, Austin Sanchez

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: