Convolutional Neural Networks with TensorFlow

Tanmay Choudhary
Geek Culture
Published in
5 min readApr 10, 2021

--

Photo by 🇻🇪 Jose G. Ortega Castro 🇲🇽 on Unsplash

In this blog I am going to talk about Convolutional Neural Networks, or CNN’s for short. CNN’s are really useful for things like image processing. We will start by understanding how a CNN works.

Images can be considered to be a very large matrix. If we consider a black and white(grayscale) image then all the pixels will have some values. Let us consider that 1 is white and 0 is black. So we have an image like this:

Image

A CNN basically works by identifying features like straight lines, diagonal lines, and curved lines in images. They identify all these lines using a filter. A filter is a small matrix that helps make the features on the image more visible. For example, here is an filter:

3X3 filter

This filter may be used to identify straight edges in an image.

The filter is then put on the first 3X3 part of the image. After that, the number on the filter and the image are multiplied. All the number on the resulting matrix are then added to make a pixel of the new image. This will result in the reduction in the image size. Sometimes, this is not desirable so we add padding to the image, a black border that increases the size of the original image so that after convolution, the image size does not decrease.

Lets see how this can be used to train a neural network. I am going to make a cats vs dogs model. I have taken the data from this Kaggle competition. After extracting the zip you will have a train folder with all cats and dog images labelled in this format: “cat.index_number.jpg”.

Let us first see all those images:

import numpy as np
import cv2
import matplotlib.pyplot as plt
from matplotlib.image import imread
folder = 'train/'for i in range(9):
plt.subplot(330 + 1 + i)
filename = folder + 'cat.' + str(i) + '.jpg'
image = imread(filename)
plt.imshow(image)
plt.show()

As we can see that all the images are of different sizes. This will not work so we will resize and segregate the images into different folders using the pre-processing from keras and the os module. Lets start by segregating all the images into different folders.

import os
import shutil
os.mkdir('data')
os.mkdir('data/cats')
os.mkdir('data/dogs')
for file in os.listdir('train/'):
if file.startswith('cat'):
shutil.copyfile('train/{}'.format(file),
'data/cats/{}'.format(file))
if file.startswith('dog'):
shutil.copyfile('train/{}'.format(file),
'data/dogs/{}'.format(file))

After this all the cat image should be copied over to the cats folder and the dog images should be copied over to the dogs folder. This might take a little time.

After this we can use tensorflow to resize and make changes to the image. However, before we do that, we need to make our model structure.

import tensorflow as tf
import tensorflow.keras as keras
model = keras.Sequential([
tf.keras.layers.Conv2D(64, (3,3), activation='relu',
padding='same', input_shape=(200,200,3)),
tf.keras.layers.MaxPooling2D((2,2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(units=128, activation='relu'),
tf.keras.layers.Dense(units=1, activation='sigmoid'),
])

Here, in the first layers 64 filters will be added. The size of the filters are 3X3. We set the activation to relu. We also set the padding to “same” which will add the black border we talked about. The input shape, which will be the new shape after resizing the images will be 200X200. The extra 3 there means that there will be matrix for the red, blue and green component of the image(since we are using colour image). Flatten just spreads the square results into a long line, like the nodes of a DNN. This is done so that we can transition from a CNN structure to a DNN structure.

After setting the model structure we need to make the pipeline for the data to move from the folder to the model training. We do this using the .flow_from_directory function. First we make the images available to the model and using ImageDataGenerator, we rescale the pixel values of the image so that they are between 1 and 0. This is because, like I said in my previous blog, neural networks work best with numbers between 0 and 1.

from keras.preprocessing.image import ImageDataGeneratordatagen = ImageDataGenerator(rescale=1/255.0)
train = datagen.flow_from_directory('data/', class_mode='binary', batch_size=64, target_size=(200,200))

The output should show: Found 25000 images belonging to 2 classes.

After this we can start the training by compiling our model and then using model.fit():

opt = tf.keras.optimizers.SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(train, epochs=20, verbose=1)

You can then wait for the training to happen. After the training is done and if you are not happy with the accuracy then you can change the values set and commence training.

The training of the model

Ideally, you should divide your dataset in training and testing data in the 80–20 or maybe in the 70–30 ratio. Then you can use the same code to make a validation data generator and then feed the generator to the model.fit as the validation data. This is common practice but as this model is just for demonstration purposes, I haven’t shown it here.

After that you can load an image from the web from your folder using imread(‘filename.jpg’). You can resize it to a 200X200 image using some function like the cv2.resize function from OpenCV and then feed it to the model.predict function to see your prediction. As we have also saved the data from the training we can access that data using this code:

acc = history.history['accuracy']
plt.plot(acc)
plt.show()

So that’s it for this blog. I hope you enjoyed it and learned the theory behind the Convolutional Neural Networks and how to use them using tensorflow. If you like this then follow me on Medium and share the article!

Thanks for reading!

--

--

Tanmay Choudhary
Geek Culture

Space exploration, AI and Flutter enthusiast. Aspiring aerospace engineer.