Regression, DeepNN & ConvNN on TensorFlow Keras Basic Overview

overview Dec 18, 2020

Recently I have been learning Tensorflow and keras these are some of the basic models that you can implement without a lot of ML knowledge and get amazing results.

I will try to expand and update this post with more entry level stuff, I will gather information from all sort of sites but shoutout to:

Tw: @ramgendeploy

Basic Regression Model:

First a basic regression model with only one parameter, it models a basic function:
y = ax+b

model = keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
model.compile(optimizer='sgd', loss='mean_squared_error')

xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)

# Training
model.fit(xs, ys, epochs=50)
print(model.predict([10.0]))

So with this model illustrates the basic setup of a neural network program should look like.

We have the Declaration of the model using the Sequential Method of keras, and a list of layers in this case a simple Dense layer with one neuron, when we have more than one neuron these are all connected hence the Dense layer. We compile this model using stochastic gradient descent and Mean Squared Error as the loss.

Then there is the data, the input data are the Xs and the target data its the Ys, this is a supervised learning type model. Where we have our training examples and what the result is for that given example.

Then we fit the data to the model, or also called training the model.

Deep Neural Network (DNN)

Now a basic DNN nothing fancy, also an example of how to add a callback on every epoch end.

import tensorflow as tf
print(tf.__version__)

class StopTraining(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
	if(logs.get('acc')>0.95):
      print("\nReached 95% Stoped!")
      self.model.stop_training = True
callbacks = StopTraining()

mnist = tf.keras.datasets.fashion_mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()

training_images = training_images/255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model.fit(training_images, training_labels, epochs=5, callbacks=[callbacks])

We load the mnist fashion datasets, retrieving the training images and the training labels, these are a number for 0-9 corresponding to a label in the label list.

Then we rescale the input data because its 255, so we have the input data to be 0-1 scale. The input data its a 28x28 image size.

The first layer of the model its a Flatten this flattens the image to a single vector, this ending up 784x1 when training with 28x28, we can set the input shape to 28x28 but this is optional, then we have our only hidden layer a 512 dense layer this meaning its a fully connected layer with relu as its activation function, and finally our output layer of 10 units.

In the output layer using the softmax activation this is going to give us a probability distribution of the most likely category the input image is.

We training the model using Adam and Sparse Categorical Cross Entropy this is used when you have more than one label, and it expects the label to be a number. We use a callback to execute on every epoch end, this is useful to for example stop training at x% or stop the training if the accuracy of training keeps going down but the accuracy of the validation set is going up, this means that we are likely overfitting.

To get a prediction and evaluate it on the test set we do this:

# Evaluating the test set 
test_images = test_images/255.0
model.evaluate(test_images, test_labels)

# Predict one image
class_image = model.predict(test_images[0].reshape(1,28,28,1))
print(class_image[0])

Convolutional Neural Network (CNN)

Now lets see a basic CNN with Max Pooling, lets use the cifar10 dataset it has 70k images 60k to train and 10k for the test set. This is like a baby VGG model.

import tensorflow as tf

tf.keras.backend.image_data_format()
#> channels_last

# Data
cifar10 = tf.keras.datasets.cifar10
(train_imgs, train_labels), (test_images, test_labels) = cifar10.load_data()

train_imgs=train_imgs/255.0
test_images=test_images/255.0

# Model declaration
model = tf.keras.models.Sequential([
  tf.keras.layers.Conv2D(128, (3,3), activation='relu', input_shape=(32, 32, 3)),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Conv2D(512, (3,3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2,2),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Checking the model
model.summary()

# Training
class CustomCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    print(logs)
callbacks = [
             tf.keras.callbacks.EarlyStopping(monitor='accuracy', patience=2), # Stops if the model its not improving
             tf.keras.callbacks.ModelCheckpoint(filepath='model.{epoch:02d}-{accuracy:.2f}.h5'), # Saving the model after each epoch
             CustomCallback()
]

model.fit(train_imgs, train_labels, epochs=20, callbacks=callbacks)
# Test set evaluation
test_loss = model.evaluate(test_images, test_labels)

class_image = model.predict(test_images[0].reshape(1,32,32,3))
print(class_image[0].argmax())

We use Cifar10 as its relatively small and has color images, we are going to use 2 blocks of Conv+MaxPool, the convolutions have 128 filters in the first block and 512 in the second, filters are a matrix of kernel size that encodes features of the image for example horizontal lines ( the features are visible when the filter is applied ), both have 3x3 kernel size (size of the filter) and a default stride of 1 (how much it moves the ). The MaxPool layer downsamples the input by taking the maximum value of the specified window.

Then a flatten layer with a fully connected layer with 128 units and 10 unit for the final output for 10 classes.

It's important to specify the input size of the model for training in this case, 32x32x3

One cool thing to do with this basic model is use it as feature loss, meaning training it with a relatively large dataset and then use the model for style transfer or image super-resolution.

Lastly we train the model this time using predetermined callbacks, EarlyStopping stops the training when no improvement has been made for a couple of epochs and ModelCheckoint to save the model after each epoch, also we can mix our own callbacks too.

We can evaluate in the test set, i got 60% training for 20 epochs.

Got any Questions? did a got anything wrong? lets chat!
Tw: @ramgendeploy

Tags

Ramiro

Hey hi! My name is Ramiro, this is my blog about artificial intelligence, coding and things that are in my head. Follow me on twitter @ramgendeploy so we can chat!