Tuesday, April 13, 2021

Understand Responsible AI


Artificial Intelligence is a powerful tool that can be used to greatly benefit the world. However, like any tool, it must be used responsibly.

Fairness

AI systems should treat all people fairly. For example, suppose you create a machine learning model to support a loan approval application for a bank. The model should make predictions of whether or not the loan should be approved without incorporating any bias based on gender, ethnicity, or other factors that might result in an unfair advantage or disadvantage to specific groups of applicants.

Machine Learning includes the capability to interpret models and quantify the extent to which each feature of the data influences the model's prediction. This capability helps data scientists and developers identify and mitigate bias in the model.

Reliability and safety

AI systems should perform reliably and safely. For example, consider an AI-based software system for an autonomous vehicle; or a machine learning model that diagnoses patient symptoms and recommends prescriptions. Unreliability in these kinds of system can result in substantial risk to human life.

AI-based software application development must be subjected to rigorous testing and deployment management processes to ensure that they work as expected before release.

Privacy and security

AI systems should be secure and respect privacy. The machine learning models on which AI systems are based rely on large volumes of data, which may contain personal details that must be kept private. Even after the models are trained and the system is in production, it uses new data to make predictions or take action that may be subject to privacy or security concerns.

Inclusiveness

AI systems should empower everyone and engage people. AI should bring benefits to all parts of society, regardless of physical ability, gender, sexual orientation, ethnicity, or other factors.

Transparency

AI systems should be understandable. Users should be made fully aware of the purpose of the system, how it works, and what limitations may be expected.

Accountability

People should be accountable for AI systems. Designers and developers of AI-based solution should work within a framework of governance and organizational principles that ensure the solution meets ethical and legal standards that are clearly defined.

Saturday, April 3, 2021

Digits Classification using Deep Learning

Hello Every one, In recent days we are hearing a lot about Machine Learning, Deep learning.  Many of us  have rough idea what is machine learning and deep learning. In this article, I am showing the implementation of neural networks from scratch. Neural networks have great potential in extracting the features of the images in order to classify the images or detection of object in an image as well as audio data. This kind of data is said to be unstructured data. Here we are taking an digits image data and building an artificial neural network (ANN) to predict the given digit is exactly matching with the labeled data. Most of the Deep learning problems are supervised learning based which means for given data we know what is output exactly.

Image is always represented in matrix form m x n pixels. Here we are considering 64 pixel image which is 8 x 8 pixel image and the image is gray image which is black and white image. 


Steps in implementing ANN for Digits Classification ;

1. Importing Libraries 

2. Importing Data Set

3. Splitting the Data for train and test

4. Building Model

5. Evaluation


Install libraries if not installed earlier 

For installing tensor flow and keras libraries run below commands in jupyter or google colabs cell

!pip install tensorflow

!pip install keras


Importing Libraries

import tensorflow as tf

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import keras

from tensorflow.keras import models,layers

from sklearn import datasets

from sklearn.model_selection import train_test_split


Importing Dataset and lets have look on shape of data

digits = datasets.load_digits()

X= digits.data

y= digits.target

print(X.shape,y.shape).

Output :

(1797, 64) (1797,)


Let's have a look how data looks like

plt.imshow(digits.images[3], cmap=plt.cm.gray_r, interpolation='nearest')

Output :






Splitting the data for train and test

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42,stratify=y)
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)

y_train = y_train.reshape(-1,)
y_train.shape

Output :

(1437, 64) (360, 64) (1437,) (360,)


Building Basic ANN model for classification

ann = models.Sequential([
    layers.Flatten(input_shape=(64,1)),
    layers.Dense(3000,activation='relu'),
    layers.Dense(1000,activation='relu'),
    layers.Dense(10,activation='sigmoid')
])
ann.compile(optimizer='SGD',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
ann.fit(X_train,y_train,epochs=5)

Output :

Epoch 1/5
45/45 [==============================] - 2s 16ms/step - loss: 2.3419 - accuracy: 0.5718
Epoch 2/5
45/45 [==============================] - 1s 16ms/step - loss: 0.1600 - accuracy: 0.9650
Epoch 3/5
45/45 [==============================] - 1s 15ms/step - loss: 0.1017 - accuracy: 0.9796
Epoch 4/5
45/45 [==============================] - 1s 16ms/step - loss: 0.0752 - accuracy: 0.9947 0s - loss: 0.0804 - accu
Epoch 5/5
45/45 [==============================] - 1s 14ms/step - loss: 0.0551 - accuracy: 0.9951

Evaluate the model :

ann.evaluate(X_test,y_test)

Output :

[0.08141227066516876, 0.980555534362793] --> loss, accuracy


Let's Predict the test data with actual target data

We have actaul labeled data which is y_test. Now we have developed our ANN model for predicting the X_test data which is said to be y_pred . Our aim is to make Y_pred = y_test , then we can say our model is doing very good. 

y_pred = ann.predict(X_test)

np.argmax(y_pred[3]) 

y_classes = [np.argmax(element) for element in y_pred]

print(y_test[:10])

print(y_classes[:10])

Output :

[5, 2, 8, 1, 7, 2, 6, 2, 6, 5] --> y_test first 10 values
[5, 2, 8, 1, 7, 2, 6, 2, 6, 5] --> y_pred first 10 values

Conclusion 

We have observed at the end y_test and Y_pred values are looking similar which means our ANN model which we build is doing very good. Actually we have confidence on our model at the stahe of evaluation stage, if you have observed at evaluation stage we have achieved loss as 0.08 which is near to zero and accuracy is 0.98 which is 98 % . This section shows us how good our model can predict on new data set. 


This is all from this article, How you got basic idea at the end of article on how to implement ANN on digits dataset. 













Thursday, December 17, 2020

The Evolution and Definition of AI

 Artificial Intelligence is the science and engineering of making intelligent machines, as defined by computer scientist john McCarthy in 1956. But what makes a machine intelligent? And does what is considered intelligent in a machine change with new technology?

What is AI?

Is it the computer, or the code that defines it? AI is sometimes called Machine Intelligence, referring to the fact that the capacity for reasoning and understanding is bound by the limits of a machine and its contents. Natural Intelligence is what is exhibited by humans, bound by the limits of our cognitive functions. AI in its current state, could be said to be computers or programs that attempt to mimic human intelligence through displays of reasoning, learning, analyzing and problem-solving.

Applied Artificial Intelligence is what we use to study datasets, solve problems, test probability, and perform tasks. The essential strength of AI over human effort is that AI can perform calculations more accurately and sizeable than humans can in a far shorter amount of time. The role of humans is to assign numerical meaning to the data or problems at hand so that the AI programs can compute this information and render an answer that has relevance, value, and accuracy in relation to boundaries of real life.

Wednesday, July 15, 2020

Magic of Vision



VISION

Have we ever think how we are recognizing things we see every day in our life. Have you ever think whats happening inside our human senses. Have you saw any thing when you close your eyes with peace of mind or concentrating on a particular act .


If you have convincing answers for your self then are blessed with Magic of Vision. If not i would like to share my experiences with you which i come over so far. For few days i have tried just closing eyes and trying to visualize. In very first day its been difficult like a lot of things going on few thoughts relevant few images relevant and many don't know where they come from. As practicing things went on good i had control on flow of images and thoughts but interestingly i found there were  color changes on different scenarios. If you don't believe try it !!

Eventually for different emotions we get different Colors when you close your eyes. There is some communication happening with brain and Vision, which is leading us to get some conclusion. Some time you feel relax, some times you irritate more. It depends on the way the thought processing happening in brain. Its magical in Human Creation. 

When i think how it relates to Computer vision, Image processing so on. Most of us using in our Daily life. How technology can come to conclusion as like humans to decide yes this belongs to particular category. I wondered relating as how color changes for different emotions in our eyes. We have different kernels for extracting different information . By inclusion of different emotions brain gives a conclusion. Similarly inclusion of different kernels information computer vision can come to a conclusion.

In the next article i will come up,detailing of kernels and information carried by the kernels !!!

Monday, March 9, 2020

Expert Systems & 6 jars: Say Hi to ML(Machine Learning)

We are hearing a lot about ML an AI , Its keep buzzing all around us. Lot of us have a question what is AI?,what is ML ?, How to deal with it ?, Where to start if i would like to know about ML. Yes you are at right place to get some idea on above questions.

What is AI? ---  Its been inspired by human science , our brain , What our brain is capable of just think you get the answer for it. The brain has ability to take decisions its called Intelligence. We try to create machines which are having ability to take decisions is AI in simple terms.

Then question arises how is it possible ?
Again just analyse our human brain, initially we try to understand and later we train our brain to take decisions. We try to implement the same in machines through techniques which are Machine Learning and Deep Learning.

In this article we focus on ML , what is ML -- In simple terms machine is capable in understanding the data and applying on task to check the performance . In broad sense if we look into ML , ML is divided into 6 jars.




Wednesday, February 26, 2020

Nature has Solution for how can we improve speed of computation Processing in ML & AI #Baby Brain, #ML,#AI



When i asked how can i improve speed of computation processing to myself. I just started thinking the different ways how can we but when we look at the nature we get approaches and probably solutions sometimes.

I just shifted my thought processing from technical stuff to nature existence. All of sudden i  wondered yes it possible to improve our computation speed much faster than currently existing. It's Quantum theory.

I just related to human science . Lot of answers can be find by analyzing by our own body. As we are dealing with A.I so our focus is completely on brain. We all know this , the learning rate of small kids is very much higher than young and old human beings. Just think of born babies how their brain status. Medically experts says it takes 8 years for complete brain development and functioning. But there is some thing we usually talk Subconscious brain. Yes this particular part of brain is very active when baby is in womb. Baby can listen listen and understand the words which are spoken by Mother and Father. Baby do remembers what ever he heard when inside Womb. If we train the baby who can use his conscious mind he can do wonders .  If u don't believe, there is some thing for you. It's been explained in Mahabharatam.

We all know this part of Mahabharatam where Arjuna says to Abhimanyu the art of breaking the chakravyuha  when he is in subhadra's Womb. Later Lord Krishna Trains him in Dwaraka in such way where he can use his conscious mind. Rest all we know about Abhimanyu and his bravery at very small age

The reason behind the telling story is the learning rate of born babies is very much higher than any age group of human kind.  If we implement the mathematics representation of learning rate in ML and AI. yes for sure the computation process will improve very high compare to todays advance computation methods.


Wednesday, February 19, 2020

Learn Image Classification on 3 Datasets using Convolutional Neural Networks (CNN)

Convolutional neural networks (CNN) – the concept behind recent breakthroughs and developments in deep learning.
CNNs have broken the mold and ascended the throne to become the state-of-the-art computer vision technique. Among the different types of neural networks (others include recurrent neural networks (RNN), long short term memory (LSTM), artificial neural networks (ANN), etc.), CNNs are easily the most popular.
These convolutional neural network models are ubiquitous in the image data space. They work phenomenally well on computer vision tasks like image classification, object detection, image recognition, etc.
So – where can you practice your CNN skills? Well, you’ve come to the right place!
There are various datasets that you can leverage for applying convolutional neural networks. Here are three popular datasets:
  • MNIST
  • CIFAR-10
  • ImageNet
In this article, we will be building image classification models using CNN on each of these datasets. That’s right! We will explore MNSIT, CIFAR-10, and ImageNet to understand, in a practical manner, how CNNs work for the image classification task.
My inspiration for writing this article is to help the community apply theoretical knowledge in a practical manner. This is a very important exercise as it not only helps you build a deeper understanding of the underlying concept but will also teach you practical details that can only be learned through implementing the concept.
If you’re new to the world of neural networks, CNNs, image classification, I recommend going through these excellent in-depth tutorials:
And if you’re looking to learn computer vision and deep learning in-depth, you should check out our popular course:

Table of Contents

  1. Using CNNs to Classify Hand-written Digits on MNIST Dataset
  2. Identifying Images from CIFAR-10 Dataset using CNNs
  3. Categorizing Images of ImageNet Dataset using CNNs
  4. Where to go from here?
Note: I will be using Keras to demonstrate image classification using CNNs in this article. Keras is an excellent framework to learn when you’re starting out in deep learning.

Using CNNs to Classify Hand-written Digits on MNIST Dataset

MNIST CNN
MNIST (Modified National Institute of Standards and Technology) is a well-known dataset used in Computer Vision that was built by Yann Le Cun et. al. It is composed of images that are handwritten digits (0-9), split into a training set of 50,000 images and a test set of 10,000 where each image is of 28 x 28 pixels in width and height.
This dataset is often used for practicing any algorithm made for image classification as the dataset is fairly easy to conquer. Hence, I recommend that this should be your first dataset if you are just foraying in the field.
MNIST comes with Keras by default and you can simply load the train and test files using a few lines of code:
from keras.datasets import mnist

# loading the dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# let's print the shape of the dataset
print("X_train shape", X_train.shape)
print("y_train shape", y_train.shape)
print("X_test shape", X_test.shape)
print("y_test shape", y_test.shape)
Here is the shape of X (features) and y (target) for the training and validation data:
X_train shape (60000, 28, 28) 
y_train shape (60000,) 
X_test shape (10000, 28, 28) 
y_test shape (10000,)
Before we train a CNN model, let’s build a basic Fully Connected Neural Network for the dataset. The basic steps to build an image classification model using a neural network are:
  1. Flatten the input image dimensions to 1D (width pixels x height pixels)
  2. Normalize the image pixel values (divide by 255)
  3. One-Hot Encode the categorical column
  4. Build a model architecture (Sequential) with Dense layers
  5. Train the model and make predictions
Here’s how you can build a neural network model for MNIST. I have commented on the relevant parts of the code for better understanding:
# keras imports for the dataset and building our neural network
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Conv2D, MaxPool2D
from keras.utils import np_utils
# Flattening the images from the 28x28 pixels to 1D 787 pixels
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# normalizing the data to help with the training
X_train /= 255
X_test /= 255
# one-hot encoding using keras' numpy-related utilities
n_classes = 10
print("Shape before one-hot encoding: ", y_train.shape)
Y_train = np_utils.to_categorical(y_train, n_classes)
Y_test = np_utils.to_categorical(y_test, n_classes)
print("Shape after one-hot encoding: ", Y_train.shape)
# building a linear stack of layers with the sequential model
model = Sequential()
# hidden layer
model.add(Dense(100, input_shape=(784,), activation='relu'))
# output layer
model.add(Dense(10, activation='softmax'))
# looking at the model summary
model.summary()
# compiling the sequential model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# training the model for 10 epochs
model.fit(X_train, Y_train, batch_size=128, epochs=10, validation_data=(X_test, Y_test))
view rawmnist_nn.py hosted with ❤ by GitHub
After running the above code, you’d realized that we are getting a good validation accuracy of around 97% easily.
Let’s modify the above code to build a CNN model.
One major advantage of using CNNs over NNs is that you do not need to flatten the input images to 1D as they are capable of working with image data in 2D. This helps in retaining the “spatial” properties of images.
Here’s the full code for the CNN model:
# keras imports for the dataset and building our neural network
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Conv2D, MaxPool2D, Flatten
from keras.utils import np_utils
# to calculate accuracy
from sklearn.metrics import accuracy_score
# loading the dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# building the input vector from the 28x28 pixels
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# normalizing the data to help with the training
X_train /= 255
X_test /= 255
# one-hot encoding using keras' numpy-related utilities
n_classes = 10
print("Shape before one-hot encoding: ", y_train.shape)
Y_train = np_utils.to_categorical(y_train, n_classes)
Y_test = np_utils.to_categorical(y_test, n_classes)
print("Shape after one-hot encoding: ", Y_train.shape)
# building a linear stack of layers with the sequential model
model = Sequential()
# convolutional layer
model.add(Conv2D(25, kernel_size=(3,3), strides=(1,1), padding='valid', activation='relu', input_shape=(28,28,1)))
model.add(MaxPool2D(pool_size=(1,1)))
# flatten output of conv
model.add(Flatten())
# hidden layer
model.add(Dense(100, activation='relu'))
# output layer
model.add(Dense(10, activation='softmax'))
# compiling the sequential model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# training the model for 10 epochs
model.fit(X_train, Y_train, batch_size=128, epochs=10, validation_data=(X_test, Y_test))
view rawcnn_mnist.py hosted with ❤ by GitHub
Even though our max validation accuracy by using a simple neural network model was around 97%, the CNN model is able to get 98%+ with just a single convolution layer!
You can go ahead and add more Conv2D layers, and also play around with the hyperparameters of the CNN model.

Identifying Images from the CIFAR-10 Dataset using CNNs

MNIST is a beginner-friendly dataset in computer vision. It’s easy to score 90%+ on validation by using a CNN model. But what if you are beyond beginner and need something challenging to put your concepts to use?
That’s where the CIFAR-10 dataset comes into the picture!
CIFAR-10 CNN
Here’s how the developers behind CIFAR (Canadian Institute For Advanced Research) describe the dataset:
The CIFAR-10 dataset consists of 60,000 32 x 32 colour images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images.
The important points that distinguish this dataset from MNIST are:
  • Images are colored in CIFAR-10 as compared to the black and white texture of MNIST
  • Each image is 32 x 32 pixel
  • 50,000 training images and 10,000 testing images
Now, these images are taken in varying lighting conditions and at different angles, and since these are colored images, you will see that there are many variations in the color itself of similar objects (for example, the color of ocean water). If you use the simple CNN architecture that we saw in the MNIST example above, you will get a low validation accuracy of around 60%.
That’s a key reason why I recommend CIFAR-10 as a good dataset to practice your hyperparameter tuning skills for CNNs. The good thing is that just like MNIST, CIFAR-10 is also easily available in Keras.
You can simply load the dataset using the following code:
from keras.datasets import cifar10
# loading the dataset 
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
Here’s how you can build a decent (around 78-80% on validation) CNN model for CIFAR-10. Notice how the shape values have been updated from (28, 28, 1) to (32, 32, 3) according to the size of the images:
# keras imports for the dataset and building our neural network
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Conv2D, MaxPool2D, Flatten
from keras.utils import np_utils
# loading the dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
# # building the input vector from the 32x32 pixels
X_train = X_train.reshape(X_train.shape[0], 32, 32, 3)
X_test = X_test.reshape(X_test.shape[0], 32, 32, 3)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# normalizing the data to help with the training
X_train /= 255
X_test /= 255
# one-hot encoding using keras' numpy-related utilities
n_classes = 10
print("Shape before one-hot encoding: ", y_train.shape)
Y_train = np_utils.to_categorical(y_train, n_classes)
Y_test = np_utils.to_categorical(y_test, n_classes)
print("Shape after one-hot encoding: ", Y_train.shape)
# building a linear stack of layers with the sequential model
model = Sequential()
# convolutional layer
model.add(Conv2D(50, kernel_size=(3,3), strides=(1,1), padding='same', activation='relu', input_shape=(32, 32, 3)))
# convolutional layer
model.add(Conv2D(75, kernel_size=(3,3), strides=(1,1), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Conv2D(125, kernel_size=(3,3), strides=(1,1), padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Dropout(0.25))
# flatten output of conv
model.add(Flatten())
# hidden layer
model.add(Dense(500, activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(250, activation='relu'))
model.add(Dropout(0.3))
# output layer
model.add(Dense(10, activation='softmax'))
# compiling the sequential model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# training the model for 10 epochs
model.fit(X_train, Y_train, batch_size=128, epochs=10, validation_data=(X_test, Y_test))
view rawcnn_cifar10.py hosted with ❤ by GitHub

Here’s what I changed in the model:

  • Increased the number of Conv2D layers to build a deeper model
  • Increased number of filters to learn more features
  • Added Dropout for regularization
  • Added more Dense layers
You can easily eclipse this performance by tuning the above model. Once you have mastered CIFAR-10, there’s also CIFAR-100 available in Keras that you can use for further practice. Since it has 100 classes, it won’t be an easy task to achieve!

Categorizing the Images of ImageNet using CNNs

Now that you have mastered MNIST and CIFAR-10, let’s take this problem a notch higher. Here, we will take a look at the famous ImageNet dataset.
ImageNet CNN
ImageNet is the main database behind the ImageNet Large Scale Recognition Challenge (ILSVRC). This is like the Olympics of Computer Vision. This is the competition that made CNNs popular the first time and every year, the best research teams across industries and academia compete with their best algorithms on computer vision tasks.

About the ImageNet Dataset

The ImageNet dataset has more than 14 million images, hand-labeled across 20,000 categories.
Also, unlike the MNIST and CIFAR-10 datasets that we have already discussed, the images in ImageNet are of decent resolution (224 x 224) and that’s what poses a challenge for us: 14 million images, each 224 by 224 pixels. Processing a dataset of this size requires a great amount of computing power in terms of CPU, GPU, and RAM.
The downside – that might be too much for an everyday laptop. So what’s the alternative solution? How can an enthusiast work with the ImageNet dataset?

That’s where Fast.ai’s Imagenette dataset comes in

Imagenette is a dataset that’s extracted from the large ImageNet collection of images. The reason behind releasing Imagenette is that researchers and students can practice on ImageNet level images without needing that much compute resources.
In the words of Jeremy Howard himself:
“I (Jeremy Howard, that is) mainly made Imagenette because I wanted a small vision dataset I could use to quickly see if my algorithm ideas might have a chance of working. They normally don’t, but testing them on Imagenet takes a really long time for me to find that out, especially because I’m interested in algorithms that perform particularly well at the end of training.
But I think this can be a useful dataset for others as well.”
And that’s what we will also use for practicing!

1. Download the Imagenette Dataset

Here’s how you can fetch the dataset (commands for your terminal):
$ wget https://s3.amazonaws.com/fast-ai-imageclas/imagenette2.tgz
$ tar -xf imagenette2.tgz
Once you have downloaded the dataset, you will notice that it has two folders – “train” and “val”. These contain the training and validation set respectively. Inside each folder, there are separate folders for each class. Here’s the mapping of the classes:
imagenette_map = {
"n01440764" : "tench",
"n02102040" : "springer",
"n02979186" : "casette_player",
"n03000684" : "chain_saw",
"n03028079" : "church",
"n03394916" : "French_horn",
"n03417042" : "garbage_truck",
"n03425413" : "gas_pump",
"n03445777" : "golf_ball",
"n03888257" : "parachute"
}
view rawimagenette_map.py hosted with ❤ by GitHub
These classes have the same ID in the original ImageNet dataset. Each of the classes has approximately 1000 images so overall, it’s a balanced dataset.
CNN Imagenet

2. Loading Images using ImageDataGenerator

Keras has this useful functionality for loading large images (like we have here) without maxing out the RAM, by doing it in small batches. ImageDataGenerator in combination with fit_generator provides this functionality:
from keras.preprocessing.image import ImageDataGenerator
# create a new generator
imagegen = ImageDataGenerator()
# load train data
train = imagegen.flow_from_directory("imagenette2/train/", class_mode="categorical", shuffle=False, batch_size=128, target_size=(224, 224))
# load val data
val = imagegen.flow_from_directory("imagenette2/val/", class_mode="categorical", shuffle=False, batch_size=128, target_size=(224, 224))
view rawdatagen.py hosted with ❤ by GitHub
The ImageDataGenerator itself inferences the class labels and the number of classes from the folder names.

3. Building a Basic CNN model for Image Classification

Let’s build a basic CNN model for our Imagenette dataset (for the purpose of image classification):
from keras.models import Sequential
from keras.layers import Conv2D, MaxPool2D, Flatten, Dense, InputLayer, BatchNormalization, Dropout
# build a sequential model
model = Sequential()
model.add(InputLayer(input_shape=(224, 224, 3)))
# 1st conv block
model.add(Conv2D(25, (5, 5), activation='relu', strides=(1, 1), padding='same'))
model.add(MaxPool2D(pool_size=(2, 2), padding='same'))
# 2nd conv block
model.add(Conv2D(50, (5, 5), activation='relu', strides=(2, 2), padding='same'))
model.add(MaxPool2D(pool_size=(2, 2), padding='same'))
model.add(BatchNormalization())
# 3rd conv block
model.add(Conv2D(70, (3, 3), activation='relu', strides=(2, 2), padding='same'))
model.add(MaxPool2D(pool_size=(2, 2), padding='valid'))
model.add(BatchNormalization())
# ANN block
model.add(Flatten())
model.add(Dense(units=100, activation='relu'))
model.add(Dense(units=100, activation='relu'))
model.add(Dropout(0.25))
# output layer
model.add(Dense(units=10, activation='softmax'))
# compile model
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])
# fit on data for 30 epochs
model.fit_generator(train, epochs=30, validation_data=val)
view rawcnn_base.py hosted with ❤ by GitHub
When we compare the validation accuracy of the above model, you’ll realize that even though it is a more deep architecture than what we have utilized so far, we are only able to get a validation accuracy of around 40-50%.
There can be many reasons for this, such as our model is not complex enough to learn the underlying patterns of images, or maybe the training data is too small to accurately generalize across classes.
Step up – transfer learning.

4. Using Transfer Learning (VGG16) to improve accuracy

VGG16 is a CNN architecture that was the first runner-up in the 2014 ImageNet Challenge. It’s designed by the Visual Graphics Group at Oxford and has 16 layers in total, with 13 convolutional layers themselves. We will load the pre-trained weights of this model so that we can utilize the useful features this model has learned for our task.

Downloading weights of VGG16

from keras.applications import VGG16

# include top should be False to remove the softmax layer
pretrained_model = VGG16(include_top=False, weights='imagenet')
pretrained_model.summary()
Here’s the architecture of the model:
imagenet CNN

Generate features from VGG16

Let’s extract useful features that VGG16 already knows from our dataset’s images:
from keras.utils import to_categorical
# extract train and val features
vgg_features_train = pretrained_model.predict(train)
vgg_features_val = pretrained_model.predict(val)
# OHE target column
train_target = to_categorical(train.labels)
val_target = to_categorical(val.labels)
Once the above features are ready, we can just use them to train a basic Fully Connected Neural Network in Keras:
model2 = Sequential()
model2.add(Flatten(input_shape=(7,7,512)))
model2.add(Dense(100, activation='relu'))
model2.add(Dropout(0.5))
model2.add(BatchNormalization())
model2.add(Dense(10, activation='softmax'))
# compile the model
model2.compile(optimizer='adam', metrics=['accuracy'], loss='categorical_crossentropy')
model2.summary()
# train model using features generated from VGG16 model
model2.fit(vgg_features_train, train_target, epochs=50, batch_size=128, validation_data=(vgg_features_val, val_target))
view rawfcnn_vgg.py hosted with ❤ by GitHub

In case you have mastered the Imagenette dataset, fastai has also released two variants which include classes you’ll find difficult to classify:
  • Imagewoof: 10 classes of dog breeds, a more difficult problem to classify
  • Image网 (“wang”): A combination of Imagenette and Imagewoof and a couple of tricks that make it a harder problem

Where to go from here?

Apart from the datasets we’ve above, you can also use the below datasets for building computer vision algorithms. In fact, consider this a challenge. Can you apply your CNN knowledge to beat the benchmark score on these datasets?
  • Fashion MNIST – MNIST-like dataset of clothes and apparel. Instead of digits, the images show a type of apparel (T-shirt, trousers, bag, etc.)
  • Caltech 101 – Another challenging dataset that I found for image classification
I also suggest that before going for transfer learning, try improving your base CNN models. You can learn from the architectures of VGG16, ZFNet, etc. for some clues on hyperparameter tuning and you can use the same ImageDataGenerator to augment your images and increase the size of the dataset.