Getting started with Tensorflow

In modern software application development there is an increasing common requirement for developers too integrate aspects of Machine Learning and Artificial Intelligence into software application stacks. Its often hard to differentiate between the role of Software Engineer and Data Scientist because in reality they often draw from the same key skills.

The problem becomes especially difficult for most people to differentiate between if you happen to list Python as one of your key skills because Python is so pervasive throughout the Data Science community.

In my personal opinion I think having an understanding and being able to develop machine learning models and implementing them is going to become a key skill requirement for software developers in the future. In much the same way, that having an understanding and being able to develop applications using HTML, CSS & Web Server administration became a somewhat mandatory skills for developers, regardless of whether they were web developers or not because web development and the web became a primary deployment model for most applications.

Developers don't always need to necessarily need to understand all the ins and outs of a specific technology or framework and will often need to only know just enough to solve a particular problem. In this post, I hope to provide a general introduction into Machine Learning and Neural Networks and Tensorflow in order to provide developers with a broader understanding of the subject and how to implement a basic machine learning algorithm using Python, Anaconda and Tensorflow

What is Machine Learning?

One of the first things software developers need to understand about machine learning is how and why it varies greatly from the classical software development approach to solving business problems. This can quite simply be explained without going into the complex jargon that so often accompanies Machine Learning explanations.

In short, in the classic approach to software development we use Rules and Data as Input to generate Answers as the Output. Using Machine Learning, we use Data and Answers as the Input and generate Rules as the Output.

Classical Programming VS Machine Learning

The primary objective of Artificial Intelligence is maximise a concept which we call Expected Utility or the probability of satisfaction of doing an action. An easy example of this is Education, your Expected Utility of going to school and progressing onto University it is hoped you increase your chances of obtaining higher earning capacity later in life.

The expected Utility of Artificial Intelligence (AI) is to replace error-prone human intelligence in completing tedious repetitive tasks. AI aims to mimic Human Intelligence in the following areas :

Natural Language Processing (NLP) : Ability to understand spoken or written human language and provide natural response to questions i.e. Automated Narration, Machine Translation, Text Summarization
Knowledge and Reasoning : Provide human like reasoning and decision-making to solve specific problems and react to changes in an environment
Planning and Problem Solving : Make predictions regarding possible actions and choosing and action that maximizes expected utility
Perception : Agent to provide information about the world it lives in
Learning : Develop knowledge of an environment using perception to learn through observation. This is the subfield of AI that deals with algorithms that learn from data without some explicit programming - Machine Learning

Machine Learning uses tools such as Statistical Analysis, Probabilistic Modelling, Decision Trees and Neural Networks to efficiently process large amounts of data in order to derive predictions.

Types of Machine Learning

Machine Learning can be separated into three major groups which are dependent on the type of data available and the desired objective

Supervised Learning
Unsupervised Learning
Reinforcement Learning

What is Tensorflow

TensorFlow is an open source software library for numerical computation using dataflow graphs. Nodes in the graph represents mathematical operations, while graph edges represent multi-dimensional data arrays communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.

This all sounds really technical and daunting even for Software Developers, I have to be honest the first time I read this I still had no idea what it was trying to tell me! I could identify some words, but really struggled with putting it all together. So the only way I could make sense of it in my mind was to think of Tensorflow as a Neural Network library, all be it a very powerful neural network library, which enables developers to Machine Learning algorithms such as Decision Trees or K-Nearest Neighbours.

The advantages of using TensorFlow are:

Intuitive Construct - a flow of tensors help to easily visualise every part of a graph
Easy to Train - using eitehr GPU or CPU computing resources
Platform Agnostic - Run models whereever you want i.e. Mobile, PC or Server

Getting started with Tensorflow

There are a number of ways to install TensorFlow on your laptop, for me personally I use Ubuntu Linux and by far the easiest way of getting started with learning configuring your Laptop is to get started with Python and Artificial Intelligence on Ubuntu which will guide you through the process of installing Anaconda on your Ubuntu Laptop .

The next step is to use Anaconda to install and Configure Tensorflow in order to start developing Deep learning applications making use of Jupyter Notebooks.

Once all these aspects have been covered your will be good to go to start creating your first TensorFlow based neural network.

Why Neural Networks

There are 3 big reasons why neural networks have become popular machine learning solutions:

Computing resources have become cheaper therefore easy and affordable to use resources at scale
There is a lot of data that is publicly available
Advanced algorithms are allowed to make and train more complex neural networks

SImple TensorFlow Neural Network Example

In this example we will develop a simple deep learning solution using the Modified National Institute of Standards and Technology (MINST) dataset of black and white handwritten digits. Each image has been lableled based on the digit that has been written on the Image. We will develop a solution that will predict the label based on the image.

This type of solution is quite difficult to implement using a classical programming approach, because as you can see in the preceeding image there are a number of different ways of writing the same number. However, using Machine Learning and TensorFlow we'll implement a solution is just a few lines of code.

Training a model for MNIST

To implement this solution we'll open a new Jupyter Lab or Notebook, which we'll use to develop and document our solution.

Our first step will be import TensorFlow

import tensorflow as tf

The next step is to download and MINST dataset and present it in a binary format. We will also divide the dataset into a training and test sets.

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

In just a few lines of code we've downloaded our data and configured our training and test datasets. We can now proceed to setting up the layers for our neural network. Each layer will consist of a number of neurons and activation function. The first layer tried to get more useful data of the raw data, The sedond layer tries to use the data to assign the probability of the image being one of 10 digits.

We will choose three parameters for the training process:

Loss Function : Used to optimize performance. The training process consists of decreasing the value of the loss function and trying to find weights for the neural network.
OPtimizer : Iterates towards the most optimal solution and how it will change weights after each iteration
Metrics : Measure the neural network performance

model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])		

model.compile(optimizer='adam',
             loss='sparse_categorical_crossentropy',
             metrics=['accuracy'])

We can now run the training on the training part of dataset. This could take up several minutes based on the configuration of the machine you are using.

#Run training
model.fit(x_train, y_train, epochs=2)


#Print out results
print(model.evaluate(x_test, y_test))

After our neural network has been trained we can save it so we can use it later. The model file will contain the model architecture.

About
Latest Posts

Gary Woodfine

Technical Director at threenine.co.uk

Gary is Technical Director at threenine.co.uk, an independent software vendor specialising in IoT, Field Service and associated managed services,enabling customers to be efficient, productive, secure and scale-able.