## Implementation for MNIST (Tutorial + TensorFlow + MNIST + Python)

First off. We use definitely need NumPy, because then we can do the expensive operations like matrix multiplication outside of Python, but this could be back switching back and forth to Python for every operation. So we use TensorFlow also which included NumPy. TensorFlow will describe a graph of interacting operations that run entirely outside of function, instead just one like NumPy does. This is similar to Theano or Torch.

Rest of explanation is in the code comments.

"""READ/DL data. Creates directory MNIST_Data""" from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) """Since we use Rectifier linear unit (ReLU) neurons, initialize with slight positive to avoid dead neurons for variables. convulations uses stride of one and are zeropadded so the output size = input size.""" def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') """Set up the softmax regression model by creating nodes for the input images and target out classes, 784 is the dimensionality of a single flat MNIST image. 10 for each digit class """ # placeholers x = tf.placeholder(tf.float32, [None, 784]) y_ = tf.placeholder(tf.float32, [None, 10]) """First Convolutional Layer.""" """32 features, 5x5 patch. weight_variable[patchwidth patchheight inputchannel outputchannel]""" W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) """Apply layer. reshape x to 4d tensor, 2nd & 3rd dim = width and height 4th dim = color channels """ x_image = tf.reshape(x, [-1, 28, 28, 1]) """"convulve x_image with weight tensor, add bias, apply ReLU function, max pool """ h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1) """Second Convolutional Layer. stack with 64 features, 5x5 patch""" W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2) """Densely Connected Layer.image reduced to 7x7. add fully Connected layer with 1024 neurons to process the entire image. reshape from poolng layer multiply weight matrix. add bias. apply ReLU""" W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) # dropout """reduce overfitting. create placeholer for probability that a neruon's output is kept during dropout, which allows us to turn dropout on during training. Yay! and off during testing. Double Yay! TensorFlow auto handles scaling neuron outputs. Triple Yay!""" keep_prob = tf.placeholder("float") h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) """Readout layer. Add softmax layer again. keep_prob in feed_dict will control dropout rate and add logging to every 100 iteration in training. For this small convolutional network, performance is actually nearly identical with and without dropout. Dropout is often very effective at reducing overfitting, but it is most useful when training very large neural networks.""" W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) # Cross entropy cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv)) """Train Model with build in optimization algorithms Note: GradientDescentOptimizer(0.5) would give 92% accuracy""" train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) # initialling variables with Session after adding optimizer w/ AdamOptimizer init = tf.initialize_all_variables() NUM_CORES = 4# Choose how many cores to use. sess = tf.Session( config=tf.ConfigProto( inter_op_parallelism_threads=NUM_CORES, intra_op_parallelism_threads=NUM_CORES)) sess.run(init) # run 20,000 times for i in range(20000): batch = mnist.train.next_batch(50) if i % 100 == 0: train_accuracy = accuracy.eval( session=sess, feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0}) print("step %d, training accuracy %g" % (i, train_accuracy)) train_step.run( session=sess, feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) """Evaluate the model.""" print("test accuracy %g" % accuracy.eval(feed_dict={ x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))