Deep Learning Tensorflow Basic Mnist Tutorial

The mnist is the HelloWorld in deep learning. This article describes in detail how to start deep learning from scratch. The code is detailed and the explanation is comprehensive.

1. Software Architecture.

  1. Two very small networks are used to identify the MNIST dataset. The first is the simplest fully connected network, and the second is a convolutional network.
  2. MNIST data set is the entry data set, so there is no need to enhance the image or read it into the memory with the generator. You can directly use the simple fit() command to train at one time.

2. How To Install.

  1. Note that the tensorflow version cannot be 2. X
  2. The main third-party libraries used are tensorflow1.x, Keras based on TensorFlow, and basic libraries include NumPy, Matplotlib.
  3. The installation method is also very simple, for example pip install numpy.
  4. Run the below commands to install the TensorFlow and related Python libraries.
    pip install numpy
    
    pip install matplotlib
    
    pip install keras
    
    pip install tensorflow

3. How To Use.

  1. First, we preview the dataset, run mnistplt.py, and draw four images for training.
  2. To train a fully connected network, run Densemnist.py to get the weight Dense.h5, load the model and predict to run Denseload.py.
  3. To train the convolutional network, run CNNmnist.py to get the weight CNN.h5, load the model and run CNNload.py for prediction.

4. Training Process Python Code.

  1. Fully connected network training.
    """Multilayer perceptron training"""
    from tensorflow.examples.tutorials.mnist import input_data
    from keras.models import  Sequential
    from keras.layers import Dense
    # Simulate raw gray data reading
    img_size=28
    num=10
    mnist=input_data.read_data_sets("./data",one_hot=True)
    X_train,y_train,X_test,y_test=mnist.train.images,mnist.train.labels,mnist.test.images,mnist.test.labels
    X_train=X_train.reshape(-1,img_size,img_size)
    X_test=X_test.reshape(-1,img_size,img_size)
    X_train=X_train*255
    X_test=X_test*255
    y_train=y_train.reshape(-1,num)
    y_test=y_test.reshape(-1,num)
    print(X_train.shape)
    print(y_train.shape)
    # Fully connected layer can only input one dimension
    num_pixels = X_train.shape[1] * X_train.shape[2]
    X_train = X_train.reshape(X_train.shape[0],num_pixels).astype('float32')
    X_test = X_test.reshape(X_test.shape[0],num_pixels).astype('float32')
    # normalization
    X_train=X_train/255
    X_test=X_test/255
    # one hot coding, edited here, omitted
    #y_train = np_utils.to_categorical(y_train)
    #y_test = np_utils.to_categorical(y_test)
    
    # Build a network
    def baseline():
        '''
        optimizer:Optimizer, such as Adam
        loss:Calculate the loss. When using the categorical_crossentropy loss function, the label should be a multi-class mode. For example, if you have 10 categories, The label of each sample should be a 10-dimensional vector, which is 1 at the index position corresponding to the value and the rest is 0
        metrics: A list containing metrics to evaluate the performance of the model during training and testing.
        '''
        model=Sequential()
        
        '''
        The first step is to determine the number of input layers: use the input_dim parameter to determine when creating the model, for example, if there are 784 input variables, set it to num_pixels.
         The fully connected layer is defined by the Dense class: the first parameter is the number of neurons in this layer, and then the initialization method and activation function. The initialization method has a continuous uniform distribution from 0 to 0.05 (uniform Keras’s default method is also this, you can also use Gaussian distribution to initialize normal, the initialization is actually the initialization of the weight and bias of the layer connection)
        '''
        model.add(Dense(num_pixels,input_dim=num_pixels,kernel_initializer='normal',activation='relu'))
        
        # softmax is an activation function that uses all neurons in this layer
        model.add(Dense(num,kernel_initializer='normal',activation='softmax'))
        
        # categorical_crossentropy is suitable for multi-classification problems and uses softmax as the activation function of the output layer
        model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
        return model
    
    # Training model
    model = baseline()
    
    """
    batch_size: Integer, the number of samples for each gradient update, if not specified then the default value is 32.
    epochs: Integer, the number of iterations of the training model.
    verbose: how to display logs, integer value.
    0: Not output log information to the standard output stream.
    1: Show progress bar.
    2: Each epoch output one line of records.
    
    For a data set with 2000 training samples, divide the 2000 samples into a batch of 500, so 4 iterations are required to complete an epoch.
    """
    
    model.fit(X_train,y_train,validation_data=(X_test,y_test),epochs=10,batch_size=200,verbose=2)
    # Model summary printing
    model.summary()
    #model.evaluate()What is returned is the loss value and the indicator value you selected (for example, accuracy)
    """
    verbose:How to display logs, an integer value.
    verbose = 0  Do not output log information on the standard output stream.
    verbose = 1 Output progress bar record.
    """
    scores = model.evaluate(X_test,y_test,verbose=0)
    print(scores)
    # Save model.
    model_dir="./Dense.h5"
    model.save(model_dir)
  2. CNN training.
    """
    Model construction and training
    Sequential model structure: A linear stack of layers. It is a simple linear structure with no redundant branches and a stack of multiple network layers.
    Output feature maps according to the filter number, that is, the depth of the convolution kernel (filter).
    3-channel RGB image, one filter has a small convolution kernel with 3 channels, but it still counts as 1 filter.
    """
    import numpy as np
    from tensorflow.examples.tutorials.mnist import input_data
    from keras.models import Sequential
    from keras.layers import Dense
    from keras.layers import Dropout
    
    # The Flatten layer is used to "flatten" the input, that is, to make the multi-dimensional input one-dimensional,
    # Commonly used in the transition from the convolutional layer to the fully connected layer.
    from keras.layers import Flatten
    from keras.layers.convolutional import Conv2D
    from keras.layers.convolutional import MaxPooling2D
    
    # Simulate raw gray data reading
    img_size=28
    num=10
    mnist=input_data.read_data_sets("./data",one_hot=True)
    X_train,y_train,X_test,y_test=mnist.train.images,mnist.train.labels,mnist.test.images,mnist.test.labels
    X_train=X_train.reshape(-1,img_size,img_size)
    X_test=X_test.reshape(-1,img_size,img_size)
    X_train=X_train*255
    X_test=X_test*255
    y_train=y_train.reshape(-1,num)
    y_test=y_test.reshape(-1,num)
    print(X_train.shape) #(55000, 28, 28)
    print(y_train.shape) #(55000, 10)
    
    # The shape of the convolution input here should match the input_shape in the model
    X_train = X_train.reshape(X_train.shape[0],28,28,1).astype('float32')
    X_test = X_test.reshape(X_test.shape[0],28,28,1).astype('float32')
    print(X_train.shape)#(55000,28,28,1)
    
    # normalization
    X_train=X_train/255
    X_test=X_test/255
    
    # one hot coding, edited here, omitted.
    #y_train = np_utils.to_categorical(y_train)
    #y_test = np_utils.to_categorical(y_test)
    
    # Build a CNN network
    def CNN():
        """
        The first layer is the convolutional layer. This layer has 32 feature maps, as the input layer of the model, accepting input data of [pixels][width][height] size. The size of the feature map is 1*5*5, and its output is connected to a ‘relu’ activation function
       The next layer is the pooling layer, using MaxPooling, the size is 2*2
        Flatten compresses one dimension as the input layer of the fully connected layer
        Next is the fully connected layer, with 128 neurons, and the activation function uses ‘relu’
        The last layer is the output layer, there are 10 neurons, each neuron corresponds to a category, the output value represents the probability that the sample belongs to that category
        """
        model = Sequential()
        model.add(Conv2D(32, (5, 5), input_shape=(img_size,img_size,1), activation='relu'))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Flatten())
        model.add(Dense(128, activation='relu'))
        model.add(Dense(num, activation='softmax'))
        # Compile
        model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
        return model
    
    # Model training
    model=CNN()
    model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=5, batch_size=200, verbose=1)
    model.summary()
    scores = model.evaluate(X_test,y_test,verbose=1)
    print(scores)
    
    # Save model
    model_dir="./CNN.h5"
    model.save(model_dir)
    

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.