搭建神经网络的八股

前向传播

由输入到输出,搭建完整的网络结构
描述前向传播的过程需要定义三个函数:

def forward(x,regularizer):
    w=
    b=
    y=
    retutn y

第一个函数forward完成网络结构设计，从输入到输出搭建完整的网络结构，实现前向传播过程。该函数中，参数x为输入，regularizer为正则化权重，返回值为预测或分类结果y。

def get_weight(shape,regularizer):
    w=tf.Variable()
    tf.add_to_collection('losses',tf.contrib.layers.l2_regularizer(regularizer)(w))
    return w

第二个函数get_weight对参数w设定。该函数中，参数shape表示参数w的形状，regularizer表示正则化权重，返回值为参数w。其中，tf.Variable给w赋初值，tf.add_to_collection表示将参数w正则化损失加到总损失losses中。

1
2
3

def get_bias(shape):
    b=tf.Variable()
    return b

第三个函数get_bias对参数b进行设定。该函数中，参数shape表示参数b的形状，返回值为参数b。其中，tf.Variable表示给b赋初值。

反向传播

训练网络，优化网络参数，提高模型准确性。

def backward():
    x=tf.placeholder()
    y_=tf.placeholder()
    y=forward.froward(x,REGULARIZER)
    global_step=tf.Variable(0,trainable=False)
    loss=

函数backward中，placeholder实现对数据集x和标准答案y_占位，forward.forward实现前向传播的网络结构，参数global_step表示训练轮数，设置为不可训练参数。在训练网络模型时，常将正则化、指数衰减学习率和滑动平均这三个方法作为模型优化方法。
在Tensorflow中正则化表示为：

首先，计算预测结果与标准答案的损失值。

MSE:y与y_的差距(loss_mse)=tf.reduce_mean(tf.square(y-y_))
交叉熵:ce=tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y,labels=tf.argmax(y_,1));y与y_的差距(cem)=tf.reduce_mean(ce)
自定义:y与y_的差距
其次，总损失值为预测结果与标准答案的损失值加上正则化项。
loss=y与y_的差距+tf.add_n(tf.get_collection(‘losses’))

在Tensorflow中，指数衰减学习率表示为：

learning_rate=tf.train.exponential_decay(
    LEARNINR_RATE_BASE,
    global_step,
    数据集总样本数/BATCH_SIZE,
    LEARNING_RATE_DECAY,
    staircase=True
)
train_step=tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step)

在Tensorflow中，滑动平均表示为：

ema=tf.train.ExponentialMovineAverage(MOVING_AVERAGE_DECAY,globa_step)
ema_op=ema.apply(tf.trianable_variables())
with tf.control_dependencies([train_step,ema_op]):
    train_op=tf.no_op(name='train')

其中，滑动平均和指数衰减学习率中global_step是同一个参数。

用with结构化所有参数

with tf.Session() as sess:
    init_op=tf.global_variables_initializer()
    sess.run(init_op)
    for i in range(STEPS):
        sess.run(train_step,feed_dict={x:,y_:})
        if i % 轮数==0:
            print

一个完整的神经网络实现：

import tensorflow as tf
import numpy as np
# 定义训练数据batch的大小
batch_size = 8
seed = 1
# 定义神经网络的参数
w1 = tf.Variable(tf.random_normal([2, 3], stddev=1, seed=1))
w2 = tf.Variable(tf.random_normal([3, 1], stddev=1, seed=1))
'''
在shape的一个维度上使用None可以方便的使用不同的batch大小.在训练时需要把数据分成比较小的batch,
但是测试时可以一次性使用全部数据,当数据集比较小时这样比较方便测试,但数据集比较大时,将大量数据放
入一个batch可能会导致内存溢出
'''
# 定义神经网络的输入、输出
x = tf.placeholder(tf.float32, shape=(None, 2), name='x-input')
y_ = tf.placeholder(tf.float32, shape=(None, 1), name='y-input')  # 标准答案
# 定义神经网络前向传播过程
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)
# 定义损失函数和反向传播的算法
y = tf.sigmoid(y)
cross_entropy = -tf.reduce_mean(
    y_ * tf.log(tf.clip_by_value(y, 1e-10, 1.0)) +
    (1 - y) * tf.log(tf.clip_by_value(1 - y, 1e-10, 1.0)))
learning_rate = 0.001
train_step = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)
# 通过随机数生成一个模拟的数据集
rdm = np.random.RandomState(seed=1)
dataset_size = 128
X = rdm.rand(dataset_size, 2)
Y = [[int(x1 + x2 < 1)] for (x1, x2) in X]
# 创建一个回话来运行Tensorflow程序
with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    # 初始化变量
    sess.run(init_op)
    print("训练之前,神经网络的参数值")
    print(sess.run(w1))
    print(sess.run(w2))

    # 设定训练的轮数
    STEPS = 5000
    for i in range(STEPS):
        # 每次选取batch_size个样本进行训练
        start = (i * batch_size) % dataset_size
        end = min(start + batch_size, dataset_size)

        # 通过选取的样本训练神经网络并更新参数
        sess.run(train_step, feed_dict={x: X[start:end], y_: Y[start:end]})
        if i % 1000 == 0:
            # 每隔一段时间计算在所有数据上的交叉熵并输出
            total_cross_entropy = sess.run(
                cross_entropy, feed_dict={
                    x: X,
                    y_: Y
                })
            print("After %d training steps,cross entroy on all data is %g" %
                  (i, total_cross_entropy))
    print("训练之后的神经网络参数")
    print(sess.run(w1))
    print(sess.run(w2))
'''
1. 定义神经网络的结构和前向传播的输出结果
2. 定义损失函数以及选择反向传播优化算法
3. 生成回话并在训练数据上反复运行反向传播优化算法
'''

文章目录

前向传播

反向传播