diff --git a/documents/技术文档.md b/documents/技术文档.md
new file mode 100644
index 0000000..ce27c6b
--- /dev/null
+++ b/documents/技术文档.md
@@ -0,0 +1,1407 @@
+# 作品类别：其他
+
+# 稠密连接网络（DenseNet）
+
+ResNet极大地改变了如何参数化深层网络中函数的观点。
+*稠密连接网络* (DenseNet）`[Huang.Liu.Van-Der-Maaten.ea.2017]` 在某种程度上是 ResNet 的逻辑扩展。让我们先从数学上了解一下。
+
+
+## 从ResNet到DenseNet
+
+回想一下任意函数的泰勒展开式（Taylor expansion），它把这个函数分解成越来越高阶的项。在$x$接近0时，
+
+$$f(x) = f(0) + f'(0) x + \frac{f''(0)}{2!}  x^2 + \frac{f'''(0)}{3!}  x^3 + \ldots.$$
+
+同样，ResNet 将函数展开为
+
+$$f(\mathbf{x}) = \mathbf{x} + g(\mathbf{x}).$$
+
+也就是说，ResNet 将 $f$ 分解为两部分：一个简单的线性项和一个更复杂的非线性项。
+那么再向前拓展一步，如果我们想将 $f$ 拓展成超过两部分的信息呢？
+一种方案便是 DenseNet。
+
+![ResNet（左）与 DenseNet（右）在跨层连接上的主要区别：使用相加和使用连结。](https://zh-v2.d2l.ai/_images/densenet-block.svg)
+
+如上图所示，ResNet和DenseNet 的关键区别在于，DenseNet输出是*连接*（用图中的 $[,]$ 表示）而不是如 ResNet 的简单相加。
+因此，在应用越来越复杂的函数序列后，我们执行从 $\mathbf{x}$ 到其展开式的映射：
+
+$$\mathbf{x} \to \left[
+\mathbf{x},
+f_1(\mathbf{x}),
+f_2([\mathbf{x}, f_1(\mathbf{x})]), f_3([\mathbf{x}, f_1(\mathbf{x}), f_2([\mathbf{x}, f_1(\mathbf{x})])]), \ldots\right].$$
+
+最后，将这些展开式结合到多层感知机中，再次减少特征的数量。
+实现起来非常简单：我们不需要添加术语，而是将它们连接起来。
+DenseNet 这个名字由变量之间的“稠密连接”而得来，最后一层与之前的所有层紧密相连。
+稠密连接如上图所示。
+
+![稠密连接](https://zh-v2.d2l.ai/_images/densenet.svg)
+
+稠密网络主要由 2 部分构成： *稠密块*（dense block）和 *过渡层* （transition layer）。
+前者定义如何连接输入和输出，而后者则控制通道数量，使其不会太复杂。
+
+
+## 稠密块体
+
+DenseNet 使用了 ResNet 改良版的“批量归一化、激活和卷积”结构。
+我们首先实现一下这个结构。
+
+
+```python
+import tensorflow as tf
+import tensorlayer as tl
+config = tf.compat.v1.ConfigProto()
+config.gpu_options.allow_growth = True
+session = tf.compat.v1.Session(config=config)
+```
+
+    Using TensorFlow backend.
+    
+
+
+```python
+class BottleNeck(tl.layers.Module):
+    def __init__(self, growth_rate, drop_rate):
+        super(BottleNeck, self).__init__()
+        self.bn1 = tl.layers.BatchNorm()
+        self.conv1 = tl.layers.Conv2d(n_filter=4 * growth_rate,
+                                            filter_size=(1, 1),
+                                            strides=(1,1),
+                                            padding="SAME")
+        self.bn2 = tl.layers.BatchNorm()
+        self.conv2 = tl.layers.Conv2d(n_filter=growth_rate,
+                                            filter_size=(3, 3),
+                                            strides=(1,1),
+                                            padding="SAME")
+        self.dropout = tl.layers.Dropout(keep=drop_rate)
+
+        self.listLayers = [self.bn1,
+                           tl.layers.PRelu(channel_shared=True),
+                           self.conv1,
+                           self.bn2,
+                           tl.layers.PRelu(channel_shared=True),
+                           self.conv2,
+                           self.dropout]
+
+    def forward(self, x):
+        y = x
+        for layer in self.listLayers:
+            y = layer(y)
+        y = tf.keras.layers.concatenate([x, y], axis=-1)
+        return y
+```
+
+一个*稠密块*由多个卷积块组成，每个卷积块使用相同数量的输出信道。
+然而，在前向传播中，我们将每个卷积块的输入和输出在通道维上连结。
+
+
+
+```python
+class DenseBlock(tl.layers.Module):
+    def __init__(self, num_layers, growth_rate, drop_rate=0.5):
+        super(DenseBlock, self).__init__()
+        self.num_layers = num_layers
+        self.growth_rate = growth_rate
+        self.drop_rate = drop_rate
+        self.listLayers = []
+        for _ in range(num_layers):
+            self.listLayers.append(BottleNeck(growth_rate=self.growth_rate, drop_rate=self.drop_rate))
+
+    def forward(self, x):
+        for layer in self.listLayers:
+            x = layer(x)
+        return x
+```
+
+在下面的例子中，我们[**定义一个**]有 2 个输出通道数为 10 的 (**`DenseBlock`**)。
+使用通道数为 3 的输入时，我们会得到通道数为 $3+2\times 10=23$ 的输出。
+卷积块的通道数控制了输出通道数相对于输入通道数的增长，因此也被称为*增长率*（growth rate）。
+
+
+
+```python
+blk = DenseBlock(2, 10)
+X = tf.random.uniform((4, 8, 8, 3))
+Y = blk(X)
+Y.shape
+```
+
+    [TL] BatchNorm batchnorm_9: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_9: n_filter: 40 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_10: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_10: n_filter: 10 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_3: keep: 0.500000 
+    [TL] PRelu prelu_9: channel_shared: True
+    [TL] PRelu prelu_10: channel_shared: True
+    [TL] BatchNorm batchnorm_11: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_11: n_filter: 40 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_12: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_12: n_filter: 10 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_4: keep: 0.500000 
+    [TL] PRelu prelu_11: channel_shared: True
+    [TL] PRelu prelu_12: channel_shared: True
+    
+
+
+
+
+    TensorShape([4, 8, 8, 23])
+
+
+
+## 过渡层
+
+由于每个稠密块都会带来通道数的增加，使用过多则会过于复杂化模型。
+而过渡层可以用来控制模型复杂度。
+它通过 $1\times 1$ 卷积层来减小通道数，并使用步幅为 2 的平均池化层减半高和宽，从而进一步降低模型复杂度。
+
+
+
+```python
+class TransitionLayer(tl.layers.Module):
+    def __init__(self, out_channels):
+        super(TransitionLayer, self).__init__()
+        self.bn = tl.layers.BatchNorm()
+        self.conv = tl.layers.Conv2d(n_filter=out_channels,
+                                           filter_size=(1, 1),
+                                           strides=(1,1),
+                                           padding="same")
+        self.pool = tl.layers.MaxPool2d(filter_size=(2, 2),
+                                              strides=(2,2),
+                                              padding="SAME")
+
+    def forward(self, inputs):
+        x = self.bn(inputs)
+        x = tl.relu(x)
+        x = self.conv(x)
+        x = self.pool(x)
+        return x
+```
+
+对上一个例子中稠密块的输出[**使用**]通道数为 10 的[**过渡层**]。
+此时输出的通道数减为 10，高和宽均减半。
+
+
+
+```python
+blk = TransitionLayer(10)
+blk(Y).shape
+```
+
+    [TL] BatchNorm batchnorm_13: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_13: n_filter: 10 filter_size: (1, 1) strides: (1, 1) pad: same act: No Activation
+    [TL] MaxPool2d maxpool2d_1: filter_size: (2, 2) strides: (2, 2) padding: SAME
+    
+
+
+
+
+    TensorShape([4, 4, 4, 10])
+
+
+
+## DenseNet模型
+
+我们来构造 DenseNet 模型。DenseNet 首先使用同 ResNet 一样的单卷积层和最大池化层。
+接下来，类似于ResNet使用的4个残差块，DenseNet使用的是4个稠密块。
+与 ResNet 类似，我们可以设置每个稠密块使用多少个卷积层。
+这里我们设成 4，从而与`7.6节`的ResNet-18保持一致。
+稠密块里的卷积层通道数（即增长率）设为32，所以每个稠密块将增加128个通道。
+
+在每个模块之间，ResNet通过步幅为2的残差块减小高和宽，DenseNet则使用过渡层来减半高和宽，并减半通道数。
+最后接上全局池化层和全连接层来输出结果。
+
+
+```python
+class DenseNet(tl.layers.Module):
+    def __init__(self, num_init_features, growth_rate, block_layers, compression_rate, drop_rate):
+        super(DenseNet, self).__init__()
+        self.conv = tl.layers.Conv2d(n_filter=num_init_features,
+                                           filter_size=(7, 7),
+                                           strides=(2,2),
+                                           padding="SAME")
+        self.bn = tl.layers.BatchNorm()
+        self.pool = tl.layers.MaxPool2d(filter_size=(3, 3),
+                                              strides=(2,2),
+                                              padding="SAME")
+        self.num_channels = num_init_features
+        self.dense_block_1 = DenseBlock(num_layers=block_layers[0], growth_rate=growth_rate, drop_rate=drop_rate)
+        self.num_channels += growth_rate * block_layers[0]
+        self.num_channels = compression_rate * self.num_channels
+        self.transition_1 = TransitionLayer(out_channels=int(self.num_channels))
+        self.dense_block_2 = DenseBlock(num_layers=block_layers[1], growth_rate=growth_rate, drop_rate=drop_rate)
+        self.num_channels += growth_rate * block_layers[1]
+        self.num_channels = compression_rate * self.num_channels
+        self.transition_2 = TransitionLayer(out_channels=int(self.num_channels))
+        self.dense_block_3 = DenseBlock(num_layers=block_layers[2], growth_rate=growth_rate, drop_rate=drop_rate)
+        self.num_channels += growth_rate * block_layers[2]
+        self.num_channels = compression_rate * self.num_channels
+        self.transition_3 = TransitionLayer(out_channels=int(self.num_channels))
+        self.dense_block_4 = DenseBlock(num_layers=block_layers[3], growth_rate=growth_rate, drop_rate=drop_rate)
+
+        self.avgpool = tl.layers.GlobalMeanPool2d()
+        self.fc = tl.layers.Dense(n_units=10,act=tl.softmax(logits=()))
+
+    def forward(self, inputs):
+        x = self.conv(inputs)
+        x = self.bn(x)
+        x = tl.relu(x)
+        x = self.pool(x)
+
+        x = self.dense_block_1(x)
+        x = self.transition_1(x)
+        x = self.dense_block_2(x)
+        x = self.transition_2(x)
+        x = self.dense_block_3(x)
+        x = self.transition_3(x,)
+        x = self.dense_block_4(x)
+
+        x = self.avgpool(x)
+        x = self.fc(x)
+
+        return x
+```
+
+# DenseNet-100模型
+构建在3个密集连接块上
+
+
+```python
+class DenseNet_100(tl.layers.Module):
+    def __init__(self, num_init_features, growth_rate, block_layers, compression_rate, drop_rate):
+        super(DenseNet_100, self).__init__()
+        self.conv = tl.layers.Conv2d(n_filter=num_init_features,
+                                           filter_size=(7, 7),
+                                           strides=(2,2),
+                                           padding="SAME")
+        self.bn = tl.layers.BatchNorm()
+        self.pool = tl.layers.MaxPool2d(filter_size=(3, 3),
+                                              strides=(2,2),
+                                              padding="SAME")
+        self.num_channels = num_init_features
+        self.dense_block_1 = DenseBlock(num_layers=block_layers[0], growth_rate=growth_rate, drop_rate=drop_rate)
+        self.num_channels += growth_rate * block_layers[0]
+        self.num_channels = compression_rate * self.num_channels
+        self.transition_1 = TransitionLayer(out_channels=int(self.num_channels))
+        self.dense_block_2 = DenseBlock(num_layers=block_layers[1], growth_rate=growth_rate, drop_rate=drop_rate)
+        self.num_channels += growth_rate * block_layers[1]
+        self.num_channels = compression_rate * self.num_channels
+        self.transition_2 = TransitionLayer(out_channels=int(self.num_channels))
+        self.dense_block_3 = DenseBlock(num_layers=block_layers[2], growth_rate=growth_rate, drop_rate=drop_rate)
+        self.num_channels += growth_rate * block_layers[2]
+        self.num_channels = compression_rate * self.num_channels
+        self.transition_3 = TransitionLayer(out_channels=int(self.num_channels))
+
+
+        self.avgpool = tl.layers.GlobalMeanPool2d()
+        self.fc = tl.layers.Dense(n_units=10,act=tl.softmax(logits=()))
+
+    def forward(self, inputs):
+        x = self.conv(inputs)
+        x = self.bn(x)
+        x = tl.relu(x)
+        x = self.pool(x)
+
+        x = self.dense_block_1(x)
+        x = self.transition_1(x)
+        x = self.dense_block_2(x)
+        x = self.transition_2(x)
+        x = self.dense_block_3(x)
+        x = self.transition_3(x,)
+
+        x = self.avgpool(x)
+        # x = tl.layers.Dense(n_units=10,act=tl.softmax(logits=x))
+        x = self.fc(x)
+
+        return x
+```
+
+# DenseNet-121，169，201，264，100的整体封装
+
+
+```python
+def densenet(x):
+    if x == 'densenet-121':
+        return DenseNet(num_init_features=64, growth_rate=32, block_layers=[6, 12, 24, 16], compression_rate=0.5,
+                            drop_rate=0.5)
+    elif x == 'densenet-169':
+        return DenseNet(num_init_features=64, growth_rate=32, block_layers=[6 , 12, 32, 32], compression_rate=0.5,
+                            drop_rate=0.5)
+    elif x == 'densenet-201':
+        return DenseNet(num_init_features=64, growth_rate=32, block_layers=[6, 12, 48, 32], compression_rate=0.5,
+                            drop_rate=0.5)
+    elif x == 'densenet-264':
+        return DenseNet(num_init_features=64, growth_rate=32, block_layers=[6, 12, 64, 48], compression_rate=0.5,
+                            drop_rate=0.5)
+    elif x=='densenet-100':
+        return DenseNet_100(num_init_features=64, growth_rate=12, block_layers=[16, 16, 16], compression_rate=0.5,
+                            drop_rate=0.5)
+
+
+```
+
+# DenseNet-100使用数据集Cifar-10进行训练
+直接用原项目提供的训练代码
+## 自己笔记本电脑训练过慢已提前中断内核
+
+
+```python
+import time
+import multiprocessing
+import tensorflow as tf
+import os
+os.environ['TL_BACKEND'] = 'tensorflow'
+
+import tensorlayer as tl
+from DenseNet.DenseNet_tensorlayer import densenet
+
+tl.logging.set_verbosity(tl.logging.DEBUG)
+X_train, y_train, X_test, y_test = tl.files.load_cifar10_dataset(shape=(-1, 32, 32, 3), plotable=False)
+
+# get the network
+net = densenet("densenet-100")
+
+# training settings
+batch_size = 128
+n_epoch = 500
+learning_rate = 0.0001
+print_freq = 5
+n_step_epoch = int(len(y_train) / batch_size)
+n_step = n_epoch * n_step_epoch
+shuffle_buffer_size = 128
+
+train_weights = net.trainable_weights
+optimizer = tl.optimizers.Adam(learning_rate)
+metrics = tl.metric.Accuracy()
+
+
+def generator_train():
+    inputs = X_train
+    targets = y_train
+    if len(inputs) != len(targets):
+        raise AssertionError("The length of inputs and targets should be equal")
+    for _input, _target in zip(inputs, targets):
+        # yield _input.encode('utf-8'), _target.encode('utf-8')
+        yield _input, _target
+
+
+def generator_test():
+    inputs = X_test
+    targets = y_test
+    if len(inputs) != len(targets):
+        raise AssertionError("The length of inputs and targets should be equal")
+    for _input, _target in zip(inputs, targets):
+        # yield _input.encode('utf-8'), _target.encode('utf-8')
+        yield _input, _target
+
+
+def _map_fn_train(img, target):
+    # 1. Randomly crop a [height, width] section of the image.
+    img = tf.image.random_crop(img, [24, 24, 3])
+    # 2. Randomly flip the image horizontally.
+    img = tf.image.random_flip_left_right(img)
+    # 3. Randomly change brightness.
+    img = tf.image.random_brightness(img, max_delta=63)
+    # 4. Randomly change contrast.
+    img = tf.image.random_contrast(img, lower=0.2, upper=1.8)
+    # 5. Subtract off the mean and divide by the variance of the pixels.
+    img = tf.image.per_image_standardization(img)
+    target = tf.reshape(target, ())
+    return img, target
+
+
+def _map_fn_test(img, target):
+    # 1. Crop the central [height, width] of the image.
+    img = tf.image.resize_with_pad(img, 24, 24)
+    # 2. Subtract off the mean and divide by the variance of the pixels.
+    img = tf.image.per_image_standardization(img)
+    img = tf.reshape(img, (24, 24, 3))
+    target = tf.reshape(target, ())
+    return img, target
+
+
+# dataset API and augmentation
+train_ds = tf.data.Dataset.from_generator(
+    generator_train, output_types=(tf.float32, tf.int32)
+)  # , output_shapes=((24, 24, 3), (1)))
+train_ds = train_ds.map(_map_fn_train,num_parallel_calls=multiprocessing.cpu_count())
+# train_ds = train_ds.repeat(n_epoch)
+train_ds = train_ds.shuffle(shuffle_buffer_size)
+train_ds = train_ds.prefetch(buffer_size=4096)
+train_ds = train_ds.batch(batch_size)
+# value = train_ds.make_one_shot_iterator().get_next()
+
+test_ds = tf.data.Dataset.from_generator(
+    generator_test, output_types=(tf.float32, tf.int32)
+)  # , output_shapes=((24, 24, 3), (1)))
+# test_ds = test_ds.shuffle(shuffle_buffer_size)
+test_ds = test_ds.map(_map_fn_test,num_parallel_calls=multiprocessing.cpu_count())
+# test_ds = test_ds.repeat(n_epoch)
+test_ds = test_ds.prefetch(buffer_size=4096)
+test_ds = test_ds.batch(batch_size)
+# value_test = test_ds.make_one_shot_iterator().get_next()
+
+
+class WithLoss(tl.layers.Module):
+
+    def __init__(self, net, loss_fn):
+        super(WithLoss, self).__init__()
+        self._net = net
+        self._loss_fn = loss_fn
+
+    def forward(self, data, label):
+        out = self._net(data)
+        loss = self._loss_fn(out, label)
+        return loss
+
+
+net_with_loss = WithLoss(net, loss_fn=tl.cost.softmax_cross_entropy_with_logits)
+net_with_train = tl.models.TrainOneStep(net_with_loss, optimizer, train_weights)
+
+for epoch in range(n_epoch):
+    start_time = time.time()
+    net.set_train()
+    train_loss, train_acc, n_iter = 0, 0, 0
+    for X_batch, y_batch in train_ds:
+
+        X_batch = tl.ops.convert_to_tensor(X_batch.numpy(), dtype=tl.float32)
+        y_batch = tl.ops.convert_to_tensor(y_batch.numpy(), dtype=tl.int64)
+
+        _loss_ce = net_with_train(X_batch, y_batch)
+        train_loss += _loss_ce
+
+        n_iter += 1
+        _logits = net(X_batch)
+        metrics.update(_logits, y_batch)
+        train_acc += metrics.result()
+        metrics.reset()
+        print("Epoch {} of {} took {}".format(epoch + 1, n_epoch, time.time() - start_time))
+        print("   train loss: {}".format(train_loss / n_iter))
+        print("   train acc:  {}".format(train_acc / n_iter))
+
+```
+
+    [TL] Load or Download cifar10 > data\cifar10
+    [TL] Conv2d conv2d_134: n_filter: 64 filter_size: (7, 7) strides: (2, 2) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_134: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] MaxPool2d maxpool2d_6: filter_size: (3, 3) strides: (2, 2) padding: SAME
+    [TL] BatchNorm batchnorm_135: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_135: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_136: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_136: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_63: keep: 0.500000 
+    [TL] PRelu prelu_129: channel_shared: True
+    [TL] PRelu prelu_130: channel_shared: True
+    [TL] BatchNorm batchnorm_137: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_137: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_138: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_138: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_64: keep: 0.500000 
+    [TL] PRelu prelu_131: channel_shared: True
+    [TL] PRelu prelu_132: channel_shared: True
+    [TL] BatchNorm batchnorm_139: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_139: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_140: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_140: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_65: keep: 0.500000 
+    [TL] PRelu prelu_133: channel_shared: True
+    [TL] PRelu prelu_134: channel_shared: True
+    [TL] BatchNorm batchnorm_141: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_141: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_142: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_142: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_66: keep: 0.500000 
+    [TL] PRelu prelu_135: channel_shared: True
+    [TL] PRelu prelu_136: channel_shared: True
+    [TL] BatchNorm batchnorm_143: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_143: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_144: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_144: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_67: keep: 0.500000 
+    [TL] PRelu prelu_137: channel_shared: True
+    [TL] PRelu prelu_138: channel_shared: True
+    [TL] BatchNorm batchnorm_145: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_145: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_146: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_146: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_68: keep: 0.500000 
+    [TL] PRelu prelu_139: channel_shared: True
+    [TL] PRelu prelu_140: channel_shared: True
+    [TL] BatchNorm batchnorm_147: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_147: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_148: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_148: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_69: keep: 0.500000 
+    [TL] PRelu prelu_141: channel_shared: True
+    [TL] PRelu prelu_142: channel_shared: True
+    [TL] BatchNorm batchnorm_149: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_149: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_150: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_150: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_70: keep: 0.500000 
+    [TL] PRelu prelu_143: channel_shared: True
+    [TL] PRelu prelu_144: channel_shared: True
+    [TL] BatchNorm batchnorm_151: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_151: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_152: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_152: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_71: keep: 0.500000 
+    [TL] PRelu prelu_145: channel_shared: True
+    [TL] PRelu prelu_146: channel_shared: True
+    [TL] BatchNorm batchnorm_153: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_153: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_154: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_154: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_72: keep: 0.500000 
+    [TL] PRelu prelu_147: channel_shared: True
+    [TL] PRelu prelu_148: channel_shared: True
+    [TL] BatchNorm batchnorm_155: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_155: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_156: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_156: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_73: keep: 0.500000 
+    [TL] PRelu prelu_149: channel_shared: True
+    [TL] PRelu prelu_150: channel_shared: True
+    [TL] BatchNorm batchnorm_157: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_157: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_158: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_158: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_74: keep: 0.500000 
+    [TL] PRelu prelu_151: channel_shared: True
+    [TL] PRelu prelu_152: channel_shared: True
+    [TL] BatchNorm batchnorm_159: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_159: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_160: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_160: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_75: keep: 0.500000 
+    [TL] PRelu prelu_153: channel_shared: True
+    [TL] PRelu prelu_154: channel_shared: True
+    [TL] BatchNorm batchnorm_161: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_161: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_162: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_162: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_76: keep: 0.500000 
+    [TL] PRelu prelu_155: channel_shared: True
+    [TL] PRelu prelu_156: channel_shared: True
+    [TL] BatchNorm batchnorm_163: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_163: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_164: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_164: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_77: keep: 0.500000 
+    [TL] PRelu prelu_157: channel_shared: True
+    [TL] PRelu prelu_158: channel_shared: True
+    [TL] BatchNorm batchnorm_165: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_165: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_166: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_166: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_78: keep: 0.500000 
+    [TL] PRelu prelu_159: channel_shared: True
+    [TL] PRelu prelu_160: channel_shared: True
+    [TL] BatchNorm batchnorm_167: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_167: n_filter: 128 filter_size: (1, 1) strides: (1, 1) pad: same act: No Activation
+    [TL] MaxPool2d maxpool2d_7: filter_size: (2, 2) strides: (2, 2) padding: SAME
+    [TL] BatchNorm batchnorm_168: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_168: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_169: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_169: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_79: keep: 0.500000 
+    [TL] PRelu prelu_161: channel_shared: True
+    [TL] PRelu prelu_162: channel_shared: True
+    [TL] BatchNorm batchnorm_170: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_170: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_171: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_171: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_80: keep: 0.500000 
+    [TL] PRelu prelu_163: channel_shared: True
+    [TL] PRelu prelu_164: channel_shared: True
+    [TL] BatchNorm batchnorm_172: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_172: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_173: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_173: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_81: keep: 0.500000 
+    [TL] PRelu prelu_165: channel_shared: True
+    [TL] PRelu prelu_166: channel_shared: True
+    [TL] BatchNorm batchnorm_174: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_174: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_175: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_175: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_82: keep: 0.500000 
+    [TL] PRelu prelu_167: channel_shared: True
+    [TL] PRelu prelu_168: channel_shared: True
+    [TL] BatchNorm batchnorm_176: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_176: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_177: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_177: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_83: keep: 0.500000 
+    [TL] PRelu prelu_169: channel_shared: True
+    [TL] PRelu prelu_170: channel_shared: True
+    [TL] BatchNorm batchnorm_178: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_178: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_179: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_179: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_84: keep: 0.500000 
+    [TL] PRelu prelu_171: channel_shared: True
+    [TL] PRelu prelu_172: channel_shared: True
+    [TL] BatchNorm batchnorm_180: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_180: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_181: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_181: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_85: keep: 0.500000 
+    [TL] PRelu prelu_173: channel_shared: True
+    [TL] PRelu prelu_174: channel_shared: True
+    [TL] BatchNorm batchnorm_182: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_182: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_183: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_183: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_86: keep: 0.500000 
+    [TL] PRelu prelu_175: channel_shared: True
+    [TL] PRelu prelu_176: channel_shared: True
+    [TL] BatchNorm batchnorm_184: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_184: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_185: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_185: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_87: keep: 0.500000 
+    [TL] PRelu prelu_177: channel_shared: True
+    [TL] PRelu prelu_178: channel_shared: True
+    [TL] BatchNorm batchnorm_186: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_186: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_187: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_187: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_88: keep: 0.500000 
+    [TL] PRelu prelu_179: channel_shared: True
+    [TL] PRelu prelu_180: channel_shared: True
+    [TL] BatchNorm batchnorm_188: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_188: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_189: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_189: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_89: keep: 0.500000 
+    [TL] PRelu prelu_181: channel_shared: True
+    [TL] PRelu prelu_182: channel_shared: True
+    [TL] BatchNorm batchnorm_190: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_190: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_191: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_191: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_90: keep: 0.500000 
+    [TL] PRelu prelu_183: channel_shared: True
+    [TL] PRelu prelu_184: channel_shared: True
+    [TL] BatchNorm batchnorm_192: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_192: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_193: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_193: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_91: keep: 0.500000 
+    [TL] PRelu prelu_185: channel_shared: True
+    [TL] PRelu prelu_186: channel_shared: True
+    [TL] BatchNorm batchnorm_194: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_194: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_195: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_195: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_92: keep: 0.500000 
+    [TL] PRelu prelu_187: channel_shared: True
+    [TL] PRelu prelu_188: channel_shared: True
+    [TL] BatchNorm batchnorm_196: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_196: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_197: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_197: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_93: keep: 0.500000 
+    [TL] PRelu prelu_189: channel_shared: True
+    [TL] PRelu prelu_190: channel_shared: True
+    [TL] BatchNorm batchnorm_198: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_198: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_199: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_199: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_94: keep: 0.500000 
+    [TL] PRelu prelu_191: channel_shared: True
+    [TL] PRelu prelu_192: channel_shared: True
+    [TL] BatchNorm batchnorm_200: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_200: n_filter: 160 filter_size: (1, 1) strides: (1, 1) pad: same act: No Activation
+    [TL] MaxPool2d maxpool2d_8: filter_size: (2, 2) strides: (2, 2) padding: SAME
+    [TL] BatchNorm batchnorm_201: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_201: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_202: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_202: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_95: keep: 0.500000 
+    [TL] PRelu prelu_193: channel_shared: True
+    [TL] PRelu prelu_194: channel_shared: True
+    [TL] BatchNorm batchnorm_203: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_203: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_204: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_204: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_96: keep: 0.500000 
+    [TL] PRelu prelu_195: channel_shared: True
+    [TL] PRelu prelu_196: channel_shared: True
+    [TL] BatchNorm batchnorm_205: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_205: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_206: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_206: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_97: keep: 0.500000 
+    [TL] PRelu prelu_197: channel_shared: True
+    [TL] PRelu prelu_198: channel_shared: True
+    [TL] BatchNorm batchnorm_207: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_207: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_208: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_208: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_98: keep: 0.500000 
+    [TL] PRelu prelu_199: channel_shared: True
+    [TL] PRelu prelu_200: channel_shared: True
+    [TL] BatchNorm batchnorm_209: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_209: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_210: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_210: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_99: keep: 0.500000 
+    [TL] PRelu prelu_201: channel_shared: True
+    [TL] PRelu prelu_202: channel_shared: True
+    [TL] BatchNorm batchnorm_211: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_211: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_212: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_212: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_100: keep: 0.500000 
+    [TL] PRelu prelu_203: channel_shared: True
+    [TL] PRelu prelu_204: channel_shared: True
+    [TL] BatchNorm batchnorm_213: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_213: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_214: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_214: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_101: keep: 0.500000 
+    [TL] PRelu prelu_205: channel_shared: True
+    [TL] PRelu prelu_206: channel_shared: True
+    [TL] BatchNorm batchnorm_215: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_215: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_216: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_216: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_102: keep: 0.500000 
+    [TL] PRelu prelu_207: channel_shared: True
+    [TL] PRelu prelu_208: channel_shared: True
+    [TL] BatchNorm batchnorm_217: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_217: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_218: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_218: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_103: keep: 0.500000 
+    [TL] PRelu prelu_209: channel_shared: True
+    [TL] PRelu prelu_210: channel_shared: True
+    [TL] BatchNorm batchnorm_219: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_219: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_220: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_220: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_104: keep: 0.500000 
+    [TL] PRelu prelu_211: channel_shared: True
+    [TL] PRelu prelu_212: channel_shared: True
+    [TL] BatchNorm batchnorm_221: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_221: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_222: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_222: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_105: keep: 0.500000 
+    [TL] PRelu prelu_213: channel_shared: True
+    [TL] PRelu prelu_214: channel_shared: True
+    [TL] BatchNorm batchnorm_223: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_223: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_224: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_224: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_106: keep: 0.500000 
+    [TL] PRelu prelu_215: channel_shared: True
+    [TL] PRelu prelu_216: channel_shared: True
+    [TL] BatchNorm batchnorm_225: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_225: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_226: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_226: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_107: keep: 0.500000 
+    [TL] PRelu prelu_217: channel_shared: True
+    [TL] PRelu prelu_218: channel_shared: True
+    [TL] BatchNorm batchnorm_227: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_227: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_228: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_228: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_108: keep: 0.500000 
+    [TL] PRelu prelu_219: channel_shared: True
+    [TL] PRelu prelu_220: channel_shared: True
+    [TL] BatchNorm batchnorm_229: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_229: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_230: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_230: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_109: keep: 0.500000 
+    [TL] PRelu prelu_221: channel_shared: True
+    [TL] PRelu prelu_222: channel_shared: True
+    [TL] BatchNorm batchnorm_231: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_231: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
+    [TL] BatchNorm batchnorm_232: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_232: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
+    [TL] Dropout dropout_110: keep: 0.500000 
+    [TL] PRelu prelu_223: channel_shared: True
+    [TL] PRelu prelu_224: channel_shared: True
+    [TL] BatchNorm batchnorm_233: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
+    [TL] Conv2d conv2d_233: n_filter: 176 filter_size: (1, 1) strides: (1, 1) pad: same act: No Activation
+    [TL] MaxPool2d maxpool2d_9: filter_size: (2, 2) strides: (2, 2) padding: SAME
+    [TL] GlobalMeanPool2d globalmeanpool2d_2
+    [TL] Dense  dense_2: 10 EagerTensor
+    Epoch 1 of 500 took 2.472132682800293
+       train loss: 2.3026280403137207
+       train acc:  0.0859375
+    Epoch 1 of 500 took 3.1664297580718994
+       train loss: 2.302629232406616
+       train acc:  0.06640625
+    Epoch 1 of 500 took 3.8870487213134766
+       train loss: 2.302507162094116
+       train acc:  0.0755208358168602
+    Epoch 1 of 500 took 4.617677927017212
+       train loss: 2.3025760650634766
+       train acc:  0.087890625
+    Epoch 1 of 500 took 5.326868057250977
+       train loss: 2.3025622367858887
+       train acc:  0.09218750149011612
+    Epoch 1 of 500 took 6.068540573120117
+       train loss: 2.3026044368743896
+       train acc:  0.08984375
+    Epoch 1 of 500 took 6.782266139984131
+       train loss: 2.3025448322296143
+       train acc:  0.0881696417927742
+    Epoch 1 of 500 took 7.485439300537109
+       train loss: 2.3025074005126953
+       train acc:  0.0869140625
+    Epoch 1 of 500 took 8.213810443878174
+       train loss: 2.302518844604492
+       train acc:  0.0902777761220932
+    Epoch 1 of 500 took 8.954435586929321
+       train loss: 2.302535057067871
+       train acc:  0.08906249701976776
+    Epoch 1 of 500 took 9.724130630493164
+       train loss: 2.302517890930176
+       train acc:  0.09375
+    Epoch 1 of 500 took 10.50143551826477
+       train loss: 2.302520751953125
+       train acc:  0.095703125
+    Epoch 1 of 500 took 11.254729986190796
+       train loss: 2.302520751953125
+       train acc:  0.09555288404226303
+    Epoch 1 of 500 took 12.040328741073608
+       train loss: 2.302544355392456
+       train acc:  0.0959821417927742
+    Epoch 1 of 500 took 12.847611427307129
+       train loss: 2.302537679672241
+       train acc:  0.09687499701976776
+    Epoch 1 of 500 took 13.565277814865112
+       train loss: 2.302561044692993
+       train acc:  0.1005859375
+    Epoch 1 of 500 took 14.284465312957764
+       train loss: 2.302546977996826
+       train acc:  0.0992647036910057
+    Epoch 1 of 500 took 14.988042116165161
+       train loss: 2.30257511138916
+       train acc:  0.1002604141831398
+    Epoch 1 of 500 took 15.723733901977539
+       train loss: 2.3025801181793213
+       train acc:  0.10115131735801697
+    Epoch 1 of 500 took 16.431028127670288
+       train loss: 2.3025593757629395
+       train acc:  0.10234375298023224
+    Epoch 1 of 500 took 17.1357638835907
+       train loss: 2.302570343017578
+       train acc:  0.103050597012043
+    Epoch 1 of 500 took 17.839925050735474
+       train loss: 2.3025553226470947
+       train acc:  0.10475852340459824
+    Epoch 1 of 500 took 18.530513525009155
+       train loss: 2.30254864692688
+       train acc:  0.10529891401529312
+    Epoch 1 of 500 took 19.220801830291748
+       train loss: 2.302539825439453
+       train acc:  0.1048177108168602
+    Epoch 1 of 500 took 19.934443950653076
+       train loss: 2.30253267288208
+       train acc:  0.10343749821186066
+    Epoch 1 of 500 took 20.631637573242188
+       train loss: 2.3025319576263428
+       train acc:  0.10306490212678909
+    Epoch 1 of 500 took 21.35742998123169
+       train loss: 2.302518129348755
+       train acc:  0.10387731343507767
+    Epoch 1 of 500 took 22.061012983322144
+       train loss: 2.302506923675537
+       train acc:  0.1040736585855484
+    Epoch 1 of 500 took 22.77148962020874
+       train loss: 2.302518129348755
+       train acc:  0.10317888110876083
+    Epoch 1 of 500 took 23.499772548675537
+       train loss: 2.302516222000122
+       train acc:  0.10260416567325592
+    Epoch 1 of 500 took 24.261035680770874
+       train loss: 2.302506446838379
+       train acc:  0.1020665317773819
+    Epoch 1 of 500 took 25.001662254333496
+       train loss: 2.3024935722351074
+       train acc:  0.10302734375
+    Epoch 1 of 500 took 25.749852180480957
+       train loss: 2.302504539489746
+       train acc:  0.10321969538927078
+    Epoch 1 of 500 took 26.513017654418945
+       train loss: 2.302511215209961
+       train acc:  0.10363051295280457
+    Epoch 1 of 500 took 27.220292568206787
+       train loss: 2.3025102615356445
+       train acc:  0.10312499850988388
+    Epoch 1 of 500 took 28.074881076812744
+       train loss: 2.302503824234009
+       train acc:  0.1028645858168602
+    Epoch 1 of 500 took 28.793709993362427
+       train loss: 2.302511215209961
+       train acc:  0.1034628376364708
+    Epoch 1 of 500 took 29.56246781349182
+       train loss: 2.302523136138916
+       train acc:  0.10259046405553818
+    Epoch 1 of 500 took 30.309444665908813
+       train loss: 2.302518129348755
+       train acc:  0.10316506773233414
+    Epoch 1 of 500 took 31.03504705429077
+       train loss: 2.3025131225585938
+       train acc:  0.10332031548023224
+    Epoch 1 of 500 took 31.74468207359314
+       train loss: 2.3025143146514893
+       train acc:  0.10251524299383163
+    Epoch 1 of 500 took 32.44689440727234
+       train loss: 2.3025083541870117
+       train acc:  0.102492555975914
+    Epoch 1 of 500 took 33.181196451187134
+       train loss: 2.3025104999542236
+       train acc:  0.10247092694044113
+    Epoch 1 of 500 took 33.900819301605225
+       train loss: 2.3025102615356445
+       train acc:  0.1015625
+    Epoch 1 of 500 took 34.656460762023926
+       train loss: 2.302510976791382
+       train acc:  0.10086805373430252
+    Epoch 1 of 500 took 35.36271095275879
+       train loss: 2.3025131225585938
+       train acc:  0.10071331262588501
+    Epoch 1 of 500 took 36.06831693649292
+       train loss: 2.30250883102417
+       train acc:  0.10073138028383255
+    Epoch 1 of 500 took 36.776965618133545
+       train loss: 2.3025097846984863
+       train acc:  0.10009765625
+    Epoch 1 of 500 took 37.47527098655701
+       train loss: 2.302520990371704
+       train acc:  0.09948979318141937
+    Epoch 1 of 500 took 38.20156741142273
+       train loss: 2.3025150299072266
+       train acc:  0.10000000149011612
+    Epoch 1 of 500 took 38.94017839431763
+       train loss: 2.3025131225585938
+       train acc:  0.10018382221460342
+    Epoch 1 of 500 took 39.68779230117798
+       train loss: 2.3025169372558594
+       train acc:  0.10036057978868484
+    Epoch 1 of 500 took 40.50718331336975
+       train loss: 2.302532434463501
+       train acc:  0.09979363530874252
+    Epoch 1 of 500 took 41.23833727836609
+       train loss: 2.302534818649292
+       train acc:  0.0998263880610466
+    Epoch 1 of 500 took 42.00189256668091
+       train loss: 2.3025405406951904
+       train acc:  0.09985795617103577
+    Epoch 1 of 500 took 42.86620116233826
+       train loss: 2.3025457859039307
+       train acc:  0.1001674085855484
+    Epoch 1 of 500 took 43.598840951919556
+       train loss: 2.3025429248809814
+       train acc:  0.10005482286214828
+    Epoch 1 of 500 took 44.333006620407104
+       train loss: 2.302553415298462
+       train acc:  0.09940733015537262
+    Epoch 1 of 500 took 45.103665828704834
+       train loss: 2.3025565147399902
+       train acc:  0.09957627207040787
+    Epoch 1 of 500 took 45.88186049461365
+       train loss: 2.3025529384613037
+       train acc:  0.10013020783662796
+    Epoch 1 of 500 took 46.64212989807129
+       train loss: 2.302560806274414
+       train acc:  0.09964139014482498
+    
+
+
+    ---------------------------------------------------------------------------
+
+    KeyboardInterrupt                         Traceback (most recent call last)
+
+    ~\AppData\Local\Temp/ipykernel_25040/640882711.py in <module>
+        120         y_batch = tl.ops.convert_to_tensor(y_batch.numpy(), dtype=tl.int64)
+        121 
+    --> 122         _loss_ce = net_with_train(X_batch, y_batch)
+        123         train_loss += _loss_ce
+        124 
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\models\core.py in __call__(self, data, label)
+        525 
+        526     def __call__(self, data, label):
+    --> 527         loss = self.net_with_train(data, label)
+        528         return loss
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\models\core.py in __call__(self, data, label)
+        448     def __call__(self, data, label):
+        449         with tf.GradientTape() as tape:
+    --> 450             loss = self.net_with_loss(data, label)
+        451         grad = tape.gradient(loss, self.train_weights)
+        452         self.optimzer.apply_gradients(zip(grad, self.train_weights))
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\core\core_tensorflow.py in __call__(self, inputs, *args, **kwargs)
+        164     def __call__(self, inputs, *args, **kwargs):
+        165 
+    --> 166         output = self.forward(inputs, *args, **kwargs)
+        167 
+        168         return output
+    
+
+    ~\AppData\Local\Temp/ipykernel_25040/640882711.py in forward(self, data, label)
+        103 
+        104     def forward(self, data, label):
+    --> 105         out = self._net(data)
+        106         loss = self._loss_fn(out, label)
+        107         return loss
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\core\core_tensorflow.py in __call__(self, inputs, *args, **kwargs)
+        164     def __call__(self, inputs, *args, **kwargs):
+        165 
+    --> 166         output = self.forward(inputs, *args, **kwargs)
+        167 
+        168         return output
+    
+
+    D:\DeepLearning_tensorflow\DenseNet\DenseNet_tensorlayer.py in forward(self, inputs)
+        156         x = self.pool(x)
+        157 
+    --> 158         x = self.dense_block_1(x)
+        159         x = self.transition_1(x)
+        160         x = self.dense_block_2(x)
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\core\core_tensorflow.py in __call__(self, inputs, *args, **kwargs)
+        164     def __call__(self, inputs, *args, **kwargs):
+        165 
+    --> 166         output = self.forward(inputs, *args, **kwargs)
+        167 
+        168         return output
+    
+
+    D:\DeepLearning_tensorflow\DenseNet\DenseNet_tensorlayer.py in forward(self, x)
+         48     def forward(self, x):
+         49         for layer in self.listLayers:
+    ---> 50             x = layer(x)
+         51         return x
+         52 
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\core\core_tensorflow.py in __call__(self, inputs, *args, **kwargs)
+        164     def __call__(self, inputs, *args, **kwargs):
+        165 
+    --> 166         output = self.forward(inputs, *args, **kwargs)
+        167 
+        168         return output
+    
+
+    D:\DeepLearning_tensorflow\DenseNet\DenseNet_tensorlayer.py in forward(self, x)
+         31         y = x
+         32         for layer in self.listLayers:
+    ---> 33             y = layer(y)
+         34         y = tf.keras.layers.concatenate([x, y], axis=-1)
+         35         return y
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\core\core_tensorflow.py in __call__(self, inputs, *args, **kwargs)
+        164     def __call__(self, inputs, *args, **kwargs):
+        165 
+    --> 166         output = self.forward(inputs, *args, **kwargs)
+        167 
+        168         return output
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\normalization.py in forward(self, inputs)
+        193                 moving_var=self.moving_var, num_features=self.num_features, data_format=self.data_format, is_train=False
+        194             )
+    --> 195         outputs = self.batchnorm(inputs=inputs)
+        196         if self.act_init_flag:
+        197             outputs = self.act(outputs)
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\backend\ops\tensorflow_nn.py in __call__(self, inputs)
+       1580                 self.moving_mean, mean, self.decay, zero_debias=False
+       1581             )
+    -> 1582             self.moving_var = moving_averages.assign_moving_average(self.moving_var, var, self.decay, zero_debias=False)
+       1583             outputs = batch_normalization(inputs, mean, var, self.beta, self.gamma, self.epsilon, self.data_format)
+       1584         else:
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\moving_averages.py in assign_moving_average(variable, value, decay, zero_debias, name)
+        109         return update(strategy, v, value)
+        110 
+    --> 111       return replica_context.merge_call(merge_fn, args=(variable, value))
+        112     else:
+        113       strategy = distribution_strategy_context.get_cross_replica_context()
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py in merge_call(self, merge_fn, args, kwargs)
+       2713     merge_fn = autograph.tf_convert(
+       2714         merge_fn, autograph_ctx.control_status_ctx(), convert_by_default=False)
+    -> 2715     return self._merge_call(merge_fn, args, kwargs)
+       2716 
+       2717   def _merge_call(self, merge_fn, args, kwargs):
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py in _merge_call(self, merge_fn, args, kwargs)
+       2720         distribution_strategy_context._CrossReplicaThreadMode(self._strategy))  # pylint: disable=protected-access
+       2721     try:
+    -> 2722       return merge_fn(self._strategy, *args, **kwargs)
+       2723     finally:
+       2724       _pop_per_thread_mode()
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\autograph\impl\api.py in wrapper(*args, **kwargs)
+        273   def wrapper(*args, **kwargs):
+        274     with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.UNSPECIFIED):
+    --> 275       return func(*args, **kwargs)
+        276 
+        277   if inspect.isfunction(func) or inspect.ismethod(func):
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\moving_averages.py in merge_fn(strategy, v, value)
+        107         value = strategy.extended.reduce_to(ds_reduce_util.ReduceOp.MEAN, value,
+        108                                             v)
+    --> 109         return update(strategy, v, value)
+        110 
+        111       return replica_context.merge_call(merge_fn, args=(variable, value))
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\moving_averages.py in update(strategy, v, value)
+         98         return _zero_debias(strategy, v, value, decay)
+         99       else:
+    --> 100         return _update(strategy, v, update_fn, args=(value,))
+        101 
+        102     replica_context = distribution_strategy_context.get_replica_context()
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\moving_averages.py in _update(strategy, var, update_fn, args)
+        190     return update_fn(var, *args)
+        191   else:
+    --> 192     return strategy.extended.update(var, update_fn, args)
+        193 
+        194 
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py in update(self, var, fn, args, kwargs, group)
+       2298         fn, autograph_ctx.control_status_ctx(), convert_by_default=False)
+       2299     with self._container_strategy().scope():
+    -> 2300       return self._update(var, fn, args, kwargs, group)
+       2301 
+       2302   def _update(self, var, fn, args, kwargs, group):
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py in _update(self, var, fn, args, kwargs, group)
+       2953     # The implementations of _update() and _update_non_slot() are identical
+       2954     # except _update() passes `var` as the first argument to `fn()`.
+    -> 2955     return self._update_non_slot(var, fn, (var,) + tuple(args), kwargs, group)
+       2956 
+       2957   def _update_non_slot(self, colocate_with, fn, args, kwargs, should_group):
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py in _update_non_slot(self, colocate_with, fn, args, kwargs, should_group)
+       2959     # once that value is used for something.
+       2960     with UpdateContext(colocate_with):
+    -> 2961       result = fn(*args, **kwargs)
+       2962       if should_group:
+       2963         return result
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\autograph\impl\api.py in wrapper(*args, **kwargs)
+        273   def wrapper(*args, **kwargs):
+        274     with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.UNSPECIFIED):
+    --> 275       return func(*args, **kwargs)
+        276 
+        277   if inspect.isfunction(func) or inspect.ismethod(func):
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\moving_averages.py in update_fn(v, value)
+         92 
+         93     def update_fn(v, value):
+    ---> 94       return state_ops.assign_sub(v, (v - value) * decay, name=scope)
+         95 
+         96     def update(strategy, v, value):
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\ops\state_ops.py in assign_sub(ref, value, use_locking, name)
+        162     return gen_state_ops.assign_sub(
+        163         ref, value, use_locking=use_locking, name=name)
+    --> 164   return ref.assign_sub(value)
+        165 
+        166 
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py in assign_sub(self, delta, use_locking, name, read_value)
+       1967     with ops.control_dependencies([self._parent_op]):
+       1968       return super(_UnreadVariable, self).assign_sub(delta, use_locking, name,
+    -> 1969                                                      read_value)
+       1970 
+       1971   def assign_add(self, delta, use_locking=None, name=None, read_value=True):
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py in assign_sub(self, delta, use_locking, name, read_value)
+        799       assign_sub_op = gen_resource_variable_ops.assign_sub_variable_op(
+        800           self.handle, ops.convert_to_tensor(delta, dtype=self.dtype),
+    --> 801           name=name)
+        802     if read_value:
+        803       return self._lazy_read(assign_sub_op)
+    
+
+    D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\ops\gen_resource_variable_ops.py in assign_sub_variable_op(resource, value, name)
+         92       _result = pywrap_tfe.TFE_Py_FastPathExecute(
+         93         _ctx._context_handle, tld.device_name, "AssignSubVariableOp", name,
+    ---> 94         tld.op_callbacks, resource, value)
+         95       return _result
+         96     except _core._NotOkStatusException as e:
+    
+
+    KeyboardInterrupt: 
+
+
+# DenseNet-121使用的数据集ImageNet进行训练
+
+
+```python
+import time
+import multiprocessing
+import tensorflow as tf
+import os
+os.environ['TL_BACKEND'] = 'tensorflow'
+
+import tensorlayer as tl
+from DenseNet.DenseNet_tensorlayer import densenet
+
+tl.logging.set_verbosity(tl.logging.DEBUG)
+def load_ImageNet_dataset(shape=(-1, 256, 256, 3), plotable=False):
+    '''已加载到本地的ImageNet数据集'''
+    return X_train, y_train, X_test, y_test
+
+# get the network
+net = densenet("densenet-121")
+
+X_train, y_train, X_test, y_test = load_ImageNet_dataset(shape=(-1, 256, 256, 3), plotable=False)
+# training settings
+batch_size = 128
+n_epoch = 500
+learning_rate = 0.0001
+print_freq = 5
+n_step_epoch = int(len(y_train) / batch_size)
+n_step = n_epoch * n_step_epoch
+shuffle_buffer_size = 128
+
+train_weights = net.trainable_weights
+optimizer = tl.optimizers.Adam(learning_rate)
+metrics = tl.metric.Accuracy()
+
+
+def generator_train():
+    inputs = X_train
+    targets = y_train
+    if len(inputs) != len(targets):
+        raise AssertionError("The length of inputs and targets should be equal")
+    for _input, _target in zip(inputs, targets):
+        # yield _input.encode('utf-8'), _target.encode('utf-8')
+        yield _input, _target
+
+
+def generator_test():
+    inputs = X_test
+    targets = y_test
+    if len(inputs) != len(targets):
+        raise AssertionError("The length of inputs and targets should be equal")
+    for _input, _target in zip(inputs, targets):
+        # yield _input.encode('utf-8'), _target.encode('utf-8')
+        yield _input, _target
+
+
+def _map_fn_train(img, target):
+    # 1. Randomly crop a [height, width] section of the image.
+    img = tf.image.random_crop(img, [24, 24, 3])
+    # 2. Randomly flip the image horizontally.
+    img = tf.image.random_flip_left_right(img)
+    # 3. Randomly change brightness.
+    img = tf.image.random_brightness(img, max_delta=63)
+    # 4. Randomly change contrast.
+    img = tf.image.random_contrast(img, lower=0.2, upper=1.8)
+    # 5. Subtract off the mean and divide by the variance of the pixels.
+    img = tf.image.per_image_standardization(img)
+    target = tf.reshape(target, ())
+    return img, target
+
+
+def _map_fn_test(img, target):
+    # 1. Crop the central [height, width] of the image.
+    img = tf.image.resize_with_pad(img, 24, 24)
+    # 2. Subtract off the mean and divide by the variance of the pixels.
+    img = tf.image.per_image_standardization(img)
+    img = tf.reshape(img, (24, 24, 3))
+    target = tf.reshape(target, ())
+    return img, target
+
+
+# dataset API and augmentation
+train_ds = tf.data.Dataset.from_generator(
+    generator_train, output_types=(tf.float32, tf.int32)
+)  # , output_shapes=((24, 24, 3), (1)))
+train_ds = train_ds.map(_map_fn_train,num_parallel_calls=multiprocessing.cpu_count())
+# train_ds = train_ds.repeat(n_epoch)
+train_ds = train_ds.shuffle(shuffle_buffer_size)
+train_ds = train_ds.prefetch(buffer_size=4096)
+train_ds = train_ds.batch(batch_size)
+# value = train_ds.make_one_shot_iterator().get_next()
+
+test_ds = tf.data.Dataset.from_generator(
+    generator_test, output_types=(tf.float32, tf.int32)
+)  # , output_shapes=((24, 24, 3), (1)))
+# test_ds = test_ds.shuffle(shuffle_buffer_size)
+test_ds = test_ds.map(_map_fn_test,num_parallel_calls=multiprocessing.cpu_count())
+# test_ds = test_ds.repeat(n_epoch)
+test_ds = test_ds.prefetch(buffer_size=4096)
+test_ds = test_ds.batch(batch_size)
+# value_test = test_ds.make_one_shot_iterator().get_next()
+
+
+class WithLoss(tl.layers.Module):
+
+    def __init__(self, net, loss_fn):
+        super(WithLoss, self).__init__()
+        self._net = net
+        self._loss_fn = loss_fn
+
+    def forward(self, data, label):
+        out = self._net(data)
+        loss = self._loss_fn(out, label)
+        return loss
+
+
+net_with_loss = WithLoss(net, loss_fn=tl.cost.softmax_cross_entropy_with_logits)
+net_with_train = tl.models.TrainOneStep(net_with_loss, optimizer, train_weights)
+
+for epoch in range(n_epoch):
+    start_time = time.time()
+    net.set_train()
+    train_loss, train_acc, n_iter = 0, 0, 0
+    for X_batch, y_batch in train_ds:
+
+        X_batch = tl.ops.convert_to_tensor(X_batch.numpy(), dtype=tl.float32)
+        y_batch = tl.ops.convert_to_tensor(y_batch.numpy(), dtype=tl.int64)
+
+        _loss_ce = net_with_train(X_batch, y_batch)
+        train_loss += _loss_ce
+
+        n_iter += 1
+        _logits = net(X_batch)
+        metrics.update(_logits, y_batch)
+        train_acc += metrics.result()
+        metrics.reset()
+        print("Epoch {} of {} took {}".format(epoch + 1, n_epoch, time.time() - start_time))
+        print("   train loss: {}".format(train_loss / n_iter))
+        print("   train acc:  {}".format(train_acc / n_iter))
+
+```