70 kB

Raw Blame History

作品类别：其他
稠密连接网络（DenseNet）
DenseNet-100模型
DenseNet-121，169，201，264，100的整体封装
DenseNet-100使用数据集Cifar-10进行训练
- 自己笔记本电脑训练过慢已提前中断内核
DenseNet-121使用的数据集ImageNet进行训练

作品类别：其他

稠密连接网络（DenseNet）

ResNet极大地改变了如何参数化深层网络中函数的观点。
稠密连接网络 (DenseNet）[Huang.Liu.Van-Der-Maaten.ea.2017] 在某种程度上是 ResNet 的逻辑扩展。让我们先从数学上了解一下。

从ResNet到DenseNet

回想一下任意函数的泰勒展开式（Taylor expansion），它把这个函数分解成越来越高阶的项。在$x$接近0时，

$$f(x) = f(0) + f'(0) x + \frac{f''(0)}{2!} x^2 + \frac{f'''(0)}{3!} x^3 + \ldots.$$

同样，ResNet 将函数展开为

$$f(\mathbf{x}) = \mathbf{x} + g(\mathbf{x}).$$

也就是说，ResNet 将 $f$ 分解为两部分：一个简单的线性项和一个更复杂的非线性项。
那么再向前拓展一步，如果我们想将 $f$ 拓展成超过两部分的信息呢？
一种方案便是 DenseNet。

如上图所示，ResNet和DenseNet 的关键区别在于，DenseNet输出是连接（用图中的 $[,]$ 表示）而不是如 ResNet 的简单相加。
因此，在应用越来越复杂的函数序列后，我们执行从 $\mathbf{x}$ 到其展开式的映射：

$$\mathbf{x} \to \left[
\mathbf{x},
f_1(\mathbf{x}),
f_2([\mathbf{x}, f_1(\mathbf{x})]), f_3([\mathbf{x}, f_1(\mathbf{x}), f_2([\mathbf{x}, f_1(\mathbf{x})])]), \ldots\right].$$

最后，将这些展开式结合到多层感知机中，再次减少特征的数量。
实现起来非常简单：我们不需要添加术语，而是将它们连接起来。
DenseNet 这个名字由变量之间的“稠密连接”而得来，最后一层与之前的所有层紧密相连。
稠密连接如上图所示。

稠密网络主要由 2 部分构成： 稠密块（dense block）和 过渡层 （transition layer）。
前者定义如何连接输入和输出，而后者则控制通道数量，使其不会太复杂。

稠密块体

DenseNet 使用了 ResNet 改良版的“批量归一化、激活和卷积”结构。
我们首先实现一下这个结构。

import tensorflow as tf
import tensorlayer as tl
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)

Using TensorFlow backend.

class BottleNeck(tl.layers.Module):
    def __init__(self, growth_rate, drop_rate):
        super(BottleNeck, self).__init__()
        self.bn1 = tl.layers.BatchNorm()
        self.conv1 = tl.layers.Conv2d(n_filter=4 * growth_rate,
                                            filter_size=(1, 1),
                                            strides=(1,1),
                                            padding="SAME")
        self.bn2 = tl.layers.BatchNorm()
        self.conv2 = tl.layers.Conv2d(n_filter=growth_rate,
                                            filter_size=(3, 3),
                                            strides=(1,1),
                                            padding="SAME")
        self.dropout = tl.layers.Dropout(keep=drop_rate)

        self.listLayers = [self.bn1,
                           tl.layers.PRelu(channel_shared=True),
                           self.conv1,
                           self.bn2,
                           tl.layers.PRelu(channel_shared=True),
                           self.conv2,
                           self.dropout]

    def forward(self, x):
        y = x
        for layer in self.listLayers:
            y = layer(y)
        y = tf.keras.layers.concatenate([x, y], axis=-1)
        return y

一个稠密块由多个卷积块组成，每个卷积块使用相同数量的输出信道。
然而，在前向传播中，我们将每个卷积块的输入和输出在通道维上连结。

class DenseBlock(tl.layers.Module):
    def __init__(self, num_layers, growth_rate, drop_rate=0.5):
        super(DenseBlock, self).__init__()
        self.num_layers = num_layers
        self.growth_rate = growth_rate
        self.drop_rate = drop_rate
        self.listLayers = []
        for _ in range(num_layers):
            self.listLayers.append(BottleNeck(growth_rate=self.growth_rate, drop_rate=self.drop_rate))

    def forward(self, x):
        for layer in self.listLayers:
            x = layer(x)
        return x

在下面的例子中，我们[定义一个]有 2 个输出通道数为 10 的 (DenseBlock)。
使用通道数为 3 的输入时，我们会得到通道数为 $3+2\times 10=23$ 的输出。
卷积块的通道数控制了输出通道数相对于输入通道数的增长，因此也被称为增长率（growth rate）。

blk = DenseBlock(2, 10)
X = tf.random.uniform((4, 8, 8, 3))
Y = blk(X)
Y.shape

[TL] BatchNorm batchnorm_9: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_9: n_filter: 40 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_10: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_10: n_filter: 10 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_3: keep: 0.500000 
[TL] PRelu prelu_9: channel_shared: True
[TL] PRelu prelu_10: channel_shared: True
[TL] BatchNorm batchnorm_11: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_11: n_filter: 40 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_12: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_12: n_filter: 10 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_4: keep: 0.500000 
[TL] PRelu prelu_11: channel_shared: True
[TL] PRelu prelu_12: channel_shared: True





TensorShape([4, 8, 8, 23])

过渡层

由于每个稠密块都会带来通道数的增加，使用过多则会过于复杂化模型。
而过渡层可以用来控制模型复杂度。
它通过 $1\times 1$ 卷积层来减小通道数，并使用步幅为 2 的平均池化层减半高和宽，从而进一步降低模型复杂度。

class TransitionLayer(tl.layers.Module):
    def __init__(self, out_channels):
        super(TransitionLayer, self).__init__()
        self.bn = tl.layers.BatchNorm()
        self.conv = tl.layers.Conv2d(n_filter=out_channels,
                                           filter_size=(1, 1),
                                           strides=(1,1),
                                           padding="same")
        self.pool = tl.layers.MaxPool2d(filter_size=(2, 2),
                                              strides=(2,2),
                                              padding="SAME")

    def forward(self, inputs):
        x = self.bn(inputs)
        x = tl.relu(x)
        x = self.conv(x)
        x = self.pool(x)
        return x

对上一个例子中稠密块的输出[使用]通道数为 10 的[过渡层]。
此时输出的通道数减为 10，高和宽均减半。

blk = TransitionLayer(10)
blk(Y).shape

[TL] BatchNorm batchnorm_13: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_13: n_filter: 10 filter_size: (1, 1) strides: (1, 1) pad: same act: No Activation
[TL] MaxPool2d maxpool2d_1: filter_size: (2, 2) strides: (2, 2) padding: SAME





TensorShape([4, 4, 4, 10])

DenseNet模型

我们来构造 DenseNet 模型。DenseNet 首先使用同 ResNet 一样的单卷积层和最大池化层。
接下来，类似于ResNet使用的4个残差块，DenseNet使用的是4个稠密块。
与 ResNet 类似，我们可以设置每个稠密块使用多少个卷积层。
这里我们设成 4，从而与7.6节的ResNet-18保持一致。
稠密块里的卷积层通道数（即增长率）设为32，所以每个稠密块将增加128个通道。

在每个模块之间，ResNet通过步幅为2的残差块减小高和宽，DenseNet则使用过渡层来减半高和宽，并减半通道数。
最后接上全局池化层和全连接层来输出结果。

class DenseNet(tl.layers.Module):
    def __init__(self, num_init_features, growth_rate, block_layers, compression_rate, drop_rate):
        super(DenseNet, self).__init__()
        self.conv = tl.layers.Conv2d(n_filter=num_init_features,
                                           filter_size=(7, 7),
                                           strides=(2,2),
                                           padding="SAME")
        self.bn = tl.layers.BatchNorm()
        self.pool = tl.layers.MaxPool2d(filter_size=(3, 3),
                                              strides=(2,2),
                                              padding="SAME")
        self.num_channels = num_init_features
        self.dense_block_1 = DenseBlock(num_layers=block_layers[0], growth_rate=growth_rate, drop_rate=drop_rate)
        self.num_channels += growth_rate * block_layers[0]
        self.num_channels = compression_rate * self.num_channels
        self.transition_1 = TransitionLayer(out_channels=int(self.num_channels))
        self.dense_block_2 = DenseBlock(num_layers=block_layers[1], growth_rate=growth_rate, drop_rate=drop_rate)
        self.num_channels += growth_rate * block_layers[1]
        self.num_channels = compression_rate * self.num_channels
        self.transition_2 = TransitionLayer(out_channels=int(self.num_channels))
        self.dense_block_3 = DenseBlock(num_layers=block_layers[2], growth_rate=growth_rate, drop_rate=drop_rate)
        self.num_channels += growth_rate * block_layers[2]
        self.num_channels = compression_rate * self.num_channels
        self.transition_3 = TransitionLayer(out_channels=int(self.num_channels))
        self.dense_block_4 = DenseBlock(num_layers=block_layers[3], growth_rate=growth_rate, drop_rate=drop_rate)

        self.avgpool = tl.layers.GlobalMeanPool2d()
        self.fc = tl.layers.Dense(n_units=10,act=tl.softmax(logits=()))

    def forward(self, inputs):
        x = self.conv(inputs)
        x = self.bn(x)
        x = tl.relu(x)
        x = self.pool(x)

        x = self.dense_block_1(x)
        x = self.transition_1(x)
        x = self.dense_block_2(x)
        x = self.transition_2(x)
        x = self.dense_block_3(x)
        x = self.transition_3(x,)
        x = self.dense_block_4(x)

        x = self.avgpool(x)
        x = self.fc(x)

        return x

DenseNet-100模型

构建在3个密集连接块上

class DenseNet_100(tl.layers.Module):
    def __init__(self, num_init_features, growth_rate, block_layers, compression_rate, drop_rate):
        super(DenseNet_100, self).__init__()
        self.conv = tl.layers.Conv2d(n_filter=num_init_features,
                                           filter_size=(7, 7),
                                           strides=(2,2),
                                           padding="SAME")
        self.bn = tl.layers.BatchNorm()
        self.pool = tl.layers.MaxPool2d(filter_size=(3, 3),
                                              strides=(2,2),
                                              padding="SAME")
        self.num_channels = num_init_features
        self.dense_block_1 = DenseBlock(num_layers=block_layers[0], growth_rate=growth_rate, drop_rate=drop_rate)
        self.num_channels += growth_rate * block_layers[0]
        self.num_channels = compression_rate * self.num_channels
        self.transition_1 = TransitionLayer(out_channels=int(self.num_channels))
        self.dense_block_2 = DenseBlock(num_layers=block_layers[1], growth_rate=growth_rate, drop_rate=drop_rate)
        self.num_channels += growth_rate * block_layers[1]
        self.num_channels = compression_rate * self.num_channels
        self.transition_2 = TransitionLayer(out_channels=int(self.num_channels))
        self.dense_block_3 = DenseBlock(num_layers=block_layers[2], growth_rate=growth_rate, drop_rate=drop_rate)
        self.num_channels += growth_rate * block_layers[2]
        self.num_channels = compression_rate * self.num_channels
        self.transition_3 = TransitionLayer(out_channels=int(self.num_channels))


        self.avgpool = tl.layers.GlobalMeanPool2d()
        self.fc = tl.layers.Dense(n_units=10,act=tl.softmax(logits=()))

    def forward(self, inputs):
        x = self.conv(inputs)
        x = self.bn(x)
        x = tl.relu(x)
        x = self.pool(x)

        x = self.dense_block_1(x)
        x = self.transition_1(x)
        x = self.dense_block_2(x)
        x = self.transition_2(x)
        x = self.dense_block_3(x)
        x = self.transition_3(x,)

        x = self.avgpool(x)
        # x = tl.layers.Dense(n_units=10,act=tl.softmax(logits=x))
        x = self.fc(x)

        return x

DenseNet-121，169，201，264，100的整体封装

def densenet(x):
    if x == 'densenet-121':
        return DenseNet(num_init_features=64, growth_rate=32, block_layers=[6, 12, 24, 16], compression_rate=0.5,
                            drop_rate=0.5)
    elif x == 'densenet-169':
        return DenseNet(num_init_features=64, growth_rate=32, block_layers=[6 , 12, 32, 32], compression_rate=0.5,
                            drop_rate=0.5)
    elif x == 'densenet-201':
        return DenseNet(num_init_features=64, growth_rate=32, block_layers=[6, 12, 48, 32], compression_rate=0.5,
                            drop_rate=0.5)
    elif x == 'densenet-264':
        return DenseNet(num_init_features=64, growth_rate=32, block_layers=[6, 12, 64, 48], compression_rate=0.5,
                            drop_rate=0.5)
    elif x=='densenet-100':
        return DenseNet_100(num_init_features=64, growth_rate=12, block_layers=[16, 16, 16], compression_rate=0.5,
                            drop_rate=0.5)

DenseNet-100使用数据集Cifar-10进行训练

直接用原项目提供的训练代码

自己笔记本电脑训练过慢已提前中断内核

import time
import multiprocessing
import tensorflow as tf
import os
os.environ['TL_BACKEND'] = 'tensorflow'

import tensorlayer as tl
from DenseNet.DenseNet_tensorlayer import densenet

tl.logging.set_verbosity(tl.logging.DEBUG)
X_train, y_train, X_test, y_test = tl.files.load_cifar10_dataset(shape=(-1, 32, 32, 3), plotable=False)

# get the network
net = densenet("densenet-100")

# training settings
batch_size = 128
n_epoch = 500
learning_rate = 0.0001
print_freq = 5
n_step_epoch = int(len(y_train) / batch_size)
n_step = n_epoch * n_step_epoch
shuffle_buffer_size = 128

train_weights = net.trainable_weights
optimizer = tl.optimizers.Adam(learning_rate)
metrics = tl.metric.Accuracy()


def generator_train():
    inputs = X_train
    targets = y_train
    if len(inputs) != len(targets):
        raise AssertionError("The length of inputs and targets should be equal")
    for _input, _target in zip(inputs, targets):
        # yield _input.encode('utf-8'), _target.encode('utf-8')
        yield _input, _target


def generator_test():
    inputs = X_test
    targets = y_test
    if len(inputs) != len(targets):
        raise AssertionError("The length of inputs and targets should be equal")
    for _input, _target in zip(inputs, targets):
        # yield _input.encode('utf-8'), _target.encode('utf-8')
        yield _input, _target


def _map_fn_train(img, target):
    # 1. Randomly crop a [height, width] section of the image.
    img = tf.image.random_crop(img, [24, 24, 3])
    # 2. Randomly flip the image horizontally.
    img = tf.image.random_flip_left_right(img)
    # 3. Randomly change brightness.
    img = tf.image.random_brightness(img, max_delta=63)
    # 4. Randomly change contrast.
    img = tf.image.random_contrast(img, lower=0.2, upper=1.8)
    # 5. Subtract off the mean and divide by the variance of the pixels.
    img = tf.image.per_image_standardization(img)
    target = tf.reshape(target, ())
    return img, target


def _map_fn_test(img, target):
    # 1. Crop the central [height, width] of the image.
    img = tf.image.resize_with_pad(img, 24, 24)
    # 2. Subtract off the mean and divide by the variance of the pixels.
    img = tf.image.per_image_standardization(img)
    img = tf.reshape(img, (24, 24, 3))
    target = tf.reshape(target, ())
    return img, target


# dataset API and augmentation
train_ds = tf.data.Dataset.from_generator(
    generator_train, output_types=(tf.float32, tf.int32)
)  # , output_shapes=((24, 24, 3), (1)))
train_ds = train_ds.map(_map_fn_train,num_parallel_calls=multiprocessing.cpu_count())
# train_ds = train_ds.repeat(n_epoch)
train_ds = train_ds.shuffle(shuffle_buffer_size)
train_ds = train_ds.prefetch(buffer_size=4096)
train_ds = train_ds.batch(batch_size)
# value = train_ds.make_one_shot_iterator().get_next()

test_ds = tf.data.Dataset.from_generator(
    generator_test, output_types=(tf.float32, tf.int32)
)  # , output_shapes=((24, 24, 3), (1)))
# test_ds = test_ds.shuffle(shuffle_buffer_size)
test_ds = test_ds.map(_map_fn_test,num_parallel_calls=multiprocessing.cpu_count())
# test_ds = test_ds.repeat(n_epoch)
test_ds = test_ds.prefetch(buffer_size=4096)
test_ds = test_ds.batch(batch_size)
# value_test = test_ds.make_one_shot_iterator().get_next()


class WithLoss(tl.layers.Module):

    def __init__(self, net, loss_fn):
        super(WithLoss, self).__init__()
        self._net = net
        self._loss_fn = loss_fn

    def forward(self, data, label):
        out = self._net(data)
        loss = self._loss_fn(out, label)
        return loss


net_with_loss = WithLoss(net, loss_fn=tl.cost.softmax_cross_entropy_with_logits)
net_with_train = tl.models.TrainOneStep(net_with_loss, optimizer, train_weights)

for epoch in range(n_epoch):
    start_time = time.time()
    net.set_train()
    train_loss, train_acc, n_iter = 0, 0, 0
    for X_batch, y_batch in train_ds:

        X_batch = tl.ops.convert_to_tensor(X_batch.numpy(), dtype=tl.float32)
        y_batch = tl.ops.convert_to_tensor(y_batch.numpy(), dtype=tl.int64)

        _loss_ce = net_with_train(X_batch, y_batch)
        train_loss += _loss_ce

        n_iter += 1
        _logits = net(X_batch)
        metrics.update(_logits, y_batch)
        train_acc += metrics.result()
        metrics.reset()
        print("Epoch {} of {} took {}".format(epoch + 1, n_epoch, time.time() - start_time))
        print("   train loss: {}".format(train_loss / n_iter))
        print("   train acc:  {}".format(train_acc / n_iter))

[TL] Load or Download cifar10 > data\cifar10
[TL] Conv2d conv2d_134: n_filter: 64 filter_size: (7, 7) strides: (2, 2) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_134: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] MaxPool2d maxpool2d_6: filter_size: (3, 3) strides: (2, 2) padding: SAME
[TL] BatchNorm batchnorm_135: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_135: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_136: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_136: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_63: keep: 0.500000 
[TL] PRelu prelu_129: channel_shared: True
[TL] PRelu prelu_130: channel_shared: True
[TL] BatchNorm batchnorm_137: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_137: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_138: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_138: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_64: keep: 0.500000 
[TL] PRelu prelu_131: channel_shared: True
[TL] PRelu prelu_132: channel_shared: True
[TL] BatchNorm batchnorm_139: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_139: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_140: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_140: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_65: keep: 0.500000 
[TL] PRelu prelu_133: channel_shared: True
[TL] PRelu prelu_134: channel_shared: True
[TL] BatchNorm batchnorm_141: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_141: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_142: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_142: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_66: keep: 0.500000 
[TL] PRelu prelu_135: channel_shared: True
[TL] PRelu prelu_136: channel_shared: True
[TL] BatchNorm batchnorm_143: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_143: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_144: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_144: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_67: keep: 0.500000 
[TL] PRelu prelu_137: channel_shared: True
[TL] PRelu prelu_138: channel_shared: True
[TL] BatchNorm batchnorm_145: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_145: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_146: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_146: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_68: keep: 0.500000 
[TL] PRelu prelu_139: channel_shared: True
[TL] PRelu prelu_140: channel_shared: True
[TL] BatchNorm batchnorm_147: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_147: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_148: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_148: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_69: keep: 0.500000 
[TL] PRelu prelu_141: channel_shared: True
[TL] PRelu prelu_142: channel_shared: True
[TL] BatchNorm batchnorm_149: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_149: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_150: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_150: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_70: keep: 0.500000 
[TL] PRelu prelu_143: channel_shared: True
[TL] PRelu prelu_144: channel_shared: True
[TL] BatchNorm batchnorm_151: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_151: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_152: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_152: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_71: keep: 0.500000 
[TL] PRelu prelu_145: channel_shared: True
[TL] PRelu prelu_146: channel_shared: True
[TL] BatchNorm batchnorm_153: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_153: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_154: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_154: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_72: keep: 0.500000 
[TL] PRelu prelu_147: channel_shared: True
[TL] PRelu prelu_148: channel_shared: True
[TL] BatchNorm batchnorm_155: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_155: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_156: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_156: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_73: keep: 0.500000 
[TL] PRelu prelu_149: channel_shared: True
[TL] PRelu prelu_150: channel_shared: True
[TL] BatchNorm batchnorm_157: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_157: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_158: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_158: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_74: keep: 0.500000 
[TL] PRelu prelu_151: channel_shared: True
[TL] PRelu prelu_152: channel_shared: True
[TL] BatchNorm batchnorm_159: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_159: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_160: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_160: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_75: keep: 0.500000 
[TL] PRelu prelu_153: channel_shared: True
[TL] PRelu prelu_154: channel_shared: True
[TL] BatchNorm batchnorm_161: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_161: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_162: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_162: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_76: keep: 0.500000 
[TL] PRelu prelu_155: channel_shared: True
[TL] PRelu prelu_156: channel_shared: True
[TL] BatchNorm batchnorm_163: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_163: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_164: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_164: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_77: keep: 0.500000 
[TL] PRelu prelu_157: channel_shared: True
[TL] PRelu prelu_158: channel_shared: True
[TL] BatchNorm batchnorm_165: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_165: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_166: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_166: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_78: keep: 0.500000 
[TL] PRelu prelu_159: channel_shared: True
[TL] PRelu prelu_160: channel_shared: True
[TL] BatchNorm batchnorm_167: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_167: n_filter: 128 filter_size: (1, 1) strides: (1, 1) pad: same act: No Activation
[TL] MaxPool2d maxpool2d_7: filter_size: (2, 2) strides: (2, 2) padding: SAME
[TL] BatchNorm batchnorm_168: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_168: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_169: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_169: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_79: keep: 0.500000 
[TL] PRelu prelu_161: channel_shared: True
[TL] PRelu prelu_162: channel_shared: True
[TL] BatchNorm batchnorm_170: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_170: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_171: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_171: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_80: keep: 0.500000 
[TL] PRelu prelu_163: channel_shared: True
[TL] PRelu prelu_164: channel_shared: True
[TL] BatchNorm batchnorm_172: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_172: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_173: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_173: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_81: keep: 0.500000 
[TL] PRelu prelu_165: channel_shared: True
[TL] PRelu prelu_166: channel_shared: True
[TL] BatchNorm batchnorm_174: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_174: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_175: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_175: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_82: keep: 0.500000 
[TL] PRelu prelu_167: channel_shared: True
[TL] PRelu prelu_168: channel_shared: True
[TL] BatchNorm batchnorm_176: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_176: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_177: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_177: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_83: keep: 0.500000 
[TL] PRelu prelu_169: channel_shared: True
[TL] PRelu prelu_170: channel_shared: True
[TL] BatchNorm batchnorm_178: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_178: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_179: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_179: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_84: keep: 0.500000 
[TL] PRelu prelu_171: channel_shared: True
[TL] PRelu prelu_172: channel_shared: True
[TL] BatchNorm batchnorm_180: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_180: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_181: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_181: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_85: keep: 0.500000 
[TL] PRelu prelu_173: channel_shared: True
[TL] PRelu prelu_174: channel_shared: True
[TL] BatchNorm batchnorm_182: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_182: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_183: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_183: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_86: keep: 0.500000 
[TL] PRelu prelu_175: channel_shared: True
[TL] PRelu prelu_176: channel_shared: True
[TL] BatchNorm batchnorm_184: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_184: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_185: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_185: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_87: keep: 0.500000 
[TL] PRelu prelu_177: channel_shared: True
[TL] PRelu prelu_178: channel_shared: True
[TL] BatchNorm batchnorm_186: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_186: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_187: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_187: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_88: keep: 0.500000 
[TL] PRelu prelu_179: channel_shared: True
[TL] PRelu prelu_180: channel_shared: True
[TL] BatchNorm batchnorm_188: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_188: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_189: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_189: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_89: keep: 0.500000 
[TL] PRelu prelu_181: channel_shared: True
[TL] PRelu prelu_182: channel_shared: True
[TL] BatchNorm batchnorm_190: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_190: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_191: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_191: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_90: keep: 0.500000 
[TL] PRelu prelu_183: channel_shared: True
[TL] PRelu prelu_184: channel_shared: True
[TL] BatchNorm batchnorm_192: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_192: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_193: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_193: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_91: keep: 0.500000 
[TL] PRelu prelu_185: channel_shared: True
[TL] PRelu prelu_186: channel_shared: True
[TL] BatchNorm batchnorm_194: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_194: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_195: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_195: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_92: keep: 0.500000 
[TL] PRelu prelu_187: channel_shared: True
[TL] PRelu prelu_188: channel_shared: True
[TL] BatchNorm batchnorm_196: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_196: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_197: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_197: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_93: keep: 0.500000 
[TL] PRelu prelu_189: channel_shared: True
[TL] PRelu prelu_190: channel_shared: True
[TL] BatchNorm batchnorm_198: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_198: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_199: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_199: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_94: keep: 0.500000 
[TL] PRelu prelu_191: channel_shared: True
[TL] PRelu prelu_192: channel_shared: True
[TL] BatchNorm batchnorm_200: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_200: n_filter: 160 filter_size: (1, 1) strides: (1, 1) pad: same act: No Activation
[TL] MaxPool2d maxpool2d_8: filter_size: (2, 2) strides: (2, 2) padding: SAME
[TL] BatchNorm batchnorm_201: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_201: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_202: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_202: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_95: keep: 0.500000 
[TL] PRelu prelu_193: channel_shared: True
[TL] PRelu prelu_194: channel_shared: True
[TL] BatchNorm batchnorm_203: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_203: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_204: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_204: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_96: keep: 0.500000 
[TL] PRelu prelu_195: channel_shared: True
[TL] PRelu prelu_196: channel_shared: True
[TL] BatchNorm batchnorm_205: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_205: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_206: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_206: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_97: keep: 0.500000 
[TL] PRelu prelu_197: channel_shared: True
[TL] PRelu prelu_198: channel_shared: True
[TL] BatchNorm batchnorm_207: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_207: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_208: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_208: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_98: keep: 0.500000 
[TL] PRelu prelu_199: channel_shared: True
[TL] PRelu prelu_200: channel_shared: True
[TL] BatchNorm batchnorm_209: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_209: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_210: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_210: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_99: keep: 0.500000 
[TL] PRelu prelu_201: channel_shared: True
[TL] PRelu prelu_202: channel_shared: True
[TL] BatchNorm batchnorm_211: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_211: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_212: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_212: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_100: keep: 0.500000 
[TL] PRelu prelu_203: channel_shared: True
[TL] PRelu prelu_204: channel_shared: True
[TL] BatchNorm batchnorm_213: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_213: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_214: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_214: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_101: keep: 0.500000 
[TL] PRelu prelu_205: channel_shared: True
[TL] PRelu prelu_206: channel_shared: True
[TL] BatchNorm batchnorm_215: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_215: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_216: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_216: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_102: keep: 0.500000 
[TL] PRelu prelu_207: channel_shared: True
[TL] PRelu prelu_208: channel_shared: True
[TL] BatchNorm batchnorm_217: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_217: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_218: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_218: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_103: keep: 0.500000 
[TL] PRelu prelu_209: channel_shared: True
[TL] PRelu prelu_210: channel_shared: True
[TL] BatchNorm batchnorm_219: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_219: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_220: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_220: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_104: keep: 0.500000 
[TL] PRelu prelu_211: channel_shared: True
[TL] PRelu prelu_212: channel_shared: True
[TL] BatchNorm batchnorm_221: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_221: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_222: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_222: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_105: keep: 0.500000 
[TL] PRelu prelu_213: channel_shared: True
[TL] PRelu prelu_214: channel_shared: True
[TL] BatchNorm batchnorm_223: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_223: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_224: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_224: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_106: keep: 0.500000 
[TL] PRelu prelu_215: channel_shared: True
[TL] PRelu prelu_216: channel_shared: True
[TL] BatchNorm batchnorm_225: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_225: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_226: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_226: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_107: keep: 0.500000 
[TL] PRelu prelu_217: channel_shared: True
[TL] PRelu prelu_218: channel_shared: True
[TL] BatchNorm batchnorm_227: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_227: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_228: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_228: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_108: keep: 0.500000 
[TL] PRelu prelu_219: channel_shared: True
[TL] PRelu prelu_220: channel_shared: True
[TL] BatchNorm batchnorm_229: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_229: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_230: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_230: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_109: keep: 0.500000 
[TL] PRelu prelu_221: channel_shared: True
[TL] PRelu prelu_222: channel_shared: True
[TL] BatchNorm batchnorm_231: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_231: n_filter: 48 filter_size: (1, 1) strides: (1, 1) pad: SAME act: No Activation
[TL] BatchNorm batchnorm_232: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_232: n_filter: 12 filter_size: (3, 3) strides: (1, 1) pad: SAME act: No Activation
[TL] Dropout dropout_110: keep: 0.500000 
[TL] PRelu prelu_223: channel_shared: True
[TL] PRelu prelu_224: channel_shared: True
[TL] BatchNorm batchnorm_233: decay: 0.900000 epsilon: 0.000010 act: No Activation is_train: True
[TL] Conv2d conv2d_233: n_filter: 176 filter_size: (1, 1) strides: (1, 1) pad: same act: No Activation
[TL] MaxPool2d maxpool2d_9: filter_size: (2, 2) strides: (2, 2) padding: SAME
[TL] GlobalMeanPool2d globalmeanpool2d_2
[TL] Dense  dense_2: 10 EagerTensor
Epoch 1 of 500 took 2.472132682800293
   train loss: 2.3026280403137207
   train acc:  0.0859375
Epoch 1 of 500 took 3.1664297580718994
   train loss: 2.302629232406616
   train acc:  0.06640625
Epoch 1 of 500 took 3.8870487213134766
   train loss: 2.302507162094116
   train acc:  0.0755208358168602
Epoch 1 of 500 took 4.617677927017212
   train loss: 2.3025760650634766
   train acc:  0.087890625
Epoch 1 of 500 took 5.326868057250977
   train loss: 2.3025622367858887
   train acc:  0.09218750149011612
Epoch 1 of 500 took 6.068540573120117
   train loss: 2.3026044368743896
   train acc:  0.08984375
Epoch 1 of 500 took 6.782266139984131
   train loss: 2.3025448322296143
   train acc:  0.0881696417927742
Epoch 1 of 500 took 7.485439300537109
   train loss: 2.3025074005126953
   train acc:  0.0869140625
Epoch 1 of 500 took 8.213810443878174
   train loss: 2.302518844604492
   train acc:  0.0902777761220932
Epoch 1 of 500 took 8.954435586929321
   train loss: 2.302535057067871
   train acc:  0.08906249701976776
Epoch 1 of 500 took 9.724130630493164
   train loss: 2.302517890930176
   train acc:  0.09375
Epoch 1 of 500 took 10.50143551826477
   train loss: 2.302520751953125
   train acc:  0.095703125
Epoch 1 of 500 took 11.254729986190796
   train loss: 2.302520751953125
   train acc:  0.09555288404226303
Epoch 1 of 500 took 12.040328741073608
   train loss: 2.302544355392456
   train acc:  0.0959821417927742
Epoch 1 of 500 took 12.847611427307129
   train loss: 2.302537679672241
   train acc:  0.09687499701976776
Epoch 1 of 500 took 13.565277814865112
   train loss: 2.302561044692993
   train acc:  0.1005859375
Epoch 1 of 500 took 14.284465312957764
   train loss: 2.302546977996826
   train acc:  0.0992647036910057
Epoch 1 of 500 took 14.988042116165161
   train loss: 2.30257511138916
   train acc:  0.1002604141831398
Epoch 1 of 500 took 15.723733901977539
   train loss: 2.3025801181793213
   train acc:  0.10115131735801697
Epoch 1 of 500 took 16.431028127670288
   train loss: 2.3025593757629395
   train acc:  0.10234375298023224
Epoch 1 of 500 took 17.1357638835907
   train loss: 2.302570343017578
   train acc:  0.103050597012043
Epoch 1 of 500 took 17.839925050735474
   train loss: 2.3025553226470947
   train acc:  0.10475852340459824
Epoch 1 of 500 took 18.530513525009155
   train loss: 2.30254864692688
   train acc:  0.10529891401529312
Epoch 1 of 500 took 19.220801830291748
   train loss: 2.302539825439453
   train acc:  0.1048177108168602
Epoch 1 of 500 took 19.934443950653076
   train loss: 2.30253267288208
   train acc:  0.10343749821186066
Epoch 1 of 500 took 20.631637573242188
   train loss: 2.3025319576263428
   train acc:  0.10306490212678909
Epoch 1 of 500 took 21.35742998123169
   train loss: 2.302518129348755
   train acc:  0.10387731343507767
Epoch 1 of 500 took 22.061012983322144
   train loss: 2.302506923675537
   train acc:  0.1040736585855484
Epoch 1 of 500 took 22.77148962020874
   train loss: 2.302518129348755
   train acc:  0.10317888110876083
Epoch 1 of 500 took 23.499772548675537
   train loss: 2.302516222000122
   train acc:  0.10260416567325592
Epoch 1 of 500 took 24.261035680770874
   train loss: 2.302506446838379
   train acc:  0.1020665317773819
Epoch 1 of 500 took 25.001662254333496
   train loss: 2.3024935722351074
   train acc:  0.10302734375
Epoch 1 of 500 took 25.749852180480957
   train loss: 2.302504539489746
   train acc:  0.10321969538927078
Epoch 1 of 500 took 26.513017654418945
   train loss: 2.302511215209961
   train acc:  0.10363051295280457
Epoch 1 of 500 took 27.220292568206787
   train loss: 2.3025102615356445
   train acc:  0.10312499850988388
Epoch 1 of 500 took 28.074881076812744
   train loss: 2.302503824234009
   train acc:  0.1028645858168602
Epoch 1 of 500 took 28.793709993362427
   train loss: 2.302511215209961
   train acc:  0.1034628376364708
Epoch 1 of 500 took 29.56246781349182
   train loss: 2.302523136138916
   train acc:  0.10259046405553818
Epoch 1 of 500 took 30.309444665908813
   train loss: 2.302518129348755
   train acc:  0.10316506773233414
Epoch 1 of 500 took 31.03504705429077
   train loss: 2.3025131225585938
   train acc:  0.10332031548023224
Epoch 1 of 500 took 31.74468207359314
   train loss: 2.3025143146514893
   train acc:  0.10251524299383163
Epoch 1 of 500 took 32.44689440727234
   train loss: 2.3025083541870117
   train acc:  0.102492555975914
Epoch 1 of 500 took 33.181196451187134
   train loss: 2.3025104999542236
   train acc:  0.10247092694044113
Epoch 1 of 500 took 33.900819301605225
   train loss: 2.3025102615356445
   train acc:  0.1015625
Epoch 1 of 500 took 34.656460762023926
   train loss: 2.302510976791382
   train acc:  0.10086805373430252
Epoch 1 of 500 took 35.36271095275879
   train loss: 2.3025131225585938
   train acc:  0.10071331262588501
Epoch 1 of 500 took 36.06831693649292
   train loss: 2.30250883102417
   train acc:  0.10073138028383255
Epoch 1 of 500 took 36.776965618133545
   train loss: 2.3025097846984863
   train acc:  0.10009765625
Epoch 1 of 500 took 37.47527098655701
   train loss: 2.302520990371704
   train acc:  0.09948979318141937
Epoch 1 of 500 took 38.20156741142273
   train loss: 2.3025150299072266
   train acc:  0.10000000149011612
Epoch 1 of 500 took 38.94017839431763
   train loss: 2.3025131225585938
   train acc:  0.10018382221460342
Epoch 1 of 500 took 39.68779230117798
   train loss: 2.3025169372558594
   train acc:  0.10036057978868484
Epoch 1 of 500 took 40.50718331336975
   train loss: 2.302532434463501
   train acc:  0.09979363530874252
Epoch 1 of 500 took 41.23833727836609
   train loss: 2.302534818649292
   train acc:  0.0998263880610466
Epoch 1 of 500 took 42.00189256668091
   train loss: 2.3025405406951904
   train acc:  0.09985795617103577
Epoch 1 of 500 took 42.86620116233826
   train loss: 2.3025457859039307
   train acc:  0.1001674085855484
Epoch 1 of 500 took 43.598840951919556
   train loss: 2.3025429248809814
   train acc:  0.10005482286214828
Epoch 1 of 500 took 44.333006620407104
   train loss: 2.302553415298462
   train acc:  0.09940733015537262
Epoch 1 of 500 took 45.103665828704834
   train loss: 2.3025565147399902
   train acc:  0.09957627207040787
Epoch 1 of 500 took 45.88186049461365
   train loss: 2.3025529384613037
   train acc:  0.10013020783662796
Epoch 1 of 500 took 46.64212989807129
   train loss: 2.302560806274414
   train acc:  0.09964139014482498



---------------------------------------------------------------------------

KeyboardInterrupt                         Traceback (most recent call last)

~\AppData\Local\Temp/ipykernel_25040/640882711.py in <module>
    120         y_batch = tl.ops.convert_to_tensor(y_batch.numpy(), dtype=tl.int64)
    121 
--> 122         _loss_ce = net_with_train(X_batch, y_batch)
    123         train_loss += _loss_ce
    124 


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\models\core.py in __call__(self, data, label)
    525 
    526     def __call__(self, data, label):
--> 527         loss = self.net_with_train(data, label)
    528         return loss


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\models\core.py in __call__(self, data, label)
    448     def __call__(self, data, label):
    449         with tf.GradientTape() as tape:
--> 450             loss = self.net_with_loss(data, label)
    451         grad = tape.gradient(loss, self.train_weights)
    452         self.optimzer.apply_gradients(zip(grad, self.train_weights))


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\core\core_tensorflow.py in __call__(self, inputs, *args, **kwargs)
    164     def __call__(self, inputs, *args, **kwargs):
    165 
--> 166         output = self.forward(inputs, *args, **kwargs)
    167 
    168         return output


~\AppData\Local\Temp/ipykernel_25040/640882711.py in forward(self, data, label)
    103 
    104     def forward(self, data, label):
--> 105         out = self._net(data)
    106         loss = self._loss_fn(out, label)
    107         return loss


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\core\core_tensorflow.py in __call__(self, inputs, *args, **kwargs)
    164     def __call__(self, inputs, *args, **kwargs):
    165 
--> 166         output = self.forward(inputs, *args, **kwargs)
    167 
    168         return output


D:\DeepLearning_tensorflow\DenseNet\DenseNet_tensorlayer.py in forward(self, inputs)
    156         x = self.pool(x)
    157 
--> 158         x = self.dense_block_1(x)
    159         x = self.transition_1(x)
    160         x = self.dense_block_2(x)


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\core\core_tensorflow.py in __call__(self, inputs, *args, **kwargs)
    164     def __call__(self, inputs, *args, **kwargs):
    165 
--> 166         output = self.forward(inputs, *args, **kwargs)
    167 
    168         return output


D:\DeepLearning_tensorflow\DenseNet\DenseNet_tensorlayer.py in forward(self, x)
     48     def forward(self, x):
     49         for layer in self.listLayers:
---> 50             x = layer(x)
     51         return x
     52 


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\core\core_tensorflow.py in __call__(self, inputs, *args, **kwargs)
    164     def __call__(self, inputs, *args, **kwargs):
    165 
--> 166         output = self.forward(inputs, *args, **kwargs)
    167 
    168         return output


D:\DeepLearning_tensorflow\DenseNet\DenseNet_tensorlayer.py in forward(self, x)
     31         y = x
     32         for layer in self.listLayers:
---> 33             y = layer(y)
     34         y = tf.keras.layers.concatenate([x, y], axis=-1)
     35         return y


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\core\core_tensorflow.py in __call__(self, inputs, *args, **kwargs)
    164     def __call__(self, inputs, *args, **kwargs):
    165 
--> 166         output = self.forward(inputs, *args, **kwargs)
    167 
    168         return output


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\layers\normalization.py in forward(self, inputs)
    193                 moving_var=self.moving_var, num_features=self.num_features, data_format=self.data_format, is_train=False
    194             )
--> 195         outputs = self.batchnorm(inputs=inputs)
    196         if self.act_init_flag:
    197             outputs = self.act(outputs)


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorlayer\backend\ops\tensorflow_nn.py in __call__(self, inputs)
   1580                 self.moving_mean, mean, self.decay, zero_debias=False
   1581             )
-> 1582             self.moving_var = moving_averages.assign_moving_average(self.moving_var, var, self.decay, zero_debias=False)
   1583             outputs = batch_normalization(inputs, mean, var, self.beta, self.gamma, self.epsilon, self.data_format)
   1584         else:


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\moving_averages.py in assign_moving_average(variable, value, decay, zero_debias, name)
    109         return update(strategy, v, value)
    110 
--> 111       return replica_context.merge_call(merge_fn, args=(variable, value))
    112     else:
    113       strategy = distribution_strategy_context.get_cross_replica_context()


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py in merge_call(self, merge_fn, args, kwargs)
   2713     merge_fn = autograph.tf_convert(
   2714         merge_fn, autograph_ctx.control_status_ctx(), convert_by_default=False)
-> 2715     return self._merge_call(merge_fn, args, kwargs)
   2716 
   2717   def _merge_call(self, merge_fn, args, kwargs):


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py in _merge_call(self, merge_fn, args, kwargs)
   2720         distribution_strategy_context._CrossReplicaThreadMode(self._strategy))  # pylint: disable=protected-access
   2721     try:
-> 2722       return merge_fn(self._strategy, *args, **kwargs)
   2723     finally:
   2724       _pop_per_thread_mode()


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\autograph\impl\api.py in wrapper(*args, **kwargs)
    273   def wrapper(*args, **kwargs):
    274     with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.UNSPECIFIED):
--> 275       return func(*args, **kwargs)
    276 
    277   if inspect.isfunction(func) or inspect.ismethod(func):


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\moving_averages.py in merge_fn(strategy, v, value)
    107         value = strategy.extended.reduce_to(ds_reduce_util.ReduceOp.MEAN, value,
    108                                             v)
--> 109         return update(strategy, v, value)
    110 
    111       return replica_context.merge_call(merge_fn, args=(variable, value))


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\moving_averages.py in update(strategy, v, value)
     98         return _zero_debias(strategy, v, value, decay)
     99       else:
--> 100         return _update(strategy, v, update_fn, args=(value,))
    101 
    102     replica_context = distribution_strategy_context.get_replica_context()


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\moving_averages.py in _update(strategy, var, update_fn, args)
    190     return update_fn(var, *args)
    191   else:
--> 192     return strategy.extended.update(var, update_fn, args)
    193 
    194 


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py in update(self, var, fn, args, kwargs, group)
   2298         fn, autograph_ctx.control_status_ctx(), convert_by_default=False)
   2299     with self._container_strategy().scope():
-> 2300       return self._update(var, fn, args, kwargs, group)
   2301 
   2302   def _update(self, var, fn, args, kwargs, group):


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py in _update(self, var, fn, args, kwargs, group)
   2953     # The implementations of _update() and _update_non_slot() are identical
   2954     # except _update() passes `var` as the first argument to `fn()`.
-> 2955     return self._update_non_slot(var, fn, (var,) + tuple(args), kwargs, group)
   2956 
   2957   def _update_non_slot(self, colocate_with, fn, args, kwargs, should_group):


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py in _update_non_slot(self, colocate_with, fn, args, kwargs, should_group)
   2959     # once that value is used for something.
   2960     with UpdateContext(colocate_with):
-> 2961       result = fn(*args, **kwargs)
   2962       if should_group:
   2963         return result


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\autograph\impl\api.py in wrapper(*args, **kwargs)
    273   def wrapper(*args, **kwargs):
    274     with ag_ctx.ControlStatusCtx(status=ag_ctx.Status.UNSPECIFIED):
--> 275       return func(*args, **kwargs)
    276 
    277   if inspect.isfunction(func) or inspect.ismethod(func):


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\training\moving_averages.py in update_fn(v, value)
     92 
     93     def update_fn(v, value):
---> 94       return state_ops.assign_sub(v, (v - value) * decay, name=scope)
     95 
     96     def update(strategy, v, value):


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\ops\state_ops.py in assign_sub(ref, value, use_locking, name)
    162     return gen_state_ops.assign_sub(
    163         ref, value, use_locking=use_locking, name=name)
--> 164   return ref.assign_sub(value)
    165 
    166 


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py in assign_sub(self, delta, use_locking, name, read_value)
   1967     with ops.control_dependencies([self._parent_op]):
   1968       return super(_UnreadVariable, self).assign_sub(delta, use_locking, name,
-> 1969                                                      read_value)
   1970 
   1971   def assign_add(self, delta, use_locking=None, name=None, read_value=True):


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py in assign_sub(self, delta, use_locking, name, read_value)
    799       assign_sub_op = gen_resource_variable_ops.assign_sub_variable_op(
    800           self.handle, ops.convert_to_tensor(delta, dtype=self.dtype),
--> 801           name=name)
    802     if read_value:
    803       return self._lazy_read(assign_sub_op)


D:\ProgramDate\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\ops\gen_resource_variable_ops.py in assign_sub_variable_op(resource, value, name)
     92       _result = pywrap_tfe.TFE_Py_FastPathExecute(
     93         _ctx._context_handle, tld.device_name, "AssignSubVariableOp", name,
---> 94         tld.op_callbacks, resource, value)
     95       return _result
     96     except _core._NotOkStatusException as e:


KeyboardInterrupt:

DenseNet-121使用的数据集ImageNet进行训练

import time
import multiprocessing
import tensorflow as tf
import os
os.environ['TL_BACKEND'] = 'tensorflow'

import tensorlayer as tl
from DenseNet.DenseNet_tensorlayer import densenet

tl.logging.set_verbosity(tl.logging.DEBUG)
def load_ImageNet_dataset(shape=(-1, 256, 256, 3), plotable=False):
    '''已加载到本地的ImageNet数据集'''
    return X_train, y_train, X_test, y_test

# get the network
net = densenet("densenet-121")

X_train, y_train, X_test, y_test = load_ImageNet_dataset(shape=(-1, 256, 256, 3), plotable=False)
# training settings
batch_size = 128
n_epoch = 500
learning_rate = 0.0001
print_freq = 5
n_step_epoch = int(len(y_train) / batch_size)
n_step = n_epoch * n_step_epoch
shuffle_buffer_size = 128

train_weights = net.trainable_weights
optimizer = tl.optimizers.Adam(learning_rate)
metrics = tl.metric.Accuracy()


def generator_train():
    inputs = X_train
    targets = y_train
    if len(inputs) != len(targets):
        raise AssertionError("The length of inputs and targets should be equal")
    for _input, _target in zip(inputs, targets):
        # yield _input.encode('utf-8'), _target.encode('utf-8')
        yield _input, _target


def generator_test():
    inputs = X_test
    targets = y_test
    if len(inputs) != len(targets):
        raise AssertionError("The length of inputs and targets should be equal")
    for _input, _target in zip(inputs, targets):
        # yield _input.encode('utf-8'), _target.encode('utf-8')
        yield _input, _target


def _map_fn_train(img, target):
    # 1. Randomly crop a [height, width] section of the image.
    img = tf.image.random_crop(img, [24, 24, 3])
    # 2. Randomly flip the image horizontally.
    img = tf.image.random_flip_left_right(img)
    # 3. Randomly change brightness.
    img = tf.image.random_brightness(img, max_delta=63)
    # 4. Randomly change contrast.
    img = tf.image.random_contrast(img, lower=0.2, upper=1.8)
    # 5. Subtract off the mean and divide by the variance of the pixels.
    img = tf.image.per_image_standardization(img)
    target = tf.reshape(target, ())
    return img, target


def _map_fn_test(img, target):
    # 1. Crop the central [height, width] of the image.
    img = tf.image.resize_with_pad(img, 24, 24)
    # 2. Subtract off the mean and divide by the variance of the pixels.
    img = tf.image.per_image_standardization(img)
    img = tf.reshape(img, (24, 24, 3))
    target = tf.reshape(target, ())
    return img, target


# dataset API and augmentation
train_ds = tf.data.Dataset.from_generator(
    generator_train, output_types=(tf.float32, tf.int32)
)  # , output_shapes=((24, 24, 3), (1)))
train_ds = train_ds.map(_map_fn_train,num_parallel_calls=multiprocessing.cpu_count())
# train_ds = train_ds.repeat(n_epoch)
train_ds = train_ds.shuffle(shuffle_buffer_size)
train_ds = train_ds.prefetch(buffer_size=4096)
train_ds = train_ds.batch(batch_size)
# value = train_ds.make_one_shot_iterator().get_next()

test_ds = tf.data.Dataset.from_generator(
    generator_test, output_types=(tf.float32, tf.int32)
)  # , output_shapes=((24, 24, 3), (1)))
# test_ds = test_ds.shuffle(shuffle_buffer_size)
test_ds = test_ds.map(_map_fn_test,num_parallel_calls=multiprocessing.cpu_count())
# test_ds = test_ds.repeat(n_epoch)
test_ds = test_ds.prefetch(buffer_size=4096)
test_ds = test_ds.batch(batch_size)
# value_test = test_ds.make_one_shot_iterator().get_next()


class WithLoss(tl.layers.Module):

    def __init__(self, net, loss_fn):
        super(WithLoss, self).__init__()
        self._net = net
        self._loss_fn = loss_fn

    def forward(self, data, label):
        out = self._net(data)
        loss = self._loss_fn(out, label)
        return loss


net_with_loss = WithLoss(net, loss_fn=tl.cost.softmax_cross_entropy_with_logits)
net_with_train = tl.models.TrainOneStep(net_with_loss, optimizer, train_weights)

for epoch in range(n_epoch):
    start_time = time.time()
    net.set_train()
    train_loss, train_acc, n_iter = 0, 0, 0
    for X_batch, y_batch in train_ds:

        X_batch = tl.ops.convert_to_tensor(X_batch.numpy(), dtype=tl.float32)
        y_batch = tl.ops.convert_to_tensor(y_batch.numpy(), dtype=tl.int64)

        _loss_ce = net_with_train(X_batch, y_batch)
        train_loss += _loss_ce

        n_iter += 1
        _logits = net(X_batch)
        metrics.update(_logits, y_batch)
        train_acc += metrics.result()
        metrics.reset()
        print("Epoch {} of {} took {}".format(epoch + 1, n_epoch, time.time() - start_time))
        print("   train loss: {}".format(train_loss / n_iter))
        print("   train acc:  {}".format(train_acc / n_iter))

70 kB Raw Blame History