i'm studying a resnet50 tutorial, which contains the following piece of code
def create_dataset_cifar10(dataset_dir, usage, resize, batch_size, workers):
data_set = ds.Cifar10Dataset(dataset_dir=dataset_dir,
usage=usage,
num_parallel_workers=workers,
shuffle=True)
trans = []
if usage == "train":
trans += [
vision.RandomCrop((32, 32), (4, 4, 4, 4)),
vision.RandomHorizontalFlip(prob=0.5)
]
trans += [
vision.Resize(resize),
vision.Rescale(1.0 / 255.0, 0.0),
vision.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),
vision.HWC2CHW()
]
...
I'm aware that normalization is an important step of data pre-processing. I'd just like to know where do the values such as 0.2023 come from.
Some friends guessed the values referred to the means and standard deviations of the dataset. The following code is to verify the assumption.
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
train_images, test_images = train_images / 255.0, test_images / 255.0
all_images = np.vstack((test_images, train_images))
reshaped_images = np.copy(all_images)
reshaped_images = reshaped_images.reshape((3, 60000, 32, 32))
reshaped_images = reshaped_images.reshape((3, 60000 * 32 * 32))
print(np.mean(reshaped_images, axis=1))
print(np.std(reshaped_images, axis=1))
gives
[0.47562782 0.47245647 0.47361567]
[0.25186847 0.25178283 0.25087802]
while
reshaped_images = np.copy(train_images)
reshaped_images = reshaped_images.reshape((3, 50000, 32, 32))
reshaped_images = reshaped_images.reshape((3, 50000 * 32 * 32))
print(np.mean(reshaped_images, axis=1))
print(np.std(reshaped_images, axis=1))
gives
[0.47410759, 0.4726623 , 0.47331911]
[0.2520572 , 0.25201249, 0.25063239]
neither is consistent with 0.2023