MNIST with binary color

Andreas

2020-02-23 16:00

A few days ago in our local machine learning user group we discussed the decrease of color in MNIST and the possible decrease in accuracy.

To evaluate this I changed the dataloader in the PyTorch MNIST example to change the grayscale color to only black and white:

train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=True, download=True,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,)),
                       lambda x: x>0,
                       lambda x: x.float(),
            ])),
    batch_size=args.batch_size, shuffle=True, **kwargs)
test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=False, transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,)),
                       lambda x: x>0,
                       lambda x: x.float(),
                   ])),
    batch_size=args.test_batch_size, shuffle=True, **kwargs)

The first lambda binarizes the values in every tensor and the second lambda converts the 0s and 1s to float because the inputs have to be float.

To verify if the datareader modification works I plotted image 7777 from both datareaders:

x, _ = test_loader.dataset[7777]
plt.imshow(x.numpy()[0], cmap='gray')

After training with the original loader without any modification for 14 epochs the result for the test set is:

Average loss: 0.0289, Accuracy: 9911/10000 (99%).

With the binary modification in the dataloader the results for the test set is only a bit decreased:

Average loss: 0.0374, Accuracy: 9889/10000 (99%).

Either I missed something or the difference isn't that big.