ffmpeg - images to videos and vice versa

To get training material for an image classifier I recorded a few videos with my GoPro and converted them later to images using ffmpeg.

My example usage here:

ffmpeg -i GOPR0001.MP4 -s 1280x720 -vf fps=10 images/G01%04d_720p.png

The 4k MP4-video is converted to 720p with 10 png images per second. A frame number is added to the image name with %04d.

After written some Python-opencv code to generate data for training I wanted to have a preview video to show my collaborators. So the other way around: a lot of images converted to a video.

cat images/* | ffmpeg -f image2pipe -framerate 5 -i - -f webm -vcodec libvpx-vp9 video.webm

The images are piped to ffmpeg and a webm(vp9) video is generated with a framerate of 5 images per second.

I experimented a bit with codecs but I found none that worked on every plattform out of the box. The biggest problem here is the slack-android-app in-app-video-player. The webm video at least gives a preview image and works on all other plattforms.

PyTorch / Torchvision Learnings

As part of our machine learning user group workshops I learned a few things about PyTorch and torchvision.

This post describes some of them.

To crop images as part of the transform step I wanted to use the functional transforms. This is how I used them as a class in a transform.Compose:

class CustomCropTransform:
    def __call__(self, img):
        return torchvision.transforms.functional.crop(img, top=0, left=0, height=75, width=133)

transforms = torchvision.transforms.Compose([
    torchvision.transforms.Resize(100),
    CustomCropTransform(),
    torchvision.transforms.ToTensor()
])

# example of usage
train_dataset = torchvision.datasets.ImageFolder("train", transform=transforms)

Here the image is first resized to 100x133 and then the bottom 25 pixels got removed by the CustomCropTransform

The second learning was about WeightedRandomSampler. This is useful if the classes in your training dataset are not the same number of items.

from torch.utils.data.sampler import WeightedRandomSampler
import numpy as np

# get targets of all train_datasets
train_targets = [target for _, target in train_dataset]

# use bincount from numpy to count the number of items in each class
counts = np.bincount(train_targets)

# get the weight for each class. this returns a matrix of weights
weight = 1. / counts

# now weight all the training items
train_samples_weight = torch.tensor([weight[t] for t in train_targets])

# use the weights. replacement=True allows using of samples more than once
train_sampler = WeightedRandomSampler(train_samples_weight, len(train_targets), replacement=True)

# now use the train_sampler in the dataloader
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=16, sampler=train_sampler)

The workshop series consists of problem driven walkthroughs: https://github.com/mlugs/machine-learning-workshop

Compare Image Labellers Votes

After using the image labeller tool with more labellers than only me, there was a need to compare the resulting yaml files.

The image labeller is a pygame-based tool to show images and add boolean based labels to the images. This is all described in this blogpost: https://madflex.de/image-tagging/.

To compare the files I wrote a script to reads the tags.yml from every labeller and exports a csv that looks like this:

screenshot

The screenshot is from a csv file uploaded to Github for easier preview.

But the actually interesting things are easier to query/generate on the shell:

# sum of how many blurred/not_blurred we agreed on with a majority
cat comparision.csv | grep -e ".*,True" | cut -d"," -f 8 | sort | uniq -c

# create train/test folders
mkdir -p {train,test}/{blurred,not_blurred}

# generate script to copy majority voted files to their train folder
#                     get only this cols    only non-test imgs  only majority=True   only col 1,3
cat comparision.csv | cut -d"," -f1,2,8,9 | grep "JPG,False" | grep "blurred,True" | cut -d"," -f1,3 | awk -F "," '{ print "cp " $1 " train/"$2 }' > run.sh

# run the generated script
sh run.sh

# and test files based only on majority decision (for all test images)
#                     get only this cols  only test imgs
cat comparision.csv | cut -d"," -f1,2,8 | grep "JPG,True" | awk -F "," '{ print "cp " $1 " test/"$3 }' > run.sh

# run the generated script
sh run.sh

The code of the comparision script is in the image-tagger repository: https://github.com/mfa/image-tagger/blob/main/compare_tags.py.