AllenNLP: Machine Translation using configuration

I got inspired by this blogpost: http://www.realworldnlpbook.com/blog/building-seq2seq-machine-translation-models-using-allennlp.html

My goal was to train my own model with the language pair English to German. Additionally I wanted to use AllenNLP json configurations.

This post was tested with Python 3.7 and AllenNLP 0.8.3.

Data

First, fetch the language pair you want. See the realworldnlpbook.com blog post linked at the top. For this example I fetched ENG--DEU. The files should be stored in data.

Configuration

This configuration is pretty close to the python code from the original blogpost:

{
  "dataset_reader": {
    "type": "seq2seq",
    "source_tokenizer": {
      "type": "word"
    },
    "target_tokenizer": {
      "type": "character"
    },
    "source_token_indexers": {
      "tokens": {
        "type": "single_id"
      }
    },
    "target_token_indexers": {
      "tokens": {
        "type": "single_id",
        "namespace": "target_tokens"
      }
    }
  },
  "train_data_path": "data/tatoeba.eng_deu.train.tsv",
  "validation_data_path": "data/tatoeba.eng_deu.dev.tsv",
  "test_data_path": "data/tatoeba.eng_deu.test.tsv",
  "evaluate_on_test": true,
  "model": {
    "type": "simple_seq2seq",
    "source_embedder": {
      "type": "basic",
      "token_embedders": {
        "tokens": {
          "type": "embedding",
          "embedding_dim": 256
        }
      }
    },
    "encoder": {
      "type": "stacked_self_attention",
      "input_dim": 256,
      "hidden_dim": 256,
      "projection_dim": 128,
      "feedforward_hidden_dim": 128,
      "num_layers": 1,
      "num_attention_heads": 8
    },
    "max_decoding_steps": 20,
    "attention": {
      "type": "dot_product"
    },
    "beam_size": 8,
    "target_namespace": "target_tokens",
    "target_embedding_dim": 256
  },
  "iterator": {
    "type": "bucket",
    "batch_size": 32,
    "sorting_keys": [["source_tokens", "num_tokens"]]
  },
  "trainer": {
    "optimizer": {
      "type": "adam"
    },
    "patience": 10,
    "num_epochs": 100,
    "cuda_device": 0
  }
}

Change cuda_device to -1 if you have no GPU. But beware that the training will take a lot longer without a GPU.

The changes for another attention (as described in the original blog post) are for example:

"attention": {
    "type": "linear",
    "tensor_1_dim": 256,
    "tensor_2_dim": 256,
    "activation": "tanh"
},

instead of

"attention": {
    "type": "dot_product"
},

How to train

allennlp train mt_eng_deu.json -s output

At the end of the training there will be a model.tar.gz in the output folder

How to evaluate

run evaluate with a trained model:

allennlp evaluate model.tar.gz data/tatoeba.eng_deu.test.tsv

Predict one sentence

generate one sentence to predict:

cat <<EOF > inputs.txt
{"source": "Let's try something."}
EOF

run predict with a trained model:

allennlp predict model.tar.gz inputs.txt --predictor seq2seq

expected (as single characters!): "Lass uns etwas versuchen!"

Nikola setup using Docker

Dockerfile

FROM python:3.6
MAINTAINER X

RUN apt-get update \
      && printf 'locales locales/locales_to_be_generated multiselect en_US.UTF-8 UTF-8\nlocales locales/default_environment_locale select en_US.UTF-8\n' | debconf-set-selections \
      && apt-get install --no-install-recommends -y \
              build-essential \
              libjpeg-dev \
              libxml2-dev \
              libxslt1-dev \
              libyaml-dev \
              libzmq3-dev \
              locales \
              python3-dev \
              python3-pip \
              zlib1g-dev \
      && pip install 'Nikola[extras]' \
      && useradd -c Nikola -m -s /bin/bash nikola

WORKDIR /home/nikola/my_blog/
EXPOSE 10000

USER nikola

ENTRYPOINT ["nikola"]

docker-compose.yml

version: "3"
services:
  nikola:
    build: .
    volumes:
      - .:/home/nikola
  auto:
    build: .
    ports:
      - "10000:10000"
    volumes:
      - .:/home/nikola
    command: "auto -p 10000 -a 0.0.0.0"

Initial setup and all nikola command except auto use: docker-compose run --rm nikola <command>

The auto nikola command is used like this: docker-compose up auto. You can now browse your live updated nikola on port 10000.

Restart hanging Docker container

Restart a Docker container if there is no change in its log since last run of script below. This was used on an arm64 with an always deadlocking Docker container. The script was run via cron every 5 minutes.

#!/bin/bash

echo started at $(date)
docker logs --tail=10 DOCKERCONTAINER 2> /tmp/check_new

if cmp -s "/tmp/check_new" "/tmp/check_prev"
then
  echo "restart"
  docker restart DOCKERCONTAINER
fi
cp /tmp/check_prev /tmp/check_prev_prev
mv /tmp/check_new /tmp/check_prev