Goodreads to Org-Mode

After announcing that all Goodreads API keys are revoked and no new API keys will be issued, it is time to move away. Without API access the migration has to be done by parsing HTML.

All code is in this Github repository: https://github.com/mfa/goodreads-to-orgmode

First step is to download all "My Books" pages. Set them to 100 books per page and download the HTML files. Put this files in the data folder. The convert.py script transfers this html files to one big Org-Mode file.

For example the Entry for "The God Engines" from Scalzi out of my library has this internal representation:

{
  "author": "Scalzi, John",
  "date_added": "2017-02-20",
  "date_read": "2017-02-20",
  "isbn13": "9781596062801",
  "rating": "4 of 5",
  "title": "The God Engines",
  "url": "https://www.goodreads.com/book/show/6470498-the-god-engines"
}

And the resulting Org-Mode block looks like:

*** The God Engines
:PROPERTIES:
:Author: Scalzi, John
:Added: 2017-02-20
:Read: 2017-02-20
:ISBN13: 9781596062801
:Rating: 4 of 5
:Url: https://www.goodreads.com/book/show/6470498-the-god-engines
:END:

After conversion I split Fiction from Non-Fiction books by creating new headlines and moving them around.

The rendering of the target markup is done via jinja2. For another target than Org-Mode change the template to something else.

cress.space final words

In 2016 we participated in the NASA Spaceappschallenge. Our team wanted to grow cress as automated as possible.

Our mission statement was:

  • setup a demonstrator green-house

  • autonomous farming, through machine learning

  • add a gaming part for users to help nurturing the plants

We didn't implement the last part. But the first two were successful.

The project started in April 2016 and ended in October 2018.
We met every Sunday afternoon to harvest and start a new one week autonomous growing cycle.

Hardware

On the first spaceappschallenge weekend we build our first planting box:

first_box

On the following weekends we improved the first box a lot. We used a Raspberry PI camera to shot a photo from above every five minutes:

inside_box

After a few months we started to build a second and later a third box. Every iteration was better because we learned from previous versions.

The boxes are build by cutting IKEA Samla boxes. They are cheap and easy to operate on.

Every box had a DHT22 for temperature and humidity inside the box and one outside the box. After the first weeks we added fans to exchange air between the outside and the inside of the box. This was very important against mold on the plants. We used very cheap pumps that can be operated by a Raspberry PI. The timing was tricky because you cannot control the amount of water – only the time the pump is powered.

We measured the moisture of the soil in the first iteration by very cheap metal sensors that measure conductivity between them. Later we used capacitive sensors additionally.

To get the best possible photos we enabled a small 12V LED before taking the photo and disabled the LED afterwards.

Image of the setup from above:

decke

In the final iterations (the last three months) we experimented with light - more like the absence of light. The cress is still growing without any sunlight, but tastes different and is more yellowish than green.

Software

The Raspberry PIs pushed every photo and every sensor value to a REST-api. The website was build with Python in the Django webframework.

The cress.space website showed an image from every day of the plants growing. Here is the last days of a cycle growing white clover:

white_clover

The code of the website is archived on https://github.com/aerospaceresearch/cress-website.

On the Raspberry PIs most of the code was either in Bash or in Python.

Machine Learning

For the MRMCD 2017 we trained a machine learning model to predict if we should water the plants based on the camera images.
The system was a binary classifier using CNN layers trained on the images we took. We used Tensorflow when it was pretty new.

The talk at MRMCD in German:

The data isn't online anymore. If you want to experiment with it, send me an email.

Statistics

About 230 gowing cycles with cress, phacelia or white clover.
The Raspberry PI cameras shot over 900.000 photos - most of them of the growing field inside the boxes.

All our growing cycles with plant, a score of success and some statistics is in https://raw.githubusercontent.com/mfa/cress-classify/master/experiments.org.

Conclusion

In the 2.5 years we saw a lot of things failing: too much water, no water, mold, sensor fails and a fair amount of human error.
But we learned a lot about growing plants, Rapberry PIs, water and electronics.

Meinsack with Datasette

In the end of 2015 we built a Django based website to generate a calendar for our local recycling dates. We used Django, DjangoRestFramework and PostgreSQL because this was our hammer (and still is for a lot of things). Every year I imported the current dates for the new year, but I never updated the core code of the project. Since then the router internals of DjangoRestFramework changed and we would have to rewrite our code to use a newer version of DjangoRestFramework. Instead I decided to use Datasette.

Datasette fits perfectly for this usecase:

  • the API is readonly

  • the database is small and can be shipped as file

  • ical is only a custom renderer

  • the site can be hosted on Google Cloud Run without hastle

Migration

For the migration I exported the PostgreSQL database using db-to-sqlite. Then I removed all tables not in the main app and remove the main_ from all the remaining tables. To have one view with all necessary data joined I added a database view (see readme.md).

One important part of the migration was that the urls have to stay the same. This is achieved by redirect rules in form of a datasette plugin. In the plugin there is an additional renderer to add the custom json export format used in the old api. The code of the plugin: old_api_compatibility.py

The new frontend is based on bulma css and doesn't use any jquery. For autocompletion I used HTML5 datalist with a bit of javascript to autofill them.

To keep my sanity for the next updates I added some tests to ensure the important endpoints stay the same. This tests are run via Github Actions on every push.

Run everything on Google Cloud Run is only the publish command shown in the readme. No server to keep updated or monitored anymore.

Next Steps

The data import has to be ported to the new codebase. Something I need to do until end of 2020.

A nice addon would be to port the code that added streets, districts and areas into the database. This is a requirement to add more cities.