Next Waste Disposal in Stuttgart as JSON

The city of Stuttgart has a website that generates an ical link for every address in the city when waste is collected. To display on a dashboard I wanted the next dates for waste, paper and recycling collection. But the ical generated by the city is non-rfc5545 compliant and all Python libraries I tried failed to parse the file: ics, ical and ical-library.

So I had to parse the file myself. Additionally I wanted to use no dependencies, so urllib from Python is used. The code is not fault tolerant in any way and solves exactly my problem: Return the next 3 dates waste is collected for my address.

Full Code:

import datetime
import json
import sys
from urllib import request

def parse(stream):
    ds = {}
    for line in stream.split("\n"):
        if line.strip() == "BEGIN:VEVENT" and ds:
            yield ds
            ds = {}
        if line.startswith("DTSTART"):
            ds["date"] = str(
                datetime.datetime.strptime(
                    line.split(":")[-1].split("T")[0], "%Y%m%d"
                ).date()
            )
        if line.startswith("SUMMARY"):
            ds["summary"] = line.split(":")[-1].strip()


r = request.urlopen(request.Request(sys.argv[1]))

keys = ["Restmüll 02-wöchentl.", "Altpapier 03-wöchentl.", "Gelber Sack 03-wöchentl."]
result = []
for item in parse(r.read().decode("utf-8")):
    if item.get("summary") in keys:
        result.append(item)
        del keys[keys.index(item.get("summary"))]

print(json.dumps(result, indent=2, ensure_ascii=False))

Example call:

python run.py https://service.stuttgart.de/lhs-services/aws/api/ical\?street\=Rathausplatz\&streetnr\=1

Results in:

[
  {
    "summary": "Restmüll 02-wöchentl.",
    "date": "2023-02-10"
  },
  {
    "summary": "Altpapier 03-wöchentl.",
    "date": "2023-02-22"
  },
  {
    "summary": "Gelber Sack 03-wöchentl.",
    "date": "2023-02-10"
  }
]

Explore Steampipe

A few weeks ago I stumpled upon steampipe - a SQL interface to APIs. Steampipe uses PostgreSQL fdw (foreign-data wrapper) in the background to do the actual queries. There are a lot of APIs supported, i.e. AWS, Fly.io, Hetzner and many more. The three "APIs" I wanted to test are CSV, RSS and Mastodon.

Install steampipe

To install steampipe on Archlinux I used yay -S steampipe-bin. Then I ran steampipe service start and it failed with:

Error: failed to connect to the database: connection setup failed:
failed to connect to `host=localhost user=root database=postgres`:
hostname resolving error (lookup localhost on 8.8.8.8:53: no such host) -
please try again or reset your steampipe database

To fix this I had to add 127.0.0.1 localhost to the /etc/hosts of my linux.

First one: Mastodon

I'm running a gotosocial instance on fly.io which should be compatible to the APIs used by the steampipe plugin. So lets try this.

First step is to clone the repository and do the quickstart in the readme. For the access_token I used https://takahashim.github.io/mastodon-access-token/ again and set the scope to read.

My .steampipe/config/mastodon.spc looks like this:

connection "fedi_social" {
  plugin = "mastodon"
  server = "https://fedi.cress.space"
  access_token = "add-the-access-token-here"
}
My server is not a Mastodon instance but a gotosocial instance, so I had to look into the database to find a valid id.
Some examples (ran inside steampipe query):
> select id, content from mastodon_toot where timeline = 'home' order by updated_at limit 3;
+----------------------------+--------------------------+
| id                         | content                  |
+----------------------------+--------------------------+
| 01GSFQBKT7EE3ZACGW3BNJMTH1 |  Es ist jetzt 14:00 Uhr. |
| 01GSFTSFKJQGP77HG3XTVMSRA4 |  Es ist jetzt 15:00 Uhr. |
| 01GSFKXQY8G03N7BECBC2EM5WV |  Es ist jetzt 13:00 Uhr. |
+----------------------------+--------------------------+

> select id, acct, username, url from mastodon_account where id='01KP91G6P5N61GT2NH8Q0CTKR6';
+----------------------------+------------+------------+--------------------------------------+
| id                         | acct       | username   | url                                  |
+----------------------------+------------+------------+--------------------------------------+
| 01KP91G6P5N61GT2NH8Q0CTKR6 | stuendlich | stuendlich | https://fedi.cress.space/@stuendlich |
+----------------------------+------------+------------+--------------------------------------+

> select id, acct, username, server from mastodon_followers
+----------------------------+------------------+----------+--------------+
| id                         | acct             | username | server       |
+----------------------------+------------------+----------+--------------+
| 01FYWZ0V6EN8T4J0HJMASNA9G8 | mfa@chaos.social | mfa      | chaos.social |
+----------------------------+------------------+----------+--------------+

So the plugin seems to work with gotosocial, when used correctly.

Note: I updated the examples based on a message from @judell. Thanks!

Second one: RSS

First install the plugin with: steampipe plugin install rss.

Example use with my blog (started with steampipe query):

> SELECT title, categories, published FROM rss_item WHERE feed_link='https://madflex.de/rss.xml' ORDER BY published DESC LIMIT 3
+---------------------------------------+-----------------------------------+---------------------------+
| title                                 | categories                        | published                 |
+---------------------------------------+-----------------------------------+---------------------------+
| Use supabase as simple store          | ["ax","flask","flyio","supabase"] | 2023-01-22T18:00:00+01:00 |
| Meinsack.click 2023 update            | ["datasette","ical"]              | 2023-01-15T18:00:00+01:00 |
| Using password-store with fzf and zsh | ["password-store"]                | 2023-01-06T15:00:00+01:00 |
+---------------------------------------+-----------------------------------+---------------------------+

Now inspect the published field a bit and this all works because it is full PostgreSQL:

> SELECT published::TIME FROM rss_item WHERE feed_link='https://madflex.de/rss.xml' ORDER BY published DESC LIMIT 1
+-----------+
| published |
+-----------+
| 18:00:00  |
+-----------+

> SELECT TO_CHAR(published, 'YYYY-MM-DD') AS date FROM rss_item WHERE feed_link='https://madflex.de/rss.xml' ORDER BY published DESC LIMIT 1
+------------+
| date       |
+------------+
| 2023-01-22 |
+------------+

Even some math with dates is possible:

> SELECT (CURRENT_DATE - published::DATE) AS time_since_last_post FROM rss_item
  WHERE feed_link='https://madflex.de/rss.xml' ORDER BY published DESC LIMIT 1
+----------------------+
| time_since_last_post |
+----------------------+
| 7                    |
+----------------------+

Third one: CSV

I am tracking when I am dropping fluid into my eyes using a CLI script. The data is stored in CSV to keep it simple.

So again first install the plugin: steampipe plugin install csv. The CSV files are searched in the directory you started the steampipe service! So not the location where you start steampipe query! This can be fixed by adding full paths to ~/.steampipe/configs/csv.spc.

As a preparation I added the csv as augentropfen.csv to the current folder (where I started the steampipe service).

Now query the csv inside steampipe query:

> SELECT * FROM augentropfen ORDER BY date DESC, time DESC LIMIT 3
+------------+-------+-------+---------+--------------+---------------------------+
| date       | time  | tz    | dayname | date_rolling | _ctx                      |
+------------+-------+-------+---------+--------------+---------------------------+
| 2023-01-29 | 00:32 | +0100 | Sun     | 2023-01-28   | {"connection_name":"csv"} |
| 2023-01-28 | 20:16 | +0100 | Sat     | 2023-01-28   | {"connection_name":"csv"} |
| 2023-01-28 | 09:48 | +0100 | Sat     | 2023-01-28   | {"connection_name":"csv"} |
+------------+-------+-------+---------+--------------+---------------------------+

To run the same query I did in Sqlite before in a previous blogpost using datasette. The SQL between Sqlite and PostgreSQL are a bit different. There is no strftime, but date_part and to_char, i.e.:

# sqlite
SELECT strftime('%W', date_rolling) AS week, strftime('%Y-%W', date_rolling) AS year_week FROM data

# postgresql
SELECT date_part('week', date_rolling::date) AS week, to_char(date_rolling::date, 'YYYY-WW') AS year_week FROM augentropfen

Now recreate the essential part of the histogram:

> SELECT COUNT(*) as drops_applied, to_char(date_rolling::date, 'YYYY-WW') AS year_week
  FROM augentropfen GROUP BY year_week ORDER BY year_week DESC LIMIT 7
+---------------+-----------+
| drops_applied | year_week |
+---------------+-----------+
| 14            | 2023-04   |
| 14            | 2023-03   |
| 13            | 2023-02   |
| 12            | 2023-01   |
| 2             | 2022-53   |
| 14            | 2022-52   |
| 12            | 2022-51   |
+---------------+-----------+

Conclusion

Using SQL to analyse different datasources is a very cool. I may revisit steampipe with new usecases.

Use supabase as simple store

I wanted to experiment a bit with supabase a serverless api backed by Postgres. Supabase has more features that are not used here, like Auth, Edge functions and Postgres extensions. To use supabase I chose to use it to save webhooks with text generated by the AX Platform.

For simplicity I clicked the table structure in the frontend of supabase. And on the code side I used technology I used before: Flask on fly.io.

The full code is in this github repository: https://github.com/mfa/ax-supabase-fly. The focus here is on the supabase part which is using supabase-py.

The flask code checks if the signature in the header is valid and matches the body of the webhook (not shown here). After this check the dataset is saved in the table "generated" with this code:

import os
from supabase import Client, create_client

url = os.environ.get("SUPABASE_URL")
key = os.environ.get("SUPABASE_KEY")
supabase = create_client(url, key)
supabase.table("generated").insert(dataset).execute()

To use supabase the URL and the secret-key are needed. Both are given in the api settings in your supabase project. Supabase-py simplifies the handling of a datastore extremely but keeps the possibility to use the PostgreSQL features in the future.

The saved datasets can later be used by another service to show and use the text somewhere.