Strava: update gear using strava api

Andreas

2019-12-22 19:00

Introduction

I started to use Pro bike garage to track the parts of my bike. But the app can only match activities for which the gear is set. I started to using the gear feature in Strava when using a second bike for activities. So before that (December 2018 in my case) all activities have no gear set.

This blog post describes how I updated all past activities and set the gear_id for them.

The Python code here is using Python 3.6+ and requests.

You need your own Strava api application. For more information on your own app look at http://developers.strava.com/ and for your CLIENT_ID and CLIENT_SECRET go to your api page at https://www.strava.com/settings/api.

Get an access_token

The Getting started from Strava is pretty helpful here: http://developers.strava.com/docs/getting-started/.

The scope is set to read and write activities. Private activities are not included (this would need activity:read_all).

The same steps as in the guide but using Python:

import requests

params = {
    "client_id": <CLIENT_ID>,
    "response_type": "code",
    "redirect_uri": "http://localhost",
    "approval_prompt": "force",
    "scope": 'read,activity:read,activity:write',
}

response = requests.get("http://www.strava.com/oauth/authorize", params=params, allow_redirects=False)
print(response.text)

Now copy the url into your browser and authorize. The redirect goes to a localhost server that doesn't work but the code in the url is all we need.

The next step is to get the access_token by using the code given by the url:

data = {
   "client_id": <CLIENT_ID>,
   "client_secret": "<CLIENT_SECRET>",
   "code": "<CODE>",
   "grant_type": "authorization_code",
}

response = requests.post(url="https://www.strava.com/oauth/token",
                         data=data)
access_token = response.json()["access_token"]

All following api calls will use this access_token. The token has an expiration date which seems to be currently a day. Enough to update all activities we want to update.

Update the gear of selected activities

All api calls with use the access_token in their header, so set a header:

headers = {
   "Authorization": f"Bearer {access_token}"
}

To get an ACTIVITY_ID browse your strava activities. First we need one with the gear already set and then one without a gear set.

Now get one activity with the gear already set to get the gear_id.

activity = requests.get(url="https://www.strava.com/api/v3/activities/<ACTIVITY_ID>",
                        headers=headers)

gear_id = activity.json().get("gear_id")

To test that everything works with one activity let us update the gear for another activity that has no gear set.

activity = requests.get(url="https://www.strava.com/api/v3/activities/<ACTIVITY_ID>",
                        headers=headers)

# should return None
print(activity.json().get("gear_id"))

# update only the gear of that activity
response = requests.put(url=f"https://www.strava.com/api/v3/activities/{activity.json()['id']}",
                        data={"gear_id": gear_id},
                        headers=headers)

# should return the expected gear_id
print(response.json().get("gear_id"))

Update the gear of all old activities without gear set

We loop over all activities of the user and update the bicyle rides that have no gear. I have about 2200 rides and some other activities. So I set the loop range to 24 pages of data.

for page in range(1, 25):
    print(f"-- page: {page}")
    params = {"per_page": 100, "page": page}
    result = requests.get(url="https://www.strava.com/api/v3/athlete/activities",
                          params=params, headers=headers)
    for item in result.json():
        if item.get("type") == "Ride" and item.get("gear_id") is None:
            print(f"update {item.get('id')}")
            activity = requests.put(url=f"https://www.strava.com/api/v3/activities/{item.get('id')}",
                                    data={"gear_id": gear_id},
                                    headers=headers)
            print(activity.json().get("gear_id"))

The outer loop iterates over the pages of activities we want to update (100 per page).
The inner loop iterates over the activities of the athlete on that page.
Each activity is checked if no gear is set and the activity is a Ride.
If that is the case the gear_id is set for that activity.

The script ran quite some time but was still to fast for the rate limit (which is 600 requests per 15 minutes). When the rate limit was hit I waited for 15 minutes and started the for loop again.

The waiting time for the updates to finish was long enough to write this post.

Parallel bzip2: lbzip2

Andreas

2019-10-27 12:00

As follow up to a previous post: Splitting a big file with split.

If the speed is limited by the bzip2 process this can be parallelized by using: lbzip2.

Lbzip2 works as a drop-in replacement to bzip2. All distributions should have a package for lbzip2.

Django: run save on all elements in a table

Andreas

2019-10-17 14:00

For a slow model I added caching fields into my Django model with fields that are updated on save. New datasets now have the cached fields but I needed to update the old ones too. The table is pretty big, so I wanted a progressbar (as always, tqdm).

The second problem is that the Djangos model save method returns None and for a lot of elements this is a pretty big list of them. The Python built-in library collections for the rescue. The method deque avoids storing all the elements (because of maxlen=0).

The code I ran in my shell_plus:

import collections
from tqdm import tqdm

iterator = map(lambda x: x.save(), MyModel.objects.all())
with tqdm(iterator, total=MyModel.objects.count(), ascii=True) as pbar:
    collections.deque(pbar, maxlen=0)

For really big tables, the initial creation of the iterator will take quite some time too!

To save memory on really large tables .iterator() helps:

import collections
from tqdm import tqdm

qs = MyModel.objects.all()
iterator = map(lambda x: x.save(), qs.iterator())
with tqdm(iterator, total=qs.count(), ascii=True) as pbar:
    collections.deque(pbar, maxlen=0)