Wednesday 6 February 2013

Strava - Excel Download and Analysis

Update4 Jab 2014: Please see here: Strava API 1 withdrawn

Last updated 8 Feb 2013

Here is the 1st release of my Excel based Strava download tool. It works for me but it must be accepted that it is very much a Beta release. There are 2 parts to the solution - the first part is Excel file itself and the second part is the file "wget.exe".

wget.exe is a free tool to get all or part of a web page. More information can be got from here: wget software. As its says:

GNU Wget is a free software package for retrieving files using HTTP, HTTPS and FTP, the most widely-used Internet protocols. It is a non-interactive commandline tool, so it may easily be called from scripts, cron jobs, terminals without X-Windows support, etc.
GNU Wget has many features to make retrieving large files or mirroring entire web or FTP sites easy, including:
  • Can resume aborted downloads, using REST and RANGE
  • Can use filename wild cards and recursively mirror directories
  • NLS-based message files for many different languages
  • Optionally converts absolute links in downloaded documents to relative, so that downloaded documents may link to each other locally
  • Runs on most UNIX-like operating systems as well as Microsoft Windows
  • Supports HTTP proxies
  • Supports HTTP cookies
  • Supports persistent HTTP connections
  • Unattended / background operation
  • Uses local file timestamps to determine whether documents need to be re-downloaded when mirroring
Any way - you don't need to know the technical details, just follow the instructions below. 

Download it from here: http://sourceforge.net/projects/gnuwin32/files/wget/1.11.4-1/wget-1.11.4-1-setup.exe/download

Create a folder where you want to run the Excel file from. I have used:
"\Documents\Strava API Release" but the choice is yours.


Copy the Excel file from here: https://docs.google.com/file/d/0B3SjSCZb92mybVF3VEVhNHFYVjQ/edit?usp=sharing.

Note that this link opens the Excel file as a Google Doc as follows:




Press the "down arrow" button (circled in red) to download the file into Excel on your PC. Save it to the location you want (see above) and also copy in wget.exe to the same folder. These are the only 2 files you need.

Open the  Excel file. Make sure that you allow macros to run:




Hopefully you will see something like this:



You only need to enter your Strava Athlete ID in the yellow box and then press the button "1 - Get All Ride Details". You can add your name in the cell above if you want - it's not currently used.


Then wait!

The macros will run. It works by generating a set of batch files that call wget. You will see command windows open and close as the data is fetched from Strava and then processed. It's not pretty. wget returns a text files and these are processed by the macro to extract the data which is loaded into the sheet.

You'll see something like this - once for each block of 50 rides and then once for each segment:





It takes several minutes to run on my laptop to retrieve 320+ rides.

Firstly top level general ride data is retrieved and added to columns B to E in sheet "Summary". Further data for each of these rides is then added ride by ride on the same sheet to columns F to N and this reports data such as distance, average speed, height gain etc for each ride

Secondly having all the ride data, it then retrieves all the segment data for each ride. This is written to sheet "Efforts". This sheet has filters applied to the columns. This allows you to create a filter for each segment.

I have also added 2 graphs - one is "Avg Speed vs Distance" - this is just a scatter plot for average speed against distance for all rides, and "Avg vs Date" which plots average speed vs date. Can you see an improvement as time goes by or is it too depressing to look at!

Each time it runs it creates a lot of files in the folder: these are the batch files it uses to run wget, and the text files created by wget each time it is run. These may be deleted if required.

Further developments:
  • More configuration options (easy)
  • Check compatibility with Excel 2007 and 2010 (possible)
  • Select rides in a date range (easy)
  • Provide update option - only get new rides since last update (easy)
  • Delete files at end of update (easy) 
  • Error trappping (easy)
  • Better performance (possibly easy)
  • Automated analysis (easy)
  • Remove Need for wget (hard) 
  • Comment and improve the VBA code quality

These might be easy but is more about time than programming knowledge.