So in the last few weeks, ConnectStats had a lot of issues related to online services. That was quite a learning experience. I hope most of it is behind, but could be a few left overs.
Strava made a change in its API related to how it authenticate users for access. This was a documented and announced changed, unlike a lot of the Garmin API changes in the past,. The issue is mostly that I hadn’t understood the change, so I didn’t get the right fix out in time. So I have only myself to blame…
Because I don’t use Strava in ConnectStats myself too much except for testing, I had not noticed that my initial fix for the API change only worked after you did a new login process. In my tests, I was always login from scratch and it was working. But many users that had already logged in before the change and my fix ended up stuck in a failure state. A workaround was to go to the config page, in the strava settings and force a log out, after the next login things should work again.
I also now have pushed a new version that will force a logout/login cycle when it is in the bad state and now have better and better understanding of the OAuth 2.0 authorisation protocol…
So after the previous set of issues, the new service had managed to be stable for new users for a bit over a week. Everything was running smoothly, so I decided to extend the rollout to existing users as well.
This worked well for a week, I added close to 2000 users in one week, which is not bad. But that’s when I suddenly got a notification from my website service provider that my account was suspended because it was creating an overload of database resources…
So I called my service provider and I upgraded my service to the next tier.
This unfortunately didn’t work very well, I noticed that somehow I had received a huge surge of activities at once. While I still don’t understand why that happened, I reported it to Garmin for investigation, but the result was even the new server was overwhelmed.
So I did some scary “on the fly” surgery of the database, deleted what appeared like the spurious activities until everything came back to normal.
In the process I noticed that the new server, while it had less constraints on resources, was actually processing information much slower that the old one, which resulted in further issue because I had implemented a timeout for processing, which was constantly hit and therefore making the situation worse (after timeout it would kill a process and restart it from the beginning).
So I reached out to my website hosting provider who advised me to purchase yet the next tier of server, which I did.
Things got better, the spurious activities stopped coming and the server was happy processing request.
That said, I noticed from the app it was clearly still noticeably much slower to download activities files to ConnectStats. But at that point I clearly had a better server and there was no way I needed more power. So after some research on optimisation of MySql, there was some simple solution which I implemented and now everything seems to be happy and rolling.
I also now have learned a lot about building a scalable online service online from scratch…
Another aspect that is hard for me to test is the use of ConnectStats with Virtual Rides or Runs from Zwift. They clearly are becoming more and more popular as I had quite a few users reaching out to highlight issues with those.
Virtual activities broke when using the new service or strava as a service in the app. So I have pushed a new version of ConnectStats that attempts to fix this. As any feature I don’t use myself, it’s a bit tricky to fix, as I can’t test it. I have to blindly iterate through, push the fix and see from other users if it worked… Hopefully the latest release will fix all remaining issues, but if it doesn’t, I appreciate Zwift users patience and feedback until I get it right…