Feedburner’s URL switch

[Update on the morning after (Dec 4) 2014: Feedburner has now rolled this change back. Nevertheless, it remains a valid thing for any feed to do, if unusual, so something podcast apps should be prepared to handle.]

Today I noticed Tim Pritlove tweet about an important Feedburner change in how episode URLs are published, which has caused playback errors with several podcast clients. Here’s an overview, technical detail to follow.

Player FM is back to normal now is all you really need to know.

Feedburner’s URL switch

It’s still not clear when they made the switch, but at some recent time, Feedburner started publishing scheme-less URLs. So instead of http://example.com, it would just be //example.com. Scheme-less URLs are perfectly valid for links on a web page, and simply mean “use the same scheme (e.g. http or https) as the current page”. Here’s an example – //google.com. I linked to //google.com, which the browser interprets as http://google.com since this blog is on a http URL. If you were instead reading this on Player FM’s website, which is now all SSL, the same link would go to https://google.com.

It’s possible other feeds use this standard too, but I’ve never come across it.

This works less well with feeds because many feed parsers don’t know about this standard. So they just save the URL as “//example.com”. Then when the episode is later downloaded or played by an app, the app also doesn’t know about the standard, and even if it did, might be detached from the original feed URL. So the app tries to download or play a nonsense URL, frustration ensues.

Impact on Player FM

Player FM is indexing approximately 9000 feedburner URLs. Of those, about 3000 were affected by this, judging by their URLs.

First step was to update TestData, the open-source project I use to publish configurable test feeds. I patched it to allow the scheme to be configurable. With that done, by passing in an empty media_scheme parameter, I could simulate Feedburner’s scheme-less URLs and get some test coverage for the subsequent fix. Example.

For the fix, I considered forking Feedjira, the Ruby feed parser, to deal with scheme-less URLs, but in the interests of a quick fix, I instead opted to just post-process its feed parsing. So after it parses the feed, some code will translate any //example.com URLs to the proper URL based on the scheme of the feed they’re contained in.

Once fixed, I’ve ensured feeds are re-fetched via Sidekiq (the message processor, so it will happen quickly and in parallel). The fetches were queued up with priority given to the most subscribed feeds, so for almost all users, it would only take about 10 minutes for feeds to be back to normal. The only delay after that is dependent on phone settings, ie how long until an update occurs. For the website, the re-fetches bust caches so that web pages were immediately fine again (as Player FM is SSL and these episode URLs weren’t, those episodes wouldn’t play until the re-fetches happened).

Although this was short notice, the good news is this URL format could theoretically be used by other feeds too. So the update today should help the feed crawler to be one notch more compatible with the universe of podcast feeds out there.