Feed Fetching Improvements

I’ve made several improvements to feed fetching lately, making podcasts update faster, more reliably, and more accurately. Read on for the details.

First and foremost, feeds are once again based on push notifications, meaning they will typically enter Player FM’s database within seconds of content providers hitting the Publish button. Or in the worst case, a few minutes later, capped at 15 minutes. Player FM’s new user-interface – soon landing with v2.0 – now supports pull-to-refresh, so this goes well together as you will frequently find fresh fodder when episodes are coming in every few seconds.

Second, the “plan B” polling solution is now more efficient, so if the aforementioned push process breaks down, feeds will still be updated within about 2 hours.

Third, if the publisher edits an already-published post, Player FM’s index will update with the new details. This was a pain point for some time, and at various times users mailed me about a missing episode or a title discrepancy. It took a while to fix mainly because there was a risk it would break caching, causing excessive bandwidth and battery use for mobile users. But it’s now been done, and efficiently.

Finally, no more duplicate episodes. Occasionally there were race conditions which caused the same feed to be indexed twice. This in itself was very rare, but the problem was that if it did happen, the duplicate would never be cleared. Now it should not happen at all, but even if the universe conspires for it to happen, the duplicate will be gone upon the next fetch a few hours later.

How feed fetching got slower and then faster

As further technical info, the performance of feed fetching regressed a couple of weeks ago. After some investigation, I found a couple of causes:

  • Push notifications from Superfeedr had failed because I made the site full-TSL (aka SSL, ie https://player.fm/* and http://player.fm no longer works). This point warrants a separate post later on, but suffice to say, it broke push notifications. I did actually have Superfeedr set up to push to the (valid) https address, but due to what appears weirdly to be a core Ruby library bug, it was not used and the original — now defunct — http URL was used instead. The http URLs do redirect to https, but the client wasn’t following redirects (it would not be common to follow redirects from a POST request anyway).
  • The “plan B” had its own problems due to a library that wasn’t thread-safe. It was causing background jobs (ie feed fetching) to fail out. Having pinpointed that library, I made a patch to use an alternative library instead and this is working smoothly now.
  • Lack of information. As a meta point, I didn’t have enough visibility on what was happening. I’ve now built some RESTful services to expose statistics and furthermore, services to verify how many episodes are being generated. If — in any given hour — that number falls short of a threshold, I get notified.