Feed quality improvements

After the recent upgrade (2 weekends ago), I’ve been able to start enhancing the server again. If I’ve done it right, you won’t notice the latest changes because they are more about Doing It Right than adding any new features.

If you’re interested in the specific changes:

  • Most importantly, old episodes are updated if the publisher makes a change. This sounds like it should have been easy to do and it was if I ignored efficiency. The reason it wasn’t done before was I was worried small changes would bust caching and cause a lot of unnecessary re-fetching. That’s been addressed here, so the whole process is efficient.
  • Changes to the main series data (such as author info) is immediately propagated to containing channels. Before it would only propagate when an episode update triggered it.
  • No more duplicate episodes! This bug previously happened in a condition when the same series is fetched twice, which in itself was rare and wouldn’t have been much of a problem in itself. The main problem was the old episodes would never go away after that one rare occurrence. Now duplicates are purged at the end of each fetch.
  • Images are now drawn from sources other than the iTunes image declaration. Some feeds include images in a separate tag that would be similar to how a normal blog would provide images. Those are now indexed in the event iTunes image isn’t included. Thanks to Lonely Bob for pointing out some feeds like this.

I’ve also begun work on an open-source project to help with processing grunt-work like this.