- cross-posted to:
- selfhosted@lemmy.world
- cross-posted to:
- selfhosted@lemmy.world
I listen to a lot of podcasts. I spend a pile of time where I need something to distract me and keep me awake, and I also just like podcasts. But there’s a lot of podcasts, especially from sources like IHeartRadio, that have scads of annoying ads (mainly for other podcasts, which seems weird, but OK).
I had gotten to the point where subscriptions like Behind the Bastards just weren’t worth listening to because the ads went on for like 5 minutes. I had to come up with something or drop them.
Enter Pinchflat. You can create a “Podcast” media source profile that’s audio-only, and respects Sponsorblock. If you have a podcast that has a Youtube channel, you can pretty much eliminate ads this way. And Pinchflat makes an RSS feed that you can subscribe to in your favorite app like Antennapod to consume that feed. One thing I like to add to the Media Profile is to redownload after a day or two so it updates the Sponsorblock info that might not have existed if Pinchflat grabs the episode when it’s very fresh.
Links:
Pinchflat Docker compose setup
Podcast RSS feeds (Ignore the reverse proxy if you already use an always-on VPN like Wireguard/tailscale or download your episodes while on your LAN)
The heroes at Sponsorblock and the other heroes that contribute timeblock entries
Donate on behalf of Pinchflat to Zakkarry, a collaborator that the developer of Pinchflat has identified as a good donation target, as well as the EFF.
I’ll often get ads for the podcast I’m currently listening to. Like, I’m already here, what more do you want?
The thing that really annoys me is when they auto insert variable length ads cause it frequently messes up and if you stop and resume a lot it can end up messing up the playback.
As far as I’ve seen, if someone has submitted the sponsor block report, this works perfectly. Of course it depends on someone getting to it, so a very lightly listened to YT channel might not have anything submitted for Pinchflat to use.
This is what ive been doing for about a year and I love it.
Well, why the hell didn’t you tell me? You owe me 10 minutes per episode back for the last year.
Honestly I’m surprised more people don’t do it. Its so easy.
Seeing your VPN to Albania also works. Online advertising is banned there.
But there’s a lot of podcasts, especially from sources like IHeartRadio, that have scads of annoying ads
And they’re so repetitive. And each block is the same length if I’m not mistaken. This could even be automated - not relying on human input - or at least half-automated.
I’ve had a pipeline in mind for exactly this purpose that I want to build when I get around to it:
- Download the audio file from RSS feed
- Self hosted AI transcription model (with output that includes timestamps)
- Self hosted LLM to recognise ad sections and return the start and end timestamps as json
- ffmpeg to slice those timestamps out and stitch the rest back together
In theory, this should be able to remove ad and sponsor sections of any length completely automatically and there’s nothing to stop it working on videos too
It should be doable to so some audio analysis of the episodes. They “always” (I am sure some forget every now and then), have an outro and intro around the ad block. With a clearly defined jingle per podcast. You should be able to make a program that analyses the audio and listens for that block and cuts it out for you.
Yep. Certain patterns are easily recognizable even by machines. One could have a relatively simple “IHeartRadio algorithm” that should work 99% of the time (esp. with Ed Zitron who brackets the blocks with that insane guitar riff).
Hell, I could even write that with ffmpeg and a shell script.
OK I’m being arrogant now, but not wrong.
I had found one that used Whisper to convert the podcast to text and then ran it through an AI to find the ad text, but I couldn’t get it to work. I had considered building something myself and was about halfway through that when I found this method. It does the job better than I think an AI would considering it’s crowdsourced for the ad identification.
This is exactly the route I’ve been begging for for years now. It seriously should be doable.
Thisis a great tip! Wanted to check out pincheflat anyway, great weekend project 😁
Does this get rid of ad-reads as well or is it just the ads served by YouTube?
Sponsorblock will strip out ad-reads.
Keep in mind that if it fetches an episode immediately after it’s posted, nobody will have had a chance to make Sponsorblock entries, hence my advice to have it re-fetch after a couple days. By then, someone should have made contributions that you can take advantage of.
Also, be sure you have the correct Sponsorblock content types selected in your “Podcast” Media Profile. You might have to play around with the options in there to make it produce exactly what you want from it.
I don’t know anything about this beyond what’s written in the post, but it sounds like if a podcast episode on YouTube has been properly sporsorblocked, then it should skip ad reads as well (keep in mind that sponsorblock is crowd-sourced, it doesn’t automatically work it’s magic)
this sounds great. thanks for the tip!
I think I already have pinchflat. Didn’t know sponsorblock worked for podcasts. Thanks!