I skipped 2 and landed on the 3rd audio piece and loved it. I dunno why. It is totally my jam for the time being. lost the wiki link, but here is the youtube link. "lil pants" :)
Why does it play an artificial voice saying "Number 9" over and over in between clips in Revolution 9 mode? it's super annoying especially given the clips are shorter
But this is really cool! I've gotten some animal sounds, weather sounds, music, a small kid talking about a soccer match in Spanish, "evil laugh", political speeches in several languages, and a telephone ringing. only pressed skip a couple times for some really unpleasant noises
Had some Japanese song pass by and all the letters were vertically arranged (as is tradition), which made it impossible to find out what artist it was.
Uh wow. This is actual bubble-bursting technology. I love this. Getting StumbleUpon vibes. Also, crazy that so many of these things are actually good... Maybe humanity is not so bad after all? Hmm. Food for thought. :D
That sounds like a great idea for a sleep aid: have an AI narrate random wikipedia pages. Maybe it could even allow you to specify topics you have no interest in so it doesn't accidentally pick a topic that might grab your interest.
NotebookLM's audio mode doesn't just read out the given text, it creates a podcast format with 2 hosts where one will ask questions and the other will answer, and go back and forth in a discussion style.
Using an "AI" (LLM) enhanced TTS adds in tone and other markers to let the underlying TTS sound much more natural. You can then double down with an ML tuned TTS to get a more natural voice.
A paid product, but https://elevenlabs.io/ does it pretty well. There is some work on open source versions you can run locally, they work reasonably well, but I haven't kept up with the FOSS field in several months, so I'm unsure which is currently best
There are some really good open source TTS models out there now. Dia 1.6B or OpenAudio S1 are good options, and you can always check whichever models are trending on huggingface: https://huggingface.co/models?pipeline_tag=text-to-speech
That would be a fascinating next iteration - combining these random audio clips with LLM-generated summaries or discussions of the Wikipedia articles they're sourced from.
I love this, and I don’t mean to throw any shade on it, but this is kind of thing I’d the best to come out of the ”vibe coding” revolution.
I don’t know if this was vibe coded, but what I mean to say is that there are a million things that you just never get around to doing, and LLMs help you to actually _realize_ little cool ideas like this.
Hopefully the "small internet" gets has a resurgence of goofy websites due to reduced development time. Boilerplate gets super annoying, but LLMs don't procrastinate the way I do.
The first track i got was an audio summary of "philosophical razors"
I assumed each "track" is an 5-minute audio summary (LLM+TTS) of a random text ARTICLE from Wikipedia.
Apparently I was mistaken and these are actually random MEDIA uploaded to Wikipedia.
Now I have an idea for a weekend project :)
EDIT:
https://commons.wikimedia.org/wiki/File:Philosophical_Razors...
Apparently it was not a summary but the full article: https://en.wikipedia.org/wiki/Philosophical_razor
EDIT2:
Index of all spoken articles on Wikipedia: https://en.wikipedia.org/wiki/Wikipedia:Spoken_articles
EDIT3:
Here is my 10-minute vibe-coded implementation of "Wikipedia Radio" for spoken articles (no LLM or TTS at runtime here) -- https://d3rfhwexohg7ag.cloudfront.net/wikipedia-radio.html
Thank you for making what many of us thought this would be! It’s pretty fun
This is awesome and a great way to come up with ideas
I skipped 2 and landed on the 3rd audio piece and loved it. I dunno why. It is totally my jam for the time being. lost the wiki link, but here is the youtube link. "lil pants" :)
https://www.youtube.com/watch?v=FxcXVqgKd9c
WTF: That link landed me on something else after this comment that I am totally loving as well.
https://www.youtube.com/watch?v=qPu4XfWkRvc&list=OLAK5uy_kNl...
Never know what rabbit hole you'll find yourself in with HN. Bless y'all.
Why does it play an artificial voice saying "Number 9" over and over in between clips in Revolution 9 mode? it's super annoying especially given the clips are shorter
But this is really cool! I've gotten some animal sounds, weather sounds, music, a small kid talking about a soccer match in Spanish, "evil laugh", political speeches in several languages, and a telephone ringing. only pressed skip a couple times for some really unpleasant noises
It's a reference to the Beatles song Revolution Number 9 which includes a lot of clips like that. It does get old pretty quickly here.
I get why you added the static noise sound between clips but that gets annoying real quick.
It is extremely loud compared to the main audio for some reason, making the site unusable
I think it's nice.
A lot of the music seems to be reuploaded from freemusicarchive.org, hadn't heard of that before but it seems cool.
Ahh I thought it would be TTS reading of various wikipedia articles and I got excited :/
Cool!
Had some Japanese song pass by and all the letters were vertically arranged (as is tradition), which made it impossible to find out what artist it was.
Uh wow. This is actual bubble-bursting technology. I love this. Getting StumbleUpon vibes. Also, crazy that so many of these things are actually good... Maybe humanity is not so bad after all? Hmm. Food for thought. :D
Give it time
wow - I've quickly found 2 songs that are better than anything I've discovered in weeks
Number 9. Number 9. Number 9....
Two Number 9s, a Number 9 large, a Number 6 with extra dip...
Very cool project, I'm jealous I didn't think of it!
From the title I expected this would be like talk radio (like NotebookLM style) discussion of random wiki pages.
That sounds like a great idea for a sleep aid: have an AI narrate random wikipedia pages. Maybe it could even allow you to specify topics you have no interest in so it doesn't accidentally pick a topic that might grab your interest.
No need for an AI. Text-to-speech (TTS) is by far good enough and much easier on CPU/GPU and the environment.
NotebookLM's audio mode doesn't just read out the given text, it creates a podcast format with 2 hosts where one will ask questions and the other will answer, and go back and forth in a discussion style.
Using an "AI" (LLM) enhanced TTS adds in tone and other markers to let the underlying TTS sound much more natural. You can then double down with an ML tuned TTS to get a more natural voice.
What's an example of that? Anytging I can run locally?
A paid product, but https://elevenlabs.io/ does it pretty well. There is some work on open source versions you can run locally, they work reasonably well, but I haven't kept up with the FOSS field in several months, so I'm unsure which is currently best
There are some really good open source TTS models out there now. Dia 1.6B or OpenAudio S1 are good options, and you can always check whichever models are trending on huggingface: https://huggingface.co/models?pipeline_tag=text-to-speech
Would you pay for it?
Yes
That would be a fascinating next iteration - combining these random audio clips with LLM-generated summaries or discussions of the Wikipedia articles they're sourced from.
I love this, and I don’t mean to throw any shade on it, but this is kind of thing I’d the best to come out of the ”vibe coding” revolution. I don’t know if this was vibe coded, but what I mean to say is that there are a million things that you just never get around to doing, and LLMs help you to actually _realize_ little cool ideas like this.
Hopefully the "small internet" gets has a resurgence of goofy websites due to reduced development time. Boilerplate gets super annoying, but LLMs don't procrastinate the way I do.
The implication of this comment is so insulting and unnecessary