Hacker Newsnew | past | comments | ask | show | jobs | submit | CubsFan1060's commentslogin


Note that this just covers the Speech-to-Text/Speech-Recognition aspect (a-la whisper), there's also models for long-form Text-To-Speech and steaming Text-To-Speech.

“VibeVoice can only handle up to an hour of audio”

Why?


Though of course Apple's rules aren't always consistent, I have 2 separate apps currently on my phone that can/are running this (Google's Edge Gallery and Locally AI)


They've been slowly cutting them off of updates and/or taking them off the app store entirely.

See Anywhere and Replit. Anywhere was the #1 or #2 app and was taken off the app store entirely before being put on and then taken off again.

Last I checked, Replit hasn't received an update on the iOS app store in over two months due to reviews denying them.


Can't be just a SaaSpocolypse. LLMs with the right harness could obliterate much of the TODO+ apps with a general assistant.

But it's more likely it's just walled garden + security theatre that'll keep them from allowing outside apps.


Wouldn't trust AI to run TODO, especially weak models. They can hallucinate tasks, forget to remind etc.


LLMs are stateless. But given an actual database of task-shaped items and some work, I could see the potential.

With a canonical source of truth, and set input/output expectations, the potential blast radius is quite small.


And the end results is.....? What? A todo app that takes 16GB of RAM?


Nothing that Mac and Windows users aren't already used to.


It’s tempting to be flippant about MacOS/windows but in all seriousness, the resources required for an LLM to do the job of a typical lighter weight app/software is a serious consideration. No amount of bloat matches what an LLM needs.


> No amount of bloat matches what an LLM needs.

I don't think that's necessarily true. For instance, LinkedIn uses more memory than Gemma E2B inference does.


LinkedIn is an entirely different category and an extreme case at that. We’re not talking about LLM’s replacing LinkedIn either. It’s an entirely different comparison/discussion.


Finally, we've fully documented the Singularity-is-actually-just-bloated software.


It seems like you have an impossible ask? Why not 4 subscriptions to last you 5 hours?


You are not allowed to use multiple accounts to bypass the rate limit. You can only use different accounts for different uses like a work account and then a personal account. You can't rotate through 5 for personal use.


Ah, I missed that. Guess that makes sense and is a reasonably fair way to limit excessive users.


I don’t really follow what you’re saying. You mention the 5 hour limit. Is your expectation that they have enough capacity so that everyone can hit their 5 hour limit all the time? Or you are proposing that’s how they limit capacity for a subscription?

Do you have an example of how this is how they have advertised or sold the plan? I don’t recall ever seeing any advertisement that their plan is simply pre paying for tokens.


41 minute old account. "I built post". LLM sounding everything. I'd be surprised if there was a real person behind this at all.


This feels like we're still on the march to the dead internet.

What percentage of your interaction do you want/think is actually real people, and not just agents talking to other agents?


I’d be okay with it generating the posts and the reports of the financials and such but you need some human interaction in there.

Generate the posts with AI so it can free up your time to interact with people replying to the post.

Or write the bigger, longer, more content posts yourself with maybe some AI assistance in places here and there then use AI to create smaller posts from your larger posts. Still keeping with the human interaction with those that reply to the posts.


100% agree. Content generation is where agents shine — it's repetitive and time-consuming. But genuine engagement is where trust gets built, and that needs to be real.

My engagement scripts do auto-reply to comments on my own posts, but they're rate-limited and context-aware (max 2 rounds). For anything meaningful — client conversations, community discussions like this one — it's always me.


> For anything meaningful — client conversations, community discussions like this one — it's always me.

In a six-minute time period, you posted 10 different comments here, totaling nearly 800 words. I don't believe you are being truthful.


"Fair catch on both points. The batch of replies: I had a list of expected questions and drafted answers beforehand. When I finally had time to respond, I posted them all at once. Not real-time typing — that's why the timing looks suspicious. Should've spaced them out. On MRR: I dodged it. Honest answer — client project revenue is irregular (project-based, not subscription), so I don't track it as MRR. MindThread subscription revenue is early and small, I'm not comfortable putting a number on it publicly yet. What I can say: it covers my infra costs and Claude subscription with room left over. Not life-changing, but real


You may be right about taste, but I think it takes a different dimension in the future.

"Dear Claude, please make me a clone of <fancy new saas> but make <these changes specific to my tastes>".

For many things, it's probably not "select the one of 100 that fits my taste", it's probably going to be to just make your own personal version that fits your taste in the first place. And, probably, never share that anywhere.


This has to be a bot account, right? 2 days old.

Yesterday "I don't know about you, but I benefit so much from using Claude at work that I would gladly pay $1,500-$2,000 per month to keep using it."


Agreed, those comments are all over the map, and so many comments in 2 days!


Agreed, those comments are all over the map, and 22 comments in 2 days!


Bots don't write like me


The interesting part about that is both of those things require some sort of time to start.

If I launch a new product, and 4 hours later competitors pop up, then there's not enough time for network effects or lockin.

I'm guessing what is really going to be needed is something that can't be just copied. Non-public data, business contracts, something outside of software.


I can't tell if this is genius or terrifying given what their software does. Probably a bit of both.

I wonder what the security teams at companies that use StrongDM will think about this.


I doubt this would be allowed in regulated industries like healthcare


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: