Discussion about this post

User's avatar
Iustin Pop's avatar

A lot of this flew over my head, as I'm not familiar with the technology. But what I took away is that one of the reasons token generation is slow is because they're batching to save costs?

I am very annoyed that I ask a question (long one, a few sentences) and. many times I have to switch tasks while the LLM is completing their response. Would definitely be willing to pay more for fast replies.

1 more comment...

No posts

Ready for more?