New ask Hacker News story: Ask HN: How do you scale transformer context lengths over multiple machines?

Ask HN: How do you scale transformer context lengths over multiple machines?
2 by sci-genie | 0 comments on Hacker News.
Context length is such an important aspect in today’s AI race. All major players actively advertise this too. Given how matrix math works, how do people run inference for a transformer when the context length is so long that you can’t fit it on one gpu / one machine ?

Tags Hacker News

Tech-Tip | Point

Post a Comment

Post a Comment

About Us

Contact Form

New ask Hacker News story: Ask HN: How do you scale transformer context lengths over multiple machines?

You might like

Post a Comment

Post a Comment

Contact Form