Now cooking the config layer and streaming.
Middleware for streaming might require some trial and error but hey, thats the fun part!
The results might be nice - imagine streaming but line by line instead by random number of tokens - the latency is still low and you can easily act on produced text!
Middleware for streaming might require some trial and error but hey, thats the fun part!
The results might be nice - imagine streaming but line by line instead by random number of tokens - the latency is still low and you can easily act on produced text!
Reposted from
Data Geek
My answer to AiSuite - focused even more on the GPU-poor.
It's still in early development, but you can take a look here:
github.com/mobarski/ai-...
All feedback is more than welcome (comments, likes, github stars, reposts).
It's still in early development, but you can take a look here:
github.com/mobarski/ai-...
All feedback is more than welcome (comments, likes, github stars, reposts).
Comments