Is there a decent chunking algorithm library on NPM?
I know Langchain and LlamaIndex have some, but figured there were probably some unbundled from frameworks.
Chunking: chunking text documents to be fed into a RAG system.
I know Langchain and LlamaIndex have some, but figured there were probably some unbundled from frameworks.
Chunking: chunking text documents to be fed into a RAG system.
Comments
Like what it means to chunk an arbitrary markdown readme looks very different than a dissertation in Latex.
So there is a wide gap here between openly available tools for this.
https://www.npmjs.com/package/@langchain/textsplitters
Or likely some kind of NLP library (but will have more than the splitters)