I’m tempted to write a tutorial on how to train the fully copyright-free LLM on your own content, just to see the shitstorm that will inevitably ensue when a workflow that allows ethical LLM training exists.
Comments
Log in with your Bluesky account to leave a comment
Just start with the olmo models and you're 80% done
There's literally a tutorial there
I keep wanting to make a similar article but just can't be bothered... mostly because I know the replies will be full of people losing their absolute shit claiming it stole their art or whatever
Comments
I would definitely want them to run fully locally, if possible.
There's literally a tutorial there
I keep wanting to make a similar article but just can't be bothered... mostly because I know the replies will be full of people losing their absolute shit claiming it stole their art or whatever
"Open this Colab Notebook and load the 7B model in 4bit mode and train a LoRA" might be too much friction.
https://allenai.org/blog/olmo2-32B
https://huggingface.co/collections/PleIAs/common-models-674cd0667951ab7c4ef84cc4