We already have a local sci-fi assistant and nobody is impressed.

I recently had an epiphany when showing a friend my local setup for Ollama with the Llama3 model. I asked it a question about the Talos Principle and got accurate information. This made me realize that we now have Google-like capabilities on our computers, but instead of searching web pages, we can have a conversation with an assistant.

This is basically the level of AI assistants from Sci-Fi books, and yet we are kind of not making much of it.

These language models are pretty accurate and versatile. They can provide data and spark meaningful conversations, even if they’re not original thoughts.

The Llama 3 model is impressive, taking up only 4GB of space yet containing vast human knowledge. It’s amazing that I can ask it any question and get a pretty good answer. This is NOW. running WITHOUT any kind of internet connection. In order to have this previously, you would have to have a local searchable copy of Wikipedia, and it would still only scratch the surface.

This got me thinking about the state of tooling with LLMs. We’re still in the early stages, just like we were when the web first emerged. Our current chatbot interface is basic and lacks the sophistication needed to unlock the full potential of these models.

But this is where the genius of the Linux philosophy comes into play. I don’t need the tooling to be advanced, I can build my own using these same LLMs. I just need the text output from ollama run and I can do anything I want with it.

In the long run, we will build up some niceties. I’m currently into nano bots because I can build up my assistants in markdown files and use them both in Ruby scripts and from the command line. But it’s a nice to have, rather than something that I have to have. They also give me the flexibility of running a bot the same way but using a non-local model if I need to.

At the moment, all that I’ve thrown at LLama gives me great results and I do not need to use non-local models.