▲TmuxAI: AI-Powered, Non-Intrusive Terminal Assistanttmuxai.dev

196 points by iaresee 72 days ago | 65 comments

kristopolous 72 days ago [-]

I've got a similar approach from a Unix philosophy.

Look at the savebrace screenshot here

https://github.com/kristopolous/Streamdown?tab=readme-ov-fil...

There's a markdown renderer which can extract code samples, a code sample viewer, and a tool to do the tmux handling and this all uses things like fzf and simple tools like simonw's llm. It's all I/O so it's all swappable.

It sits adjacent and you can go back and forth, using the chat when you need to but not doing everything through it.

You can also make it go away and then when it comes back it's the same context so you're not starting over.

Since I offload the actual llm loop, you can use whatever you want. The hooks are at the interface and parsing level.

When rendering the markdown, streamdown saves the code blocks as null-delimited chunks in the configurable /tmp/sd/savebrace. This allows things like xargs, fzf, or a suite of unix tools to manipulate it in sophisticated chains.

Again, it's not a package, it's an open architecture.

I know I don't have a slick pitch site but it's intentionally dispersive like Unix is supposed to be.

It's ready to go, just ask me. Everyone I've shown in person has followed up with things like "This has changed my life".

I'm trying to make llm workflow components. The WIMP of the LLM era. Things that are flexible, primitive in a good way, and also very easy to use.

Bug reports, contributions, and even opinionated designers are highly encouraged!

rane 71 days ago [-]

Maybe if you could explain what exactly is happening in the savebrace example because it's not clear how it relates to this.

Wilfred 71 days ago [-]

If I've understood this interesting workflow correctly, there's two major components.

streamdown: a markdown renderer for the terminal, intended for consuming LLM output. It has affordances to make it easier to run the code snippets: no indentation, easy insertion in the clipboard, fzf access to previous items.

llmehelp: tools to slurp the current tmux text content (i.e. recent command output) as well as slurp the current zsh prompt (i.e. the command you're currently writing).

I think the idea is then you bounce between the LLM helping you and just having a normal shell/editor tmux session. The LLM has relevant context to your work without having to explicitly give it anything.

kristopolous 71 days ago [-]

Basically.

About 20 years ago I had a since-long-disappeared article called "The Great Productivity Mental Experiment" which we can extend now for the AI era:

You've got 3 equally capable competent programmers with the same task, estimated to take on the order of days.

#1 has no Internet access and only debuggers, source and on system documentation

#2 has all that + search engines and the Internet

#3 has all #2 + all the SOTA AI tools.

They are all given the same task, and a timer starts.

Who gets to "first run" the fastest? 90% success rate? 99.9%?

The point of the exercise is the answer: "I don't know"

Ergo there is no clear objective time saver.

The next question is what would establish a clear victor without having to make a taxonomy of the tasks. We're looking for best time practice.

The answer is workflow, engagement, and behavior.

Current AI flow will get you to first run faster. But to the 99.9% pass? Without new flows it can't. It's a phenomenon called automation complacency and it can make bad workflows very costly.

(The original point of the exercise was to point out how better tools don't fix bad practice and how frameworks, linters, the internet, stronger type systems ... these can either solve problems or create larger ones based on how you use them. There is no silver bullet as Fred Brooks said in the 1980s)

kristopolous 71 days ago [-]

thanks. that's exactly the feedback I need! Appreciated. I put more screenshots here: https://github.com/kristopolous/llmehelp ... maybe that's clearer?

jarbus 71 days ago [-]

Just wanted to say I love this, didn't known I needed this until now.

kristopolous 71 days ago [-]

There's this new thing I'm currently working on. I have a tool that does a clean execvp of what you pass it through, as a total wrapper.

You can do ./tool "bash" and then open up nvim, emacs, do whatever, while the tool sits there passing things back and forth cleanly. Full modern terminal support.

Now here's the thing. You get context. Lots of it. Here's what it can do:

    psql# <ctrl-x - the tool sees this, looks at the previous N I/O bytes and reverses video to symbolize it's in a mode> I need to join the users and accounts table <enter>

Then it knows from the PPID chain you're in postrgresql, it knows the output of previous commands, it then sends that to an llm, which processes it and gives you this

    psql# I need to join the users and accounts table
    [ select * from users as u ... (Y/n) ]

Then it shows it. Here's the nice thing. You're STILL IN THE MODE and now you have more context. You can get out of it at any time through another ctrl-x toggle.

This way it follows you throughout your session and you can selectively beckon the LLM at your leisure, typing in english where you need to.

SSH into a host and you're still in it. Stuck in a weird emacs mode? Just press the hotkey and the i/o gets redirected as you ask the LLM to get you out.

But more importantly this is generic. It's a tool that allows you to intercept terminal session context windows and inject middleware, generically and then tie it to hotkeys.

As a result it works with any shell, inside of tmux, outside, in the vscode terminal, wherever you want... and you can make as many tools for it as you want.

I think it's fundamentally a new unix primitive. And I'm someone that researches this stuff (https://siliconfolklore.com/scale/ is a conference talk I gave last year).

If you know of anything else that's like this please tell me I haven't been able to find it.

Btw you cannot do this through pipes, the input of the left process isn't available to the piped process on the right. You can intercept stdin but you don't get the input file descriptor of the left process. The shell starts two processes at the time and then passes things through so you can't even use PPID cleanly without heuristic guessing. Trust me. I tried doing things this way many times. That's why nothing else works like this, you need new tricks.

I intend to package this up and release it in the next few days.

sshine 71 days ago [-]

Mind blown.

This is so simple.

It’s like rlwrap, but generic. So you could reimplement rlwrap with this.

I’ve been experimenting with schemesh recently. It’s a shell with ask control structures in scheme. Amazing, but a little immature still. Having a scheme middleware would be stronger: I can have a full zsh where a control key drops me onto a Scheme interpreter.

Now, how far up your process tree do you want to host it?

kristopolous 68 days ago [-]

https://9ol.es/tmp/fizzbuzz.webm

Finally kind of working. There's the first demo video

I escape out three times: once for basic shell help, the next while I'm inside of vim and finally at the python interpreter. The recent input and output are the context clues to solve the query, which is the part in reverse video.

Then the code is injected into the proxy wrapper

https://github.com/kristopolous/llmehelp/tree/main/shellwrap is the code.

jarbus 71 days ago [-]

Would need to re-read this a few more times to fully understand it, but very interested in the direction. Still struggling to wrap my mind around how it would work from nvim automatically, though. Excited to see what you've got

arkasan 71 days ago [-]

Had a terrible experience with warp. I personally don't use warp, but I know one colleague who uses it. One day, he ran `kubectl describe <resource> <resource name>` and warp suggested `kubectl delete <resource> <resource name>` and he pressed enter. He was lucky the resource was not critical and could be recreated without any damage. Think about what would have happened if the same thing had happened for the namespace resource. People go into automatic accept mode after some time, and this is very dangerous when you do anything at the terminal, because there is no UNDO button.

gregwebs 71 days ago [-]

I love warp but I always turned the AI off- the GUI improvements were enough for me. I have maintained a general rule for a long time now that when manual commands are run on production, someone else must be watching what’s going on and approving commands or approve them ahead of time.

literalAardvark 69 days ago [-]

Yes there is, Velero.

Yeah it might be problematic, but people make mistakes like this all the time, which is why rm -Rf /home /user/.file is a well known rite of passage.

mvanbaak 70 days ago [-]

and this is why you want mfa and gitops with multiple users signing off on a change

protocolture 71 days ago [-]

My first instinct is that this is super useful.

But then I realise that I do enough sensitive stuff on the terminal that I don't really want this unless I have a model running locally.

Then I worry about all the times I have seen a junior run a command from the internet and bricked a production server.

WinstonSmith84 71 days ago [-]

I suppose, you still need (and shall) understand what this command is doing eventually. But it certainly helps a lot when you don't exactly remember how the command is built. If not when googling stackoverflow, I personally rely a lot on previous commands which are saved in the history (ctrl + r)

malux85 72 days ago [-]

> TmuxAI » I'll help you find large files taking up space in this directory.

Get rid of this bit, so the user asks question, gets command.

Make it so the user can ask a follow up question if they want, but this is just noise, taking up valuable terminal space

jeltz 71 days ago [-]

Seems like something for everyone who loved Clippy.

tough 72 days ago [-]

tokens also cost money unless running locally

amelius 71 days ago [-]

Instead of showing:

    Do you want to execute this command? [Y]es/No/Edit

perhaps also add an "Explain" option, because for some commands it is not immediately obvious what they do (or are supposed to do).

iaresee 72 days ago [-]

A WIP but evolving, it watches your active tmux panes and allows you to work with AI agents who can interact with those panes. For command line folk, this could feel like a pretty good way to bring AI in to your working life.

rcarmo 71 days ago [-]

I already use aider and VS Code Agent Mode (which occasionally asks me to run commands for libraries, etc.)

This seems… like an amazing attack vector. Hope it integrates with litellm/ollama without fuss so I can run it locally.

make3 71 days ago [-]

I wish you could use Cursor in Terminal mode, eg I press a button, a Cursor window opens with the terminal tab taking up all the space. That way we could just reuse Cursor's special+k and special+l instead of having to have a different app with the same functionality

WinstonSmith84 71 days ago [-]

Wish so too, I guess it would require either a plugin or ... that something like TmuxAI partners with Cursor so that Cursor would provide an API key that could be used in TmuxAI (instead of the API key of OpenRouter https://tmuxai.dev/getting-started#post-installation-setup)

rcarmo 71 days ago [-]

I don’t see why I should spend time fiddling with Cursor when VS Code now does mostly the same thing.

WinstonSmith84 71 days ago [-]

Cursor is still slightly ahead - and as the OP has written, it works well in the embedded terminal of Cursor / Code (without "fiddling")

eranrund 71 days ago [-]

This looks interesting and I’m eager to try it, but my concern is this could easily send sensitive information such as API keys I paste to my terminal to the AI providers. How do you remedy that?

rpigab 71 days ago [-]

I would usually alt-tab to browser, open up any good LLM in 1 keystroke, write a short prompt, optionally paste the output of "ls" or "find" if context matters, then just copy and paste the result. This tool adds context but I'm fine without it.

dimatura 72 days ago [-]

The "non-intrusive" part is interesting. I've bit the bullet with AI assistance when coding - even when it feels like it gets in the way sometimes, overall I find it a net benefit. But I briefly tried AI in the shell with the warp terminal and found it just too clunky and distracting. I wasn't even interested in the AI features, just wanted to try a fancy new terminal. Not saying warp might not be useful for some people, just wasn't for me. So far I've found explicitly calling for assistance with a CLI command (I've used aichat for this, but there's several out there) to be more useful in those occasional instances where I can't remember some obscure flag combination.

mkbelieve 72 days ago [-]

Uninstalled warp, as the whole thing felt clunky and slow, and never even turned on the AI. You can accomplish everything it does with zsh + plugins without much fuss.

dcreater 71 days ago [-]

Same it performs like an electron app. And the whole must login thing soured me from the start. I don't need SaaS practices for the terminal and I don't trust that they aren't snooping either.

daft_pink 71 days ago [-]

I moved away from Warp as well… to Ghostty. I didn’t find I got much benefit from Warp.

rawoke083600 71 days ago [-]

I do love this, but haven't managed to actually try it out. ( I stopped trying and moved on)

But well done for launching (the following is not hate, but onboarding feedback)

Who else had issues about API key ?

1. What is a TMUXAI_OPENROUTER_API_KEY ?? (is like an OPENAI key) ?

2. If its an API key for TMUXAI ? Where do I find this ? Can't see on the website ? (probably haven't searched properly, but why make me search ?)

3. SUPER simple instructions to install, but ZERO (discoverable) instructions where/how to find and set API key ??

4. When running tmuxai instead of telling me I need an API key. How about putting an actual link to where I can find the API key.

Again well done for launching... sure it took hard word and effort.

alvinunreal 71 days ago [-]

Thanks for feedback, agree those stuff could be clearer. Will handle.

Just to answer your questions, it's an OpenAI API compatible service, which you can generate here: https://openrouter.ai/settings/keys

Also added recently in readme how you can use OpenAI, Claude or others.

tensility 71 days ago [-]

Why does running 'help' require tokens?

71 days ago [-]

dr_kretyn 72 days ago [-]

Interesting. I've been working on a similar project, though with more 'agentic' workflow. It's also in golang, CLI-native but also supports MCP and "just finishing" 'agentic tasks'. Potentially a nice overlap :) https://github.com/laszukdawid/terminal-agent

zipping1549 71 days ago [-]

LLM + Terminal integration just calls for disaster.

alvinunreal 71 days ago [-]

wouldn't manage production with ai too, however there are many tasks for e.g had recently to create an rpm, which is much better if ai can work on it

jph00 71 days ago [-]

Shellsage has provided this functionality for quite a while. I've been using it for months, and it's been a game-changer for me.

It was created by one of my colleagues, Nathan Cooper.

https://www.answer.ai/posts/2024-12-05-introducing-shell-sag...

inciampati 72 days ago [-]

Just got this running. It took a minute to figure out "where the config file is" but once I got it set up with openrouter keys... wow! This plus speech to text = Look ma no hands!

jmdots 71 days ago [-]

I would want the command to be shorter in `wc -c` terms -- but cool!

porcoda 71 days ago [-]

Can this be aimed at ollama or some other locally hosted model? It wasn’t clear from the docs since their config examples seem to presume you want to use a third party hosted API.

never_inline 71 days ago [-]

Given ollama has a openai compatible API[0], quick searching in repo returns this

https://github.com/alvinunreal/tmuxai/issues/6#issuecomment-...

1: https://ollama.com/blog/openai-compatibility

trees101 71 days ago [-]

edit your `.config/tmuxai/config.yaml`

to add these lines:

``` openrouter: api_key: "dummy_key" model: gemma3:4b base_url: http://localhost:11434/v1 ```

trees101 71 days ago [-]

I tried it and it didn't work too well. I suspect the prompts were optimized for Gemini, not local Gemma.

TBH I found the whole thing quite flaky even when using Gemini. I don't think I'll keep using it, although the concept was promising.

alvinunreal 71 days ago [-]

It's still early release, I know it should still be improved but my hope is community could help.

sepositus 71 days ago [-]

So I have yet to use any tool that needs an API key because I am concerned about costs. Does anyone have any idea what the daily usage of something like this would cost?

mdrzn 71 days ago [-]

Gemini has a very generous Free Tier for their API (including 2.5 Flash and 2.5 Pro): https://aistudio.google.com/app/apikey

kenmacd 71 days ago [-]

Keep in mind that using the free tier models means Google will keep what you send and use it for training.

In bold in their terms is: "Do not submit sensitive, confidential, or personal information to the Unpaid Services."

https://ai.google.dev/gemini-api/terms#unpaid-services

amelius 71 days ago [-]

Same here. Plus I want to run this locally because I'm concerned about my data (and hey, I'm a hacker and don't want to be dependent on machines out of my control, and besides I like reproducibility).

alvinunreal 71 days ago [-]

it's not that big with gemini flash model, no need for full model for this

alvinunreal 72 days ago [-]

Thanks iaresee! Yes, the non-intrusive observation of panes is the central idea, trying to integrate AI help without breaking the command-line workflow.

Appreciate the feedback as it evolves.

poulpy123 71 days ago [-]

Just curious: why specifically tmux and not any terminal ?

atsaloli 71 days ago [-]

Can I use it with Perplexity API without going through OpenRouter API? I want to try it but I don't want to go through a third party.

smallpipe 72 days ago [-]

TIL that `head -5` is equivalent to `head -n 5`, and that's not in the manual

ksala_ 71 days ago [-]

It’s called out as obsolete in the coreutils documentation, https://www.gnu.org/software/coreutils/manual/html_node/head..., which probably explains the lack of references in the manual

jclulow 71 days ago [-]

Depends on which manual you read; e.g., it's documented in https://illumos.org/man/1/head

neuroelectron 71 days ago [-]

I feel like heuristics would be a much better way to do this. Just an "newb assassitmant" with a long list of useful commands but I guess this frees up expert's time from doing something so boring.

pjmlp 71 days ago [-]

This would be interesting, if voice controled, all this lengthy prompt texts make COBOL feel like a pleasant language.

mathfailure 72 days ago [-]

Is it a locally running model?

gglon 72 days ago [-]

It seems like a model is not included; you need to set an API endpoint in a configuration https://tmuxai.dev/getting-started#environment-variables

mathfailure 72 days ago [-]

Then why would anyone let this thing into their terminal..?

dcre 72 days ago [-]

You can point it at any API you want, including local. The tool is agnostic, like nearly all such tools.

kurtis_reed 72 days ago [-]

it sees the visible contents of panes or the previous lines too?

alvinunreal 71 days ago [-]

It sees by default 200 lines from each pane, can be changed with max_capture_lines config parameter.

You can also override during session, with: /config set max_capture_lines 1000 - to increase capture lines for the current session only.

Loading comments...

kristopolous 72 days ago [-]

I've got a similar approach from a Unix philosophy.

Look at the savebrace screenshot here

https://github.com/kristopolous/Streamdown?tab=readme-ov-fil...

It sits adjacent and you can go back and forth, using the chat when you need to but not doing everything through it.

You can also make it go away and then when it comes back it's the same context so you're not starting over.

Since I offload the actual llm loop, you can use whatever you want. The hooks are at the interface and parsing level.

Again, it's not a package, it's an open architecture.

I know I don't have a slick pitch site but it's intentionally dispersive like Unix is supposed to be.

It's ready to go, just ask me. Everyone I've shown in person has followed up with things like "This has changed my life".

I'm trying to make llm workflow components. The WIMP of the LLM era. Things that are flexible, primitive in a good way, and also very easy to use.

Bug reports, contributions, and even opinionated designers are highly encouraged!

rane 71 days ago [-]

Maybe if you could explain what exactly is happening in the savebrace example because it's not clear how it relates to this.

Wilfred 71 days ago [-]

If I've understood this interesting workflow correctly, there's two major components.

llmehelp: tools to slurp the current tmux text content (i.e. recent command output) as well as slurp the current zsh prompt (i.e. the command you're currently writing).

kristopolous 71 days ago [-]

Basically.

About 20 years ago I had a since-long-disappeared article called "The Great Productivity Mental Experiment" which we can extend now for the AI era:

You've got 3 equally capable competent programmers with the same task, estimated to take on the order of days.

#1 has no Internet access and only debuggers, source and on system documentation

#2 has all that + search engines and the Internet

#3 has all #2 + all the SOTA AI tools.

They are all given the same task, and a timer starts.

Who gets to "first run" the fastest? 90% success rate? 99.9%?

The point of the exercise is the answer: "I don't know"

Ergo there is no clear objective time saver.

The next question is what would establish a clear victor without having to make a taxonomy of the tasks. We're looking for best time practice.

The answer is workflow, engagement, and behavior.

Current AI flow will get you to first run faster. But to the 99.9% pass? Without new flows it can't. It's a phenomenon called automation complacency and it can make bad workflows very costly.

kristopolous 71 days ago [-]

thanks. that's exactly the feedback I need! Appreciated. I put more screenshots here: https://github.com/kristopolous/llmehelp ... maybe that's clearer?

jarbus 71 days ago [-]

Just wanted to say I love this, didn't known I needed this until now.

kristopolous 71 days ago [-]

There's this new thing I'm currently working on. I have a tool that does a clean execvp of what you pass it through, as a total wrapper.

You can do ./tool "bash" and then open up nvim, emacs, do whatever, while the tool sits there passing things back and forth cleanly. Full modern terminal support.

Now here's the thing. You get context. Lots of it. Here's what it can do:

    psql# <ctrl-x - the tool sees this, looks at the previous N I/O bytes and reverses video to symbolize it's in a mode> I need to join the users and accounts table <enter>

Then it knows from the PPID chain you're in postrgresql, it knows the output of previous commands, it then sends that to an llm, which processes it and gives you this

    psql# I need to join the users and accounts table
    [ select * from users as u ... (Y/n) ]