Skip to content

Add cancel() method to interrupt a stream#733

Open
simonchatts wants to merge 1 commit intoabetlen:mainfrom
simonchatts:main
Open

Add cancel() method to interrupt a stream#733
simonchatts wants to merge 1 commit intoabetlen:mainfrom
simonchatts:main

Conversation

@simonchatts
Copy link
Copy Markdown

Fixes #599.

Thanks for all your work on this project!

@tk-master
Copy link
Copy Markdown
Contributor

please accept this pr @abetlen

@tk-master
Copy link
Copy Markdown
Contributor

Actually.. I found an issue with this method.. this will only cancel after a token is generated but if the llm is slow or gets stuck processing the prompt, this doesn't cancel it..

We need a better method.

@tk-master
Copy link
Copy Markdown
Contributor

I'm coming back to this because I need to figure out a better method to interrupt the generation programmatically..

For a console-based scenario it's pretty easy in python, all I have to do is surround the code with try except KeyboardInterrupt: .. then I can just press ctrl+c at any point to gracefully interrupt the llm..

But.. if I'm using a front-end user interface, I haven't managed to make it work properly let's say with a button "Stop generating" that can call a python function.. because of the issue I mentioned in the previous post..

@abetlen sorry to bother again but do you have any suggestions/ideas on how to accomplish this?

@abetlen abetlen force-pushed the main branch 2 times, most recently from 8c93cf8 to cc0fe43 Compare November 14, 2023 20:24
@woheller69
Copy link
Copy Markdown

Why not add it now and improve if there is a better solution. For now this would work in most cases.

@woheller69
Copy link
Copy Markdown

has anyone found a reasonable solution for this? Or am I the only one not willing to wait until the model finishes without killing the job and losing context?

@jewser
Copy link
Copy Markdown

jewser commented May 11, 2024

Any chance this gets merged for now?

@madprops
Copy link
Copy Markdown

It indeed blocks until the first token is produced, but cancelling it after that is trivial. The other similar issue is cancelling a model that is loading.

@woheller69
Copy link
Copy Markdown

gpt4all python bindings offer a similar way which allows stopping with the next token

@ekcrisp
Copy link
Copy Markdown

ekcrisp commented Nov 21, 2024

+1 can we merge this?

@kingbri1
Copy link
Copy Markdown

Take a look at ggml-org/llama.cpp#10509 which should permanently solve this problem on lcpp's side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dynamically intterupt token generation

7 participants