Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ollama LLM provider tools support #14623

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dhuebner
Copy link
Member

What it does

Resolves #14610

Adds Tools handling for Ollama LMs

How to test

Set an Ollama model as workspace agent and ask questions about the workspace. For example:
@Workspace How many files are in my workspace?

Follow-ups

Breaking changes

  • This PR introduces breaking changes and requires careful review. If yes, the breaking changes section in the changelog has been updated.

Attribution

Review checklist

Reminder for reviewers

@planger
Copy link
Contributor

planger commented Dec 13, 2024

@dhuebner Thank you for the PR! Which models did you use for testing?

@dhuebner
Copy link
Member Author

@planger
llama3.1 as WS agent and llama3.2 as orchestrator, why?

@planger
Copy link
Contributor

planger commented Dec 13, 2024

@dhuebner thanks, no particular reason, just out of curiosity and to know how to best test the PR. Thank you!

@JonasHelming
Copy link
Contributor

JonasHelming commented Dec 15, 2024

Great feature addition!

I tried to test this with LLama3.1 but I was not really successful (see below).
Also I noticed:

  • You do not stream at all in the case of tools present? Is this intentional? For the user, they just see "generating" for quite a while and nothing happens
  • Because of not using streaming, we do not show the function calls in the chat as for the open AI provider. I think this is a really nice feature showing the user what happens in a transparent way.

see how it is visualized
image

This was my test with Ollama:

image

@dhuebner
Copy link
Member Author

@JonasHelming

You do not stream at all in the case of tools present? Is this intentional? For the user, they just see "generating" for quite a while and nothing happens

The reason is that Ollama doesn't seem to support tools with streaming. The document says:
tools: tools for the model to use if supported. Requires stream to be set to false

There is also no possibility to listen to e.g. onToolsCall(), means we will need to buffer the stream until we recognize if it is a tools_call and than present it to the user.

This was my test with Ollama:

I tried the questions mentioned in #14285 by @navr32 , that worked okay. I will test with more complex questions, although I don't know how to make it perform better with just calling the API provided...

@JonasHelming
Copy link
Contributor

@dhuebner OK that makes sense. However I think we should still add the tool call to the response so that is its rendered in the chat.

@dhuebner
Copy link
Member Author

@JonasHelming

However I think we should still add the tool call to the response so that is its rendered in the chat.

How should it look like? I can send only one response during a request, or is there a special API to achieve this?

@planger
Copy link
Contributor

planger commented Dec 20, 2024

I think at the moment we cannot really represent tool calls generically in the non-streaming case.

In the non-streaming case, we can only return a LanguageModelTextResponse:

export interface LanguageModelTextResponse {
    text: string;
}

In the streaming-case we use LanguageModelStreamResponsePart which can contain tool calls:

export interface LanguageModelStreamResponsePart {
    content?: string | null;
    tool_calls?: ToolCall[];
}

These are then translated in the method AbstractStreamParsingChatAgent.parse() to ToolCallChatResponseContentImpl, which will show in the UI.

So I think we may need to extend our LanguageModelResponse model to be able to capture tool calls also for non-streaming responses and then process them in the AbstractTextToModelParsingChatAgent to translate them into ToolCallChatResponseContentImpl, so they show up in the UI.

@dhuebner
Copy link
Member Author

@planger
Good idea!
But for now as we do not have this API, what is still missing in this PR?

@sdirix
Copy link
Member

sdirix commented Dec 20, 2024

Just an idea: Can't we map non-stream responses to streamed ones rather easily? We just pretend that there is a stream and once we have the answer of the LLM we send it as one blob. This way we could reuse the tools?

@dhuebner
Copy link
Member Author

@sdirix
Okay, we can do that

@dhuebner dhuebner force-pushed the dhuebner/ollama-tools-14610 branch from 65ed52c to 1579b14 Compare January 10, 2025 11:30
@dhuebner dhuebner force-pushed the dhuebner/ollama-tools-14610 branch from 1579b14 to 218c7f8 Compare January 10, 2025 13:47
@dhuebner
Copy link
Member Author

@sdirix @JonasHelming
Now returning a fake stream response with tool-call data attached. The current state looks like this:

Bildschirmfoto 2025-01-10 um 14 46 45

Please consider to make ToolCall.id mandatory. It cost me some time to figure out that without an id multiple tool-call message parts are merged into one in a wired way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Waiting on reviewers
Development

Successfully merging this pull request may close these issues.

[Theia AI] Ollama LLM provider does not support tools
4 participants