Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frequency_penalty at 0 causes no response content with Phi-3-mini-4k-cpu-int4-rtn-block-32-acc-level-4-onnx #90

Open
therealjohn opened this issue Aug 19, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@therealjohn
Copy link
Member

Frequency response at 0 causes an issue with no content in the response. > 0 by < 1 cause other weird responses. 1 seems to be the only reliable value and its unclear if its the model or something else.

POST http://127.0.0.1:5272/v1/chat/completions
content-type: application/json

{
    "messages": [
        {
            "role": "user",
            "content": "Whats the golden ratio"
        }
    ],
    "frequency_penalty": 0,
    "model": "Phi-3-mini-4k-cpu-int4-rtn-block-32-acc-level-4-onnx"
}

You will get a response like:

{
  "model": null,
  "choices": [
    {
      "delta": {
        "role": "assistant",
        "content": "",
        "name": null,
        "tool_call_id": null,
        "function_call": null,
        "tool_calls": null
      },
      "message": {
        "role": "assistant",
        "content": "",
        "name": null,
        "tool_call_id": null,
        "function_call": null,
        "tool_calls": null
      },
      "index": 0,
      "finish_reason": "stop",
      "finish_details": null,
      "logprobs": null
    }
  ],
  "usage": null,
  "created": 1724095112,
  "id": "chat.id.2641",
  "system_fingerprint": null,
  "object": "chat.completion",
  "Successful": true,
  "error": null,
  "HttpStatusCode": 0,
  "HeaderValues": null
}
@swatDong swatDong added the bug Something isn't working label Aug 23, 2024
@swatDong
Copy link
Contributor

@a1exwang - is this caused by invalid parameter? May consider adding value check for all input parameters.

@a1exwang
Copy link
Collaborator

  1. AITK uses ONNX runtime GenAI for inference and frequency_penalty is converted to repetition_penalty behind the scene.

  2. According to ONNX documentation, repetition_penalty cannot be 0.

  3. As the tooltip mentions, this parameter controls likelihood of repetition. So if you set a lower value, it will likely repeat itself. That's why you will see weird values when set to 0~1.

    image

    image
  4. The value 1 is not the only reliable value. You can also set it to greater than 1, which will decrease the likelihood of repetition more.

    image

I think we can add range validation for input parameters as @swatDong said

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants