Skip to content

Commit

Permalink
Python: Fix function call content argument parsing (#10132)
Browse files Browse the repository at this point in the history
### Motivation and Context

<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
This PR includes changes to improve the handling of JSON parsing when it
fails to be parsed as it is in function call arguments by adding a
regular expression for preprocessing JSON strings to remove single
quotes only when they are not escaped.

### Description

<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->

Enhancements to JSON parsing:

*
[`python/semantic_kernel/contents/function_call_content.py`](diffhunk://#diff-1f5f27eade7117045bc9e62d66ebef5c93047142cdb4fd0115b0ff8eadf2f69cL152-R164):
Modified `parse_arguments` method to preprocess JSON strings by
replacing single quotes with double quotes, except for escaped single
quotes. Added a debug log for invalid JSON.

Test file updates:

*
[`python/tests/unit/contents/test_function_call_content.py`](diffhunk://#diff-6e891b3044d5bf5caa7abc97a44ef11ce291aba01fa8032ab1104773f4545c9dR135-R148):
Added new tests for parsing single-quoted JSON strings. Renamed the test
file from `test_function_call.py` to `test_function_call_content.py`.

### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄
  • Loading branch information
TaoChenOSU authored Jan 9, 2025
1 parent dcad32e commit 4882550
Show file tree
Hide file tree
Showing 3 changed files with 49 additions and 10 deletions.
31 changes: 23 additions & 8 deletions python/DEV_SETUP.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ make install
```

This will install uv, python, Semantic Kernel and all dependencies and the pre-commit config. It uses python 3.10 by default, if you want to change that set the `PYTHON_VERSION` environment variable to the desired version (currently supported are 3.10, 3.11, 3.12). For instance for 3.12"

```bash
make install PYTHON_VERSION=3.12
```
Expand All @@ -71,6 +71,8 @@ Alternatively you can run the VSCode task `Python: Install` to run the same comm

## VSCode Setup

Install the [Python extension](https://marketplace.visualstudio.com/items?itemName=ms-python.python) for VSCode.

Open the workspace in [VSCode](https://code.visualstudio.com/docs/editor/workspaces).
> The workspace for python should be rooted in the `./python` folder.
Expand All @@ -87,13 +89,19 @@ Read more about the extension [here](https://github.com/astral-sh/ruff-vscode).

- We have removed the strict dependency on forcing `pytest` usage via the `.vscode/settings.json` file.
- Developers are free to set up unit tests using their preferred framework, whether it is `pytest` or `unittest`.
- If needed, adjust VSCode's local `settings.json` (accessed via the Command Palette: **Open User Settings (JSON)**) to configure the test framework. For example:
- If needed, adjust VSCode's local `settings.json` (accessed via the Command Palette(`Ctrl+Shift+P`) and type `Preferences: Open User Settings (JSON)`) to configure the test framework. For example:

```json
"pythonTestExplorer.testFramework": "pytest"
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
```

Or, for `unittest`:

```json
"pythonTestExplorer.testFramework": "unittest"
"python.testing.unittestEnabled": true,
"python.testing.pytestEnabled": false,
```

## LLM setup

Expand All @@ -116,6 +124,7 @@ There are a lot of settings, for a more extensive list of settings, see [ALL_SET
To configure a `.env` file with just the keys needed for OpenAI Chat Completions, you can create a `openai.env` (this name is just as an example, a single `.env` with all required keys is more common) file in the root of the `python` folder with the following content:

Content of `openai.env`:

```env
OPENAI_API_KEY=""
OPENAI_CHAT_MODEL_ID="gpt-4o-mini"
Expand Down Expand Up @@ -153,7 +162,6 @@ You can also run all the tests together under the [tests](tests/) folder.
Alternatively, you can run them using VSCode Tasks. Open the command palette
(`Ctrl+Shift+P`) and type `Tasks: Run Task`. Select `Python: Tests - All` from the list.


## Implementation Decisions

### Asynchronous programming
Expand All @@ -170,17 +178,18 @@ We follow the [Google Docstring](https://github.com/google/styleguide/blob/gh-pa
They are currently not checked for private functions (functions starting with '_').

They should contain:

- Single line explaining what the function does, ending with a period.
- If necessary to further explain the logic a newline follows the first line and then the explanation is given.
- The following three sections are optional, and if used should be separated by a single empty line.
- Arguments are then specified after a header called `Args:`, with each argument being specified in the following format:
- `arg_name`: Explanation of the argument.
- `arg_name`: Explanation of the argument.
- if a longer explanation is needed for a argument, it should be placed on the next line, indented by 4 spaces.
- Type and default values do not have to be specified, they will be pulled from the definition.
- Returns are specified after a header called `Returns:` or `Yields:`, with the return type and explanation of the return value.
- Finally, a header for exceptions can be added, called `Raises:`, with each exception being specified in the following format:
- `ExceptionType`: Explanation of the exception.
- if a longer explanation is needed for a exception, it should be placed on the next line, indented by 4 spaces.
- `ExceptionType`: Explanation of the exception.
- if a longer explanation is needed for a exception, it should be placed on the next line, indented by 4 spaces.

Putting them all together, gives you at minimum this:

Expand All @@ -189,6 +198,7 @@ def equal(arg1: str, arg2: str) -> bool:
"""Compares two strings and returns True if they are the same."""
...
```

Or a complete version of this:

```python
Expand Down Expand Up @@ -290,6 +300,7 @@ To run the same checks that run during a commit and the GitHub Action `Python Co
```

or use the following task (using `Ctrl+Shift+P`):

- `Python - Run Checks` to run the checks on the whole project.
- `Python - Run Checks - Staged` to run the checks on the currently staged files only.

Expand All @@ -302,19 +313,23 @@ We try to maintain a high code coverage for the project. To run the code coverag
```bash
uv run pytest --cov=semantic_kernel --cov-report=term-missing:skip-covered tests/unit/
```

or use the following task (using `Ctrl+Shift+P`):

- `Python: Tests - Code Coverage` to run the code coverage on the whole project.

This will show you which files are not covered by the tests, including the specific lines not covered. Make sure to consider the untested lines from the code you are working on, but feel free to add other tests as well, that is always welcome!

## Catching up with the latest changes

There are many people committing to Semantic Kernel, so it is important to keep your local repository up to date. To do this, you can run the following commands:

```bash
git fetch upstream main
git rebase upstream/main
git push --force-with-lease
```

or:

```bash
Expand Down
14 changes: 12 additions & 2 deletions python/semantic_kernel/contents/function_call_content.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import json
import logging
import re
from collections.abc import Mapping
from typing import TYPE_CHECKING, Any, ClassVar, Final, Literal, TypeVar
from xml.etree.ElementTree import Element # nosec
Expand Down Expand Up @@ -149,9 +150,18 @@ def parse_arguments(self) -> Mapping[str, Any] | None:
if isinstance(self.arguments, Mapping):
return self.arguments
try:
return json.loads(self.arguments.replace("'", '"'))
return json.loads(self.arguments)
except json.JSONDecodeError as exc:
raise FunctionCallInvalidArgumentsException("Function Call arguments are not valid JSON.") from exc
logger.debug("Function Call arguments are not valid JSON. Trying to preprocess.")
try:
# Python strings can be single quoted, but JSON strings should be double quoted.
# JSON keys and values should be enclosed in double quotes.
# Replace single quotes with double quotes, but not if it's an escaped single quote.
return json.loads(re.sub(r"(?<!\\)'", '"', self.arguments).replace("\\'", "'"))
except json.JSONDecodeError:
raise FunctionCallInvalidArgumentsException(
"Function Call arguments are not valid JSON even after preprocessing."
) from exc

def to_kernel_arguments(self) -> "KernelArguments":
"""Return the arguments as a KernelArguments instance."""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,20 @@ def test_parse_arguments_none():
assert fc.parse_arguments() is None


def test_parse_arguments_single_quotes():
# Test parsing arguments that are single quoted
fc = FunctionCallContent(id="test", name="Test-Function", arguments="{'input': 'world'}")
assert fc.parse_arguments() == {"input": "world"}

fc = FunctionCallContent(id="test", name="Test-Function", arguments="{'input': 'Let\\'s go'}")
assert fc.parse_arguments() == {"input": "Let's go"}

# Note this the below test is not a valid json string
fc = FunctionCallContent(id="test", name="Test-Function", arguments="{'input': 'Let's go'}")
with pytest.raises(FunctionCallInvalidArgumentsException):
fc.parse_arguments()


def test_parse_arguments_fail():
# Test parsing arguments to dictionary
fc = FunctionCallContent(id=None, name="Test-Function", arguments="""{"input": "world}""")
Expand Down

0 comments on commit 4882550

Please sign in to comment.