Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: <title>extract entities by nltk strategy found Error: "Column(s) ['description', 'source_id'] do not exist" #1601

Open
3 tasks
HENScience opened this issue Jan 9, 2025 · 0 comments
Labels
bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer

Comments

@HENScience
Copy link

HENScience commented Jan 9, 2025

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the bug

When I set the strategy of entity extraction to nltk, the following error occurs during index creation:
KeyError: "Column(s) ['description', 'source_id'] do not exist"
graphrag\index\operations\extract_entities\extract_entities.py", line 171, in _merge_entities
.agg(description=("description", list), text_unit_ids=("source_id", list))

Steps to reproduce

No response

Expected Behavior

No response

GraphRAG Config Used

# Paste your config here
entity_extraction:
  prompt: "prompts/entity_extraction.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 1
  strategy: 
    type: nltk

Logs and screenshots

No response

Additional Information

  • GraphRAG Version: v1.1.1
  • Operating System: window11 Professional
  • Python Version: 3.10
@HENScience HENScience added bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer
Projects
None yet
Development

No branches or pull requests

1 participant