OpenSPG/KAG: KAG is a logical form-guided reasoning and retrieval framework based on the OpenSPG engine and LLMs. It is used to create logical reasoning and authentic Q&A solutions for professional domain knowledge bases. It effectively overcomes the shortcomings of the traditional RAG vector similarity calculation model.

AljwadhDecember 30, 2024

0 1,590 4 minutes read

English |
Simplified Chinese |
Japanese version document

KAG is a logical reasoning and Q&A framework based on OpenSPG machine and large language models, used to create logical reasoning and Q&A solutions for vertical domain knowledge bases. KAG effectively overcomes the ambiguity of traditional RAG vector similarity calculation and the noise problem of GraphRAG introduced by OpenIE. KAG supports logical reasoning and multi-hop fact Q&A, etc., and is much better than the current SOTA method.

The goal of KAG is to build an LLM service framework that is enhanced by knowledge in professional fields, supporting logical reasoning, real Q&A, etc. KAG perfectly combines the logical and factual characteristics of KGs. Its main features include:

Knowledge and Chunk Mutual Indexing structure to integrate more complete text information in context
Knowledge alignment using conceptual semantic reasoning to mitigate the noise problem caused by OpenIE
Schema-constrained knowledge construction to support domain expert knowledge representation and construction
Logical form-guided hybrid reasoning and retrieval to support logical reasoning and multi-hop reasoning Q&A

⭐️ Star our repository to stay up-to-date with exciting new features and improvements! Get instant notifications for new releases! 🌟

2.1 Knowledge Representation

In the context of private knowledge bases, unstructured data, structured information, and business expert experience often coexist. KAG discussed with the DIKW hierarchy to upgrade the SPG to a LLMs friendly version.

For unstructured data such as news, events, logs, and books, as well as structured data such as transactions, statistics, and approvals, along with business experience and domain knowledge rules, KAG uses techniques such as layout analysis, knowledge extraction, property normalization, and semantic alignment to integrate raw business data and expert rules into a unified business knowledge graph.

It is made compatible with schema-free information retrieval and schema-constrained expertise construction of the same type of knowledge (e.g., entity type, event type), and supports cross-index representation between the graph structure and the original text block .

This representation of the mutual index helps to build the inverted index based on the graph structure, and promotes the joint representation and reasoning of logical forms.

2.2 Mixed Reasoning Guided by Logic Forms

KAG proposes a logical formal guided hybrid solution and inference engine.

The engine includes three types of operators: planning, reasoning, and recovery, which transform natural language problems into problem-solving processes that combine language and notation.

In this process, each step can use different operators, such as exact match retrieval, text retrieval, numerical calculation or semantic reasoning, to realize the integration of four different problem-solving processes: Retrieval , Knowledge Graph reasoning, language reasoning and numerical calculations. .

2024.11.21 : Support uploading Word docs, model invoke concurrency setting, User experience optimization, etc.
2024.10.25 : KAG initial release

domain knowledge injection, domain schema customization, QFS tasks support, Visual query analysis, etc.
Optimization of logical reasoning, support of conversation tasks
and model release, and solution for the reasoning of event knowledge graph and medical knowledge graph
and front-end open source, distributed build support, mathematical reasoning optimization

4.1 product base (for ordinary users)

4.1.1 Engine and Dependent Image Installation

Use the following commands to download the docker-compose.yml file and launch the Docker Compose services.

# set the HOME environment variable (only Windows users need to execute this command)
# set HOME=%USERPROFILE%

curl -sSL https://raw.githubusercontent.com/OpenSPG/openspg/refs/heads/master/dev/release/docker-compose-west.yml -o docker-compose-west.yml
docker compose -f docker-compose-west.yml up -d

Navigate to the default KAG product url using your browser: http://127.0.0.1:8887

See the Quick Start for Product Mode for a detailed introduction.

4.2 toolkit-based (for developers)

4.2.1 Engine and Dependent Image Installation

See section 3.1 to complete the installation of the engine and dependent image.

4.2.2 Installation of KAG

macOS / Linux developers

# Create conda env: conda create -n kag-demo python=3.10 && conda activate kag-demo

# Clone code: git clone https://github.com/OpenSPG/KAG.git

# Install KAG: cd KAG && pip install -e .

Windows developers

# Install the official Python 3.8.10 or later, install Git.

# Create and activate Python venv: py -m venv kag-demo && kag-demo\Scripts\activate

# Clone code: git clone https://github.com/OpenSPG/KAG.git

# Install KAG: cd KAG && pip install -e .

Please see the Quick Start for Developer Mode guide for a detailed introduction to the toolkit. Then you can use the built-in components to copy the performance results of the built-in datasets, and apply the components to new business scenarios.

The KAG framework includes three parts: kg-builder, kg-solver, and kag-model. This release includes only the first two parts, and models gradual open source releases in the future.

kg-builder implements a knowledge representation that is friendly to large-scale language models (LLM). Based on the hierarchical structure of DIKW (data, information, knowledge and wisdom), IT improves the knowledge representation ability of SPG, and is compatible with obtaining information without schema constraints and building professional knowledge that have schema constraints of the same type of knowledge (such as entity type. and event type), it also supports the representation of the mutual index between the graph structure and the original text block, which supports the efficient taking the stage in question and answer reasoning.

kg-solver uses a logical symbol guided hybrid solving and reasoning engine that includes three types of operators: planning, reasoning, and recovery, to transform natural language problems into a process to solve the problem that combines language and symbols. In this process, each step can use different operators, such as exact match retrieval, text retrieval, numerical calculation or semantic reasoning, to realize the integration of four different problem-solving processes: Retrieval , Knowledge Graph reasoning, language reasoning and numerical calculations. .

GitHub: https://github.com/OpenSPG/KAG

OpenSPG: https://spg.openkg.cn/

KAG introduction and applications: https://github.com/orgs/OpenSPG/discussions/52

If you use this software, please comment below:

@article{liang2024kag,
  title={KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation},
  author={Liang, Lei and Sun, Mengshu and Gui, Zhengke and Zhu, Zhongshu and Jiang, Zhouyu and Zhong, Ling and Qu, Yuan and Zhao, Peilong and Bo, Zhongpu and Yang, Jin and others},
  journal={arXiv preprint arXiv:2409.13731},
  year={2024}
}

@article{yikgfabric,
  title={KGFabric: A Scalable Knowledge Graph Warehouse for Enterprise Data Interconnection},
  author={Yi, Peng and Liang, Lei and Da Zhang, Yong Chen and Zhu, Jinye and Liu, Xiangyu and Tang, Kun and Chen, Jialin and Lin, Hao and Qiu, Leijie and Zhou, Jun}
}

Apache License 2.0

https://opengraph.githubassets.com/8592261e6a7d00fdde18ad567f692319e5a3dea244ed1825fe940b0fec8739e2/OpenSPG/KAG

2024-12-30 02:55:00

AljwadhDecember 30, 2024

0 1,590 4 minutes read

2.1 Knowledge Representation

2.2 Mixed Reasoning Guided by Logic Forms

4.1 product base (for ordinary users)

4.1.1 Engine and Dependent Image Installation

4.2 toolkit-based (for developers)

4.2.1 Engine and Dependent Image Installation

4.2.2 Installation of KAG

Aljwadh

Leave a Reply Cancel reply

Elon Musk agrees with Tweet saying Americans aren’t smart enough for tech jobs

Apple Allows Support for Satellite T-Mobile and Starlink in the iPhone

Lamar Kendrick will appear in Synth Riders experience on Apple Pro vision

The 2024 Movie Monster State of the Union

Thousands of people are evacuating in LA as wildfires and extreme winds hit Southern California

As Trump Attacks Canada, Downing Street stick to the sideline

Ryan Reynolds and Andrew Garfield Are Game to Return as Deadpool and Spider-Man

Your Dishwasher Is Gross. Here’s How to Clean It

Apple Music expands its live radio offerings with three new stations

Ready Player Me’s Player Zero sees momentum for Web3 collectible avatars

The 33 Best Shows on Apple TV+ Right Now (December 2024)

2.1 Knowledge Representation

2.2 Mixed Reasoning Guided by Logic Forms

4.1 product base (for ordinary users)

4.1.1 Engine and Dependent Image Installation

4.2 toolkit-based (for developers)

4.2.1 Engine and Dependent Image Installation

4.2.2 Installation of KAG

Aljwadh

Philippines' Marcos signs record $109 billion budget into law for 2025 By Reuters

Liverpool demand price for Alexis McAllister amid Real Madrid links

Related Articles

Improved Coding Benchmark for LLMS

Eaguurafa / shunpo: a minimalist bash tool that makes the navigation directory a bit faster.

“They’re like milk”: WB DVDs from 2006-2008 rotten in their cases

The world’s largest 3D-printed neighborhood is nearing completion in Texas

Leave a Reply Cancel reply

As Trump Attacks Canada, Downing Street stick to the sideline

Ryan Reynolds and Andrew Garfield Are Game to Return as Deadpool and Spider-Man

Your Dishwasher Is Gross. Here’s How to Clean It

Apple Music expands its live radio offerings with three new stations

Ready Player Me’s Player Zero sees momentum for Web3 collectible avatars

The 33 Best Shows on Apple TV+ Right Now (December 2024)