GraphGPT
https://github.com/varunshenoy/GraphGPT
GitHub - varunshenoy/GraphGPT: Extrapolating knowledge graphs from unstructured text using GPT-3 ๐ต๏ธโ๏ธ
Extrapolating knowledge graphs from unstructured text using GPT-3 ๐ต๏ธโ๏ธ - GitHub - varunshenoy/GraphGPT: Extrapolating knowledge graphs from unstructured text using GPT-3 ๐ต๏ธโ๏ธ
github.com
GraphGPT
graphgpt.vercel.app
์ต๊ทผ GraphGPT ๋ผ๋ ๊ฒ์ ์ ํ์ต๋๋ค. GraphGPT๋ ์ํฐํฐ์ ์ํฐํฐ๊ฐ์ ๊ด๊ณ๋ฅผ ํ์ ํ์ฌ ๊ทธ ๊ด๊ณ๋ฅผ ๊ทธ๋ํ๋ก ํํํด์ฃผ๋ ํด์ธ๋ฐ, ๊ธฐ์กด์ ์จํจ๋ก์ง ๊ทธ๋ํ ๊ตฌ์ถ ๊ธฐ๋ฒ์ ChatGPT๋ก ๋์ฒดํ ๊ฒ์ ๋๋ค. ์ํฐํฐ์์ ์ฐ๊ฒฐ ๋ฟ ์๋๋ผ ๋ ธ๋์ ์์์ ๋ฐ๊ฟ์ค ์๋ ์์ต๋๋ค.
๊ทธ๋ฌ๋ GraphGPT์ ํ๊ณ๊ฐ ์์ต๋๋ค.
์ฒซ์งธ, GPT 3 ๋ชจ๋ธ ๊ธฐ๋ฐ์ ๋๋ค.
๋์งธ, ์์ด๋ก ํ๋กฌํํธ๊ฐ ์์ฑ๋์ด, ํ๊ธ ์ง์ ์ ์ฑ๋ฅ์ด ์๋์ต๋๋ค.
๋ฐ๋ผ์ ์ ๋ ํ๋ก๊ทธ๋จ์ด GPT3.5 (gpt-3.5-turbo) ๋ชจ๋ธ์ ์ฌ์ฉํ๋๋ก ํ๊ณ ,
๋ ธ๋ ๋ฐ ์ฃ์ง์ ๋ผ๋ฒจ์ ํ๊ตญ์ด๋ก ์ถ๋ ฅํ ์ ์๋๋ก ํ๋กฌํํธ๋ฅผ ์์ ํ์ต๋๋ค.
๋ํ ํ๋กฌํํธ์ ๋ ์์ธํ Example์ ์ถ๊ฐํ์ฌ ๊ธฐ์กด GraphGPT์ ์ฑ๋ฅ๋ฌธ์ ๋ฅผ ๊ฐ์ ํ์ต๋๋ค.
ํ๋ ๋! GraphGPT๋ ์ด๋ฏธ ์น ์ดํ๋ฆฌ์ผ์ด์ ์ ํตํด ํ๋ฅญํ ์ธํฐ๋ํฐ๋ธ ์๊ฐํ๋ฅผ ์ ๊ณตํ๊ณ ์์ง๋ง,
ํฐํธ ์ฌ์ด์ฆ, ๋ ธ๋ ํฌ๊ธฐ ๋ฑ ์์ ๋กญ๊ฒ ์ปค์คํ ํ ์ ์๋๋ก NetworkX ๋ผ์ด๋ธ๋ฌ๋ฆฌ๋ก ์๊ฐํํ์ฌ ๋ง๋ฌด๋ฆฌํด๋ณด์์ต๋๋ค.
GPT 3.5 ๋ชจ๋ธ์ ๊ธฐ๋ฐ์ผ๋ก ํ๊ตญ์ด ํ๋กฌํํธ ์์ฑ
GraphGPT ๊นํ๋ธ์ Prompts ๋๋ ํ ๋ฆฌ์ ๊ฐ๋ฐ์๊ฐ ํ์ฉํ ํ๋กฌํํธ๋ฅผ ๋ณผ ์ ์์ต๋๋ค. ๋ณด์๋ค์ํผ ์์ด๋ก ์์ฑ๋์ด ์์ด ์ ๋ ์ด ํ๋กฌํํธ๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ๊ตญ์ด ์์ฑ, ์์ ์ถ๊ฐ๋ฅผ ์งํํ์ต๋๋ค.
Given the current state of the graph and a prompt, first look for noun-type entities in the prompt, and those entities become nodes.
Find the right word to express the relationship between the entities. Those relationships become the edge. If you can't find relationships, don't force it.
Every node has an id, label, and color (in hex).
Every edge has a to and from with node ids, and a label.
Edges are directed, so the order of the from and to is important.
You MUST express in KOREAN.
Examples 1: The default color for nodes is #cccccc.
current state: {}
user prompt: ๋์ฐ์ ์์ฐ๊ณผ ์๋ก ์น๊ตฌ์
๋๋ค.
new state: { "nodes": [ {"id": 1, "label": "๋์ฐ", "color": "#cccccc"}, {"id": 2, "label": "์์ฐ", "color": "#cccccc"} ], "edges": [ {"from": 1, "to": 2, "label": "์น๊ตฌ"}, {"from": 2, "to": 1, "label": "์น๊ตฌ"} ] }
Examples 2: Keep current state and update new state.
current state: { "nodes": [ { "id": 1, "label": "๋ฐฅ", "color": "#cccccc" } ], "edges": [] }
user prompt: ์๋ฆฌ์ค๋ ๋ฐฅ์ ๋ฃธ๋ฉ์ดํธ์
๋๋ค. ์๋ฆฌ์ค์ ๋
ธ๋๋ฅผ ์ด๋ก์์ผ๋ก ๋ง๋ค์ด์ฃผ์ธ์.
new state: { "nodes": [ {"id": 1, "label": "๋ฐฅ", "color": "#cccccc"}, {"id": 2, "label": "์๋ฆฌ์ค", "color": "#00ff00"} ], "edges": [ {"from": 1, "to": 2, "label": "๋ฃธ๋ฉ์ดํธ"} ] }
Examples 3: GPT can assign a label for an edge to a word that does not appear in the prompt.
current state: {}
user prompt: ๋์ฐ์ ๋๊ตฌ์์ ํ์ด๋ฌ์ต๋๋ค.
new state: { "nodes": [ {"id": 1, "label": "๋์ฐ", "color": "#cccccc"}, {"id": 2, "label": "๋๊ตฌ", "color": "#cccccc"} ], "edges": [ {"from": 1, "to": 2, "label": "์ถ์"} ] }
Examples 4: GPT need to identify key relationships and put them into words.
current state: {}
user prompt: ์ฌ๋๋ค์ ์์ฃผ ๋ง๋๋ฉด ์ ๋๊ฐ์ ํ์ฑํฉ๋๋ค.
new state: { "nodes": [ {"id": 1, "label": " ๋ง๋จ", "color": "#cccccc"}, {"id": 2, "label": "์ ๋๊ฐ", "color": "#cccccc"} ], "edges": [ {"from": 1, "to": 2, "label": "ํ์ฑ"}] }
Examples 5: When a relationship is defined only as 'is(์ด๋ค)', the label of the relationship (edge) is unified as 'is(์ด๋ค)'.
current state: {}
user prompt: 1ํธ์ ์ ํด๋นํ๋ ์ญ์ ์์์ญ, ์ ๋๋ฆผ์ญ, ๊ด์ด๋์ญ ๋ฑ์ด ์๋ค.
new state : { "nodes": [ {"id": 1, "label": " 1ํธ์ ์ญ", "color": "#cccccc"}, {"id": 2, "label": "์์์ญ", "color": "#cccccc"}, {"id": 3, "label": "์ ๋๋ฆผ์ญ", "color": "#cccccc"}, {"id": 4, "label": "๊ด์ด๋์ญ", "color": "#cccccc"} ],
"edges": [ {"from": 2, "to": 1, "label": "์ด๋ค"}, {"from": 3, "to": 1, "label": "์ด๋ค"}, {"from": 4, "to": 1, "label": "์ด๋ค"}] }
===
์ 5๊ฐ์ง ์ฌ๋ก๋ฅผ ๊ธฐ๋ฐ์ผ๋ก new state๋ฅผ ๋ต๋ณํ๋ AI๊ฐ ๋์ธ์.
์ถ๊ฐ, ์์ ํ ํ๋กฌํํธ ์ ๋ฌธ์ ๋๋ค. ์ด๊ฒ์ ๊ฒ์ ๋ค ์ปค๋ฒํ๋ ค๋ค ๋ณด๋ ํ๋กฌํํธ๊ฐ ๊ธธ์ด์ง ๋ฏ ํ์ง๋ง, ๋ค์ํ ์ผ์ด์ค์์ ์ํ๋ ๊ฒฐ๊ณผ๊ฐ ๋์ฌ ์ ์๋๋ก ์์ฑํ์ต๋๋ค.
GPT 3.5๋ฅผ ํ์ฉํ ์ง์๊ทธ๋ํ Networkx ๊ตฌํ
https://colab.research.google.com/drive/1yHlAC2IHxPcKlA_nuH8kn2bAew4Vv3ZM?usp=sharing
ChatGPT API๋ฅผ ํตํ ์ง์๊ทธ๋ํ ๊ตฌ์ถ.ipynb
Colaboratory notebook
colab.research.google.com
๊ทธ๋ผ, ๊ตฌํํด๋ณด๊ฒ ์ต๋๋ค. ์ฝ๋ ์ ์ฒด๋ ์ ์ฝ๋ฉ ๋งํฌ๋ฅผ ํตํด ํ์ธ ๊ฐ๋ฅํฉ๋๋ค.
def making_prompt(is_start, current_state, user_promt):
human_template = ''
if(is_start):
human_template += 'current state: {}'
else:
human_template += 'current state: {}'.format(current_state)
human_template += ' user prompt: ' + user_promt + " new state:"
result = chat(
[
SystemMessage(content=system_template),
HumanMessage(content=human_template)
]
)
return result
ํ๋กฌํํธ๋ฅผ ๋ง๋๋ ํจ์์ ๋๋ค. ์ฌ์ฉ์๊ฐ ์๋ก์ด ๋ฌธ์ฅ์ ์ ๋ ฅํ ๋๋ง๋ค new state๋ฅผ ๋ต๋ณํ ์ ์๋๋ก ๊ตฌ์ฑํ์ต๋๋ค. ์ ํ๋กฌํํธ์์ ํ์ธํ ์ ์๋ฏ, ๊ธฐ์กด์ state๋ฅผ ์ ์งํ๊ณ , new state๋ฅผ ์ ๋ฐ์ดํธ ํ๋ ๊ฒ์ด ๊ธฐ๋ณธ์ ๋๋ค. ์ฒซ ์ง์๋ผ๋ฉด ํ์ฌ state๊ฐ {} (๋น ์ํ) ๊ฐ ๋์ด์ผ๊ฒ ์ฃ .
์ ๋ Langchain์ ํ์ฉํ๊ณ , langchain์ ChatOpenAI ๋ชจ๋์ ์ฌ์ฉํด์ ์์คํ ํ ํ๋ฆฟ(Instruction)๊ณผ ์ ๋ ฅ ํ ํ๋ฆฟ(Input)์ ํฉ์ณ์ฃผ์์ต๋๋ค.
์ด ํ ๋ฐ๋ณต์ ์ผ๋ก ์ฌ์ฉ์ ์ ๋ ฅ์ ๋ฐ์ new state๋ฅผ ์ ๋ฐ์ดํธ ํด์ฃผ์์ต๋๋ค. ์๋๋ ์ ๊ฐ ์ ์ฉํด๋ณธ ์์์ ๋๋ค.
new state : {}
User Input : ์์ผ๋๋์ ์๋๋ ๋๋ธ๋ฆฐ์ด๋ค.
new state :
{ "nodes": [ {"id": 1, "label": "์์ผ๋๋", "color": "#cccccc"}, {"id": 2, "label": "๋๋ธ๋ฆฐ", "color": "#cccccc"} ],
"edges": [ {"from": 1, "to": 2, "label": "์๋"} ] }
User Input : ๋๋ธ๋ฆฐ์ ๋ช ์์๋ ํธ๋ฆฌ๋ํฐ๋ํ, ํผ๋์คํํฌ ๋ฑ์ด ์๋ค.
new state :
{ "nodes": [ {"id": 1, "label": "์์ผ๋๋", "color": "#cccccc"}, {"id": 2, "label": "๋๋ธ๋ฆฐ", "color": "#cccccc"}, {"id": 3, "label": "ํธ๋ฆฌ๋ํฐ๋ํ", "color": "#cccccc"}, {"id": 4, "label": "ํผ๋์คํํฌ", "color": "#cccccc"} ], "edges": [ {"from": 1, "to": 2, "label": "์๋"}, {"from": 2, "to": 3, "label": "๋ช ์"}, {"from": 2, "to": 4, "label": "๋ช ์"} ] }
User Input : ์์ผ๋๋์ ๋ ธ๋ ์์์ ํ๋์์ผ๋ก ํด์ฃผ์ธ์.
new state :
{ "nodes": [ {"id": 1, "label": "์์ผ๋๋", "color": "#0000ff"}, {"id": 2, "label": "๋๋ธ๋ฆฐ", "color": "#cccccc"}, {"id": 3, "label": "ํธ๋ฆฌ๋ํฐ๋ํ", "color": "#cccccc"}, {"id": 4, "label": "ํผ๋์คํํฌ", "color": "#cccccc"} ], "edges": [ {"from": 1, "to": 2, "label": "์๋"}, {"from": 2, "to": 3, "label": "๋ช ์"}, {"from": 2, "to": 4, "label": "๋ช ์"} ] }
User Input : ์์ผ๋๋๋ ๋งฅ์ฃผ์ ์ปคํผ๊ฐ ์ ๋ช ํฉ๋๋ค.
new state :
{ "nodes": [ {"id": 1, "label": "์์ผ๋๋", "color": "#0000ff"}, {"id": 2, "label": "๋๋ธ๋ฆฐ", "color": "#cccccc"}, {"id": 3, "label": "ํธ๋ฆฌ๋ํฐ๋ํ", "color": "#cccccc"}, {"id": 4, "label": "ํผ๋์คํํฌ", "color": "#cccccc"}, {"id": 5, "label": "๋งฅ์ฃผ", "color": "#cccccc"}, {"id": 6, "label": "์ปคํผ", "color": "#cccccc"} ], "edges": [ {"from": 1, "to": 2, "label": "์๋"}, {"from": 2, "to": 3, "label": "๋ช ์"}, {"from": 2, "to": 4, "label": "๋ช ์"}, {"from": 1, "to": 5, "label": "์ ๋ช "}, {"from": 1, "to": 6, "label": "์ ๋ช "} ] }
User Input : ๊ธฐ๋ค์ค๊ฐ ์ ๋ช ํ ๋งฅ์ฃผ ์ค ํ๋์ ๋๋ค.
new state :
{ "nodes": [ {"id": 1, "label": "์์ผ๋๋", "color": "#0000ff"}, {"id": 2, "label": "๋๋ธ๋ฆฐ", "color": "#cccccc"}, {"id": 3, "label": "ํธ๋ฆฌ๋ํฐ๋ํ", "color": "#cccccc"}, {"id": 4, "label": "ํผ๋์คํํฌ", "color": "#cccccc"}, {"id": 5, "label": "๋งฅ์ฃผ", "color": "#cccccc"}, {"id": 6, "label": "์ปคํผ", "color": "#cccccc"}, {"id": 7, "label": "๊ธฐ๋ค์ค", "color": "#cccccc"} ], "edges": [ {"from": 1, "to": 2, "label": "์๋"}, {"from": 2, "to": 3, "label": "๋ช ์"}, {"from": 2, "to": 4, "label": "๋ช ์"}, {"from": 1, "to": 5, "label": "์ ๋ช "}, {"from": 1, "to": 6, "label": "์ ๋ช "}, {"from": 5, "to": 7, "label": "์ ๋ช "} ] }
์ด๋ ๊ฒ ์ต์ข ์ ์ผ๋ก ๋์จ state๋ ๋ฐ๋ก networkx ์ ๋ ฅ์ผ๋ก ์ฌ์ฉํ ์ ์๋๋ก ํ์ต๋๋ค.
๋ ธ๋ :
[{'id': 1, 'label': '์์ผ๋๋', 'color': '#0000ff'},
{'id': 2, 'label': '๋๋ธ๋ฆฐ', 'color': '#cccccc'},
{'id': 3, 'label': 'ํธ๋ฆฌ๋ํฐ๋ํ', 'color': '#cccccc'},
{'id': 4, 'label': 'ํผ๋์คํํฌ', 'color': '#cccccc'},
{'id': 5, 'label': '๋งฅ์ฃผ', 'color': '#cccccc'},
{'id': 6, 'label': '์ปคํผ', 'color': '#cccccc'},
{'id': 7, 'label': '๊ธฐ๋ค์ค', 'color': '#cccccc'}]
์ฃ์ง :
[{'from': 1, 'to': 2, 'label': '์๋'},
{'from': 2, 'to': 3, 'label': '๋ช
์'},
{'from': 2, 'to': 4, 'label': '๋ช
์'},
{'from': 1, 'to': 5, 'label': '์ ๋ช
'},
{'from': 1, 'to': 6, 'label': '์ ๋ช
'},
{'from': 5, 'to': 7, 'label': '์ ๋ช
'}]
import networkx as nx
import matplotlib.pyplot as plt
# networkx ๊ทธ๋ํ ์์ฑ
G = nx.DiGraph()
# ๋
ธ๋ ์ถ๊ฐ
for node in nodes:
#print(node["id"], node["label"], node["color"])
G.add_node(node["id"], label=node["label"], color=node["color"])
# ์์ง ์ถ๊ฐ
for edge in edges:
#print((edge["from"], edge["to"], edge["label"]))
G.add_edge(edge["from"], edge["to"], label=edge["label"], length = 10)
# ๊ทธ๋ํ ์๊ฐํ
pos = nx.spring_layout(G, k=0.2) # ๋
ธ๋ ์์น ์ง์
#pos = nx.kamada_kawai_layout(G)
#pos = nx.layout.fruchterman_reingold_layout(G)
#pos = nx.random_layout(G)
#pos = nx.spectral_layout(G)
๋ฐฉํฅ์ฑ ๊ทธ๋ํ (DiGraph)๋ฅผ ์์ฑํ์ฌ, ๋ ธ๋์ ์ฃ์ง๋ฅผ add_node, add_edge ํจ์๋ฅผ ํตํด ์ถ๊ฐํ์ต๋๋ค. ์ธ์ ๊ฐ์ GPT๋ก ๋ถํฐ ๋ฐ์ ๋ต๋ณ์ value๊ฐ์ ๊ทธ๋๋ก ์ฌ์ฉํ ์ ์์ต๋๋ค.
plt.figure(figsize=(120,80))
nx.draw_networkx(G, pos,
labels = {node: G.nodes()[node]['label'] for node in G.nodes()},node_size = 10000,
node_color=nColors, font_size=80, font_family='NanumBarunGothic')
nx.draw_networkx_edge_labels(
G,pos, font_size= 50,
edge_labels=dict(((e[0], e[1]), e[2]['label']) for e in G.edges(data=True)),
font_color='red', font_family='NanumBarunGothic'
)
๊ทธ ์ธ ํฐํธ ์์, ํฌ๊ธฐ ๋ฑ์ ์ํ๋ ๋๋ก ๋ฐ๊ฟ ์ ์์ต๋๋ค.
์๊ฐํ ๊ฒฐ๊ณผ, ์ํฐํฐ ๊ฐ์ ๊ด๊ณ๋ฅผ ์ ์ ํ ๋ช ์ฌ๋ก ํ ๋นํ๊ณ , ๋ ธ๋์ ์์ ๋ณ๊ฒฝ๊น์ง ๋ชจ๋ ์ ๋๋ก ๋์์์ ํ์ธํ ์ ์์ต๋๋ค. ๊ธด ๊ธ์ ๋ํด ์์ฝํ๋ ์ฉ๋๋, ์ํคํผ๋์ ์ฒ๋ผ ํค์๋ ๊ฐ์ ๊ด๊ณ๊ฐ ์ค์ํ ๊ธ์ ๋ํด ์ฝ๊ฒ ๊ทธ ๊ด๊ณ๋ฅผ ํ์ ํ๋ ์ฉ๋๋ก ์ฌ์ฉํ๊ธฐ ์ฉ์ดํ ๊ฒ ๊ฐ์ต๋๋ค.