(Translated by https://www.hiragana.jp/)
GitHub - cocacola-lab/GPT4IE: An open-source and powerful Information Extraction toolkit based on GPT (GPT for Information Extraction; GPT4IE for short)。Note: we set a default openai key in the tool, you can tell us if the key reach the limit.
Skip to content

An open-source and powerful Information Extraction toolkit based on GPT (GPT for Information Extraction; GPT4IE for short)。Note: we set a default openai key in the tool, you can tell us if the key reach the limit.

License

Notifications You must be signed in to change notification settings

cocacola-lab/GPT4IE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPT4IE

we also provide a IE tool based on ChatGPT, you can see in ChatIE

Description

Note: we set a default openai key in the tool, you can tell us if the key reach the limit.

GPT4IE (GPT for Information Extraction) is a open-source and powerful IE tool demo. Enhanced by GPT3.5 and prompting, it aims to automatically extract structured information from a raw sentence and make a valuable in-depth analysis of the input sentence. Harnessing valuable structured information helps corporations make incisive and business–improving decisions.

We support the following functions:

Task Name Lauguages
RE entity-relation joint extraction Chinese, English
NER named entity recoginzation Chinese, English
EE event extraction Chinese, English

RE

This task aims to extract triples from plain texts, such as (China, capital, Beijing) , (《如懿传》, 主演しゅえん, しゅう迅).

Input

  • sentence: a plain text.
  • relation type list (rtl)* : ['relation type 1', 'relation type 2', ...]
  • subject type list (stl)* : ['subject type 1', 'subject type 2', ...]
  • object type list (otl)* : ['object type 1', 'object type 2', ...]

PS: * denote optional, we set default value for them. But for better extraction, you should specify the three list according to application scenarios.

Examples

sentence: Bob worked for Google in Beijing, the capital of China.
rtl: ['location-located_in', 'administrative_division-country', 'person-place_lived', 'person-company', 'person-nationality', 'company-founders', 'country-administrative_divisions', 'person-children', 'country-capital', 'deceased_person-place_of_death', 'neighborhood-neighborhood_of', 'person-place_of_birth']
stl: ['organization', 'person', 'location', 'country']
otl: ['person', 'location', 'country', 'organization', 'city']
ouptut:
ouptut

sentence: だい:《如懿传》《如懿传》一部古装宫廷情感电视剧,よしひろししゅん执导,しゅう迅、霍建华、张钧甯、ただし洁、からし芷蕾、わらわよう纯、邬君うめとう主演しゅえん
rtl: ['所属しょぞく专辑', '成立せいりつ', '海拔かいばつ', 'かんかた语言', 'うらない地面じめん积', 'ちち亲', '歌手かしゅ', 'せいへんじん', '导演', '首都しゅと', '主演しゅえん', 'ただしごと长', 'せき', '妻子さいし', 'はは亲', '气候', 'めん积', 'おもかく', '邮政编码', '简称', '出品しゅっぴん公司こうし', 'ちゅうさつ资本', '编剧', '创始じん', '毕业いんこう', '国籍こくせき', '专业だい码', '朝代あさよ', '作者さくしゃ', 'さく词', '所在しょざい城市じょうし', 'よしみ宾', '总部地点ちてん', '人口じんこう数量すうりょう', '代言だいげんじん', 'あらため编自', 'こう长', '丈夫じょうぶ', '主持しゅうもちじん', 'しゅ题曲', 'おさむ业年げん', '作曲さっきょく', 'ごう', '上映じょうえい时间', 'ひょうぼう', '饰演', 'はいおん', '获奖'] stl: ['国家こっか', '行政ぎょうせい', '文学ぶんがく作品さくひん', '人物じんぶつ', 'かげ作品さくひん', '学校がっこう', '图书作品さくひん', '地点ちてん', '历史人物じんぶつ', 'けいてん', '歌曲かきょく', '学科がっか专业', 'くわだて业', '电视综艺', 'つくえ构', 'くわだて业/しなぱい', '娱乐人物じんぶつ']
otl: ['国家こっか', '人物じんぶつ', 'Text', 'Date', '地点ちてん', '气候', '城市じょうし', '歌曲かきょく', 'くわだて业', 'Number', 'おと乐专辑', '学校がっこう', '作品さくひん', '语言']
ouptut:
ouptut


NER

This task aims to extract entities from plain texts, such as (LOC, Beijing) , (人物じんぶつ, しゅう恩来おんらい).

Input

  • sentence: a plain text.
  • entity type list (etl)* : ['entity type 1', 'entity type 2', ...]

PS: * denote optional, we set default value for it. But for better extraction, you should specify the list according to application scenarios.

Examples

sentence: Bob worked for Google in Beijing, the capital of China.
etl: ['LOC', 'MISC', 'ORG', 'PER']
ouptut:
ouptut

sentence: ざい过去てきねんちゅう,致公党こうとうざい小平こだいら论指引下ひきさげ,遵循社会しゃかいぬし义初级阶だんてき基本きほん线,努力どりょく实践致公党こうとう十大提出的发挥参政党职能、きょう自身じしんけん设的基本きほんにん务。
etl: ['组织つくえ构', '地点ちてん', '人物じんぶつ']
ouptut:
ouptut


EE

This task aims to extract event from plain texts, such as {Life-Divorce: {Person: Bob, Time: today, Place: America}} , {竞赛ぎょう为-すすむ级: {时间: 无, すすむ级方: 西北せいほくおおかみ, すすむ级赛ごと: ちゅうかぶと榜首そう}}.

Input

  • sentence: a plain text.
  • event type list (etl)* : {'event type 1': ['argument role 1', 'argument role 2', ...], ...}

PS: * denote optional, we set default value for it. But for better extraction, you should specify the list according to application scenarios.

Examples

sentence: Yesterday Bob and his wife got divorced in Guangzhou.
etl: {'Personnel:Elect': ['Person', 'Entity', 'Position', 'Time', 'Place'], 'Business:Declare-Bankruptcy': ['Org', 'Time', 'Place'], 'Justice:Arrest-Jail': ['Person', 'Agent', 'Crime', 'Time', 'Place'], 'Life:Divorce': ['Person', 'Time', 'Place'], 'Life:Injure': ['Agent', 'Victim', 'Instrument', 'Time', 'Place']}
ouptut:
ouptut

sentence: ざい2022ねん卡塔尔世かいはい决赛ちゅうおもね廷以てんだまだい战险胜法こく
etl: {'组织ぎょう为-罢工': ['时间', '所属しょぞく组织', '罢工人数にんずう', '罢工じん员'], '竞赛ぎょう为-すすむ级': ['时间', 'すすむ级方', 'すすむ级赛ごと'], '财经/交易こうえき-涨停':['时间', '涨停またひょう'] , '组织关系-解雇かいこ': ['时间', '解雇かいこかた', '解雇かいこじん员']}
ouptut:
ouptut


Setup

  1. Run npm install to download required dependencies.
  2. Run npm run start. GPT4IE should open up in a new browser tab.
  3. note: node-version v14.17.4 npm-version 9.6.0
  4. you need have an Open-AI key.

About

An open-source and powerful Information Extraction toolkit based on GPT (GPT for Information Extraction; GPT4IE for short)。Note: we set a default openai key in the tool, you can tell us if the key reach the limit.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published