syjq.net
当前位置:首页 >> python有哪些提取文本摘要的库 >>

python有哪些提取文本摘要的库

1.google goose >>> from goose import Goose>>> url = 'http://edition.cnn.com/2012/02/22/world/europe/uk-occupy-london/index.html?hpt=ieu_c2'>>> g = Goose()>>> article = g.extract(url=url)>>> article.titleu'Occupy London loses ev...

1.google goose 1 2 3 4 5 6 7 8 9 10 11 12 >>> from goose import Goose >>> url = 'http://edition.cnn.com/2012/02/22/world/europe/uk-occupy-london/index.html?hpt=ieu_c2' >>> g = Goose() >>> article = g.extract(url=url) >>> articl...

目前只找到了几个分词库,如结巴分词,没有找到那种可以提取文本摘要的库. 可以是其它语言,只要能容易被python调用即可.

TextTeaser好像很赞 也可是试试 snownlp

其实实现是个简单版本并不难,文本切成句子,以句子相似性作为相互链接的权值,构造一个矩阵。有了权值矩阵就可以利用pagerank来得到每个句子的最 终得分。计算好没个句子的出度入度,给个初始的得分,迭代更新至收敛,得分最高则作为摘要。计算...

这里有一些例子: x = string.gsub("hello world", "(%w+)", "%1 %1") --> x="hello hello world world" x = string.gsub("hello world", "%w+", "%0 %0", 1) --> x="hello hello world" x = string.gsub("hello world from Lua", "(%w+)%s*(%w+...

1.google goose 1 2 3 4 5 6 7 8 9 10 11 12 >>> from goose import Goose >>> url = 'http://edition.cnn.com/2012/02/22/world/europe/uk-occupy-london/index.html?hpt=ieu_c2' >>> g = Goose() >>> article = g.extract(url=url) >>> articl...

网站首页 | 网站地图
All rights reserved Powered by www.syjq.net
copyright ©right 2010-2021。
内容来自网络,如有侵犯请联系客服。zhit325@qq.com