Python中jieba库的使用方法-编程学习网

短信预约信息系统项目管理师报名、考试、查分时间动态提醒

一、jieba库的安装

因为 jieba 是一个第三方库，所有需要我们在本地进行安装。

Windows 下使用命令安装：在联网状态下，在命令行下输入 pip install jieba 进行安装，安装完成后会提示安装成功

这里写图片描述

在 pyCharm 中安装：打开 settings，搜索 Project Interpreter，在右边的窗口选择 + 号，点击后在搜索框搜索 jieba，点击安装即可

二、jieba三种模式的使用


# -*- coding: utf-8 -*-
import jieba

seg_str = "好好学习，天天向上。"

print("/".join(jieba.lcut(seg_str)))    # 精简模式，返回一个列表类型的结果
print("/".join(jieba.lcut(seg_str, cut_all=True)))      # 全模式，使用 'cut_all=True' 指定 
print("/".join(jieba.lcut_for_search(seg_str)))     # 搜索引擎模式

分词效果：

这里写图片描述

三、jieba 分词简单应用

需求：使用 jieba 分词对一个文本进行分词，统计次数出现最多的词语，这里以三国演义为例


# -*- coding: utf-8 -*-
import jieba

txt = open("三国演义.txt", "r", encoding='utf-8').read()
words = jieba.lcut(txt)     # 使用精确模式对文本进行分词
counts = {}     # 通过键值对的形式存储词语及其出现的次数

for word in words:
    if len(word) == 1:    # 单个词语不计算在内
        continue
    else:
        counts[word] = counts.get(word, 0) + 1    # 遍历所有词语，每出现一次其对应的值加 1

items = list(counts.items())
items.sort(key=lambda x: x[1], reverse=True)    # 根据词语出现的次数进行从大到小排序

for i in range(3):
    word, count = items[i]
    print("{0:<5}{1:>5}".format(word, count))

统计结果：

这里写图片描述

你可以随便找一个文本文档，也可以到 https://github.com/coderjas/python-quick 下载上面例子中的文档。

四、扩展：英文单词统计

上面的例子统计实现了中文文档中出现最多的词语，接着我们就来统计一下一个英文文档中出现次数最多的单词。原理同上


# -*- coding: utf-8 -*-

def get_text():
    txt = open("1.txt", "r", encoding='UTF-8').read()
    txt = txt.lower()
    for ch in '!"#$%&()*+,-./:;<=>?@[\\]^_‘{|}~':
        txt = txt.replace(ch, " ")      # 将文本中特殊字符替换为空格
    return txt

file_txt = get_text()
words = file_txt.split()    # 对字符串进行分割，获得单词列表
counts = {}

for word in words:
    if len(word) == 1:
        continue
    else:
        counts[word] = counts.get(word, 0) + 1 

items = list(counts.items())    
items.sort(key=lambda x: x[1], reverse=True)      

for i in range(5):
    word, count = items[i]
    print("{0:<5}->{1:>5}".format(word, count))

统计结果：

这里写图片描述

到此这篇关于Python中jieba库的使用方法的文章就介绍到这了,更多相关Python jieba库内容请搜索编程网以前的文章或继续浏览下面的相关文章希望大家以后多多支持编程网！

文章详情

Python中jieba库的使用方法

目录

一、jieba库的安装

二、jieba三种模式的使用

三、jieba 分词简单应用

四、扩展：英文单词统计

软考中级精品资料免费领

相关文章

猜你喜欢

Python中jieba库的使用方法

Python中jieba库如何使用

关于Python中jieba库的使用

Python中jieba库的介绍与使用

如何在python中使用jieba库

jieba库的使用方法是什么

python之jieba分词库使用

python 中的jieba分词库

Python安装jieba库的方法是什么

python中jieba库(中文分词库)使用安装教程

Python中jieba分词模块的用法

python中jieba模块怎么使用

Python中文分词库jieba(结巴分词)详细使用介绍

Python第三方库jieba库与中文分词全面详解

python中validators库的使用方法详解

Python中pyautogui库的使用方法汇总

python中partial库的使用方法解析

Python pandas库中isnull函数使用方法

python使用cv2库、下载opencv库的方法

Python中方法链的使用方法