Python词频统计的两种方法详解-编程学习网

统计文件里每个单词的个数

思路：

分别统计文档中的单词，与出现的次数

用两个列表将其保存起来，最后再用zip()函数连接输出**

想法成立开始实践

方法一：


# 导入文件
with open("passage.txt", 'r') as file:
    dates = file.readlines()
# 处理
words = []
for i in dates:
    words += i.replace("\n", "").split(" ")  # 用空字符来代替换行 words +是为了不被覆盖无+将只有最后一条数据
    # print(i.replace("\n","").split(" "))
setWords = list(set(words))  # 集合自动去重
num = []  # 统计一个单词出现的次数
for k in setWords:
    count = 0
    for j in words:
        if k == j:
            count = count + 1
    num.append(count)
print(num)
print(setWords)
# 输出
for x, y in zip(setWords, num):  # 将两个列表用zip结合
    print(x + ":" + str(y))、

效果图：

在这里插入图片描述

方法二：

此方法用来字典，较前一个相对简洁一点


# 导入
with open("passage.txt", 'r') as file:
    dates = file.readlines()
# 处理
words = []
for i in dates:
    words += i.replace("\n", "").split(" ")
    # print(i.replace("\n","").split(" "))
# setWords=list(set(words))  #可以不用这个
print(words)
print("-" * 40)
# print(setWords)
diccount = dict()
for i in words:
    if (i not in diccount):
        diccount[i] = 1  # 第一遍字典为空 赋值相当于 i=1，i为words里的单词
        # print(diccount)
    else:
        diccount[i] = diccount[i] + 1  # 等不在里面的全部遍历一遍赋值就都在里面了，我们再来记数
print(diccount)

效果图：

在这里插入图片描述

统计的文档

在这里插入图片描述

总结

本篇文章就到这里了，希望能够给你带来帮助，也希望您能够多多关注编程网的更多内容!

文章详情

Python词频统计的两种方法详解

目录

统计文件里每个单词的个数

思路：

想法成立开始实践

方法一：

方法二：

总结

软考中级精品资料免费领

相关文章

猜你喜欢

Python词频统计的两种方法详解

Python统计词频的几种方法小结

Python词频统计的方法有哪些

Python统计词频的方法有哪些

Python比较两个日期的两种方法详解

Python识别二维码的两种方法详解

Python 详解爬取并统计CSDN全站热榜标题关键词词频流程

使用Python统计代码运行时间的两种方法

Python pip更新的两种方式详解

Python获取网络时间戳的两种方法详解

详解ubuntu安装vscode的两种方法

详解Python单元测试的两种写法

详解Python实现图像分割增强的两种方法

详解python连接telnet和ssh的两种方式

dede中统计栏目文章数的两种方法

详解使用python的logging模块在stdout输出的两种方法

python实现单例的两种方法解读

Java实现Excel转PDF的两种方法详解

Python+OpenCV读写视频的方法详解

python文件读取read及readlines两种方法使用详解