怎么解决python pdfkit 中文乱码-编程学习网

这篇文章主要介绍“怎么解决python pdfkit 中文乱码”，在日常操作中，相信很多人在怎么解决python pdfkit 中文乱码问题上存在疑惑，小编查阅了各式资料，整理出简单好用的操作方法，希望对大家解答”怎么解决python pdfkit 中文乱码”的疑惑有所帮助！接下来，请跟着小编一起来学习吧！

使用python pdfkit生成pdf文件中遇到中文乱码问题

生成的文件名不能带有中文字符

生成的pdf内容中文为乱码

生成的文件名不能带有中文字符

解决方法：

我暂时想到的处理方式是先生成英文文件名，再将这个文件重命名为中文的文件名

#coding=utf8import osimport pdfkitfrom uuid import uuid1ret = '<html><head><meta charset="UTF-8"></head><body><h2>测试pdf内容部分</h2></body></html>'.decode('utf8')file_name = str(uuid1())pdfkit.from_string(ret, file_name) # file_name不能带有中文 如果有会报错file_name_new = '测试.pdf'os.rename(file_name, file_name_new)

生成的pdf内容中文为乱码

原因1：

因为pdfkit生成pdf功能其实调用的是webkit的子模块wkhtmltopdf(通过命令行方式)，所以pdfkit生成中文乱码其实是wkhtmltopdf中文乱码导致的；而wkhtmltopdf中文乱码是因为系统中不存在中文字体导致的

解决方法：

在系统中添加中文字体

我的本地电脑是ubuntu14.04的字体文件保存在/usr/share/fonts下(包含了中文字体文件具体哪一个我也不知道汗。)，我的服务器是redhat系统(没有中文字体)，所以在我的电脑上操作如下:

cd /usr/share/fontszip -r fonts.zip ./*scp fonts.zip 服务器用户名@服务器ip:/usr/share/fonts

在服务器上操作如下:

cd /usr/share/fontsunzip fonts.zipfc-cache -fvfc-list # 查看新添加的字体

你需要找一台有安装了中文字体的电脑复制一份字体文件(就是/usr/share/fonts下的文件)，然后如我以上操作就可以了。

原因2：

需要在html的字符集设置为utf8

<head><meta charset="UTF-8"></head>

补充：python写入html文件中文乱码-解决办法

使用open函数将爬虫爬取的html写入文件，有时候在控制台不会乱码，但是写入文件的html中的中文是乱码的

案例分析

看下面一段代码：

# 爬虫未使用cookiefrom urllib import requestif __name__ == '__main__': url = "http://www.renren.com/967487029/profile" rsp = request.urlopen(url) html = rsp.read().decode() with open("rsp.html","w")as f: # 将爬取的页面 print(html) f.write(html)

看似没有问题，并且在控制台输出的html也不会出现中文乱码，但是创建的html文件中

怎么解决python pdfkit 中文乱码

解决方案

使用open方法的一个参数，名为encoding=” “，加入encoding=”utf-8”即可

# 爬虫未使用cookiefrom urllib import requestif __name__ == '__main__': url = "http://www.renren.com/967487029/profile" rsp = request.urlopen(url) html = rsp.read().decode() with open("rsp.html","w",encoding="utf-8")as f: # 将爬取的页面 print(html) f.write(html)

运行结果

怎么解决python pdfkit 中文乱码

到此，关于“怎么解决python pdfkit 中文乱码”的学习就结束了，希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习，快去试试吧！若想继续学习更多相关知识，请继续关注编程网网站，小编会继续努力为大家带来更多实用的文章！

文章详情

怎么解决python pdfkit 中文乱码

生成的文件名不能带有中文字符

解决方法：

生成的pdf内容中文为乱码

原因1：

解决方法：

原因2：

案例分析

解决方案

软考中级精品资料免费领

相关文章

猜你喜欢

怎么解决python pdfkit 中文乱码

Python中文乱码解决

python shell 中文乱码解决

python web.py 解决中文乱码

php中文乱码怎么解决

J2EE中文乱码怎么解决

AJAX中文乱码怎么解决

github中文乱码怎么解决

Matplotlib中文乱码怎么解决

ubuntu中文乱码怎么解决

devc++中文乱码怎么解决

Linux怎么解决中文乱码

c++中文乱码怎么解决

java中中文乱码怎么解决？

PHP mysqli中文乱码怎么解决

nodejs 中文就乱码怎么解决

springboot properties中文乱码怎么解决

PHP+MYSQL中文乱码怎么解决

php bom中文乱码怎么解决

nodejs json中文乱码怎么解决