Python 中有关中文编码解码小记-编程学习网

简单记录几点，以备后忘：

1、python 中的默认编码方式为ascii

In [1]: import sys
In [2]: sys.getdefaultencoding()
Out[2]: 'ascii'

2、设置python 中的默认编码方式

In [1]: import sys
In [2]: reload(sys)
<module 'sys' (built-in)>
In [3]: sys.setdefaultencoding('utf-8')
In [4]: sys.getdefaultencoding()
'utf-8'

3、python 头顶部设置的编码格式 # _*_ coding: utf-8 _*_ 不会影响默认python 的默认编码格式

#! /usr/bin/env python
# _*_ coding: utf-8 _*_

import sys
print sys.getdefaultencoding()

执行后的结果为 ascii 编码格式

那么python 头顶部设置的编码格式有什么作用呢?

#1、如果代码中有中文注释，就需要此声明
#2、比较高级的编辑器（比如我的emacs），会根据头部声明，将此作为代码文件的格式
#3、程序会通过头部声明，解码初始化 u"人生苦短"，这样的unicode对象，(所以头部声明和代码的存储格式要一致)

以上观点来自于 http://python.jobbole.com/81244/ 此文

那做个测试吧：

#! /usr/bin/env python
# _*_ coding: utf-8 _*_

import sys
print sys.getdefaultencoding()

#reload(sys)
#sys.setdefaultencoding('utf-8')

# 会被编码为unicode
s1 = u"这是一个测试1"

# 会被编码为ascii
s2 = "这是一个测试2"

s1.encode('gbk')
s2.encode('gbk')
print s1
print s2

以上测试结果：

ascii
Traceback (most recent call last):
  File "testunicoding.py", line 21, in <module>
    s2.encode('gbk')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 0: ordinal not in range(128)

主要s2这个字符串的默认编码格式为ascii ，无法先decode 成unicode 。出了问题

将默认编码方式更改为utf-8后

#! /usr/bin/env python
# _*_ coding: utf-8 _*_

import sys
print sys.getdefaultencoding()

reload(sys)
sys.setdefaultencoding('utf-8')

print sys.getdefaultencoding()

# 会被编码为unicode
s1 = u"这是一个测试1"

# 会被编码为ascii
s2 = "这是一个测试2"

s1.encode('gbk')
s2.encode('gbk')
print s1
print s2

执行结果：

ascii
utf-8
这是一个测试1
这是一个测试2

文章详情

Python 中有关中文编码解码小记

软考中级精品资料免费领

相关文章

猜你喜欢

Python 中有关中文编码解码小记

Python中，关于读取文件编码解码的问

Python 中文编码

详解python中文编码问题

python中文编码乱码问题的解决

SQLite3中文编码 Python

python如何解决中文编码乱码问题

Python中文编码问题

python中文编码&json中文输出问

关于Python中的编码规范

python中文转换url编码

python 中文url编码处理

简单解决Python文件中文编码问题

Python有关UnicodeUTF-8GBK编码问题详解

Python中的字节编码和解码技巧有哪些？

PHP中JSON编码解码遇到中文乱码怎么办？

SQLite3中文编码 Python的实现

python中文转换url编码（转）

Python中文乱码解决

关于Java中properties文件编码问题