python标准模块shlex-编程学习网

shlex模块实现了一个类来解析简单的类shell语法，可以用来编写领域特定的语言，或者解析加引号的字符串。

处理输入文本时有一个常见的问题，往往要把一个加引号的单词序列标识为一个实体。根据引号划分文本可能与预想的并不一样，特别是嵌套有多层引号时。例：

有文本quotes.txt，内容如下

This string has embedded "double quotes" and 'single quotes' in it,

and even "a 'nested example'".

一种简单的方法是构造一个正则表达式，来查找引号之外的文本部分，将它们与引号内的文本分开，或者反之。这可能带来不必要的复杂性，而且很容易因为边界条件出错，如撇号或者拼写错误。更好地解决方案是使用一个真正的解析器，如shlex模块提供的解析器。以下是一个简单的例子，它使用shlex类打印输入文件中找到的token。

#!/usr/bin/python 
 
import shlex 
import sys 
 
if len(sys.argv) != 2: 
    print 'Please specify one filename on the command line.' 
    sys.exit(1) 
 
filename = sys.argv[1] 
body = file(filename, 'rt').read() 
print 'ORIGINAL:', repr(body) 
print 
 
print 'TOKENS:' 
lexer = shlex.shlex(body) 
for token in lexer: 
    print repr(token)

执行 python shlex_example.py quotes.txt

结果

ORIGINAL: 'This string has embedded "double quotes" and \'single quotes\' in it,\nand even "a \'nested example\'".\n'

TOKENS:

'This'

'string'

'has'

'embedded'

'"double quotes"'

'and'

"'single quotes'"

'in'

'it'

','

'and'

'even'

'"a \'nested example\'"'

'.'

另外，孤立的引号（如I'm）也会处理。看以下文件

This string has an embedded apostrophe, doesn't it?

用shlex完全可以找出包含嵌入式撇号的token

执行 python shlex_example.py apostrophe.txt

结果：

ORIGINAL: "This string has an edbedded apostrophe, doesn't it?"

TOKENS:

'This'

'string'

'has'

'an'

'edbedded'

'apostrophe'

','

"doesn't"

'it'

'?'

可以看出shlex非常智能，比正则表达式方便多了。

文章详情

python标准模块shlex

软考中级精品资料免费领

相关文章

猜你喜欢

python标准模块shlex

Python标准模块--asyncio

python标准库--logging模块

Python中标准模块importlib详解

Python的标准模块包json详解

python标准库logging模块怎么用

代码解析python标准库logging模块

详解Python常用标准库之os模块与shutil模块

Python强大的自有模块——标准库

python标准库random模块处理随机数

python标准库学习之sys模块详解

Python常用标准库之os模块与shutil模块怎么使用

Python标准库uuid模块(生成唯一标识)详解

python标准库压缩包模块zipfile和tarfile详解(常用标准库)

Python标准库之Math,Random模块使用详解

Python标准库之zipfile和tarfile模块的使用

Python标准库datetime date模块的详细介绍

python标准库模块之json库怎么使用

Python3.x标准模块库目录

Python标准库之日期、时间和日历模块