python 包实现 urllib 网络请求操作-编程学习网

一、简介
二、发起请求
三、携带参数请求
四、获取响应数据
五、设置headers
六、使用代理
七、认证登录
八、设置cookie
九、异常处理
十、HTTP异常
十一、超时异常
十二、解析编码
十三、参数拼接
十四、请求链接解析
十五、拼接链接
十六、字典转换参数

一、简介

是一个 python 内置包，不需要额外安装即可使用
urllib 是 Python 标准库中用于网络请求的库，内置四个模块，分别是
urllib.request：用来打开和读取 url，可以用它来模拟发送请求，获取网页响应内容
urllib.error：用来处理 urllib.request 引起的异常，保证程序的正常执行
urllib.parse：用来解析 url，可以对 url 进行拆分、合并等
urllib.robotparse：用来解析 robots.txt 文件，判断网站是否能够进行爬取

二、发起请求

import urllib.request

# 方法一
resp = urllib.request.urlopen('http://www.baidu.com', timeout=1)
print(resp.read().decode('utf-8'))

# 方法二
request = urllib.request.Request('http://www.baidu.com')
response = urllib.request.urlopen(request)
print(response.read().decode('utf-8'))

三、携带参数请求

请求某些网页时需要携带一些数据

import urllib.parse
import urllib.request

params = {
'name':'autofelix',
'age':'25'
}

data = bytes(urllib.parse.urlencode(params), encoding='utf8')
response = urllib.request.urlopen("http://www.baidu.com/", data=data)
print(response.read().decode('utf-8'))

四、获取响应数据

import urllib.request

resp = urllib.request.urlopen('http://www.baidu.com')
print(type(resp))
print(resp.status)
print(resp.geturl())
print(resp.getcode())
print(resp.info())
print(resp.getheaders())
print(resp.getheader('Server'))

五、设置headers

import urllib.request

headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'
}
request = urllib.request.Request(url="http://tieba.baidu.com/", headers=headers)
response = urllib.request.urlopen(request)
print(response.read().decode('utf-8'))

六、使用代理

import urllib.request

proxys = urllib.request.ProxyHandler({
'http': 'proxy.cn:8080',
'https': 'proxy.cn:8080'
})

opener = urllib.request.build_opener(proxys)
urllib.request.install_opener(opener)

request = urllib.request.Request(url="http://www.baidu.com/")
response = urllib.request.urlopen(request)
print(response.read().decode('utf-8'))

七、认证登录

有些网站需要携带账号和密码进行登录之后才能继续浏览网页

import urllib.request

url = "http://www.baidu.com/"
user = 'autofelix'
password = '123456'
pwdmgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
pwdmgr.add_password(None,url,user,password)

auth_handler = urllib.request.HTTPBasicAuthHandler(pwdmgr)
opener = urllib.request.build_opener(auth_handler)
response = opener.open(url)
print(response.read().decode('utf-8'))

八、设置cookie

如果请求的页面每次需要身份验证，我们可以使用 Cookies 来自动登录，免去重复登录验证的操作

import http.cookiejar
import urllib.request

cookie = http.cookiejar.CookieJar()
handler = urllib.request.HTTPCookieProcessor(cookie)
opener = urllib.request.build_opener(handler)
response = opener.open("http://www.baidu.com/")

f = open('cookie.txt', 'a')
for item in cookie:
f.write(item.name+" = "+item.value+'\n')
f.close()

九、异常处理

from urllib import error, request

try:
resp = request.urlopen('http://www.baidu.com')
except error.URLError as e:
print(e.reason)

十、HTTP异常

from urllib import error, request

try:
resp = request.urlopen('http://www.baidu.com')
except error.HTTPError as e:
print(e.reason, e.code, e.headers, sep='\n')
except error.URLError as e:
print(e.reason)
else:
print('request successfully')

十一、超时异常

import socket, urllib.request, urllib.error

try:
resp = urllib.request.urlopen('http://www.baidu.com', timeout=0.01)
except urllib.error.URLError as e:
print(type(e.reason))
if isinstance(e.reason,socket.timeout):
print('time out')

十二、解析编码

from urllib import parse

name = parse.quote('飞兔小哥')

# 转换回来
parse.unquote(name)

十三、参数拼接

在访问url时，我们常常需要传递很多的url参数
而如果用字符串的方法去拼接url的话，会比较麻烦

from urllib import parse

params = {'name': '飞兔', 'age': '27', 'height': '178'}
parse.urlencode(params)

十四、请求链接解析

from urllib.parse import urlparse

result = urlparse('http://www.baidu.com/index.html?user=autofelix')
print(type(result))
print(result)

十五、拼接链接

如果拼接的是两个链接，则以返回后面的链接
如果拼接是一个链接和参数，则返回拼接后的内容

from urllib.parse import urljoin

print(urljoin('http://www.baidu.com', 'index.html'))

十六、字典转换参数

from urllib.parse import urlencode

params = {
'name': 'autofelix',
'age': 27
}
baseUrl = 'http://www.baidu.com?'
print(baseUrl + urlencode(params))

到此这篇关于python 包中的 urllib 网络请求教程的文章就介绍到这了,更多相关 urllib 网络请求内容请搜索编程网以前的文章或继续浏览下面的相关文章希望大家以后多多支持编程网！

文章详情

python 包实现 urllib 网络请求操作

目录

一、简介

二、发起请求

三、携带参数请求

四、获取响应数据

五、设置headers

六、使用代理

七、认证登录

八、设置cookie

九、异常处理

十、HTTP异常

十一、超时异常

十二、解析编码

十三、参数拼接

十四、请求链接解析

十五、拼接链接

十六、字典转换参数

软考中级精品资料免费领

相关文章

猜你喜欢

python 包实现 urllib 网络请求操作

python包中的urllib网络请求怎么实现

python 包 requests 实现请求操作

Python网络请求模块urllib与requests使用介绍

【iOS_Swift_Alamofire实现网络请求】

python新一代网络请求库之python-httpx库操作指南

PHP如何实现网络请求

Android实现网络请求方法

python实现请求数据包签名

Flutter Http网络请求实现详解

shell中的curl网络请求的实现

Android HTTP网络请求的异步实现

Python爬虫之网络请求实例分析

Python操作JSON实现网络数据交换

Python async+request与async+aiohttp实现异步网络请求探索

PHP实现网络请求的方法总结

SpringBoot与Postman实现REST模拟请求的操作

使用RestTemplate访问https实现SSL请求操作

Redux中进行异步操作(网络请求)的示例方案

JS实现网络请求的方式有哪些