Requests库是用python语言基于urllib编写的,采用的是Apache2 Licensed开源协议的HTTP库。本文旨在总结使用requests库实现爬虫post方法的几种区别。
requests库版本:2.24.0
0x01 POST表单提交参数
import requests
url="http://192.168.1.122"
payload = {'key1': 'value#1', 'key2': 'value 2'}
#或者data参数传入元组列表,payload = (('key1', 'value#1'), ('key1', 'value 2')),两者发包相同
r=requests.post(url,data=payload)
发包情况如下,Content-Type字段为application/x-www-form-urlencoded
,它是post的默认格式,使用js中URLencode转码方法。包括将name、value中的空格替换为加号;将非ascii字符做百分号编码;将input的name、value用'='连接,不同的input之间用'&'连接。
POST / HTTP/1.1
Host: 192.168.1.122
User-Agent: python-requests/2.24.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 27
Content-Type: application/x-www-form-urlencoded
key1=value%231&key2=value+2
0x02 POST以json形式提交参数
import requests
import json
url="http://192.168.1.122"
payload = {'key1': 'value#1','key2':'value 2'}
r=requests.post(url,json=payload)
此时,Content-Type字段为application/json
,请求体以json格式发送:
POST / HTTP/1.1
Host: 192.168.1.122
User-Agent: python-requests/2.24.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 38
Content-Type: application/json
{"key1": "value#1", "key2": "value 2"}
0x03 POST上传字符串
import requests
url="http://192.168.1.122"
xml = "my xml\n"
xml2="""{"key":"value"}"""
xml3=xml+xml2
#headers可以根据需要自定义,headers = {'Content-Type': 'application/html'}
r=requests.post(url,data=xml3)
很多时候抓包发现post发送的请求体比较复杂,我们可以把符合格式的数据做成字符串的形式上传,然后headers根据需要自己定义。请求如下:
POST / HTTP/1.1
Host: 192.168.1.122
User-Agent: python-requests/2.24.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 22
my xml
{"key":"value"}
0x04 POST上传文件
import requests
url="http://192.168.1.122"
files = {'file': open('test.txt', 'rb')}
r=requests.post(url,files=files)
使用files参数,即会以表单形式上传文件。
POST / HTTP/1.1
Host: 192.168.1.122
User-Agent: python-requests/2.24.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 157
Content-Type: multipart/form-data; boundary=2e6d3d07e843ebdfed4289659ba3a23a
--2e6d3d07e843ebdfed4289659ba3a23a
Content-Disposition: form-data; name="file"; filename="test.txt"
hello world!
--2e6d3d07e843ebdfed4289659ba3a23a--
参考文档:
https://requests.readthedocs.io/zh_CN/latest/user/quickstart.html#post
https://blog.csdn.net/u013827143/article/details/86222486
https://www.cnblogs.com/lly-lcf/p/13876823.html