Every now and then the Requests project gets bored of fixing bugs and decides to break a whole ton of your code. But it doesn’t look good when we put it like that, so instead we call it a ‘major release’ and sell it as being full of shiny new features. Unfortunately it turns out that people complain if we break their code and don’t provide a nice way to find out what we broke.
So we provide a changelist with every release. The changelist is aimed at providing an easy to scan list of the changes so that busy downstream authors can quickly identify the things that will cause them pain and fix them. That’s great, but often people want a slightly more detailed explanation and description of what we did and why we did it.
Well, Requests just released its version 2.0, and that’s a major release right there. To help make your life easier, I’ve whipped up this post: a lengthier explanation of what has changed in Requests to bring us up to 2.0. I’ll tackle each change in order of their presence in the changelist, and link to the relevant issues in Github for people who want to see what fool convinced Kenneth it was a good idea.
Let’s do it!
Header Dictionary Keys are always Native Strings
Previously, Requests would always encode any header keys you gave it to bytestrings on both Python 2 and Python 3. This was in principle fine. In practice, we had a couple of problems
- It broke overriding headers that are otherwise automatically set by Requests, such as Content-Type.
- It could cause unpleasant
UnicodeDecodeError
s if you had unicode header values on Python 2. - It didn’t work well with how
urllib3
expects its inputs.
So we now coerce to the native string type on each platform. Note that if you provide non-native string keys (unicode on Python 2, bytestring on Python 3), we will assume the encoding is UTF-8 when we convert the type. Be warned.
Proxy URLs must now have an explicit scheme
Merged in Issue 1497, originally raised in Issue 1182
You used to be able to provide the proxy dictionary with proxies that didn’t have a scheme, like this:
{'http': '192.0.0.1:8888', 'https': '192.0.0.2:8888'}
This was useful for convenience, but it turned out to be a secret source of bugs. In the absence of a scheme, Requests would assume you wanted to use the scheme of the key, so that the above dictionary was interpreted as:
{'http': 'http://192.0.0.1:8888', 'https': 'https://192.0.0.2:8888'}
It turns out that this is often not what people wanted. Rather than continue to
guess, as of 2.0 Requests will throw a MissingScheme
exception if such a
proxy URL is used. This includes any proxies source from environment
variables.
Timeouts Are Better
Fixed downstream from us in urllib3.
Timeouts have been a source of pain for a lot of people for quite some time. They tend to behave in unintuitive ways, and we ended up adding notes to the documentation to attempt to fight this problem.
However, thanks to some sweet work done in urllib3, you now get better control over timeouts.
When stream=True
, the timeout value now applies only to the connection
attempt, not to any of the actual data download. When stream=false
, we apply
the timeout value to the connection process, and then to the data download.
To be clear, that means that this:
>>> r = requests.get(url, timeout=5, stream=False)
Could take up to 10 seconds to execute: 5 seconds will be the maximum wait for connection, and 5 seconds will be the maximum wait for a read to return.
RequestException is now a subclass of IOError
This is fairly simple. The Python docs are pretty clear on this point:
Raised when an error is detected that doesn’t fall in any of the other categories.
Conceptually, RequestsException
should not be a subclass of RuntimeError
,
it should be a subclass of IOError
. So now it is.
Added new method to PreparedRequest objects
We do a lot of internal copying of PreparedRequest
objects, so there was a
fair amount of redundant code in the library. We added the PreparedRequest.copy()
method to clean that up, and it appeared to be sufficiently useful that it’s
now part of the public API.
Allow preparing of Requests with Session context
Proposed in Issue 1445, implemented in Issue 1507
This involved adding a new method to Session
objects:
Session.prepare_request()
. This method takes a Request
object and turns it
into a PreparedRequest
, while adding data specific to a single Session
,
e.g. any relevant cookie data. This has been a fairly highly requested feature
since Kenneth added the PreparedRequest
functionality in 1.0.
The new primary PreparedRequest
workflow is:
r = Request()
# Do stuff with the Request object.
s = Session()
p = s.prepare_request(r)
# Then, later:
s.send(p)
This provides all the many benefits of Requests sessions for your
PreparedRequest
s.
Extended the HTTPAdapter subclass interface
Implemented as part of the proxy improvements mentioned later.
We have a HTTPAdapter.add_headers()
method for adding HTTP headers to any
request being sent through a Transport Adapter. As part of the extended work on
proxies, we’ve added a new method, HTTPAdapter.proxy_headers()
, that does
the equivalent thing for requests being sent through proxies. This is
particularly useful for requests that use the CONNECT verb to tunnel HTTPS data
through proxies, as it enables them to specify headers that should be sent to
the proxy, not the downstream target.
It’s expected that most users will never worry about this function, but it is a
useful extension to the subclassing interface of the HTTPAdapter
.
Better Handling of Chunked Encoding Errors
Identified by many issues, but the catalyst was Issue 1397, and implemented in Issue 1498.
It turns out that a distressingly large number of websites report that they
will be using chunked encoding (by setting Transfer-Encoding: chunked
in the
HTTP headers), but then send all the data as one blob. I’ve actually touched on
this in a previous post.
Anyway, when that happens we used to throw an ugly httplib.IncompleteRead
exception. We now catch that, and instead throw the much nicer
requests.ChunkedEncodingError
instead. Far better.
Invalid Percent-Escape Sequences Now Better Handled
Proposed in Issue 1510, resolved by Issue 1514.
This is fairly simple. If Requests encountered a URL that contained an invalid
percent-escape sequence, such as the clearly invalid http://%zz/
, we used to
throw a ValueError
moaning about an invalid literal for base 16. That, while
true, was unhelpful. We now throw a requests.InvalidURL
exception instead.
Correct Some Reason Phrases
Proposed and fixed by Issue 1456.
We had an invalid reason phrase for the HTTP 208 response code. The correct
phrase is Already Reported
, but we were using IM Used
. We fixed that up,
and added the HTTP 226 status code whose reason phrase actually is IM Used
.
Vastly Improved Proxy Support
Proposed many many times, I wrote a whole post about it, and fixed by Issue 1515.
HTTPS proxies used to be totally broken: you could just never assume they
worked. Thanks to some phenomenal work on urllib3
by a number of awesome
people, we can now announce support for the HTTP CONNECT verb, and as a result
support for HTTPS and proxies.
This is a huge positive for us, and I’m delighted it made it in. Special thanks go to Stanislav Vitkovskiy, Marc Schlaich, Thomas Weißschuh and Andrey Petrov for their great work getting this in place.
Miscellaneous Bug Fixes
We also fixed a number of bugs. In no particular order, they are:
- Cookies are now correctly sent on responses to 401 messages, and any 401s received that set cookies now have those cookies persisted.
- We only select chunked encoding only when we legitimately don’t know how large a file is, instead of when we have a zero length file.
- Mixed case schemes are now supported throughout Requests, including when mounting Transport Adapters.
- We have a much more robust infrastructure for streaming downloads, which should now actually run to completion.
- We now collect environment proxies from more locations, such as the Windows registry.
- We have a few minor assorted cookies fixes: nothing dramatic.
- We no longer reuse
PreparedRequest
objects on redirects. - Auth settings in
.netrc
files no longer override explicit auth values: instead it’s the other way around. - Cookies that specify port numbers in their host field are now correctly parsed.
- You can perform streaming uploads with
BytesIO
objects now.
Summary
Requests 2.0 is an awesome release. In particular, the proxy and timeout improvements are a massive win. 2.0 has involved a lot of work from a ton of contributors, and coincides with Requests passing 5 million downloads. This is definitely another major milestone. So thanks for all your continuing support! On behalf of the Requests project, I want to say that you’re excellent, and we love you all.
I think Requests is getting better all the time, and hopefully you do too. I encourage you to download the new version and get using it. If you encounter any problems, raise an issue and let us know.
Enjoy yourself!