1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
|
# Downloads
[](https://travis-ci.com/JuliaLang/Downloads.jl)
[](https://codecov.io/gh/JuliaLang/Downloads.jl)
The `Downloads` package provides a single function, `download`, which provides
cross-platform, multi-protocol, in-process download functionality implemented
with [libcurl](https://curl.haxx.se/libcurl/). It uses libcurl's multi-handle
callback API to present a Julian API: `download(url)` blocks the task in which
it occurs but yields to Julia's scheduler, allowing arbitrarily many tasks to
download URLs concurrently and efficiently. As of Julia 1.6, this package is a
standard library that is included with Julia, but this package can be used with
Julia 1.3 through 1.5 as well.
## API
The public API of `Downloads` consists of two functions and three types:
- `download` — download a file from a URL, erroring if it can't be downloaded
- `request` — request a URL, returning a `Response` object indicating success
- `Response` — a type capturing the status and other metadata about a request
- `RequestError` — an error type thrown by `download` and `request` on error
- `Downloader` — an object encapsulating shared resources for downloading
### download
```jl
download(url, [ output = tempfile() ];
[ method = "GET", ]
[ headers = <none>, ]
[ timeout = <none>, ]
[ progress = <none>, ]
[ verbose = false, ]
[ downloader = <default>, ]
) -> output
```
- `url :: AbstractString`
- `output :: Union{AbstractString, AbstractCmd, IO}`
- `method :: AbstractString`
- `headers :: Union{AbstractVector, AbstractDict}`
- `timeout :: Real`
- `progress :: (total::Integer, now::Integer) --> Any`
- `verbose :: Bool`
- `downloader :: Downloader`
Download a file from the given url, saving it to `output` or if not specified, a
temporary path. The `output` can also be an `IO` handle, in which case the body
of the response is streamed to that handle and the handle is returned. If
`output` is a command, the command is run and output is sent to it on stdin.
If the `downloader` keyword argument is provided, it must be a `Downloader`
object. Resources and connections will be shared between downloads performed by
the same `Downloader` and cleaned up automatically when the object is garbage
collected or there have been no downloads performed with it for a grace period.
See `Downloader` for more info about configuration and usage.
If the `headers` keyword argument is provided, it must be a vector or dictionary
whose elements are all pairs of strings. These pairs are passed as headers when
downloading URLs with protocols that supports them, such as HTTP/S.
The `timeout` keyword argument specifies a timeout for the download in seconds,
with a resolution of milliseconds. By default no timeout is set, but this can
also be explicitly requested by passing a timeout value of `Inf`.
If the `progress` keyword argument is provided, it must be a callback funtion
which will be called whenever there are updates about the size and status of the
ongoing download. The callback must take two integer arguments: `total` and
`now` which are the total size of the download in bytes, and the number of bytes
which have been downloaded so far. Note that `total` starts out as zero and
remains zero until the server gives an indiation of the total size of the
download (e.g. with a `Content-Length` header), which may never happen. So a
well-behaved progress callback should handle a total size of zero gracefully.
If the `verbose` optoin is set to true, `libcurl`, which is used to implement
the download functionality will print debugging information to `stderr`.
### request
```jl
request(url;
[ input = <none>, ]
[ output = <none>, ]
[ method = input ? "PUT" : output ? "GET" : "HEAD", ]
[ headers = <none>, ]
[ timeout = <none>, ]
[ progress = <none>, ]
[ verbose = false, ]
[ throw = true, ]
[ downloader = <default>, ]
) -> Union{Response, RequestError}
```
- `url :: AbstractString`
- `input :: Union{AbstractString, AbstractCmd, IO}`
- `output :: Union{AbstractString, AbstractCmd, IO}`
- `method :: AbstractString`
- `headers :: Union{AbstractVector, AbstractDict}`
- `timeout :: Real`
- `progress :: (dl_total, dl_now, ul_total, ul_now) --> Any`
- `verbose :: Bool`
- `throw :: Bool`
- `downloader :: Downloader`
Make a request to the given url, returning a `Response` object capturing the
status, headers and other information about the response. The body of the
reponse is written to `output` if specified and discarded otherwise. For HTTP/S
requests, if an `input` stream is given, a `PUT` request is made; otherwise if
an `output` stream is givven, a `GET` request is made; if neither is given a
`HEAD` request is made. For other protocols, appropriate default methods are
used based on what combination of input and output are requested. The following
options differ from the `download` function:
- `input` allows providing a request body; if provided default to `PUT` request
- `progress` is a callback taking four integers for upload and download progress
- `throw` controls whether to throw or return a `RequestError` on request error
Note that unlike `download` which throws an error if the requested URL could not
be downloaded (indicated by non-2xx status code), `request` returns a `Response`
object no matter what the status code of the response is. If there is an error
with getting a response at all, then a `RequestError` is thrown or returned.
### Response
```jl
struct Response
proto :: String
url :: String
status :: Int
message :: String
headers :: Vector{Pair{String,String}}
end
```
`Response` is a type capturing the properties of a successful response to a
request as an object. It has the following fields:
- `proto`: the protocol that was used to get the response
- `url`: the URL that was ultimately requested after following redirects
- `status`: the status code of the response, indicating success, failure, etc.
- `message`: a textual message describing the nature of the response
- `headers`: any headers that were returned with the response
The meaning and availability of some of these responses depends on the protocol
used for the request. For many protocols, including HTTP/S and S/FTP, a 2xx
status code indicates a successful response. For responses in protocols that do
not support headers, the headers vector will be empty. HTTP/2 does not include a
status message, only a status code, so the message will be empty.
### RequestError
```jl
struct RequestError <: ErrorException
url :: String
code :: Int
message :: String
response :: Response
end
```
`RequestError` is a type capturing the properties of a failed response to a
request as an exception object:
- `url`: the original URL that was requested without any redirects
- `code`: the libcurl error code; `0` if a protocol-only error occurred
- `message`: the libcurl error message indicating what went wrong
- `response`: response object capturing what response info is available
The same `RequestError` type is thrown by `download` if the request was
successful but there was a protocol-level error indicated by a status code that
is not in the 2xx range, in which case `code` will be zero and the `message`
field will be the empty string. The `request` API only throws a `RequestError`
if the libcurl error `code` is non-zero, in which case the included `response`
object is likely to have a `status` of zero and an empty message. There are,
however, situations where a curl-level error is thrown due to a protocol error,
in which case both the inner and outer code and message may be of interest.
### Downloader
```jl
Downloader(; [ grace::Real = 30 ])
```
`Downloader` objects are used to perform individual `download` operations.
Connections, name lookups and other resources are shared within a `Downloader`.
These connections and resources are cleaned up after a configurable grace period
(default: 30 seconds) since anything was downloaded with it, or when it is
garbage collected, whichever comes first. If the grace period is set to zero,
all resources will be cleaned up immediately as soon as there are no more
ongoing downloads in progress. If the grace period is set to `Inf` then
resources are not cleaned up until `Downloader` is garbage collected.
|