summaryrefslogtreecommitdiffstats
path: root/bitbake/doc/bitbake-user-manual/bitbake-user-manual-fetching.xml
diff options
context:
space:
mode:
Diffstat (limited to 'bitbake/doc/bitbake-user-manual/bitbake-user-manual-fetching.xml')
-rw-r--r--bitbake/doc/bitbake-user-manual/bitbake-user-manual-fetching.xml622
1 files changed, 622 insertions, 0 deletions
diff --git a/bitbake/doc/bitbake-user-manual/bitbake-user-manual-fetching.xml b/bitbake/doc/bitbake-user-manual/bitbake-user-manual-fetching.xml
new file mode 100644
index 0000000000..5aa53defc4
--- /dev/null
+++ b/bitbake/doc/bitbake-user-manual/bitbake-user-manual-fetching.xml
@@ -0,0 +1,622 @@
1<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
2"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
3
4<chapter>
5<title>File Download Support</title>
6
7 <para>
8 BitBake's fetch module is a standalone piece of library code
9 that deals with the intricacies of downloading source code
10 and files from remote systems.
11 Fetching source code is one of the corner stones of building software.
12 As such, this module forms an important part of BitBake.
13 </para>
14
15 <para>
16 The current fetch module is called "fetch2" and refers to the
17 fact that it is the second major version of the API.
18 The original version is obsolete and removed from the codebase.
19 Thus, in all cases, "fetch" refers to "fetch2" in this
20 manual.
21 </para>
22
23 <section id='the-download-fetch'>
24 <title>The Download (Fetch)</title>
25
26 <para>
27 BitBake takes several steps when fetching source code or files.
28 The fetcher codebase deals with two distinct processes in order:
29 obtaining the files from somewhere (cached or otherwise)
30 and then unpacking those files into a specific location and
31 perhaps in a specific way.
32 Getting and unpacking the files is often optionally followed
33 by patching.
34 Patching, however, is not covered by this module.
35 </para>
36
37 <para>
38 The code to execute the first part of this process, a fetch,
39 looks something like the following:
40 <literallayout class='monospaced'>
41 src_uri = (d.getVar('SRC_URI', True) or "").split()
42 fetcher = bb.fetch2.Fetch(src_uri, d)
43 fetcher.download()
44 </literallayout>
45 This code sets up an instance of the fetch class.
46 The instance uses a space-separated list of URLs from the
47 <link linkend='var-SRC_URI'><filename>SRC_URI</filename></link>
48 variable and then calls the <filename>download</filename>
49 method to download the files.
50 </para>
51
52 <para>
53 The instantiation of the fetch class is usually followed by:
54 <literallayout class='monospaced'>
55 rootdir = l.getVar('WORKDIR', True)
56 fetcher.unpack(rootdir)
57 </literallayout>
58 This code unpacks the downloaded files to the
59 specified by <filename>WORKDIR</filename>.
60 <note>
61 For convenience, the naming in these examples matches
62 the variables used by OpenEmbedded.
63 </note>
64 The <filename>SRC_URI</filename> and <filename>WORKDIR</filename>
65 variables are not coded into the fetcher.
66 They variables can (and are) called with different variable names.
67 In OpenEmbedded for example, the shared state (sstate) code uses
68 the fetch module to fetch the sstate files.
69 </para>
70
71 <para>
72 When the <filename>download()</filename> method is called,
73 BitBake tries to fulfill the URLs by looking for source files
74 in a specific search order:
75 <itemizedlist>
76 <listitem><para><emphasis>Pre-mirror Sites:</emphasis>
77 BitBake first uses pre-mirrors to try and find source files.
78 These locations are defined using the
79 <link linkend='var-PREMIRRORS'><filename>PREMIRRORS</filename></link>
80 variable.
81 </para></listitem>
82 <listitem><para><emphasis>Source URI:</emphasis>
83 If pre-mirrors fail, BitBake uses the original URL (e.g from
84 <filename>SRC_URI</filename>).
85 </para></listitem>
86 <listitem><para><emphasis>Mirror Sites:</emphasis>
87 If fetch failures occur, BitBake next uses mirror location as
88 defined by the
89 <link linkend='var-MIRRORS'><filename>MIRRORS</filename></link>
90 variable.
91 </para></listitem>
92 </itemizedlist>
93 </para>
94
95 <para>
96 For each URL passed to the fetcher, the fetcher
97 calls the submodule that handles that particular URL type.
98 This behavior can be the source of some confusion when you
99 are providing URLs for the <filename>SRC_URI</filename>
100 variable.
101 Consider the following two URLs:
102 <literallayout class='monospaced'>
103 http://git.yoctoproject.org/git/poky;protocol=git
104 git://git.yoctoproject.org/git/poky;protocol=http
105 </literallayout>
106 In the former case, the URL is passed to the
107 <filename>wget</filename> fetcher, which does not
108 understand "git".
109 Therefore, the latter case is the correct form since the
110 Git fetcher does know how to use HTTP as a transport.
111 </para>
112
113 <para>
114 Here are some examples that show commonly used mirror
115 definitions:
116 <literallayout class='monospaced'>
117 PREMIRRORS ?= "\
118 bzr://.*/.* http://somemirror.org/sources/ \n \
119 cvs://.*/.* http://somemirror.org/sources/ \n \
120 git://.*/.* http://somemirror.org/sources/ \n \
121 hg://.*/.* http://somemirror.org/sources/ \n \
122 osc://.*/.* http://somemirror.org/sources/ \n \
123 p4://.*/.* http://somemirror.org/sources/ \n \
124 svn://.*/.* http://somemirror.org/sources/ \n"
125
126 MIRRORS =+ "\
127 ftp://.*/.* http://somemirror.org/sources/ \n \
128 http://.*/.* http://somemirror.org/sources/ \n \
129 https://.*/.* http://somemirror.org/sources/ \n"
130 </literallayout>
131 It is useful to note that BitBake supports
132 cross-URLs.
133 It is possible to mirror a Git repository on an HTTP
134 server as a tarball.
135 This is what the <filename>git://</filename> mapping in
136 the previous example does.
137 </para>
138
139 <para>
140 Since network accesses are slow, Bitbake maintains a
141 cache of files downloaded from the network.
142 Any source files that are not local (i.e.
143 downloaded from the Internet) are placed into the download
144 directory, which is specified by the
145 <link linkend='var-DL_DIR'><filename>DL_DIR</filename></link>
146 variable.
147 </para>
148
149 <para>
150 File integrity is of key importance for reproducing builds.
151 For non-local archive downloads, the fetcher code can verify
152 sha256 and md5 checksums to ensure the archives have been
153 downloaded correctly.
154 You can specify these checksums by using the
155 <filename>SRC_URI</filename> variable with the appropriate
156 varflags as follows:
157 <literallayout class='monospaced'>
158 SRC_URI[md5sum] = "value"
159 SRC_URI[sha256sum] = "value"
160 </literallayout>
161 You can also specify the checksums as parameters on the
162 <filename>SRC_URI</filename> as shown below:
163 <literallayout class='monospaced'>
164 SRC_URI = "http://example.com/foobar.tar.bz2;md5sum=4a8e0f237e961fd7785d19d07fdb994d"
165 </literallayout>
166 If multiple URIs exist, you can specify the checksums either
167 directly as in the previous example, or you can name the URLs.
168 The following syntax shows how you name the URIs:
169 <literallayout class='monospaced'>
170 SRC_URI = "http://example.com/foobar.tar.bz2;name=foo"
171 SRC_URI[foo.md5sum] = 4a8e0f237e961fd7785d19d07fdb994d
172 </literallayout>
173 After a file has been downloaded and has had its checksum checked,
174 a ".done" stamp is placed in <filename>DL_DIR</filename>.
175 BitBake uses this stamp during subsequent builds to avoid
176 downloading or comparing a checksum for the file again.
177 <note>
178 It is assumed that local storage is safe from data corruption.
179 If this were not the case, there would be bigger issues to worry about.
180 </note>
181 </para>
182
183 <para>
184 If
185 <link linkend='var-BB_STRICT_CHECKSUM'><filename>BB_STRICT_CHECKSUM</filename></link>
186 is set, any download without a checksum triggers an
187 error message.
188 The
189 <link linkend='var-BB_NO_NETWORK'><filename>BB_NO_NETWORK</filename></link>
190 variable can be used to make any attempted network access a fatal
191 error, which is useful for checking that mirrors are complete
192 as well as other things.
193 </para>
194 </section>
195
196 <section id='bb-the-unpack'>
197 <title>The Unpack</title>
198
199 <para>
200 The unpack process usually immediately follows the download.
201 For all URLs except Git URLs, BitBake uses the common
202 <filename>unpack</filename> method.
203 </para>
204
205 <para>
206 A number of parameters exist that you can specify within the
207 URL to govern the behavior of the unpack stage:
208 <itemizedlist>
209 <listitem><para><emphasis>unpack:</emphasis>
210 Controls whether the URL components are unpacked.
211 If set to "1", which is the default, the components
212 are unpacked.
213 If set to "0", the unpack stage leaves the file alone.
214 This parameter is useful when you want an archive to be
215 copied in and not be unpacked.
216 </para></listitem>
217 <listitem><para><emphasis>dos:</emphasis>
218 Applies to <filename>.zip</filename> and
219 <filename>.jar</filename> files and specifies whether to
220 use DOS line ending conversion on text files.
221 </para></listitem>
222 <listitem><para><emphasis>basepath:</emphasis>
223 Instructs the unpack stage to strip the specified
224 directories from the source path when unpacking.
225 </para></listitem>
226 <listitem><para><emphasis>subdir:</emphasis>
227 Unpacks the specific URL to the specified subdirectory
228 within the root directory.
229 </para></listitem>
230 </itemizedlist>
231 The unpack call automatically decompresses and extracts files
232 with ".Z", ".z", ".gz", ".xz", ".zip", ".jar", ".ipk", ".rpm".
233 ".srpm", ".deb" and ".bz2" extensions as well as various combinations
234 of tarball extensions.
235 </para>
236
237 <para>
238 As mentioned, the Git fetcher has its own unpack method that
239 is optimized to work with Git trees.
240 Basically, this method works by cloning the tree into the final
241 directory.
242 The process is completed using references so that there is
243 only one central copy of the Git metadata needed.
244 </para>
245 </section>
246
247 <section id='bb-fetchers'>
248 <title>Fetchers</title>
249
250 <para>
251 As mentioned earlier, the URL prefix determines which
252 fetcher submodule BitBake uses.
253 Each submodule can support different URL parameters,
254 which are described in the following sections.
255 </para>
256
257 <section id='local-file-fetcher'>
258 <title>Local file fetcher (<filename>file://</filename>)</title>
259
260 <para>
261 This submodule handles URLs that begin with
262 <filename>file://</filename>.
263 The filename you specify with in the URL can
264 either be an absolute or relative path to a file.
265 If the filename is relative, the contents of the
266 <link linkend='var-FILESPATH'><filename>FILESPATH</filename></link>
267 variable is used in the same way
268 <filename>PATH</filename> is used to find executables.
269 Failing that,
270 <link linkend='var-FILESDIR'><filename>FILESDIR</filename></link>
271 is used to find the appropriate relative file.
272 <note>
273 <filename>FILESDIR</filename> is deprecated and can
274 be replaced with <filename>FILESPATH</filename>.
275 Because <filename>FILESDIR</filename> is likely to be
276 removed, you should not use this variable in any new code.
277 </note>
278 If the file cannot be found, it is assumed that it is available in
279 <link linkend='var-DL_DIR'><filename>DL_DIR</filename></link>
280 by the time the <filename>download()</filename> method is called.
281 </para>
282
283 <para>
284 If you specify a directory, the entire directory is
285 unpacked.
286 </para>
287
288 <para>
289 Here are some example URLs:
290 <literallayout class='monospaced'>
291 SRC_URI = "file://relativefile.patch"
292 SRC_URI = "file://relativefile.patch;this=ignored"
293 SRC_URI = "file:///Users/ich/very_important_software"
294 </literallayout>
295 </para>
296 </section>
297
298 <section id='cvs-fetcher'>
299 <title>CVS fetcher (<filename>(cvs://</filename>)</title>
300
301 <para>
302 This submodule handles checking out files from the
303 CVS version control system.
304 You can configure it using a number of different variables:
305 <itemizedlist>
306 <listitem><para><emphasis><filename>FETCHCMD_cvs</filename>:</emphasis>
307 The name of the executable to use when running
308 the <filename>cvs</filename> command.
309 This name is usually "cvs".
310 </para></listitem>
311 <listitem><para><emphasis><filename>SRCDATE</filename>:</emphasis>
312 The date to use when fetching the CVS source code.
313 A special value of "now" causes the checkout to
314 be updated on every build.
315 </para></listitem>
316 <listitem><para><emphasis><filename>CVSDIR</filename>:</emphasis>
317 Specifies where a temporary checkout is saved.
318 The location is often <filename>DL_DIR/cvs</filename>.
319 </para></listitem>
320 <listitem><para><emphasis><filename>CVS_PROXY_HOST</filename>:</emphasis>
321 The name to use as a "proxy=" parameter to the
322 <filename>cvs</filename> command.
323 </para></listitem>
324 <listitem><para><emphasis><filename>CVS_PROXY_PORT</filename>:</emphasis>
325 The port number to use as a "proxyport=" parameter to
326 the <filename>cvs</filename> command.
327 </para></listitem>
328 </itemizedlist>
329 As well as the standard username and password URL syntax,
330 you can also configure the fetcher with various URL parameters:
331 </para>
332
333 <para>
334 The supported parameters are as follows:
335 <itemizedlist>
336 <listitem><para><emphasis>"method":</emphasis>
337 The protocol over which to communicate with the cvs server.
338 By default, this protocol is "pserver".
339 If "method" is set to "ext", BitBake examines the
340 "rsh" parameter and sets <filename>CVS_RSH</filename>.
341 You can use "dir" for local directories.
342 </para></listitem>
343 <listitem><para><emphasis>"module":</emphasis>
344 Specifies the module to check out.
345 You must supply this parameter.
346 </para></listitem>
347 <listitem><para><emphasis>"tag":</emphasis>
348 Describes which CVS TAG should be used for
349 the checkout.
350 By default, the TAG is empty.
351 </para></listitem>
352 <listitem><para><emphasis>"date":</emphasis>
353 Specifies a date.
354 If no "date" is specified, the
355 <link linkend='var-SRCDATE'><filename>SRCDATE</filename></link>
356 of the configuration is used to checkout a specific date.
357 The special value of "now" causes the checkout to be
358 updated on every build.
359 </para></listitem>
360 <listitem><para><emphasis>"localdir":</emphasis>
361 Used to rename the module.
362 Effectively, you are renaming the output directory
363 to which the module is unpacked.
364 You are forcing the module into a special
365 directory relative to <filename>CVSDIR</filename>.
366 </para></listitem>
367 <listitem><para><emphasis>"rsh"</emphasis>
368 Used in conjunction with the "method" parameter.
369 </para></listitem>
370 <listitem><para><emphasis>"scmdata":</emphasis>
371 Causes the CVS metadata to be maintained in the tarball
372 the fetcher creates when set to "keep".
373 The tarball is expanded into the work directory.
374 By default, the CVS metadata is removed.
375 </para></listitem>
376 <listitem><para><emphasis>"fullpath":</emphasis>
377 Controls whether the resulting checkout is at the
378 module level, which is the default, or is at deeper
379 paths.
380 </para></listitem>
381 <listitem><para><emphasis>"norecurse":</emphasis>
382 Causes the fetcher to only checkout the specified
383 directory with no recurse into any subdirectories.
384 </para></listitem>
385 <listitem><para><emphasis>"port":</emphasis>
386 The port to which the CVS server connects.
387 </para></listitem>
388 </itemizedlist>
389 Some example URLs are as follows:
390 <literallayout class='monospaced'>
391 SRC_URI = "cvs://CVSROOT;module=mymodule;tag=some-version;method=ext"
392 SRC_URI = "cvs://CVSROOT;module=mymodule;date=20060126;localdir=usethat"
393 </literallayout>
394 </para>
395 </section>
396
397 <section id='http-ftp-fetcher'>
398 <title>HTTP/FTP wget fetcher (<filename>http://</filename>, <filename>ftp://</filename>, <filename>https://</filename>)</title>
399
400 <para>
401 This fetcher obtains files from web and FTP servers.
402 Internally, the fetcher uses the wget utility.
403 </para>
404
405 <para>
406 The executable and parameters used are specified by the
407 <filename>FETCHCMD_wget</filename> variable, which defaults
408 to a sensible values.
409 The fetcher supports a parameter "downloadfilename" that
410 allows the name of the downloaded file to be specified.
411 Specifying the name of the downloaded file is useful
412 for avoiding collisions in
413 <link linkend='var-DL_DIR'><filename>DL_DIR</filename></link>
414 when dealing with multiple files that have the same name.
415 </para>
416
417 <para>
418 Some example URLs are as follows:
419 <literallayout class='monospaced'>
420 SRC_URI = "http://oe.handhelds.org/not_there.aac"
421 SRC_URI = "ftp://oe.handhelds.org/not_there_as_well.aac"
422 SRC_URI = "ftp://you@oe.handheld.sorg/home/you/secret.plan"
423 </literallayout>
424 </para>
425 </section>
426
427 <section id='svn-fetcher'>
428 <title>Subversion (SVN) Fetcher (<filename>svn://</filename>)</title>
429
430 <para>
431 This fetcher submodule fetches code from the
432 Subversion source control system.
433 The executable used is specified by
434 <filename>FETCHCMD_svn</filename>, which defaults
435 to "svn".
436 The fetcher's temporary working directory is set
437 by <filename>SVNDIR</filename>, which is usually
438 <filename>DL_DIR/svn</filename>.
439 </para>
440
441 <para>
442 The supported parameters are as follows:
443 <itemizedlist>
444 <listitem><para><emphasis>"module":</emphasis>
445 The name of the svn module to checkout.
446 You must provide this parameter.
447 You can think of this parameter as the top-level
448 directory of the repository data you want.
449 </para></listitem>
450 <listitem><para><emphasis>"protocol":</emphasis>
451 The protocol to use, which defaults to "svn".
452 Other options are "svn+ssh" and "rsh".
453 For "rsh", the "rsh" parameter is also used.
454 </para></listitem>
455 <listitem><para><emphasis>"rev":</emphasis>
456 The revision of the source code to checkout.
457 </para></listitem>
458 <listitem><para><emphasis>"date":</emphasis>
459 The date of the source code to checkout.
460 Specific revisions are generally much safer to checkout
461 rather than by date as they do not involve timezones
462 (e.g. they are much more deterministic).
463 </para></listitem>
464 <listitem><para><emphasis>"scmdata":</emphasis>
465 Causes the “.svn” directories to be available during
466 compile-time when set to "keep".
467 By default, these directories are removed.
468 </para></listitem>
469 </itemizedlist>
470 Following are two examples using svn:
471 <literallayout class='monospaced'>
472 SRC_URI = "svn://svn.oe.handhelds.org/svn;module=vip;proto=http;rev=667"
473 SRC_URI = "svn://svn.oe.handhelds.org/svn/;module=opie;proto=svn+ssh;date=20060126"
474 </literallayout>
475 </para>
476 </section>
477
478 <section id='git-fetcher'>
479 <title>GIT Fetcher (<filename>git://</filename>)</title>
480
481 <para>
482 This fetcher submodule fetches code from the Git
483 source control system.
484 The fetcher works by creating a bare clone of the
485 remote into <filename>GITDIR</filename>, which is
486 usually <filename>DL_DIR/git</filename>.
487 This bare clone is then cloned into the work directory during the
488 unpack stage when a specific tree is checked out.
489 This is done using alternates and by reference to
490 minimize the amount of duplicate data on the disk and
491 make the unpack process fast.
492 The executable used can be set with
493 <filename>FETCHCMD_git</filename>.
494 </para>
495
496 <para>
497 This fetcher supports the following parameters:
498 <itemizedlist>
499 <listitem><para><emphasis>"protocol":</emphasis>
500 The protocol used to fetch the files.
501 The default is "git" when a hostname is set.
502 If a hostname is not set, the Git protocol is "file".
503 You can also use "http", "https", "ssh" and "rsync".
504 </para></listitem>
505 <listitem><para><emphasis>"nocheckout":</emphasis>
506 Tells the fetcher to not checkout source code when
507 unpacking when set to "1".
508 Set this option for the URL where there is a custom
509 routine to checkout code.
510 The default is "0".
511 </para></listitem>
512 <listitem><para><emphasis>"rebaseable":</emphasis>
513 Indicates that the upstream Git repository can be rebased.
514 You should set this parameter to "1" if
515 revisions can become detached from branches.
516 In this case, the source mirror tarball is done per
517 revision, which has a loss of efficiency.
518 Rebasing the upstream Git repository could cause the
519 current revision to disappear from the upstream repository.
520 This option reminds the fetcher to preserve the local cache
521 carefully for future use.
522 The default value for this parameter is "0".
523 </para></listitem>
524 <listitem><para><emphasis>"nobranch":</emphasis>
525 Tells the fetcher to not check the SHA validation
526 for the branch when set to "1".
527 The default is "0".
528 Set this option for the recipe that refers to
529 the commit that is valid for a tag instead of
530 the branch.
531 </para></listitem>
532 <listitem><para><emphasis>"bareclone":</emphasis>
533 Tells the fetcher to clone a bare clone into the
534 destination directory without checking out a working tree.
535 Only the raw Git metadata is provided.
536 This parameter implies the "nocheckout" parameter as well.
537 </para></listitem>
538 <listitem><para><emphasis>"branch":</emphasis>
539 The branch(es) of the Git tree to clone.
540 If unset, this is assumed to be "master".
541 The number of branch parameters much match the number of
542 name parameters.
543 </para></listitem>
544 <listitem><para><emphasis>"rev":</emphasis>
545 The revision to use for the checkout.
546 The default is "master".
547 </para></listitem>
548 <listitem><para><emphasis>"tag":</emphasis>
549 Specifies a tag to use for the checkout.
550 To correctly resolve tags, BitBake must access the
551 network.
552 For that reason, tags are often not used.
553 As far as Git is concerned, the "tag" parameter behaves
554 effectively the same as the "revision" parameter.
555 </para></listitem>
556 <listitem><para><emphasis>"subpath":</emphasis>
557 Limits the checkout to a specific subpath of the tree.
558 By default, the whole tree is checked out.
559 </para></listitem>
560 <listitem><para><emphasis>"destsuffix":</emphasis>
561 The name of the path in which to place the checkout.
562 By default, the path is <filename>git/</filename>.
563 </para></listitem>
564 </itemizedlist>
565 Here are some example URLs:
566 <literallayout class='monospaced'>
567 SRC_URI = "git://git.oe.handhelds.org/git/vip.git;tag=version-1"
568 SRC_URI = "git://git.oe.handhelds.org/git/vip.git;protocol=http"
569 </literallayout>
570 </para>
571 </section>
572
573 <section id='other-fetchers'>
574 <title>Other Fetchers</title>
575
576 <para>
577 Fetch submodules also exist for the following:
578 <itemizedlist>
579 <listitem><para>
580 Bazaar (<filename>bzr://</filename>)
581 </para></listitem>
582 <listitem><para>
583 Perforce (<filename>p4://</filename>)
584 </para></listitem>
585 <listitem><para>
586 Git Submodules (<filename>gitsm://</filename>)
587 </para></listitem>
588 <listitem><para>
589 Trees using Git Annex (<filename>gitannex://</filename>)
590 </para></listitem>
591 <listitem><para>
592 Secure FTP (<filename>sftp://</filename>)
593 </para></listitem>
594 <listitem><para>
595 Secure Shell (<filename>ssh://</filename>)
596 </para></listitem>
597 <listitem><para>
598 Repo (<filename>repo://</filename>)
599 </para></listitem>
600 <listitem><para>
601 OSC (<filename>osc://</filename>)
602 </para></listitem>
603 <listitem><para>
604 Mercurial (<filename>hg://</filename>)
605 </para></listitem>
606 </itemizedlist>
607 No documentation currently exists for these lesser used
608 fetcher submodules.
609 However, you might find the code helpful and readable.
610 </para>
611 </section>
612 </section>
613
614 <section id='auto-revisions'>
615 <title>Auto Revisions</title>
616
617 <para>
618 We need to document <filename>AUTOREV</filename> and
619 <filename>SRCREV_FORMAT</filename> here.
620 </para>
621 </section>
622</chapter>