From 9b3e31e5e16238a7038d3fea409121be1e0e0709 Mon Sep 17 00:00:00 2001 From: Scott Rifenbark Date: Tue, 18 Feb 2014 17:43:15 -0600 Subject: bitbake: user-manual-fetching.xml: Re-write of the Fetching chapter. Based on a Richard Purdie re-write. (Bitbake rev: fad9a6258f8c04bbe0168e46898dd27b86c39ee0) Signed-off-by: Scott Rifenbark Signed-off-by: Richard Purdie --- bitbake/doc/user-manual/user-manual-fetching.xml | 582 +++++++++++++++++------ 1 file changed, 436 insertions(+), 146 deletions(-) (limited to 'bitbake') diff --git a/bitbake/doc/user-manual/user-manual-fetching.xml b/bitbake/doc/user-manual/user-manual-fetching.xml index 846968419b..87951fd4b4 100644 --- a/bitbake/doc/user-manual/user-manual-fetching.xml +++ b/bitbake/doc/user-manual/user-manual-fetching.xml @@ -5,57 +5,112 @@ File Download Support - BitBake's fetch and - fetch2 modules support downloading - files. - This chapter provides an overview of the fetching process - and also presents sections on each of the fetchers BitBake - supports. - - The original fetch code, for all - practical purposes, has been replaced by - fetch2 code. - Consequently, the information in this chapter does not - apply to fetch. - + BitBake's fetch module is a standalone piece of library code + that deals with the intricacies of downloading source code + and files from remote systems. + Fetching source code is one of the corner stones of building software. + As such, this module forms an important part of BitBake. -
- Overview + + The current fetch module is called "fetch2" and refers to the + fact that it is the second major version of the API. + The original version is obsolete and removed from the codebase. + Thus, in all cases, "fetch" refers to "fetch2" in this + manual. + + +
+ The Download (Fetch) + + + BitBake takes several steps when fetching source code or files. + The fetcher codebase deals with two distinct processes in order: + obtaining the files from somewhere (cached or otherwise) + and then unpacking those files into a specific location and + perhaps in a specific way. + Getting and unpacking the files is often optionally followed + by patching. + Patching, however, is not covered by the fetch. + - When BitBake starts to execute, the very first thing - it does is to fetch the source files needed. - This section overviews the process. + The code to execute the first part of this process, a fetch, + looks something like the following: + + src_uri = (d.getVar('SRC_URI', True) or "").split() + fetcher = bb.fetch2.Fetch(src_uri, d) + fetcher.download() + + This code sets up an instance of the fetch module. + The instance uses a space-separated list of URLs from the + SRC_URI + variable and then calls the download + method to download the files. - When BitBake goes looking for source files, it follows a search - order: - + The instance of the fetch module is usually followed by: + + rootdir = l.getVar('WORKDIR', True) + fetcher.unpack(rootdir) + + This code unpacks the downloaded files to the + specified by WORKDIR. + + For convenience, the naming in these examples matches + the variables used by OpenEmbedded. + + The SRC_URI and WORKDIR + variables are not coded into the fetcher. + They variables can (and are) called with different variable names. + In OpenEmbedded for example, the shared state (sstate) code uses + the fetch module to fetch the sstate files. + + + + When the download() method is called, + BitBake tries to fulfill the URLs by looking for source files + in a specific search order: + Pre-mirror Sites: - BitBake first uses pre-mirrors to try and find source - files. + BitBake first uses pre-mirrors to try and find source files. These locations are defined using the PREMIRRORS variable. Source URI: - If pre-mirrors fail, BitBake uses - SRC_URI. + If pre-mirrors fail, BitBake uses the original URL (e.g from + SRC_URI). Mirror Sites: - If fetch failures occur using SRC_URI, - BitBake next uses mirror location as defined by the + If fetch failures occur, BitBake next uses mirror location as + defined by the MIRRORS variable. - + + + + + For each URL passed to the fetcher, the fetcher + calls the submodule that handles that particular URL type. + This behavior can be the source of some confusion when you + are providing URLs for the SRC_URI + variable. + Consider the following two URLs: + + http://git.yoctoproject.org/git/poky;protocol=git + git://git.yoctoproject.org/git/poky;protocol=http + + In the former case, the URL is passed to the + wget fetcher, which does not + understand "git". + Therefore, the latter case is the correct form since the + Git fetcher does know how to use HTTP as a transport. - Because cross-URLs are supported, it is possible to mirror - a Git repository on an HTTP server as a tarball. Here are some examples that show commonly used mirror definitions: @@ -74,19 +129,29 @@ http://.*/.* http://somemirror.org/sources/ \n \ https://.*/.* http://somemirror.org/sources/ \n" + It is useful to note that BitBake supports + cross-URLs. + It is possible to mirror a Git repository on an HTTP + server as a tarball. + This is what the git:// mapping in + the previous example does. - Any source files that are not local (i.e. downloaded from - the Internet) are placed into the download directory, - which is specified by - DL_DIR. + Since network accesses are slow, Bitbake maintains a + cache of files downloaded from the network. + Any source files that are not local (i.e. + downloaded from the Internet) are placed into the download + directory, which is specified by the + DL_DIR + variable. + File integrity is of key importance for reproducing builds. For non-local archive downloads, the fetcher code can verify - sha256 and md5 checksums to ensure - the archives have been downloaded correctly. + sha256 and md5 checksums to ensure the archives have been + downloaded correctly. You can specify these checksums by using the SRC_URI variable with the appropriate varflags as follows: @@ -97,66 +162,133 @@ You can also specify the checksums as parameters on the SRC_URI as shown below: - SRC_URI="http://example.com/foobar.tar.bz2;md5sum=4a8e0f237e961fd7785d19d07fdb994d" + SRC_URI = "http://example.com/foobar.tar.bz2;md5sum=4a8e0f237e961fd7785d19d07f +db994d" - If - BB_STRICT_CHECKSUM - is set, any download without a checksum triggers an error message. - In cases where multiple files are listed using - SRC_URI, the name parameter is used - assign names to the URLs and these are then specified - in the checksums using the following form: + If multiple URIs exist, you can specify the checksums either + directly as in the previous example, or you can name the URLs. + The following syntax shows how you name the URIs: - SRC_URI[name.sha256sum] + SRC_URI = "http://example.com/foobar.tar.bz2;name=foo" + SRC_URI[foo.md5sum] = 4a8e0f237e961fd7785d19d07fdb994d + After a file has been downloaded and has had its checksum checked, + a ".done" stamp is placed in DL_DIR. + BitBake uses this stamp during subsequent builds to avoid + downloading or comparing a checksum for the file again. + + It is assumed that local storage is safe from data corruption. + If this were not the case, there would be bigger issues to worry about. + + + + + If + BB_STRICT_CHECKSUM + is set, any download without a checksum triggers an + error message. + The + BB_NO_NETWORK + variable can be used to make any attempted network access a fatal + error, which is useful for checking that mirrors are complete + as well as other things.
-
- Fetchers +
+ The Unpack + + + The unpack process usually immediately follows the download. + For all URLs except Git URLs, BitBake uses the common + unpack method. + - As mentioned in the previous section, the - SRC_URI is normally used to - tell BitBake which files to fetch. - And, the fetcher BitBake uses depends on the how - SRC_URI is set. + A number of parameters exist that you can specify within the + URL to govern the behavior of the unpack stage: + + unpack: + Controls whether the URL components are unpacked. + If set to "1", which is the default, the components + are unpacked. + If set to "0", the unpack stage leaves the file alone. + This parameter is useful when you want an archive to be + copied in and not be unpacked. + + dos: + Applies to .zip and + .jar files and specifies whether to + use DOS line ending conversion on text files. + + basepath: + Instructs the unpack stage to strip the specified + directories from the source path when unpacking. + + subdir: + Unpacks the specific URL to the specified subdirectory + within the root directory. + + + The unpack call automatically decompresses and extracts files + with ".Z", ".z", ".gz", ".xz", ".zip", ".jar", ".ipk", ".rpm". + ".srpm", ".deb" and ".bz2" extensions as well as various combinations + of tarball extensions. - These next few sections describe the available fetchers and - their options. - Each fetcher honors a set of variables URI parameters, - which are separated by semi-colon characters and consist - of a key and a value. - The semantics of the variables and parameters are - defined by the fetcher. - BitBake tries to have consistent semantics between the - different fetchers. + As mentioned, the Git fetcher has its own unpack method that + is optimized to work with Git trees. + Basically, this method works by cloning the tree into the final + directory. + The process is completed using references so that there is + only one central copy of the Git metadata needed. +
-
- Local file fetcher +
+ Fetchers - - The URN for the local file fetcher is file. - + + As mentioned earlier, the URL prefix determines which + fetcher submodule BitBake uses. + Each submodule can support different URL parameters, + which are described in the following sections. + + +
+ Local file fetcher (<filename>file://</filename>) - The filename can be either absolute or relative. - If the filename is relative, + This submodule handles URLs that begin with + file://. + The filename you specify with in the URL can + either be an absolute or relative path to a file. + If the filename is relative, the contents of the FILESPATH - is used. + variable is used in the same way + PATH is used to find executables. Failing that, FILESDIR is used to find the appropriate relative file. + + FILESDIR is deprecated and can + be replaced with FILESPATH. + Because FILESDIR is likely to be + removed, you should not use this variable in any new code. + + If the file cannot be found, it is assumed that it is available in + DL_DIR + by the time the download() method is called. - The metadata usually extend these variables to include - variations of the values in - OVERRIDES. - Single files and complete directories can be specified. + If you specify a directory, the entire directory is + unpacked. + + + + Here are some example URLs: SRC_URI = "file://relativefile.patch" SRC_URI = "file://relativefile.patch;this=ignored" @@ -166,36 +298,53 @@
- CVS fetcher - - - The URN for the CVS fetcher is cvs. - + CVS fetcher (<filename>(cvs://</filename>) - This fetcher honors the variables CVSDIR, - SRCDATE, FETCHCOMMAND_cvs, - UPDATECOMMAND_cvs. - The - DL_DIR - variable specifies where a - temporary checkout is saved. - The - SRCDATE - variable specifies which date to - use when doing the fetching. - The special value of "now" causes the checkout to be - updated on every build. - The FETCHCOMMAND and - UPDATECOMMAND variables specify the executables - to use for the CVS checkout or update. + This submodule handles checking out files from the + CVS version control system. + You can configure it using a number of different variables: + + FETCHCMD_cvs: + The name of the executable to use when running + the cvs command. + This name is usually "cvs". + + SRCDATE: + The date to use when fetching the CVS source code. + A special value of "now" causes the checkout to + be updated on every build. + + CVSDIR: + Specifies where a temporary checkout is saved. + The location is often DL_DIR/cvs. + + CVS_PROXY_HOST: + The name to use as a "proxy=" parameter to the + cvs command. + + CVS_PROXY_PORT: + The port number to use as a "proxyport=" parameter to + the cvs command. + + + As well as the standard username and password URL syntax, + you can also configure the fetcher with various URL parameters: The supported parameters are as follows: + "method": + The protocol over which to communicate with the cvs server. + By default, this protocol is "pserver". + If "method" is set to "ext", BitBake examines the + "rsh" parameter and sets CVS_RSH. + You can use "dir" for local directories. + "module": Specifies the module to check out. + You must supply this parameter. "tag": Describes which CVS TAG should be used for @@ -210,23 +359,36 @@ The special value of "now" causes the checkout to be updated on every build. - "method": - By default pserver. - If "method" is set to "ext", BitBake examines the "rsh" - parameter and sets CVS_RSH. - "localdir": - Used to checkout force into a special + Used to rename the module. + Effectively, you are renaming the output directory + to which the module is unpacked. + You are forcing the module into a special directory relative to CVSDIR. "rsh" Used in conjunction with the "method" parameter. "scmdata": - I need a description for this. + Causes the CVS metadata to be maintained in the tarball + the fetcher creates when set to "keep". + The tarball is expanded into the work directory. + By default, the CVS metadata is removed. + + "fullpath": + Controls whether the resulting checkout is at the + module level, which is the default, or is at deeper + paths. + + "norecurse": + Causes the fetcher to only checkout the specified + directory with no recurse into any subdirectories. + + "port": + The port to which the CVS server connects. - Following are two examples using cvs: + Some example URLs are as follows: SRC_URI = "cvs://CVSROOT;module=mymodule;tag=some-version;method=ext" SRC_URI = "cvs://CVSROOT;module=mymodule;date=20060126;localdir=usethat" @@ -235,19 +397,27 @@
- HTTP/FTP fetcher + HTTP/FTP wget fetcher (<filename>http://</filename>, <filename>ftp://</filename>, <filename>https://</filename>) - The URNs for the HTTP/FTP fetcher are http, https, and ftp. + This fetcher obtains files from web and FTP servers. + Internally, the fetcher uses the wget utility. - This fetcher honors the variables - FETCHCOMMAND_wget. - The FETCHCOMMAND variable - contains the command used for fetching. - “${URI}” and “${FILES}” are replaced by the URI and - the base name of the file to be fetched. + The executable and parameters used are specified by the + FETCHCMD_wget variable, which defaults + to a sensible values. + The fetcher supports a parameter "downloadfilename" that + allows the name of the downloaded file to be specified. + Specifying the name of the downloaded file is useful + for avoiding collisions in + DL_DIR + when dealing with multiple files that have the same name. + + + + Some example URLs are as follows: SRC_URI = "http://oe.handhelds.org/not_there.aac" SRC_URI = "ftp://oe.handhelds.org/not_there_as_well.aac" @@ -257,36 +427,46 @@
- SVN Fetcher + Subversion (SVN) Fetcher (<filename>svn://</filename>) - The URN for the SVN fetcher is svn. - - - - This fetcher honors the variables - FETCHCOMMAND_svn, - SVNDIR, - and - SRCREV. - The FETCHCOMMAND variable contains the - subversion command. - The SRCREV variable specifies which revision - to use when doing the fetching. + This fetcher submodule fetches code from the + Subversion source control system. + The executable used is specified by + FETCHCMD_svn, which defaults + to "svn". + The fetcher's temporary working directory is set + by SVNDIR, which is usually + DL_DIR/svn. The supported parameters are as follows: - "proto": - The Subversion protocol. + "module": + The name of the svn module to checkout. + You must provide this parameter. + You can think of this parameter as the top-level + directory of the repository data you want. + + "protocol": + The protocol to use, which defaults to "svn". + Other options are "svn+ssh" and "rsh". + For "rsh", the "rsh" parameter is also used. "rev": - The Subversion revision. + The revision of the source code to checkout. + + "date": + The date of the source code to checkout. + Specific revisions are generally much safer to checkout + rather than by date as they do not involve timezones + (e.g. they are much more deterministic). "scmdata": - Set to "keep" causes the “.svn” directories - to be available during compile-time. + Causes the “.svn” directories to be available during + compile-time when set to "keep". + By default, these directories are removed. Following are two examples using svn: @@ -298,40 +478,150 @@
- GIT Fetcher - - - The URN for the Git Fetcher is git. - + GIT Fetcher (<filename>git://</filename>) - The variable GITDIR is used as the - base directory in which the Git tree is cloned. + This fetcher submodule fetches code from the Git + source control system. + The fetcher works by creating a bare clone of the + remote into GITDIR, which is + usually DL_DIR/git. + This bare clone is then cloned into the work directory during the + unpack stage when a specific tree is checked out. + This is done using alternates and by reference to + minimize the amount of duplicate data on the disk and + make the unpack process fast. + The executable used can be set with + FETCHCMD_git. - The supported parameters are as follows: + This fetcher supports the following parameters: - "tag": - The Git tag. - The default is "master". - "protocol": - The Git protocol. + The protocol used to fetch the files. The default is "git" when a hostname is set. If a hostname is not set, the Git protocol is "file". + You can also use "http", "https", "ssh" and "rsync". - "scmdata": - When set to “keep”, the “.git” directory is available - during compile-time. + "nocheckout": + Tells the fetcher to not checkout source code when + unpacking when set to "1". + Set this option for the URL where there is a custom + routine to checkout code. + The default is "0". + + "rebaseable": + Indicates that the upstream Git repository can be rebased. + You should set this parameter to "1" if + revisions can become detached from branches. + In this case, the source mirror tarball is done per + revision, which has a loss of efficiency. + Rebasing the upstream Git repository could cause the + current revision to disappear from the upstream repository. + This option reminds the fetcher to preserve the local cache + carefully for future use. + The default value for this parameter is "0". + + "nobranch": + Tells the fetcher to not check the SHA validation + for the branch when set to "1". + The default is "0". + Set this option for the recipe that refers to + the commit that is valid for a tag instead of + the branch. + + "bareclone": + Tells the fetcher to clone a bare clone into the + destination directory without checking out a working tree. + Only the raw Git metadata is provided. + This parameter implies the "nocheckout" parameter as well. + + "branch": + The branch(es) of the Git tree to clone. + If unset, this is assumed to be "master". + The number of branch parameters much match the number of + name parameters. + + "rev": + The revision to use for the checkout. + The default is "master". + + "tag": + Specifies a tag to use for the checkout. + To correctly resolve tags, BitBake must access the + network. + For that reason, tags are often not used. + As far as Git is concerned, the "tag" parameter behaves + effectively the same as the "revision" parameter. + + "subpath": + Limits the checkout to a specific subpath of the tree. + By default, the whole tree is checked out. + + "destsuffix": + The name of the path in which to place the checkout. + By default, the path is git/. - Following are two examples using git: + Here are some example URLs: SRC_URI = "git://git.oe.handhelds.org/git/vip.git;tag=version-1" SRC_URI = "git://git.oe.handhelds.org/git/vip.git;protocol=http"
+ +
+ Other Fetchers + + + Fetch submodules also exist for the following: + + + Bazzar (bzr://) + + + Perforce (p4://) + + + SVK + + + Git Submodules (gitsm://) + + + Trees using Git Annex (gitannex://) + + + Secure FTP (sftp://) + + + Secure Shell (ssh://) + + + Repo (repo://) + + + OSC (osc://) + + + Mercurial (hg://) + + + No documentation currently exists for these lesser used + fetcher submodules. + However, you might find the code helpful and readable. + +
+
+ +
+ Auto Revisions + + + We need to document AUTOREV and + SRCREV_FORMAT here. +
-- cgit v1.2.3-54-g00ecf