1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
|
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
[<!ENTITY % poky SYSTEM "../poky.ent"> %poky; ] >
<chapter id='technical-details'>
<title>Technical Details</title>
<para>
This chapter provides technical details for various parts of the Yocto Project.
Currently, topics include Yocto Project components and shared state (sstate) cache.
</para>
<section id='usingpoky-components'>
<title>Yocto Project Components</title>
<para>
The BitBake task executor together with various types of configuration files form the
Yocto Project core.
This section overviews the BitBake task executor and the
configuration files by describing what they are used for and how they interact.
</para>
<para>
BitBake handles the parsing and execution of the data files.
The data itself is of various types:
<itemizedlist>
<listitem><para><emphasis>Recipes:</emphasis> Provides details about particular
pieces of software</para></listitem>
<listitem><para><emphasis>Class Data:</emphasis> An abstraction of common build
information (e.g. how to build a Linux kernel).</para></listitem>
<listitem><para><emphasis>Configuration Data:</emphasis> Defines machine-specific settings,
policy decisions, etc.
Configuration data acts as the glue to bind everything together.</para></listitem>
</itemizedlist>
For more information on data, see the
"<ulink url='&YOCTO_DOCS_DEV_URL;#yocto-project-terms'>Yocto Project Terms</ulink>"
section in the Yocto Project Development Manual.
</para>
<para>
BitBake knows how to combine multiple data sources together and refers to each data source
as a "<link linkend='usingpoky-changes-layers'>layer</link>".
</para>
<para>
Following are some brief details on these core components.
For more detailed information on these components see the
"<link linkend='ref-structure'>Reference: Directory Structure</link>" appendix.
</para>
<section id='usingpoky-components-bitbake'>
<title>BitBake</title>
<para>
BitBake is the tool at the heart of the Yocto Project and is responsible
for parsing the metadata, generating a list of tasks from it,
and then executing those tasks.
To see a list of the options BitBake supports, use the following help command:
<literallayout class='monospaced'>
$ bitbake --help
</literallayout>
</para>
<para>
The most common usage for BitBake is <filename>bitbake <packagename></filename>, where
<filename>packagename</filename> is the name of the package you want to build
(referred to as the "target" in this manual).
The target often equates to the first part of a <filename>.bb</filename> filename.
So, to run the <filename>matchbox-desktop_1.2.3.bb</filename> file, you
might type the following:
<literallayout class='monospaced'>
$ bitbake matchbox-desktop
</literallayout>
Several different versions of <filename>matchbox-desktop</filename> might exist.
BitBake chooses the one selected by the distribution configuration.
You can get more details about how BitBake chooses between different
target versions and providers in the
"<link linkend='ref-bitbake-providers'>Preferences and Providers</link>" section.
</para>
<para>
BitBake also tries to execute any dependent tasks first.
So for example, before building <filename>matchbox-desktop</filename>, BitBake
would build a cross compiler and <filename>eglibc</filename> if they had not already
been built.
<note>This release of the Yocto Project does not support the <filename>glibc</filename>
GNU version of the Unix standard C library. By default, the Yocto Project builds with
<filename>eglibc</filename>.</note>
</para>
<para>
A useful BitBake option to consider is the <filename>-k</filename> or
<filename>--continue</filename> option.
This option instructs BitBake to try and continue processing the job as much
as possible even after encountering an error.
When an error occurs, the target that
failed and those that depend on it cannot be remade.
However, when you use this option other dependencies can still be processed.
</para>
</section>
<section id='usingpoky-components-metadata'>
<title>Metadata (Recipes)</title>
<para>
The <filename>.bb</filename> files are usually referred to as "recipes."
In general, a recipe contains information about a single piece of software.
The information includes the location from which to download the source patches
(if any are needed), which special configuration options to apply,
how to compile the source files, and how to package the compiled output.
</para>
<para>
The term "package" can also be used to describe recipes.
However, since the same word is used for the packaged output from the Yocto
Project (i.e. <filename>.ipk</filename> or <filename>.deb</filename> files),
this document avoids using the term "package" when referring to recipes.
</para>
</section>
<section id='usingpoky-components-classes'>
<title>Classes</title>
<para>
Class files (<filename>.bbclass</filename>) contain information that is useful to share
between metadata files.
An example is the Autotools class, which contains
common settings for any application that Autotools uses.
The "<link linkend='ref-classes'>Reference: Classes</link>" appendix provides details
about common classes and how to use them.
</para>
</section>
<section id='usingpoky-components-configuration'>
<title>Configuration</title>
<para>
The configuration files (<filename>.conf</filename>) define various configuration variables
that govern the Yocto Project build process.
These files fall into several areas that define machine configuration options,
distribution configuration options, compiler tuning options, general common configuration
options and user configuration options (<filename>local.conf</filename>, which is found
in the Yocto Project files build directory).
</para>
</section>
</section>
<section id="shared-state-cache">
<title>Shared State Cache</title>
<para>
By design, the Yocto Project build system builds everything from scratch unless
BitBake can determine that parts don't need to be rebuilt.
Fundamentally, building from scratch is attractive as it means all parts are
built fresh and there is no possibility of stale data causing problems.
When developers hit problems, they typically default back to building from scratch
so they know the state of things from the start.
</para>
<para>
Building an image from scratch is both an advantage and a disadvantage to the process.
As mentioned in the previous paragraph, building from scratch ensures that
everything is current and starts from a known state.
However, building from scratch also takes much longer as it generally means
rebuilding things that don't necessarily need rebuilt.
</para>
<para>
The Yocto Project implements shared state code that supports incremental builds.
The implementation of the shared state code answers the following questions that
were fundamental roadblocks within the Yocto Project incremental build support system:
<itemizedlist>
<listitem>What pieces of the system have changed and what pieces have not changed?</listitem>
<listitem>How are changed pieces of software removed and replaced?</listitem>
<listitem>How are pre-built components that don't need to be rebuilt from scratch
used when they are available?</listitem>
</itemizedlist>
</para>
<para>
For the first question, the build system detects changes in the "inputs" to a given task by
creating a checksum (or signature) of the task's inputs.
If the checksum changes, the system assumes the inputs have changed and the task needs to be
rerun.
For the second question, the shared state (sstate) code tracks which tasks add which output
to the build process.
This means the output from a given task can be removed, upgraded or otherwise manipulated.
The third question is partly addressed by the solution for the second question
assuming the build system can fetch the sstate objects from remote locations and
install them if they are deemed to be valid.
</para>
<para>
The rest of this section goes into detail about the overall incremental build
architecture, the checksums (signatures), shared state, and some tips and tricks.
</para>
<section id='overall-architecture'>
<title>Overall Architecture</title>
<para>
When determining what parts of the system need to be built, BitBake
uses a per-task basis and does not use a per-recipe basis.
You might wonder why using a per-task basis is preferred over a per-recipe basis.
To help explain, consider having the IPK packaging backend enabled and then switching to DEB.
In this case, <filename>do_install</filename> and <filename>do_package</filename>
output are still valid.
However, with a per-recipe approach, the build would not include the
<filename>.deb</filename> files.
Consequently, you would have to invalidate the whole build and rerun it.
Rerunning everything is not the best situation.
Also in this case, the core must be "taught" much about specific tasks.
This methodology does not scale well and does not allow users to easily add new tasks
in layers or as external recipes without touching the packaged-staging core.
</para>
</section>
<section id='checksums'>
<title>Checksums (Signatures)</title>
<para>
The shared state code uses a checksum, which is a unique signature of a task's
inputs, to determine if a task needs to be run again.
Because it is a change in a task's inputs that triggers a rerun, the process
needs to detect all the inputs to a given task.
For shell tasks, this turns out to be fairly easy because
the build process generates a "run" shell script for each task and
it is possible to create a checksum that gives you a good idea of when
the task's data changes.
</para>
<para>
To complicate the problem, there are things that should not be included in
the checksum.
First, there is the actual specific build path of a given task -
the <filename>WORKDIR</filename>.
It does not matter if the working directory changes because it should not
affect the output for target packages.
Also, the build process has the objective of making native/cross packages relocatable.
The checksum therefore needs to exclude <filename>WORKDIR</filename>.
The simplistic approach for excluding the working directory is to set
<filename>WORKDIR</filename> to some fixed value and create the checksum
for the "run" script.
</para>
<para>
Another problem results from the "run" scripts containing functions that
might or might not get called.
The incremental build solution contains code that figures out dependencies
between shell functions.
This code is used to prune the "run" scripts down to the minimum set,
thereby alleviating this problem and making the "run" scripts much more
readable as a bonus.
</para>
<para>
So far we have solutions for shell scripts.
What about python tasks?
The same approach applies even though these tasks are more difficult.
The process needs to figure out what variables a python function accesses
and what functions it calls.
Again, the incremental build solution contains code that first figures out
the variable and function dependencies, and then creates a checksum for the data
used as the input to the task.
</para>
<para>
Like the <filename>WORKDIR</filename> case, situations exist where dependencies
should be ignored.
For these cases, you can instruct the build process to ignore a dependency
by using a line like the following:
<literallayout class='monospaced'>
PACKAGE_ARCHS[vardepsexclude] = "MACHINE"
</literallayout>
This example ensures that the <filename>PACKAGE_ARCHS</filename> variable does not
depend on the value of <filename>MACHINE</filename>, even if it does reference it.
</para>
<para>
Equally, there are cases where we need to add dependencies BitBake is not able to find.
You can accomplish this by using a line like the following:
<literallayout class='monospaced'>
PACKAGE_ARCHS[vardeps] = "MACHINE"
</literallayout>
This example explicitly adds the <filename>MACHINE</filename> variable as a
dependency for <filename>PACKAGE_ARCHS</filename>.
</para>
<para>
Consider a case with inline python, for example, where BitBake is not
able to figure out dependencies.
When running in debug mode (i.e. using <filename>-DDD</filename>), BitBake
produces output when it discovers something for which it cannot figure out
dependencies.
The Yocto Project team has currently not managed to cover those dependencies
in detail and is aware of the need to fix this situation.
</para>
<para>
Thus far, this section has limited discussion to the direct inputs into a task.
Information based on direct inputs is referred to as the "basehash" in the
code.
However, there is still the question of a task's indirect inputs - the
things that were already built and present in the build directory.
The checksum (or signature) for a particular task needs to add the hashes
of all the tasks on which the particular task depends.
Choosing which dependencies to add is a policy decision.
However, the effect is to generate a master checksum that combines the basehash
and the hashes of the task's dependencies.
</para>
<para>
At the code level, there are a variety of ways both the basehash and the
dependent task hashes can be influenced.
Within the BitBake configuration file, we can give BitBake some extra information
to help it construct the basehash.
The following statements effectively result in a list of global variable
dependency excludes - variables never included in any checksum:
<literallayout class='monospaced'>
BB_HASHBASE_WHITELIST ?= "TMPDIR FILE PATH PWD BB_TASKHASH BBPATH"
BB_HASHBASE_WHITELIST += "DL_DIR SSTATE_DIR THISDIR FILESEXTRAPATHS"
BB_HASHBASE_WHITELIST += "FILE_DIRNAME HOME LOGNAME SHELL TERM USER"
BB_HASHBASE_WHITELIST += "FILESPATH USERNAME STAGING_DIR_HOST STAGING_DIR_TARGET"
</literallayout>
The previous example actually excludes
<link linkend='var-WORKDIR'><filename>WORKDIR</filename></link>
since it is actually constructed as a path within
<link linkend='var-TMPDIR'><filename>TMPDIR</filename></link>, which is on
the whitelist.
</para>
<para>
The rules for deciding which hashes of dependent tasks to include through
dependency chains are more complex and are generally accomplished with a
python function.
The code in <filename>meta/lib/oe/sstatesig.py</filename> shows two examples
of this and also illustrates how you can insert your own policy into the system
if so desired.
This file defines the two basic signature generators <filename>OE-Core</filename>
uses: "OEBasic" and "OEBasicHash".
By default, there is a dummy "noop" signature handler enabled in BitBake.
This means that behavior is unchanged from previous versions.
<filename>OE-Core</filename> uses the "OEBasic" signature handler by default
through this setting in the <filename>bitbake.conf</filename> file:
<literallayout class='monospaced'>
BB_SIGNATURE_HANDLER ?= "OEBasic"
</literallayout>
The "OEBasicHash" <filename>BB_SIGNATURE_HANDLER</filename> is the same as the
"OEBasic" version but adds the task hash to the stamp files.
This results in any metadata change that changes the task hash, automatically
causing the task to be run again.
This removes the need to bump <link linkend='var-PR'><filename>PR</filename></link>
values and changes to metadata automatically ripple across the build.
Currently, this behavior is not the default behavior for <filename>OE-Core</filename>
but is the default in <filename>poky</filename>.
</para>
<para>
It is also worth noting that the end result of these signature generators is to
make some dependency and hash information available to the build.
This information includes:
<literallayout class='monospaced'>
BB_BASEHASH_task-<taskname> - the base hashes for each task in the recipe
BB_BASEHASH_<filename:taskname> - the base hashes for each dependent task
BBHASHDEPS_<filename:taskname> - The task dependencies for each task
BB_TASKHASH - the hash of the currently running task
</literallayout>
</para>
</section>
<section id='shared-state'>
<title>Shared State</title>
<para>
Checksums and dependencies, as discussed in the previous section, solve half the
problem.
The other part of the problem is being able to use checksum information during the build
and being able to reuse or rebuild specific components.
</para>
<para>
The shared state class (<filename>sstate.bbclass</filename>)
is a relatively generic implementation of how to "capture" a snapshot of a given task.
The idea is that the build process does not care about the source of a task's output.
Output could be freshly built or it could be downloaded and unpacked from
somewhere - the build process doesn't need to worry about its source.
</para>
<para>
There are two types of output, one is just about creating a directory
in <filename>WORKDIR</filename>.
A good example is the output of either <filename>do_install</filename> or
<filename>do_package</filename>.
The other type of output occurs when a set of data is merged into a shared directory
tree such as the sysroot.
</para>
<para>
The Yocto Project team has tried to keep the details of the implementation hidden in
<filename>sstate.bbclass</filename>.
From a user's perspective, adding shared state wrapping to a task
is as simple as this <filename>do_deploy</filename> example taken from
<filename>do_deploy.bbclass</filename>:
<literallayout class='monospaced'>
DEPLOYDIR = "${WORKDIR}/deploy-${PN}"
SSTATETASKS += "do_deploy"
do_deploy[sstate-name] = "deploy"
do_deploy[sstate-inputdirs] = "${DEPLOYDIR}"
do_deploy[sstate-outputdirs] = "${DEPLOY_DIR_IMAGE}"
python do_deploy_setscene () {
sstate_setscene(d)
}
addtask do_deploy_setscene
</literallayout>
In the example, we add some extra flags to the task, a name field ("deploy"), an
input directory where the task sends data, and the output
directory where the data from the task should eventually be copied.
We also add a <filename>_setscene</filename> variant of the task and add the task
name to the <filename>SSTATETASKS</filename> list.
</para>
<para>
If you have a directory whose contents you need to preserve, you can do this with
a line like the following:
<literallayout class='monospaced'>
do_package[sstate-plaindirs] = "${PKGD} ${PKGDEST}"
</literallayout>
This method, as well as the following example, also works for multiple directories.
<literallayout class='monospaced'>
do_package[sstate-inputdirs] = "${PKGDESTWORK} ${SHLIBSWORKDIR}"
do_package[sstate-outputdirs] = "${PKGDATA_DIR} ${SHLIBSDIR}"
do_package[sstate-lockfile] = "${PACKAGELOCK}"
</literallayout>
These methods also include the ability to take a lockfile when manipulating
shared state directory structures since some cases are sensitive to file
additions or removals.
</para>
<para>
Behind the scenes, the shared state code works by looking in
<filename>SSTATE_DIR</filename> and
<filename>SSTATE_MIRRORS</filename> for shared state files.
Here is an example:
<literallayout class='monospaced'>
SSTATE_MIRRORS ?= "\
file://.* http://someserver.tld/share/sstate/ \n \
file://.* file:///some/local/dir/sstate/"
</literallayout>
</para>
<para>
The shared state package validity can be detected just by looking at the
filename since the filename contains the task checksum (or signature) as
described earlier in this section.
If a valid shared state package is found, the build process downloads it
and uses it to accelerate the task.
</para>
<para>
The build processes uses the <filename>*_setscene</filename> tasks
for the task acceleration phase.
BitBake goes through this phase before the main execution code and tries
to accelerate any tasks for which it can find shared state packages.
If a shared state package for a task is available, the shared state
package is used.
This means the task and any tasks on which it is dependent are not
executed.
</para>
<para>
As a real world example, the aim is when building an IPK-based image,
only the <filename>do_package_write_ipk</filename> tasks would have their
shared state packages fetched and extracted.
Since the sysroot is not used, it would never get extracted.
This is another reason why a task-based approach is preferred over a
recipe-based approach, which would have to install the output from every task.
</para>
</section>
<section id='tips-and-tricks'>
<title>Tips and Tricks</title>
<para>
The code in the Yocto Project that supports incremental builds is not
simple code.
This section presents some tips and tricks that help you work around
issues related to shared state code.
</para>
<section id='debugging'>
<title>Debugging</title>
<para>
When things go wrong, debugging needs to be straightforward.
Because of this, the Yocto Project team included strong debugging
tools:
<itemizedlist>
<listitem><para>Whenever a shared state package is written out, so is a
corresponding <filename>.siginfo</filename> file.
This practice results in a pickled python database of all
the metadata that went into creating the hash for a given shared state
package.</para></listitem>
<listitem><para>If BitBake is run with the <filename>--dump-signatures</filename>
(or <filename>-S</filename>) option, BitBake dumps out
<filename>.siginfo</filename> files in
the stamp directory for every task it would have executed instead of
building the specified target package.</para></listitem>
<listitem><para>There is a <filename>bitbake-diffsigs</filename> command that
can process these <filename>.siginfo</filename> files.
If one file is specified, it will dump out the dependency
information in the file.
If two files are specified, it will compare the two files and dump out
the differences between the two.
This allows the question of "What changed between X and Y?" to be
answered easily.</para></listitem>
</itemizedlist>
</para>
</section>
<section id='invalidating-shared-state'>
<title>Invalidating Shared State</title>
<para>
The shared state code uses checksums and shared state memory
cache to avoid unnecessarily rebuilding tasks.
As with all schemes, this one has some drawbacks.
It is possible that you could make implicit changes that are not factored
into the checksum calculation, but do affect a task's output.
A good example is perhaps when a tool changes its output.
Let's say that the output of <filename>rpmdeps</filename> needed to change.
The result of the change should be that all the "package", "package_write_rpm",
and "package_deploy-rpm" shared state cache items would become invalid.
But, because this is a change that is external to the code and therefore implicit,
the associated shared state cache items do not become invalidated.
In this case, the build process would use the cached items rather than running the
task again.
Obviously, these types of implicit changes can cause problems.
</para>
<para>
To avoid these problems during the build, you need to understand the effects of any
change you make.
Note that any changes you make directly to a function automatically are factored into
the checksum calculation and thus, will invalidate the associated area of sstate cache.
You need to be aware of any implicit changes that are not obvious changes to the
code and could affect the output of a given task.
Once you are aware of such a change, you can take steps to invalidate the cache
and force the task to run.
The step to take is as simple as changing a function's comments in the source code.
For example, to invalidate package shared state files, change the comment statements
of <filename>do_package</filename> or the comments of one of the functions it calls.
The change is purely cosmetic, but it causes the checksum to be recalculated and
forces the task to be run again.
</para>
<note>
For an example of a commit that makes a cosmetic change to invalidate
a shared state, see this
<ulink url='&YOCTO_GIT_URL;/cgit.cgi/poky/commit/meta/classes/package.bbclass?id=737f8bbb4f27b4837047cb9b4fbfe01dfde36d54'>commit</ulink>.
</note>
</section>
</section>
</section>
</chapter>
<!--
vim: expandtab tw=80 ts=4
-->
|