diff options
-rw-r--r-- | documentation/poky-ref-manual/technical-details.xml | 381 |
1 files changed, 378 insertions, 3 deletions
diff --git a/documentation/poky-ref-manual/technical-details.xml b/documentation/poky-ref-manual/technical-details.xml index b34179533b..1657431495 100644 --- a/documentation/poky-ref-manual/technical-details.xml +++ b/documentation/poky-ref-manual/technical-details.xml | |||
@@ -151,12 +151,386 @@ | |||
151 | 151 | ||
152 | <para> | 152 | <para> |
153 | By design, the Yocto Project builds everything from scratch unless it can determine that | 153 | By design, the Yocto Project builds everything from scratch unless it can determine that |
154 | a given task's inputs have not changed. | 154 | parts don't need to be rebuilt. |
155 | While building from scratch ensures that everything is current, it does also | 155 | Fundamentally, building from scratch is an attraction as it means all parts are |
156 | mean that a lot of time could be spent rebuiding things that don't necessarily need built. | 156 | built fresh and there is no possibility of stale data causing problems. |
157 | When developers hit problems, they typically default back to building from scratch | ||
158 | so they know the state of things from the start. | ||
159 | </para> | ||
160 | |||
161 | <para> | ||
162 | Building an image from scratch is both an advantage and a disadvantage to the process. | ||
163 | As mentioned in the previous paragraph, building from scratch ensures that | ||
164 | everything is current and starts from a known state. | ||
165 | However, building from scratch also takes much longer as it generally means | ||
166 | rebuiding things that don't necessarily need rebuilt. | ||
167 | </para> | ||
168 | |||
169 | <para> | ||
170 | The Yocto Project implements shared state code that supports incremental builds. | ||
171 | The implementation of the shared state code answers the following questions that | ||
172 | were fundamental roadblocks within the Yocto Project incremental build support system: | ||
173 | <itemizedlist> | ||
174 | <listitem>What pieces of the system have changed and what pieces have not changed?</listitem> | ||
175 | <listitem>How are changed pieces of software removed and replaced?</listitem> | ||
176 | <listitem>How are pre-built components that don't need to be rebuilt from scratch | ||
177 | used when they are available?</listitem> | ||
178 | </itemizedlist> | ||
157 | </para> | 179 | </para> |
158 | 180 | ||
159 | <para> | 181 | <para> |
182 | For the first question, the build system detects changes in the "inputs" to a given task by | ||
183 | creating a checksum (or signature) of the task's inputs. | ||
184 | If the checksum changes, the system assumes the inputs have changed and the task needs to be | ||
185 | rerun. | ||
186 | For the second question, the shared state (sstate) code tracks which tasks add which output | ||
187 | to the build process. | ||
188 | This means the output from a given task can be removed, upgraded or otherwise manipulated. | ||
189 | The third question is partly addressed by the solution for the second question | ||
190 | assuming the build system can fetch the sstate objects from remote locations and | ||
191 | install them if they are deemed to be valid. | ||
192 | </para> | ||
193 | |||
194 | <para> | ||
195 | The rest of this section goes into detail about the overall incremental build | ||
196 | architecture, the checksums (signatures), shared state, and some tips and tricks. | ||
197 | </para> | ||
198 | |||
199 | <section id='overall-architecture'> | ||
200 | <title>Overall Architecture</title> | ||
201 | |||
202 | <para> | ||
203 | When determining what parts of the system need to be built, the Yocto Project | ||
204 | uses a per-task basis and does not use a per-recipe basis. | ||
205 | You might wonder why using a per-task basis is preferred over a per-recipe basis. | ||
206 | To help explain, consider having the IPK packaging backend enabled and then switching to DEB. | ||
207 | In this case, <filename>do_install</filename> and <filename>do_package</filename> | ||
208 | output are still valid. | ||
209 | However, with a per-recipe approach, the build would not include the | ||
210 | <filename>.deb</filename> files. | ||
211 | Consequently, you would have to invalidate the whole build and rerun it. | ||
212 | Rerunning everything is not the best situation. | ||
213 | Also in this case, the core must be "taught" much about specific tasks. | ||
214 | This methodology does not scale well and does not allow users to easily add new tasks | ||
215 | in layers or as external recipes without touching the packaged-staging core. | ||
216 | </para> | ||
217 | </section> | ||
218 | |||
219 | <section id='checksums'> | ||
220 | <title>Checksums (Signatures)</title> | ||
221 | |||
222 | <para> | ||
223 | The Yocto Project uses a checksum, which is a unique signature of a task's | ||
224 | inputs, to determine if a task needs to be run again. | ||
225 | Because it is a change in a task's inputs that trigger a rerun, the process | ||
226 | needs to detect all the inputs to a given task. | ||
227 | For shell tasks, this turns out to be fairly easy because | ||
228 | the build process generates a "run" shell script for each task and | ||
229 | it is possible to create a checksum that gives you a good idea of when | ||
230 | the task's data changes. | ||
231 | </para> | ||
232 | |||
233 | <para> | ||
234 | To complicate the problem, there are things that should not be included in | ||
235 | the checksum. | ||
236 | First, there is the actual specific build path of a given task - | ||
237 | the <filename>WORKDIR</filename>. | ||
238 | It does not matter if the working directory changes because it should not | ||
239 | affect the output for target packages. | ||
240 | Also, the build process has the objective of making native/cross packages relocatable. | ||
241 | The checksum therefore needs to exclude <filename>WORKDIR</filename>. | ||
242 | The simplistic approach for excluding the worknig directory is to set | ||
243 | <filename>WORKDIR</filename> to some fixed value and create the checksum | ||
244 | for the "run" script. | ||
245 | </para> | ||
246 | |||
247 | <para> | ||
248 | Another problem results from the "run" scripts containing functions that | ||
249 | might or might not get called. | ||
250 | The Yocto Project contains code that figures out dependencies between shell | ||
251 | functions. | ||
252 | This code is used to prune the "run" scripts down to the minimum set, | ||
253 | thereby alleviating this problem and making the "run" scripts much more | ||
254 | readable as a bonus. | ||
255 | </para> | ||
256 | |||
257 | <para> | ||
258 | So far we have solutions for shell scripts. | ||
259 | What about python tasks? | ||
260 | Handling these tasks are more difficult but the the same approach | ||
261 | applies. | ||
262 | The process needs to figure out what variables a python function accesses | ||
263 | and what functions it calls. | ||
264 | Again, the Yocto Project contains code that first figures out the variable and function | ||
265 | dependencies, and then creates a checksum for the data used as the input to | ||
266 | the task. | ||
267 | </para> | ||
268 | |||
269 | <para> | ||
270 | Like the <filename>WORKDIR</filename> case, situations exist where dependencies | ||
271 | should be ignored. | ||
272 | For these cases, you can instruct the build process to ignore a dependency | ||
273 | by using a line like the following: | ||
274 | <literallayout class='monospaced'> | ||
275 | PACKAGE_ARCHS[vardepsexclude] = "MACHINE" | ||
276 | </literallayout> | ||
277 | This example ensures that the <filename>PACKAGE_ARCHS</filename> variable does not | ||
278 | depend on the value of <filename>MACHINE</filename>, even if it does reference it. | ||
279 | </para> | ||
280 | |||
281 | <para> | ||
282 | Equally, there are cases where we need to add in dependencies | ||
283 | BitBake is not able to find. | ||
284 | You can accomplish this by using a line like the following: | ||
285 | <literallayout class='monospaced'> | ||
286 | PACKAGE_ARCHS[vardeps] = "MACHINE" | ||
287 | </literallayout> | ||
288 | This example explicitly adds the <filename>MACHINE</filename> variable as a | ||
289 | dependency for <filename>PACKAGE_ARCHS</filename>. | ||
290 | </para> | ||
291 | |||
292 | <para> | ||
293 | Consider a case with inline python, for example, where BitBake is not | ||
294 | able to figure out dependencies. | ||
295 | When running in debug mode (i.e. using <filename>-DDD</filename>), BitBake | ||
296 | produces output when it discovers something for which it cannot figure out | ||
297 | dependencies. | ||
298 | The Yocto Project team has currently not managed to cover those dependencies | ||
299 | in detail and is aware of the need to fix this situation. | ||
300 | </para> | ||
301 | |||
302 | <para> | ||
303 | Thus far, this section has limited discussion to the direct inputs into a | ||
304 | task. | ||
305 | Information based on direct inputs is referred to as the "basehash" in the code. | ||
306 | However, there is still the question of a task's indirect inputs, the things that | ||
307 | were already built and present in the build directory. | ||
308 | The checksum (or signature) for a particular task needs to add the hashes of all the | ||
309 | tasks the particular task depends upon. | ||
310 | Choosing which dependencies to add is a policy decision. | ||
311 | However, the effect is to generate a master checksum that combines the | ||
312 | basehash and the hashes of the task's dependencies. | ||
313 | </para> | ||
314 | |||
315 | <para> | ||
316 | While figuring out the dependencies and creating these checksums is good, | ||
317 | what does the Yocto Project build system do with the checksum information? | ||
318 | The build system uses a signature handler that is responsible for | ||
319 | processing the checksum information. | ||
320 | By default, there is a dummy "noop" signature handler enabled in BitBake. | ||
321 | This means that behaviour is unchanged from previous versions. | ||
322 | OECore uses the "basic" signature handler through this setting in the | ||
323 | <filename>bitbake.conf</filename> file: | ||
324 | <literallayout class='monospaced'> | ||
325 | BB_SIGNATURE_HANDLER ?= "basic" | ||
326 | </literallayout> | ||
327 | Also within the BitBake configuration file, we can give BitBake | ||
328 | some extra information to help it handle this information. | ||
329 | The following statements effectively result in a list of global | ||
330 | list of variable dependency excludes - variables never included in | ||
331 | any checksum: | ||
332 | <literallayout class='monospaced'> | ||
333 | BB_HASHBASE_WHITELIST ?= "TMPDIR FILE PATH PWD BB_TASKHASH BBPATH" | ||
334 | BB_HASHBASE_WHITELIST += "DL_DIR SSTATE_DIR THISDIR FILESEXTRAPATHS" | ||
335 | BB_HASHBASE_WHITELIST += "FILE_DIRNAME HOME LOGNAME SHELL TERM USER" | ||
336 | BB_HASHBASE_WHITELIST += "FILESPATH USERNAME STAGING_DIR_HOST STAGING_DIR_TARGET" | ||
337 | BB_HASHTASK_WHITELIST += "(.*-cross$|.*-native$|.*-cross-initial$| \ | ||
338 | .*-cross-intermediate$|^virtual:native:.*|^virtual:nativesdk:.*)" | ||
339 | </literallayout> | ||
340 | This example is actually where <filename>WORKDIR</filename> | ||
341 | is excluded since <filename>WORKDIR</filename> is constructed as a | ||
342 | path within <filename>TMPDIR</filename>, which is on the whitelist. | ||
343 | </para> | ||
344 | |||
345 | <para> | ||
346 | The <filename>BB_HASHTASK_WHITELIST</filename> covers dependent tasks and | ||
347 | excludes certain kinds of tasks from the dependency chains. | ||
348 | The effect of the previous example is to isolate the native, target, | ||
349 | and cross components. | ||
350 | So, for example, toolchain changes do not force a rebuild of the whole system. | ||
351 | </para> | ||
352 | |||
353 | <para> | ||
354 | The end result of the "basic" handler is to make some dependency and | ||
355 | hash information available to the build. | ||
356 | This includes: | ||
357 | <literallayout class='monospaced'> | ||
358 | BB_BASEHASH_task-<taskname> - the base hashes for each task in the recipe | ||
359 | BB_BASEHASH_<filename:taskname> - the base hashes for each dependent task | ||
360 | BBHASHDEPS_<filename:taskname> - The task dependencies for each task | ||
361 | BB_TASKHASH - the hash of the currently running task | ||
362 | </literallayout> | ||
363 | There is also a "basichash" <filename>BB_SIGNATURE_HANDLER</filename>, | ||
364 | which is the same as the basic version but adds the task hash to the stamp files. | ||
365 | This results in any metadata change that changes the task hash, | ||
366 | automatically causing the task to be run again. | ||
367 | This removes the need to bump <filename>PR</filename> | ||
368 | values and changes to metadata automatically ripple across the build. | ||
369 | Currently, this behavior is not the default behavior. | ||
370 | However, it is likely that the Yocto Project team will go forward with this | ||
371 | behavior in the future since all the functionality exists. | ||
372 | The reason for the delay is the potential impact to the distribution feed | ||
373 | creation as they need increasing <filename>PR</filename> fields | ||
374 | and the Yocto Project currently lacks a mechanism to automate incrementing | ||
375 | this field. | ||
376 | </para> | ||
377 | </section> | ||
378 | |||
379 | <section id='shared-state'> | ||
380 | <title>Shared State</title> | ||
381 | |||
382 | <para> | ||
383 | Checksums and dependencies as discussed in the previous section solves half the | ||
384 | problem. | ||
385 | The other part of the problem is being able to use checksum information during the build | ||
386 | and being able to reuse or rebuild specific components. | ||
387 | </para> | ||
388 | |||
389 | <para> | ||
390 | The shared state class (<filename>sstate.bbclass</filename>) | ||
391 | is a relatively generic implementation of how to | ||
392 | "capture" a snapshot of a given task. | ||
393 | The idea is that the build process does not care about the source of a | ||
394 | task's output. | ||
395 | Output could be freshly built or it could be downloaded and unpacked from | ||
396 | somewhere - the build process doesn't need to worry about its source. | ||
397 | </para> | ||
398 | |||
399 | <para> | ||
400 | There are two types of output, one is just about creating a directory | ||
401 | in <filename>WORKDIR</filename>. | ||
402 | A good example is the output of either <filename>do_install</filename> or | ||
403 | <filename>do_package</filename>. | ||
404 | The other type of output occurs when a set of data is merged into a shared directory | ||
405 | tree such as the sysroot. | ||
406 | </para> | ||
407 | |||
408 | <para> | ||
409 | The Yocto Project team has tried to keep the details of the implementation hidden in | ||
410 | <filename>sstate.bbclass</filename>. | ||
411 | From a user's perspective, adding shared state wrapping to a task | ||
412 | is as simple as this <filename>do_deploy</filename> example taken from | ||
413 | <filename>do_deploy.bbclass</filename>: | ||
414 | <literallayout class='monospaced'> | ||
415 | DEPLOYDIR = "${WORKDIR}/deploy-${PN}" | ||
416 | SSTATETASKS += "do_deploy" | ||
417 | do_deploy[sstate-name] = "deploy" | ||
418 | do_deploy[sstate-inputdirs] = "${DEPLOYDIR}" | ||
419 | do_deploy[sstate-outputdirs] = "${DEPLOY_DIR_IMAGE}" | ||
420 | |||
421 | python do_deploy_setscene () { | ||
422 | sstate_setscene(d) | ||
423 | } | ||
424 | addtask do_deploy_setscene | ||
425 | </literallayout> | ||
426 | In the example, we add some extra flags to the task, a name field ("deploy"), an | ||
427 | input directory where the task sends data, and the output | ||
428 | directory where the data from the task should eventually be copied. | ||
429 | We also add a <filename>_setscene</filename> variant of the task and add the task | ||
430 | name to the <filename>SSTATETASKS</filename> list. | ||
431 | </para> | ||
432 | |||
433 | <para> | ||
434 | If you have a directory whose contents you need to preserve, | ||
435 | you can do this with a line like the following: | ||
436 | <literallayout class='monospaced'> | ||
437 | do_package[sstate-plaindirs] = "${PKGD} ${PKGDEST}" | ||
438 | </literallayout> | ||
439 | This method, as well as the following example, also works for mutliple directories. | ||
440 | <literallayout class='monospaced'> | ||
441 | do_package[sstate-inputdirs] = "${PKGDESTWORK} ${SHLIBSWORKDIR}" | ||
442 | do_package[sstate-outputdirs] = "${PKGDATA_DIR} ${SHLIBSDIR}" | ||
443 | do_package[sstate-lockfile] = "${PACKAGELOCK}" | ||
444 | </literallayout> | ||
445 | These methods also include the ability to take a lockfile when manipulating | ||
446 | shared state directory structures since some cases are sensitive to file | ||
447 | additions or removals. | ||
448 | </para> | ||
449 | |||
450 | <para> | ||
451 | Behind the scenes, the shared state code works by looking in | ||
452 | <filename>SSTATE_DIR</filename> and | ||
453 | <filename>SSTATE_MIRRORS</filename> for shared state files. | ||
454 | Here is an example: | ||
455 | <literallayout class='monospaced'> | ||
456 | SSTATE_MIRRORS ?= "\ | ||
457 | file://.* http://someserver.tld/share/sstate/ \n \ | ||
458 | file://.* file:///some/local/dir/sstate/" | ||
459 | </literallayout> | ||
460 | </para> | ||
461 | |||
462 | <para> | ||
463 | The shared state package validity can be detected just by looking at the | ||
464 | filename since the filename contains the task checksum (or signature) as | ||
465 | described earlier in this section. | ||
466 | If a valid shared state package is found, the build process downloads it | ||
467 | and uses it to accelerate the task. | ||
468 | </para> | ||
469 | |||
470 | <para> | ||
471 | The build processes uses the <filename>*_setscene</filename> tasks | ||
472 | for the task acceleration phase. | ||
473 | BitBake goes through this phase before the main execution code and tries | ||
474 | to accelerate any tasks for which it can find shared state packages. | ||
475 | If a shared state package for a task is available, the shared state | ||
476 | package is used. | ||
477 | This means the task and any tasks on which it is dependent are not | ||
478 | executed. | ||
479 | </para> | ||
480 | |||
481 | <para> | ||
482 | As a real world example, the aim is when building an IPK-based image, | ||
483 | only the <filename>do_package_write_ipk</filename> tasks would have their | ||
484 | shared state packages fetched and extracted. | ||
485 | Since the sysroot is not used, it would never get extracted. | ||
486 | This is another reason to prefer the task-based approach over a | ||
487 | recipe-based approach, which would have to install the output from every task. | ||
488 | </para> | ||
489 | </section> | ||
490 | |||
491 | <section id='tips-and-tricks'> | ||
492 | <title>Tips and Tricks</title> | ||
493 | |||
494 | <para> | ||
495 | The code in the Yocto Project that supports incremental builds is not | ||
496 | simple code. | ||
497 | Consequently, when things go wrong, debugging needs to be straightforward. | ||
498 | Because of this, the Yocto Project team included strong debugging | ||
499 | tools. | ||
500 | </para> | ||
501 | |||
502 | <para> | ||
503 | First, whenever a shared state package is written out, so is a | ||
504 | corresponding <filename>.siginfo</filename> file. | ||
505 | This practice results in a pickled python database of all | ||
506 | the metadata that went into creating the hash for a given shared state | ||
507 | package. | ||
508 | </para> | ||
509 | |||
510 | <para> | ||
511 | Second, if BitBake is run with the <filename>--dump-signatures</filename> | ||
512 | (or <filename>-S</filename>) option, BitBake dumps out | ||
513 | <filename>.siginfo</filename> files in | ||
514 | the stamp directory for every task it would have executed instead of | ||
515 | building the target package specified. | ||
516 | </para> | ||
517 | |||
518 | <para> | ||
519 | Finally, there is a <filename>bitbake-diffsigs</filename> command that | ||
520 | can process these <filename>.siginfo</filename> files. | ||
521 | If one file is specified, it will dump out the dependency | ||
522 | information in the file. | ||
523 | If two files are specified, it will compare the | ||
524 | two files and dump out the differences between the two. | ||
525 | This allows the question of "What changed between X and Y?" to be | ||
526 | answered easily. | ||
527 | </para> | ||
528 | </section> | ||
529 | </section> | ||
530 | |||
531 | <!-- | ||
532 | |||
533 | <para> | ||
160 | The Yocto Project build process uses a shared state caching scheme to avoid having to | 534 | The Yocto Project build process uses a shared state caching scheme to avoid having to |
161 | rebuild software when it is not necessary. | 535 | rebuild software when it is not necessary. |
162 | Because the build time for a Yocto image can be significant, it is helpful to try and | 536 | Because the build time for a Yocto image can be significant, it is helpful to try and |
@@ -222,6 +596,7 @@ | |||
222 | <ulink url='http://git.yoctoproject.org/cgit.cgi/poky/commit/meta/classes/package.bbclass?id=737f8bbb4f27b4837047cb9b4fbfe01dfde36d54'>commit</ulink>. | 596 | <ulink url='http://git.yoctoproject.org/cgit.cgi/poky/commit/meta/classes/package.bbclass?id=737f8bbb4f27b4837047cb9b4fbfe01dfde36d54'>commit</ulink>. |
223 | </note> | 597 | </note> |
224 | </section> | 598 | </section> |
599 | --> | ||
225 | 600 | ||
226 | </chapter> | 601 | </chapter> |
227 | <!-- | 602 | <!-- |