3.3.3. Shared State

Checksums and dependencies, as discussed in the previous section, solve half the problem of supporting a shared state. The other part of the problem is being able to use checksum information during the build and being able to reuse or rebuild specific components.

The sstate class is a relatively generic implementation of how to "capture" a snapshot of a given task. The idea is that the build process does not care about the source of a task's output. Output could be freshly built or it could be downloaded and unpacked from somewhere - the build process does not need to worry about its origin.

There are two types of output, one is just about creating a directory in WORKDIR. A good example is the output of either do_install or do_package. The other type of output occurs when a set of data is merged into a shared directory tree such as the sysroot.

The Yocto Project team has tried to keep the details of the implementation hidden in sstate class. From a user's perspective, adding shared state wrapping to a task is as simple as this do_deploy example taken from the deploy class:

     DEPLOYDIR = "${WORKDIR}/deploy-${PN}"
     SSTATETASKS += "do_deploy"
     do_deploy[sstate-inputdirs] = "${DEPLOYDIR}"
     do_deploy[sstate-outputdirs] = "${DEPLOY_DIR_IMAGE}"

     python do_deploy_setscene () {
         sstate_setscene(d)
     }
     addtask do_deploy_setscene
     do_deploy[dirs] = "${DEPLOYDIR} ${B}"

The following list explains the previous example:

Adding "do_deploy" to SSTATETASKS adds some required sstate-related processing, which is implemented in the sstate class, to before and after the do_deploy task.
The do_deploy[sstate-inputdirs] = "${DEPLOYDIR}" declares that do_deploy places its output in ${DEPLOYDIR} when run normally (i.e. when not using the sstate cache). This output becomes the input to the shared state cache.
The do_deploy[sstate-outputdirs] = "${DEPLOY_DIR_IMAGE}" line causes the contents of the shared state cache to be copied to ${DEPLOY_DIR_IMAGE}.

Note
If do_deploy is not already in the shared state cache or if its input checksum (signature) has changed from when the output was cached, the task will be run to populate the shared state cache, after which the contents of the shared state cache is copied to ${DEPLOY_DIR_IMAGE}. If do_deploy is in the shared state cache and its signature indicates that the cached output is still valid (i.e. if no relevant task inputs have changed), then the contents of the shared state cache will be copied directly to ${DEPLOY_DIR_IMAGE} by the do_deploy_setscene task instead, skipping the do_deploy task.
The following task definition is glue logic needed to make the previous settings effective:
```
     python do_deploy_setscene () {
         sstate_setscene(d)
     }
     addtask do_deploy_setscene
                        
```
sstate_setscene() takes the flags above as input and accelerates the do_deploy task through the shared state cache if possible. If the task was accelerated, sstate_setscene() returns True. Otherwise, it returns False, and the normal do_deploy task runs. For more information, see the "setscene" section in the BitBake User Manual.
The do_deploy[dirs] = "${DEPLOYDIR} ${B}" line creates ${DEPLOYDIR} and ${B} before the do_deploy task runs, and also sets the current working directory of do_deploy to ${B}. For more information, see the "Variable Flags" section in the BitBake User Manual.
Note
In cases where sstate-inputdirs and sstate-outputdirs would be the same, you can use sstate-plaindirs. For example, to preserve the ${PKGD} and ${PKGDEST} output from the do_package task, use the following:
```
     do_package[sstate-plaindirs] = "${PKGD} ${PKGDEST}"
                            
```
sstate-inputdirs and sstate-outputdirs can also be used with multiple directories. For example, the following declares PKGDESTWORK and SHLIBWORK as shared state input directories, which populates the shared state cache, and PKGDATA_DIR and SHLIBSDIR as the corresponding shared state output directories:
```
     do_package[sstate-inputdirs] = "${PKGDESTWORK} ${SHLIBSWORKDIR}"
     do_package[sstate-outputdirs] = "${PKGDATA_DIR} ${SHLIBSDIR}"
                        
```
These methods also include the ability to take a lockfile when manipulating shared state directory structures, for cases where file additions or removals are sensitive:
```
     do_package[sstate-lockfile] = "${PACKAGELOCK}"
                        
```

Behind the scenes, the shared state code works by looking in SSTATE_DIR and SSTATE_MIRRORS for shared state files. Here is an example:

     SSTATE_MIRRORS ?= "\
     file://.* http://someserver.tld/share/sstate/PATH;downloadfilename=PATH \n \
     file://.* file:///some/local/dir/sstate/PATH"

Note

The shared state directory (SSTATE_DIR) is organized into two-character subdirectories, where the subdirectory names are based on the first two characters of the hash. If the shared state directory structure for a mirror has the same structure as SSTATE_DIR, you must specify "PATH" as part of the URI to enable the build system to map to the appropriate subdirectory.

The shared state package validity can be detected just by looking at the filename since the filename contains the task checksum (or signature) as described earlier in this section. If a valid shared state package is found, the build process downloads it and uses it to accelerate the task.

The build processes use the *_setscene tasks for the task acceleration phase. BitBake goes through this phase before the main execution code and tries to accelerate any tasks for which it can find shared state packages. If a shared state package for a task is available, the shared state package is used. This means the task and any tasks on which it is dependent are not executed.

As a real world example, the aim is when building an IPK-based image, only the do_package_write_ipk tasks would have their shared state packages fetched and extracted. Since the sysroot is not used, it would never get extracted. This is another reason why a task-based approach is preferred over a recipe-based approach, which would have to install the output from every task.