The automatic packaging process in Yocto (updating with latest Yocto/OE)
======================================================================== Basics about Yocto [important] ------------------------------ The Yocto project ulitizes bitbake (a metadata interpreter and task scheduler) and meta data from OE to automakes the the process of building a customized Linux distro. If you've built out a running Linux, you probably know the basic process of doing so. Take LFS (Linux From Scratch) as an example, it first builds out a toolchain; the toolchain is then used to build out the real root file system. The basic steps of building a package and install it into the rootfs mainly involves the following steps in LFS: 1. get the source package and needed patches 2. unpack the compressed source tarball 3. patch the source code 4. configure the source code 5. compile the source code 6. install the generated files into the root file system With the above working flow, we can build out a Linux. However, it still has several shortcomings. First and most important, it lacks automation. Repeating the 'configure, make, make install' process for hundreds of packages is not only boring but also error prone. The above six steps are really common for all packages, there's no reason why we can't automate this process. Second, it lacks package management. While this is OK for small systems with only a few packages, it would really be a disaster for large systems with hundreds of packages. The importance of package management is so obvious that I will not detail the reason here. Of course there are other shortcomings such as cross compilation complexity. I'm not going to detail them here because this document mainly focuses on packaging automation in Yocto. Let's look at how Yocto solves the above two problem. Before that, let's look at some basic tasks in Yocto for building a package. 1. do_fetch 2. do_unpack 3. do_patch 4. do_configure 5. do_compile 6. do_install 7. do_package There are other tasks, but the above seven are the most important ones. The first six tasks corresponds to the six common steps in LFS, the last task is used to support package management. Now, let's answer the question of how Yocto supports automation of building out a package. As stated above, the Yocto project comprises two components, bitbake and metadata. In brief, bitbake reads and interprets metadata into tasks, resolves dependencies among tasks and runs each task in proper order. Tasks like do_configure are written as metadata in classes or recipes. Task dependencies are also encoded in classes or recipes. In this way, people could reuse others' work. Normally, users may not even need to look into the recipes. Yocto supports three kinds of package backend, that is, rpm, ipk and deb. This document only discusses rpm package management support in Yocto, as the rest two are totally analogous. To make package management work in Yocto, two things are essential. One is to build out rpm packages and the other is to make the database generated at rootfs time usable on target. Note that we're building packages and installing them into the target root file system on our build machine, so we have to be careful that when installing packages, the generated database could still be used on target. Please refer to package_rpm.bbclass, rootfs_rpm.bbclass to see how we make this work. One might argue that a package management tool is also essential. Absolutely correct. But it's just a matter of installing the tool into rootfs. Nothing complex. How to build out rpm packages is a more complex task and this document endeadors to expain it clearly. Basics about packaging [important] ---------------------------------- I'll firt give out a brief overview of the packaging process in Yocto, then I'll explain the technical details in format of 'Q & A'. We have to first make clear the inputs and outputs of the packaging process. Input: the output of do_install task in ${D} Output: splited rpm packages in ${DEPLOY_DIR} The process of packaging is as follows: 1) do_package Input is ${D}, output is ${PKGDEST}. D = ${WORKDIR}/image PKGD = ${WORKDIR}/package PKGDEST = ${WORKDIR}/packages-split *) set up PKGD (from D) *) perform_pkgcopy *) ${PACKAGE_PREPROCESS_FUNCS} *) split_and_strip_files *) fixup_perms *) split PKGD into PKGDEST *) pacakge_do_split_locales *) populate_packages *) process PKGDEST *) package_fixsymlinks *) package_name_hook *) package_do_filedeps *) package_do_shlibs *) package_do_pkgconfig *) read_shlibdeps *) package_depchains *) emit_pkgdata 2) do_packagedata -- dummy task to mark when dealing with package data is complete 3) do_package_write_rpm *) read_subpackage_metadata PKGDATA_DIR = ${WORKDIR}/pkgdata This task gets information from ${PKGDATA_DIR}/runtime/pkg and set corresponding variables. The ${PKGDATA_DIR}/runtime/pkg file is created in the do_package task by the emit_pkgdata function. *) do_package_rpm *) write spec file *) build .src.rpm packages if necessary *) build rpm packages 4) do_package_write -- dummy task to mark when all packaging is complete -- do_package_write[noexec] = "1" Q1: As we can see, the ${PKGD} is actually an intermediate directory which is basically a copy of ${D}. So why do we need this directory? We need this directory because we should not change the contents in the ${D} directory in the packaging process. In this way, the packaging process is seperated from the installing process. We can safely rerun the packaging tasks without re-executing the do_install task. Q2. Why do we need the pkgdata files (${WORKDIR}/pkgdata/*)? We need these files to store information about packages so that the do_package_write_rpm task is seperated from the do_package task, or the do_packagedata task is seperated from the do_write_package task. This ensures we don't have unnecessary reruns of those tasks in the do_package period. Relavent Classes [important] ---------------------------- package.bbclass package_rpm.bbclass packagedata.bbclass prserv.bbclass Other Questions and Answers [important] 1. Why is rpm-native needed for every package backend? In package.bbclass, the comment states that rpm-native is needed for per-file dependency identification. What does it mean? Also, why is file-native needed for every package backend? rpm-native is needed for per-file dependency identification which is performed by the `package_do_filedeps' function. It's also needed by the `split_and_strip_files' function which uses debugedit from rpm-native. file-native is needed for the stripping process to work correctly. 2. Why the do_packagedata task of each DEPENDS must have completed before the do_package task starts? d.appendVarFlag('do_package', 'deptask', " do_packagedata") (do_package[deptask] = "do_packagedata") For example, if the `findutils' depends on `autoconf-native', so the do_packagedata task of autoconf-native must have completed before the package task for findutils to start. shlibs requires any DEPENDS to have already packaged for the *.list files 3. What is the configure.sstate file under ${WORKDIR} used for? meta/classes/autotools.bbclass: CONFIGURESTAMPFILE = "${WORKDIR}/configure.sstate" This value is compared with the BB_TASKHASH value to determine whether the build directory needs to be cleaned up.