Software Management on embedded systems

組込みシステムがますます複雑になるにつれて、ソフトウェアは複雑さ増大させています。 新しい機能や修正が追加されたときに、組み込みシステムのソフトウェアを完全に信頼できる方法で更新できることが重要です。

On a Linux-based system, we can find in most cases the following elements:

  • the boot loader.

  • the kernel and the DT (Device Tree) file.

  • the root file system

  • other file systems, mounted at a later point

  • customer data, in raw format or on a file system

  • application specific software. For example, firmware to be downloaded on connected micro-controllers, and so on.

一般的に言えば、ほとんどの場合、カーネルとルート ファイル システムを更新し、ユーザー データを保持する必要がありますが、場合によって異なります。

ごく稀に、ブートローダーも更新する必要があります。 実際、ブート ローダーの更新は常に危険を伴います。 更新に失敗すると、ボードが壊れてしまうからです。 場合によっては、壊れたボードを元に戻すことができますが、ほとんどの場合、これはエンド ユーザーに任せることができず、システムをメーカーに返送する必要があります。

ソフトウェアの更新については、さまざまな概念があります。 それらのいくつかを公開し、なぜこのプロジェクトを実装したのかを説明したいと思います。

ブートローダーによる更新

ブートローダーは、単にカーネルを起動するだけでなく、はるかに多くのことを行います。 それらには独自のシェルがあり、プロセッサの周辺機器 (ほとんどの場合はシリアル回線) を使用して管理できます。 多くの場合、スクリプトが可能で、ある種のソフトウェア更新メカニズムを実装できます。

ただし、このアプローチにはいくつかの欠点があり、Linux で実行されているアプリケーションに基づいて別の解決策を探すことができました。

ブートローダーは周辺機器へのアクセスが制限される

カーネルでサポートされているすべての周辺機器がブート ローダーで使用できるわけではありません。 カーネルにサポートを追加することが理にかなっている場合、周辺機器はメイン アプリケーションで使用できるため、ドライバーをブート ローダーに移植する作業を繰り返すことは常に意味があるとは限りません。

ブートローダーのドライバーが更新されない

ブートローダのドライバは、ほとんどが Linux カーネルから移植されていますが、適応のために後で修正されたり、カーネルと同期されたりすることはありませんが、バグ修正は Linux カーネルで定期的に行われます。 一部の周辺機器は信頼性の低い方法で動作する可能性があり、問題を修正するのは簡単ではありません。 ブートローダーのドライバーは、多かれ少なかれカーネルのそれぞれのドライバーのフォークです。

例として、NAND デバイス用の UBI / UBIFS にはカーネルに多くの修正が含まれており、ブートローダーに移植されていません。 同じことが USB スタックにも見られます。 新しい周辺機器やプロトコルをサポートする取り組みは、ブート ローダーと同様にカーネルに使用する方が適切です。

削減されたファイル システム

サポートされているファイル システムの数は限られており、ファイル システムをブート ローダーに移植するには多大な労力が必要です。

Network support is limited

Network stack is limited, generally an update is possible via UDP but not via TCP.

Interaction with the operator

It is difficult to expose an interface to the operator, such as a GUI with a browser or on a display.

A complex logic can be easier implemented inside an application else in the boot loader. Extending the boot loader becomes complicated because the whole range of services and libraries are not available.

Boot loader’s update advantages

However, this approach has some advantages, too:

  • software for update is generally simpler. - smaller footprint: a stand-alone application only for software management requires an own kernel and a root file system. Even if their size can be trimmed dropping what is not required for updating the software, their size is not negligible.

パッケージマネージャーによる更新

すべての Linux ディストリビューションは、パッケージ マネージャーで更新されています。 組み込みに適していないのはなぜですか?

使用できないとは言えませんが、このアプローチには重大な欠点があります。 組み込みシステムは、特定のソフトウェアで十分にテストされています。 ソフトウェア自体はもはや アトミック ではなく、パッケージの長いリストに分割されるため、パッケージマネージャーを使用すると奇妙になる可能性があります。 ライブラリ バージョン x.y のアプリケーションが動作し、同じライブラリの異なるバージョンでも動作することをどのように保証できますか? どうすれば正常にテストできますか?

メーカーにとっては、一般に、ソフトウェアの新しいリリース (テスト エンジニアによって十分にテストされたもの) がリリースされ、新しいソフトウェア (またはファームウェア) を更新できると言う方が適切です。 パッケージを分割すると、テスターに​​とって悪夢と多大な労力が発生する可能性があります。

個々のファイルを簡単に置き換えることができるため、開発をスピードアップできますが、これは顧客サイトにおけるソフトウェア バージョンの悪夢です。 顧客がバグを報告した場合、いくつかのファイルのパッチが顧客に以前に送信されたときに、ソフトウェアが「バージョン 2.5」である可能性はどのようにあり得ますか?

アトミック アップデートは、通常、組み込みシステムの必須機能です。

Strategies for an application doing software upgrade

Instead of using the boot loader, an application can take into charge to upgrade the system. The application can use all services provided by the OS. The proposed solution is a stand-alone software, that follow customer rules and performs checks to determine if a software is installable, and then install the software on the desired storage.

The application can detect if the provided new software is suitable for the hardware, and it is can also check if the software is released by a verified authority. The range of features can grow from small system to a complex one, including the possibility to have pre- and post- install scripts, and so on.

Different strategies can be used, depending on the system’s resources. I am listing some of them.

Double copy with fall-back

If there is enough space on the storage to save two copies of the whole software, it is possible to guarantee that there is always a working copy even if the software update is interrupted or a power off occurs.

Each copy must contain the kernel, the root file system, and each further component that can be updated. It is required a mechanism to identify which version is running.

SWUpdate should be inserted in the application software, and the application software will trigger it when an update is required. The duty of SWUpdate is to update the stand-by copy, leaving the running copy of the software untouched.

A synergy with the boot loader is often necessary, because the boot loader must decide which copy should be started. Again, it must be possible to switch between the two copies. After a reboot, the boot loader decides which copy should run.

_images/double_copy_layout.png

Check the chapter about boot loader to see which mechanisms can be implemented to guarantee that the target is not broken after an update.

The most evident drawback is the amount of required space. The available space for each copy is less than half the size of the storage. However, an update is always safe even in case of power off.

This project supports this strategy. The application as part of this project should be installed in the root file system and started or triggered as required. There is no need of an own kernel, because the two copies guarantees that it is always possible to upgrade the not running copy.

SWUpdate will set bootloader’s variable to signal the that a new image is successfully installed.

Single copy - running as standalone image

The software upgrade application consists of kernel (maybe reduced dropping not required drivers) and a small root file system, with the application and its libraries. The whole size is much less than a single copy of the system software. Depending on set up, I get sizes from 2.5 until 8 MB for the stand-alone root file system. If the size is very important on small systems, it becomes negligible on systems with a lot of storage or big NANDs.

The system can be put in “upgrade” mode, simply signaling to the boot loader that the upgrading software must be started. The way can differ, for example setting a boot loader environment or using and external GPIO.

The boot loader starts “SWUpdate”, booting the SWUpdate kernel and the initrd image as root file system. Because it runs in RAM, it is possible to upgrade the whole storage. Differently as in the double-copy strategy, the systems must reboot to put itself in update mode.

This concept consumes less space in storage as having two copies, but it does not guarantee a fall-back without updating again the software. However, it can be guaranteed that the system goes automatically in upgrade mode when the productivity software is not found or corrupted, as well as when the upgrade process is interrupted for some reason.

_images/single_copy_layout.png

In fact, it is possible to consider the upgrade procedure as a transaction, and only after the successful upgrade the new software is set as “boot-able”. With these considerations, an upgrade with this strategy is safe: it is always guaranteed that the system boots and it is ready to get a new software, if the old one is corrupted or cannot run. With U-Boot as boot loader, SWUpdate is able to manage U-Boot’s environment setting variables to indicate the start and the end of a transaction and that the storage contains a valid software. A similar feature for GRUB environment block modification as well as for EFI Boot Guard has been introduced.

SWUpdate is mainly used in this configuration. The recipes for Yocto generate an initrd image containing the SWUpdate application, that is automatically started after mounting the root file system.

_images/swupdate_single.png

Something went wrong ?

Many things can go wrong, and it must be guaranteed that the system is able to run again and maybe able to reload a new software to fix a damaged image. SWUpdate works together with the boot loader to identify the possible causes of failures. Currently U-Boot, GRUB, and EFI Boot Guard are supported.

We can at least group some of the common causes:

  • damage / corrupted image during installing. SWUpdate is able to recognize it and the update process is interrupted. The old software is preserved and nothing is really copied into the target’s storage.

  • corrupted image in the storage (flash)

  • remote update interrupted due to communication problem.

  • power-failure

SWUpdate works as transaction process. The boot loader environment variable “recovery_status” is set to signal the update’s status to the boot loader. Of course, further variables can be added to fine tuning and report error causes. recovery_status can have the values “progress”, “failed”, or it can be unset.

When SWUpdate starts, it sets recovery_status to “progress”. After an update is finished with success, the variable is erased. If the update ends with an error, recovery_status has the value “failed”.

When an update is interrupted, independently from the cause, the boot loader recognizes it because the recovery_status variable is in “progress” or “failed”. The boot loader can then start again SWUpdate to load again the software (single-copy case) or run the old copy of the application (double-copy case).

Power Failure

If a power off occurs, it must be guaranteed that the system is able to work again - starting again SWUpdate or restoring an old copy of the software.

Generally, the behavior can be split according to the chosen scenario:

  • single copy: SWUpdate is interrupted and the update transaction did not end with a success. The boot loader is able to start SWUpdate again, having the possibility to update the software again.

  • double copy: SWUpdate did not switch between stand-by and current copy. The same version of software, that was not touched by the update, is started again.

To be completely safe, SWUpdate and the bootloader need to exchange some information. The bootloader must detect if an update was interrupted due to a power-off, and restart SWUpdate until an update is successful. SWUpdate supports the U-Boot, GRUB, and EFI Boot Guard bootloaders. U-Boot and EFI Boot Guard have a power-safe environment which SWUpdate is able to read and change in order to communicate with them. In case of GRUB, a fixed 1024-byte environment block file is used instead. SWUpdate sets a variable as flag when it starts to update the system and resets the same variable after completion. The bootloader can read this flag to check if an update was running before a power-off.

_images/SoftwareUpdateU-Boot.png

What about upgrading SWUpdate itself ?

SWUpdate is thought to be used in the whole development process, replacing customized process to update the software during the development. Before going into production, SWUpdate is well tested for a project.

If SWUpdate itself should be updated, the update cannot be safe if there is only one copy of SWUpdate in the storage. Safe update can be guaranteed only if SWUpdate is duplicated.

There are some ways to circumvent this issue if SWUpdate is part of the upgraded image:

  • have two copies of SWUpdate

  • take the risk, but have a rescue procedure using the boot loader.

What about upgrading the Boot loader ?

Updating the boot loader is in most cases a one-way process. On most SOCs, there is no possibility to have multiple copies of the boot loader, and when boot loader is broken, the board does not simply boot.

Some SOCs allow one to have multiple copies of the boot loader. But again, there is no general solution for this because it is very hardware specific.

In my experience, most targets do not allow one to update the boot loader. It is very uncommon that the boot loader must be updated when the product is ready for production.

It is different if the U-Boot environment must be updated, that is a common practice. U-Boot provides a double copy of the whole environment, and updating the environment from SWUpdate is power-off safe. Other boot loaders can or cannot have this feature.