Skip to main content
Category

Working Group

ELISA Safety Analysis Approach

By Ambassadors, Blog, Working Group

Written By: Paul Albertella, ELISA Project TSC member, Chair for Open Source Engineering Process Working Group and Consultant at Codethink

The ELISA Open Source Engineering Process (OSEP) Working Group plans to document and apply a safety analysis process based on STPA that is suitable for use with Linux and other open source software.

Our objective is to specify a system context and an example set of safety goals for a safety-related system involving the Linux kernel, in order to enable the safety analysis and specification of a set of safety responsibilities that we may assign to the kernel in that context (and possibly other contexts), at a useful level of detail.

  • By system context, we mean either a concrete system design, or an abstraction representing a class of system designs.
  • By safety goals, we mean a set of system-level criteria that must be satisfied in order to avoid specific negative outcomes or consequences.
  • By safety responsibilities, we mean the behaviour or properties that are required to avoid violating the safety goals for the given system context. This may involve cooperation with other safety mechanisms, which are required to operate when it is not possible to avoid violating a high level safety goal.

In ISO 26262 terminology, this is equivalent to defining the assumptions of use (AoU) for Linux (or any FOSS component, or integration of components) as a Safety Element out of Context (SEooC).

You can find more information about this approach and its intended role as part of a wider engineering process in the Refining the RAFIA approach talk from last year’s ELISA Spring Workshop.

Our purpose with undertaking this kind of analysis in ELISA is to describe and provide examples of a method for identifying and documenting the risks associated with using Linux in a given context, and to examine how its existing features may be used to help to identify, control and/or mitigate these risks. The results of this analysis may then be used to derive the safety requirements that should apply for a system using Linux in such a context

For example, certain Linux configurations may help to address some of the risks that we identify, while others – including default kernel configurations – may introduce additional and possibly unacceptable risks (e.g performance optimisations that may have unintended or unpredictable consequences). Some mitigations for these risks may be outside the scope of Linux, involving AoUs on applications using Linux or on other components integrated with Linux as part of an operating system.

The outputs of this process are expected to be:

  • A description of the system context and safety goals that we have assumed for the purpose of the analysis
  • A specification of the risks that are considered in the analysis, and any exclusions from scope
  • Safety responsibilities for both Linux and the other system components identified in the system context, and specific requirements relating to these
  • Specific scenarios that might lead to these safety requirements being violated, which can be used to derive test cases and fault injections, and to identify where additional mitigations, safety mechanisms and/or requirements are needed to deal with these scenarios
  • Documents capturing the results of Linux feature analysis (or analysis of other interacting components) that was undertaken as part of the investigation

The OSEP group plans to start by applying this approach to the Automotive working group’s Telltale use case, documenting and refining the process as well as recording the results of the analysis.

We have invited other ELISA working groups and contributors to consider the following:

  • What other system contexts and safety goals should we consider for analysis?
  • What specific Linux features or properties should we focus on in our analyses?
  • How might ELISA working groups collaborate in applying, refining and documenting this process and its results?
  • Are there any external communities with which we might collaborate?

We would also welcome input from other open source contributors and communities interested in functional safety.

If you would like to get involved or learn more about this approach, please join the OSEP mailing list, where you can also find details of our meetings and how to participate.

What happens when OpenAPS commands run on Linux?

By Blog, Technical Update, Working Group

Written by Shuah Khan, Linux Fellow at the Linux Foundation and member of the ELISA Project TSC

Key Points

  • Understanding system resources necessary to build and run a workload is important.
  • Linux tracing and strace can be used to discover the system resources in use by a workload.
  • Tracing OpenAPS commands with strace generated detailed view of system activity during the common run and helped with generating flowcharts for normal and error paths when commands detected device busy conditions.
  • Once we discover and understand the workload needs, we can focus on them to avoid regressions and evaluate safety.

OpenAPS is an open source Artificial Pancreas System designed to automatically adjust an insulin pump’s insulin delivery to keep Blood Glucose in a safe range at all times. It is an open and transparent effort to make safe and effective basic Automatic Pancreas System technology widely available to anyone with compatible medical devices who is willing to build their own system.

Broadly speaking, the OpenAPS system can be thought of performing 3 main functions. Monitoring the environment and operational status of devices with as much data relevant to therapy as possible collected, predicting what should happen to glucose levels next, and enacting changes through issuing commands, emails and even phone calls.

The ELISA Medical Devices Working Group has set out to discover the Linux kernel subsystems used by OpenAPS. Understanding the kernel footprint necessary to run a workload helps us focus on the  subsystem and modules that make up the footprint for safety. We set out to answer the following questions:

  • What happens when an OpenAPS workload runs on Linux and discovers the subsystems and modules that are in active use when OpenAPS is running?
  • What are the interactions between OpenAPS and the kernel when a user checks how much insulin is left in the insulin pump?

Fine grained view of system activity

As mentioned earlier, the approach we gathered higher level of information about the OpenAPS usage. This higher information doesn’t tell us the system usage by individual OpenAPS commands. For example, watching the brain activity during the entire dinner vs. isolating the activity as we take the first bite of a delicious dessert and enjoy it.

Following up on our previous work, we gathered the fine grained information about individual OpenAPS commands and important use-cases. We used the strace command to trace the OpenAPS commands based on the process we identified to trace a generic workload which is now in the Linux kernel user’s and administrator’s guide. We traced several OpenAPS commands on an OpenAPS instance running on RasPi managing a Medtronic Insulin Pump. “mdt” in the following text refers to the command provided by https://github.com/ecc1/medtronic:

  • Get Insulin Pump model (mdt model)
  • Get Insulin Pump time (mdt clock)
  • Get Insulin Pump battery (mdt battery)
  • Get Insulin Pump basal rate schedule (mdt basal)
  • Suspend Insulin Pump (mdt suspend)
  • Resume Insulin Pump (mdt resume)
  • Get the remaining insulin in the Insulin Pump reservoir (mdt reservoir)
  • Get Insulin Pump temporary basal rate (mdt tempbasal)
  • Get Insulin Pump history (pumphistory)
    • Reference: https://github.com/ecc1/medtronic/tree/main/cmd/pumphistory
  • Send button press to the Insulin Pump  (mdt button)
  • Command the Insulin Pump to deliver a given amount of insulin as bolus  (mdt bolus)
    • Reference: https://github.com/ecc1/medtronic/blob/main/button.go
  • Set Insulin Pump temporary basal rate (mdt settempbasal)
    • Reference: Advanced OpenAPS features: https://openaps.readthedocs.io/en/latest/docs/Customize-Iterate/oref1.html

We ran these commands in normal mode and under strace to get summary (strace -c ) and complete trace (strace) information. The following shows a few selected commands, their output, trace information, and our analysis of the trace.

Running the command to get  the remaining insulin in the reservoir

mdt reservoir &> reservoir.out      # Get the remaining insulin in the reservoir 

strace -c mdt reservoir &> reservoir.summary  # trace summary

strace mdt reservoir &> reservoir.full  # full trace

Output from the get the remaining insulin in the reservoir command

Run the get the reservoir status command

connected to CC111x radio on /dev/spidev0.0

setting frequency to 916.600

waking pump

model 722 pump

108.5

disconnecting CC111x radio on /dev/spidev0.0

Process startup (process mgmt)

Open files (fs -> driver sysfs)

Open pump device “/dev/spidev0.0”

Open “/sys/class/gpio/gpio4/active_low”

Open “/sys/class/gpio/gpio4/direction”

Open “/sys/class/gpio/gpio4/value”

Open “/sys/class/gpio/gpio4/value”

Send SPI_IOC_MESSAGE(s) to talk to the pump (driver ioctl)

Close files (fs -> driver sysfs)

Exit program  (process mgmt)

This output shows that the Pump wakeup path was invoked.

mdt reservoir command flowchart

Complete strace -c output

2022/10/12 11:50:59 connected to CC111x radio on /dev/spidev0.0

2022/10/12 11:50:59 setting frequency to 916.600

2022/10/12 11:51:00 model 722 pump

108.5

2022/10/12 11:51:00 disconnecting CC111x radio on /dev/spidev0.0

% time     seconds  usecs/call     calls    errors syscall

—— ———– ———– ——— ——— —————-

 38.55    0.012979         270        48           ioctl

 31.50    0.010605          87       121         1 futex

  9.90    0.003333          27       120           rt_sigaction

  3.86    0.001300          54        24           clock_gettime

  3.04    0.001022         170         6           write

  1.95    0.000658         658         1           readlinkat

  1.73    0.000584          26        22           mmap2

  1.66    0.000558         111         5           read

  1.63    0.000548          49        11           openat

  1.61    0.000543         181         3           clone

  1.55    0.000521          47        11           fcntl

  1.08    0.000362          40         9           rt_sigprocmask

  0.77    0.000259          25        10           close

  0.72    0.000243          24        10           mprotect

  0.26    0.000088          22         4           brk

  0.14    0.000046          23         2           sigaltstack

  0.06    0.000020          20         1           gettid

  0.00    0.000000           0         1           execve

  0.00    0.000000           0         1           getpid

  0.00    0.000000           0         1           access

  0.00    0.000000           0         1           readlink

  0.00    0.000000           0         2           munmap

  0.00    0.000000           0         1           uname

  0.00    0.000000           0         1           flock

  0.00    0.000000           0         1           ugetrlimit

  0.00    0.000000           0         5           fstat64

  0.00    0.000000           0         1           sched_getaffinity

  0.00    0.000000           0         8           epoll_ctl

  0.00    0.000000           0         1           set_tid_address

  0.00    0.000000           0         2           fstatat64

  0.00    0.000000           0         1           set_robust_list

  0.00    0.000000           0         1           epoll_create1

  0.00    0.000000           0         1           set_tls

—— ———– ———– ——— ——— —————-

100.00    0.033669                   437         1 total

Get temp basal rate

mdt tempbasal &> tempbasal.out # Get temp basal rate

strace -c mdt tempbasal &> tempbasal.summary # trace summary

strace mdt tempbasal &> tempbasal.full # full trace

Output from the get temp basal rate command

Run get basal temperature command

connected to CC111x radio on /dev/spidev0.0

setting frequency to 916.600

model 722 pump

“duration”: 28,

  “temp”: “absolute”,

  “rate”: 0.9

}

disconnecting CC111x radio on /dev/spidev0.0

Process startup (process mgmt)

Open files (fs -> driver sysfs)

Open pump device “/dev/spidev0.0”

Open “/sys/class/gpio/gpio4/active_low”

Open “/sys/class/gpio/gpio4/direction”

Open “/sys/class/gpio/gpio4/value”

Open “/sys/class/gpio/gpio4/value”

Sends SPI_IOC_MESSAGE(s) to talk to the pump (driver ioctl SPI_IOC_MESSAGE)

Close files (fs -> driver sysfs)

Exit program  (process mgmt)

mdt tempbasal command flowchart

Complete strace -c output

2022/10/12 11:51:06 connected to CC111x radio on /dev/spidev0.0

2022/10/12 11:51:06 setting frequency to 916.600

2022/10/12 11:51:06 model 722 pump

{

  “duration”: 28,

  “temp”: “absolute”,

  “rate”: 0.9

}

2022/10/12 11:51:06 disconnecting CC111x radio on /dev/spidev0.0

% time     seconds  usecs/call     calls    errors syscall

—— ———– ———– ——— ——— —————-

 44.07    0.015195         316        48           ioctl

 31.19    0.010755          93       115         1 futex

  8.35    0.002879          23       120           rt_sigaction

  2.43    0.000837          34        24           clock_gettime

  2.10    0.000723          65        11           openat

  2.10    0.000723         723         1           readlinkat

  1.90    0.000655         109         6           write

  1.46    0.000503         167         3           clone

  1.05    0.000362          72         5           read

  0.95    0.000326          32        10           close

  0.92    0.000317          28        11           fcntl

  0.80    0.000277          34         8           epoll_ctl

  0.67    0.000232          10        22           mmap2

  0.62    0.000215          23         9           rt_sigprocmask

  0.52    0.000181          18        10           mprotect

  0.41    0.000140          70         2           fstatat64

  0.27    0.000094          23         4           brk

  0.11    0.000039          39         1           flock

  0.08    0.000029          29         1           epoll_create1

  0.00    0.000000           0         1           execve

  0.00    0.000000           0         1           getpid

  0.00    0.000000           0         1           access

  0.00    0.000000           0         1           readlink

  0.00    0.000000           0         2           munmap

  0.00    0.000000           0         1           uname

  0.00    0.000000           0         2           sigaltstack

  0.00    0.000000           0         1           ugetrlimit

  0.00    0.000000           0         5           fstat64

  0.00    0.000000           0         1           gettid

  0.00    0.000000           0         1           sched_getaffinity

  0.00    0.000000           0         1           set_tid_address

  0.00    0.000000           0         1           set_robust_list

  0.00    0.000000           0         1           set_tls

—— ———– ———– ——— ——— —————-

100.00    0.034482                   431         1 total

Get insulin pump history

pumphistory -n 1 &> pumphistory.out # Get insulin pump history

strace -c pumphistory -n 1 &> pumphistory.summary # trace summary

strace pumphistory -n 1 &> pumphistory.full # full trace

Output from the get pumphistory command

Run pumphistory command

retrieving pump history since 2022-10-12 10:54:03

cannot connect to CC111x radio on /dev/spidev0.0

null

/dev/spidev0.0: device is in use

Process startup (process mgmt)

Open files (fs -> driver sysfs)

Open pump device “/dev/spidev0.0”

Open “/sys/class/gpio/gpio4/active_low”

Open “/sys/class/gpio/gpio4/direction”

Close files (fs -> driver sysfs)

Exit program  (process mgmt)

This output shows that the pump busy path is invoked

Pump history command flowchart

Complete strace -c output

2022/10/12 11:54:03 retrieving pump history since 2022-10-12 10:54:03

2022/10/12 11:54:03 cannot connect to CC111x radio on /dev/spidev0.0

null

2022/10/12 11:54:03 /dev/spidev0.0: device is in use

% time     seconds  usecs/call     calls    errors syscall

—— ———– ———– ——— ——— —————-

 40.92    0.003106          25       120           rt_sigaction

 15.91    0.001208         109        11           futex

  8.09    0.000614         204         3           clone

  7.84    0.000595          59        10           mprotect

  7.52    0.000571         190         3           fcntl

  5.23    0.000397          44         9           rt_sigprocmask

  5.20    0.000395          17        22           mmap2

  4.23    0.000321         321         1           readlinkat

  3.77    0.000286          13        21           clock_gettime

  1.29    0.000098          24         4           brk

  0.00    0.000000           0         5           read

  0.00    0.000000           0         4           write

  0.00    0.000000           0         7           close

  0.00    0.000000           0         1           execve

  0.00    0.000000           0         1           getpid

  0.00    0.000000           0         1           access

  0.00    0.000000           0         1           readlink

  0.00    0.000000           0         2           munmap

  0.00    0.000000           0         1           uname

  0.00    0.000000           0         1         1 flock

  0.00    0.000000           0         2           sigaltstack

  0.00    0.000000           0         1           ugetrlimit

  0.00    0.000000           0         5           fstat64

  0.00    0.000000           0         1           gettid

  0.00    0.000000           0         1           sched_getaffinity

  0.00    0.000000           0         1           set_tid_address

  0.00    0.000000           0         7           openat

  0.00    0.000000           0         1           set_robust_list

  0.00    0.000000           0         1           set_tls

—— ———– ———– ——— ——— —————-

100.00    0.007591                   248         1 total

The following diagram shows the run-time context for these commands and their mapping to the Linux subsystems used by them.

System view

The following system view was updated after high level tracing analysis. This system view remains unchanged after the fine grained tracing analysis as expected.

References

https://github.com/ecc1/medtronic/blob/main/button.go

Conclusion

Using strace to trace OpenAPS commands helped us understand the detailed view of system files opened and closed while the commands run. We were able to generate flowcharts for normal and error paths when commands detected device busy conditions. As mentioned earlier, this tracing method gave us insight into the parts of the kernel used by the individual OpenAPS commands.

Credits

The ELISA Medical Working Group would like to sincerely acknowledge Chance Harrison for running the OpenAPS commands and providing information critical for making this effort a successful one.

SPDX-License-Identifier: CC-BY-4.0

This document is released under the Creative Commons Attribution 4.0 International License, available at https://creativecommons.org/licenses/by/4.0/legalcode. Pursuant to Section 5 of the license, please note that the following disclaimers apply (capitalized terms have the meanings set forth in the license). To the extent possible, the Licensor offers the Licensed Material as-is and as-available, and makes no representations or warranties of any kind concerning the Licensed Material, whether express, implied, statutory, or other. This includes, without limitation, warranties of title, merchantability, fitness for a particular purpose, non-infringement, absence of latent or other defects, accuracy, or the presence or absence of errors, whether or not known or discoverable. Where disclaimers of warranties are not allowed in full or in part, this disclaimer may not apply to You.

To the extent possible, in no event will the Licensor be liable to You on any legal theory (including, without limitation, negligence) or otherwise for any direct, special, indirect, incidental, consequential, punitive, exemplary, or other losses, costs, expenses, or damages arising out of this Public License or use of the Licensed Material, even if the Licensor has been advised of the possibility of such losses, costs, expenses, or damages. Where a limitation of liability is not allowed in full or in part, this limitation may not apply to You.

The disclaimer of warranties and limitation of liability provided above shall be interpreted in a manner that, to the extent possible, most closely approximates an absolute disclaimer and waiver of all liability.

ELISA CI enablement – Automation tools for easier collaboration

By Ambassadors, Blog, Working Group

Written by Philipp Ahmann, Chair of the ELISA Project TSC (Robert Bosch GmbH) and Sudip Mukherjee, Member of the Tools Working Group (Codethink Ltd)

This article describes how the ELISA Project has enabled Continuous Integration (CI), using the Automotive working group use case as an example. The key goals are to:

  • Make it easier for others to onboard the project 
  • Experience deliverables from the various working groups 
  • Make the work reproducible and more dependable

Additionally, it describes which elements are part of the pipeline along with the tools involved in the creation and testing of the release images. A remarkable element is that the complete pipeline is coupled to the documented development flow. This means that the CI reproduces the steps a developer would do on a PC.

In this way, it approaches people who are new to safety as well as those which are new to Linux. It is a collaborative approach of multiple working groups within ELISA and should foster more collaboration in upcoming ELISA activities.

Motivational factors that enable the CI

In a collaborative project, people of different interests and technical backgrounds unite to work together. After a successful onboarding process, it is important that contributions and work results become visible to the wider community. They need to be properly basellined and distributed fastly to stakeholders. 

A high degree of automation increases the dependability of the development process and reduces failures introduced by human slip. It also gives more time back to developers as they do not have to do everything from scratch.

The ELISA project also relies on external sources for support – for example – the Linux Foundation sister projects Automotive Grade Linux (AGL) and the Yocto Project.  The CI guarantees the rebuild whenever updated packages from AGL or Yocto are provided.

In summary, one can say that these demands drive the enablement of the CI within ELISA Automotive WG and put a set of requirements to it:

  • A entry point for stakeholder with different (technical) background and experience
  • A respect of different stakeholder interests in the work products
  • Faster ramp-up for new project contributors 
  • Better and faster  integration and distribution of new features and improvements
  • Automated and dependable software product creation
  • Extendable testing to validate concepts and assumptions.

With the successful setup of the prototype for the automotive use case, the goal is to scale towards more complex architectures as tackled within the ELISA Systems WG. The basic approach and tools will remain the same in the extension towards new use cases. 

The elements currently involved in the CI include:

All elements are coupled and tied together to build a dependency chain, which makes it easier to automatically detect failures, caused by changes along the elements of the pipeline.

The CI elements and flow within ELISA

The ELISA Project uses GitLab to run the CI pipeline, which interacts with an ELISA hosted build server (which is also used for other purposes as running tools for code improvement). The build sources are taken from Automotive Grade Linux Gerrit, Yocto Project repositories as well as the ELISA GitHub repositories.

Currently, the generated build artifacts are the kernel and root filesystem for QEMU x86_64 image. It is planned to extend these to additional images for physical hardware in the future.

While the CI is running in GitLab, ELISA is hosting source code and documentation in GitHub. This is the starting point of the CI flow. 

The build is triggered on a regular basis once per day to provide a fresh image and check the functionality of the generated image. Due to the usage of cached build artifacts, the longer build times of a full yocto build are reduced to a minimum (in the range of 15 minutes). The generated images are only kept for one day, so that a new download will always have the latest changes included. This further supports the idea of continuous integration. The latest image can be downloaded from the GitLab project packages

Also a manual trigger of the CI can be done by referencing a GitHub branch of meta-elisa from a PR. In this way a pre-check from changes is possible before formal images are released from the master. It further reduces eventual regressions. To avoid misuse, access to trigger new builds from referenced GitHub forks is restricted to ELISA working group members.

The strong bound of build server to documentation also makes sure that the docker file for manual build is used to create a docker image, which is then used to create the build artifacts for the QEMU image. Also, this is newly created on a regular basis, to make sure that documentation and tool updates will always result in a working image using the latest documentation.

This is needed to achieve the different entry points for the stakeholders who want to try ELISA artifacts or contribute to the project work.

Dependency to the AGL instrument cluster demo

As already mentioned before, but not directly visible from the build flow, is the dependency to the AGL instrument cluster demo. As ELISA collaborates with AGL, it enhances the existing work of AGL and uses it for demonstrating. The results are still a work in progress; therefore the generated meta-elisa layer is not integrated into the formal AGL repositories (yet). Missing this upstreaming causes the risk that the ELISA automotive WG tell-tale demo will break in case AGL makes changes to interfaces used by ELISA. Similar changes can happen for changes in the yocto-poky base.

By building on a daily basis and testing the generated images for the added functionality, it is guaranteed that any breaking change will become directly visible and immediate actions can take place. In these scenarios the latest image will not be deployed for download and analysis of the reported errors can be done.

As a full yocto build can be resource intensive and time consuming the use of cached build artifacts with an enabled SSTATE helps to reduce the build time to a minimum.

Benefits of using Yocto’s SSTATE cache

Building a yocto AGL image from scratch is taking time and a lot of computation performance. At least 100GB of disc space is consumed and a build can easily take several hours to complete even on decent hardware. 

In order to reduce the build time and limit the necessary disc space, it is possible to turn on cached build results. The so-called SSTATE is created with the first full build and shared for upcoming builds, demanding to only build deltas. Any change in sources and updates are detected and necessary binaries and their dependencies will be recompiled in case needed.

The generated cache is used within the CI, but can also be used for a local build at a developer PCc. 

Additionally, from time to time a new cache is generated to check that the build will also work in case someone builds the system from scratch. Building from scratch can make sense for low network speeds or networks with traffic limitations. This further makes sure that the GitLab pipeline fits to the documentation as much as possible. 

GitLab pipeline coupling to the meta-elisa documentation

The GitLab pipeline basically represents the steps which users would take when manually creating the ELISA enhanced demo of the instrument cluster provided by Automotive Grade Linux. As the current version is based on QEMU it is easy to reproduce the demo at a local machine and also to perform automated testing. This is illustrated by transferring the created images to the OpenQA server hosted by Codethink. 

OpenQA testing

It is important to confirm that the build was successful. As the original AGL demo has been enhanced with the Signal-Source-Control application, which also renders additional screen content, it makes sense to do an output comparison. For the meta-elisa demo OpenQA was used as a proven tool to check proper boot and functionality of the code modifications. This should also avoid potential regressions, e.g. due to external dependency or interface changes. An example output of a comparison as generated by OpenQA can be found below.

In the OpenQA tool, the orange divider in the presented screenshot can be moved left and right with the cursor. This makes it easy to identify differences between reference and recent execution. The greenish frame shows the region of interest which is checked for content. The 100% shows the fit ratio. 

Additionally, during the test execution a screen recording is performed and further logs such as those from serial terminal are provided.

With this option, a fully automated flow from source repository to generated and tested image is achieved. It is possible that working group members can fork the existing meta-elisa repository on GitHub, do modifications, check that it is built properly using GitLab and that the image remains functionally by checking the OpenQA result. This flow would be completely web based without the demand of even having a development environment setup locally on a PC.

However, this may more be used as an exceptional case. The interaction with the system to understand and learn is rather limited in this scenario. As an intention is to develop the system further and gain a larger understanding of the demand of a safety-critical system with Linux based workloads, there are many more entry points for different levels of expertise and personal interest.  

The different entry points to experience the automotive use case 

In order to make it easier for new contributors to join the ELISA Automotive WG, the demo has multiple entry points. 

  • Natively on a development PC using the Readme within meta-elisa
  • Starting from a docker file on a development PC
  • Using a docker image based on the ELISA provided docker file
  • Enhance the build time by using the yocto SSTATE 
  • Download QEMU based build images

Closing the loop to the motivation, the aforementioned entry points fulfill the demands of a collaborative project, where people of different interests come together to collaborate. It is possible to understand the system creation from scratch with an own build from source, or just trace a workload by using the pre-built QEMU images. The development flow is accelerated by providing docker images and cached build artifacts. As the outputs of different pipeline steps are available, debugging gets simplified as well (like checking a native build against a docker build or behavior of the own vs. generated QEMU image).

In summary, a comprehensive documentation (including this post) helps to tailor the pipeline and images to individual demands. Though, to have the best compatibility for exchange with others, the usage of a docker container is recommended. 

Docker container for better compatibility of CI and manual reproduction

Taking into account the large number of different host machines and Linux distributions that exist, from practical experience the ELISA CI is providing a docker file as a common base for development. The docker file is used to generate the docker image, which again is used to build the meta-elisa. By providing the docker file and docker image also to others, who want to create a local build on their machine, influence of the underlying Linux distribution is reduced. Documentation about the docker container creation and usage can be found in the wg-automotive github repository. To pull the docker container directly use:

docker pull registry.gitlab.com/elisa-tech/docker-image/elisa

Some general remarks regarding container should be mentioned for competition:

  • There is still a (limited) dependency to the underlying OS outside the container, as the docker image uses for example the Linux kernel from the host system. However, side effects of missing environment setups, conflicting or overwriting packages are minimized by using the docker container.
  • Regarding build time and performance the impact of the container could be neglected. The duration and system load during the build with and without container usage was very similar. 

The docker file will be kept up to date to reflect demands as per meta-elisa or dependency from yocto or AGL updates. This makes sure to always have the latest features and improvements also for the planned next steps reflected for better reproduction of results. 

Outlook and next steps

As an evolution of the use case, activities have been started to enable a reference image of an example architecture including Xen and Zephyr on real automotive hardware. Any result from the automotive as well as other working groups within ELISA should feed into the example architecture tailorable to specific use case demands.

Additionally having a proper software bill of material (SBOM) generation will also enhance the quality of the reproducible demo. It will support upcoming regulations which require an understanding of the content of your software product. As AGL also works on the reproducible build to generate binary compatibility of builds this may be an additional enhancement considered as future activity.

Summary

  • By setting up the CI with QEMU image deployment to the openQA testing it is possible to quickly track changes and validate code changes for proper appliance along the chain from source to deployment. 
  • Active contributors in ELISA have the chance to check their own commits or pull requests by using a standardized GitHub development flow and start the CI by a two line change in the pipeline reference to the meta-elisa fork and branch commit.
  • The different entry points and precompiled build results enable interested people to decide to build an image from scratch, use a docker container to reduce environment setup steps and reduce build time to a fraction by taking cached build artifacts from the server. 
  • By provisioning the final QEMU build image, it is even possible to access and boot the generated image without the need of performing an own build.

As all pipeline elements from documentation to later qa testing are dependent on each other and are coupled to an integrated flow. A high degree of dependability and reproduction is achieved with this solution. With a future enhancement of SBOM generation and utilization of reproducible builds project, the system dependability will further be increased. The automated execution supports that aligned development flows are kept and regressions can be tackled before they occur. Best entry points for reproduction of the results are the meta-elisa and wg-automotive docker container documentation in GitHub.

Linux in Aerospace: A Personal Journey

By Blog, Working Group

Written by Steven H. VanderLeest, Software Engineering Technical Lead at The Boeing Company and Chair of the ELISA Aerospace Working Group

Introduction

From the early days of Linux, I was a fan of this innovative, open-source Operating System (OS). I appreciated it as a hobbyist, helping me run Linux at home. I appreciated it as an educator, helping my computer engineering students walk with Linux through OS concepts. However, as a professional working in the safety-critical domain of aerospace, I wondered: could Linux fly?

My Pre-flight Taxi with Linux

My journey with Linux had its roots in the 1980s before Linus Torvalds introduced his new OS to the world in 1991. During my undergraduate degree in the 1980s, my engineering program had some labs equipped with the relatively recent IBM Personal Computer (PC). The machines were amazing, but my ability to command their power was somewhat limited by the OS, which was the Microsoft Disk Operating System (MS-DOS). When I reached my third year, I gained access to a Sun Workstation running SunOS, a variant of Unix. I quickly learned to appreciate the rich menagerie of shell commands, the power of combining them with redirection such as pipes, and the aesthetics of the fledgling X-Windows GUI.

I first heard about Linux in graduate school in the early 1990s at the University of Illinois at Urbana-Champaign. My doctoral thesis was on Input/Output (I/O) performance, especially on multiprocessor systems. My research analyzed and quantified I/O performance on OSs such as SunOS, SGI IRIX, DEC OSF/1, HP-UX, and Linux. One key finding of my research was that I/O performance could be impacted by the interference caused by unrelated transactions contending for shared resources within a multi-processor system. The magnitude of the impact was heavily dependent not only on the computing hardware architecture but also on the architecture of the OS. Interference could even occur on a uni-processor where independent processes had I/O tasks clustered in time.

As an educator, I applied Linux in my teaching. After finishing my Ph.D., I returned to my alma mater, Calvin College (now University), to take a position as a professor of engineering, teaching computer engineering topics. Linux provided a rich learning environment where my students could look under the hood while learning about operating systems. The transparency of open-source code made an ideal environment for learning and innovation. I also wanted to share my love for working at the interface between computer hardware and software. Studying the Linux kernel provided key insights into how the OS manages the hardware on behalf of applications. The overall system’s performance will improve if the OS is reasonably tuned to take advantage of the hardware architecture.

As a hobbyist, I used Linux at home. I set it up on any extra desktop or laptop I could get my hands on. The whole family got involved when I set up MythTV, an open-source streaming media system, and installed it on a spare Linux desktop system along with an expansion card to capture and record live television. We were asynchronously watching programs and never missing an episode well before any of our friends or neighbors followed suit with ReplayTV or TiVo.

As an engineering professional, I found opportunities to bolster my work with Linux. The challenge was that my employers often required MS Windows as the standard a bureaucratic IT department imposed. Nevertheless, I discovered ways to use Linux by dual-booting or a LiveCD approach and eventually run Linux in a virtual machine using hypervisors like VirtualBox. Like its Unix forebears, Linux was much more stable and reliable than Windows. Even if an application program went astray, I got a segmentation fault warning at most, and the other processes continued. Windows was prone to the Blue Screen of Death, bringing the system to a halt much too often. While it might be distressing to lose your work when this happened, losing a few minutes of labor (or hours if you didn’t save often) was a minor albeit annoying inconvenience. I couldn’t expect higher reliability since that wasn’t a use case for office desktop systems. I quickly realized that Windows doesn’t apply to safety-critical systems.

I also would not expect an operating system designed for an office desktop/laptop to work for embedded systems where the available main memory and secondary storage are limited. Embedded computing platforms are all around us but hidden inside our vehicles, more sophisticated consumer electronics, and smart devices. Windows might not work in these use cases, but Linux could! I started using Linux on embedded development boards when chip manufacturers such as Freescale (later NXP), Intel, Texas Instruments, and others began providing a Linux Board Support Package. The chip makers found this approach was the most effective way to get developers up and running quickly on their new hardware.

Taking Flight with Linux

Within safety-critical domains such as aerospace, Linux provides the foundation for multiple software development environments that run on desktops and laptops. As we move toward distributed development, Linux is a ubiquitous cloud guest OS.

For embedded, safety-critical applications, Linux is less common than a Real-Time Operating System (RTOS). However, a group of Linux developers has been slowly improving real-time performance since the 1990s. Attention coalesced into the PREEMPT-RT patch since 2004, with key parts of the patch making their way to the mainline kernel code. Today, almost all PREEMPT-RT functionality is mainlined but must be enabled through kernel configuration parameters. As for the safety-critical need, in the early 2010s, several research groups examined Linux as a foundation for an Integrated Modular Avionics (IMA) system. I led one of these efforts as the Principal Investigator for a Small Business Innovation Research (SBIR) contract with the US Defense Advanced Research Projects Agency (DARPA). We developed a proof-of-concept safety-critical system that combined the Xen hypervisor with Linux as a guest OS, to provide ARINC 653 partitioning, a key standard related to IMA.

Over the past decade, multiple private endeavors have applied Linux in aeronautical and astronautical computing systems, even platforms with modest safety criticality, though only a few of these efforts have been publicized. Demonstrating that software is reliable enough for flight is ambitious. I work for Boeing, one of the aerospace companies tackling that challenge. The next section provides an overview of the four key characteristics necessary to put aircraft using Linux into the air.

Developing Software for Aerospace is Challenging

For use in avionics (an electronic computing platform used on an aircraft), the software must be fast, deterministic, embedded, and assured.

Fast

For use in avionics, Linux must be fast. The Linux developer community is already heavily focused on speed, constantly innovating kernel performance improvements.

The aerospace industry can largely leverage the Linux community effort toward high performance. There may be a few specialized devices where drivers must be further optimized. However, those devices will almost always follow the existing design patterns and take advantage of community innovations, such as io_uring. Another example of an area that might need more attention is boot time. For aerospace, certain fault-tolerance techniques require a fast boot-up (or in-air re-boot) time. In these cases, the system must be operational in only a few seconds or even less.

Deterministic

For use in avionics, Linux must be deterministic. Remember the action thriller series 24? Jack Bauer (played by Kiefer Sutherland) would introduce the series with a voice-over claiming “events occur in real-time”. The audience understood that we were watching as if it were airing live. This commonly understood definition of real-time is not quite the same idea as a real-time computing system. For an RTOS, real-time means that the response to critical events will occur within a deterministic amount of time, even in the worst case. Most computing systems- hardware and software- are tuned to optimize the average response time. Most users and actions enjoy a rapid response, but sometimes at the expense of a slow response for certain users or certain actions. A deterministic system is not necessarily fast — it simply means that we can bound, with confidence, the maximum for critical response times. We want a guaranteed maximum response time in a real-time system, even in the worst case. If we were grading responses like students, we don’t care if the best score was an A+ or the average score was a C. We care that the worst score is still a passing grade in real-time systems. Let’s say the system must always respond within 50 milliseconds, or something bad happens. Over a series of tests, perhaps you find that the fastest response is 12 milliseconds, the average is 27 milliseconds, and the worst is 42 milliseconds. For determinism, we only care that the worst response is still under the requirement (in this example, it appears to be meeting our needs).

The aerospace industry can leverage the Linux community’s effort toward determinism. The PREEMPT_RT patches developed over the last 20 years have largely been mainlined, but must still be configured to enable them. Deterministic boot time has received less attention than deterministic response time, but both are important for aerospace applications.

Embedded

For use in avionics, Linux must be embedded. Embedded use cases are constrained with limited size, weight, and power. The most widely deployed embedded instance of Linux is probably the Android OS, used on the largest number of smartphones around the globe today. The vast majority of the billions of embedded devices that make our digital world run smoothly are not this visible — they are under the hood in your car, behind the panel of your home thermostat, and in many other behind-the-scenes locations.

Many industries, including the aerospace industry, continue to turn to Linux for embedded systems. Chip manufacturers continue to support Linux, often the first OS for which they provide starter software development kits. Developers from across the open-source community continue to develop drivers for new devices.

Assured

Regulatory agencies often oversee safety-critical systems to ensure the software is correct to a high confidence level. Because public safety is at stake, the agencies generally have the authority to enforce standards before a product can be released. For use in avionics, Linux must be assured. For avionics software in civilian aircraft, the authority to approve flight certification is specific to a geographic region. For example, in the United States, it is the Federal Aviation Administration (FAA); in most of Europe, it is the European Union Aviation Safety Agency (EASA).

The details of safety standards vary across industries such as nuclear, automotive, medical, aeronautical, rail, and others. However, the same basic concepts are found in all of them, such as expert peer review or formal means of verification and validation to show the software is suited to purpose. Most have two aspects: ensuring the software is reliable (it does the things we want) and safe (it does not do things we do not want).

A key standard for avionics software is DO-178C, which describes software development life cycle processes and objectives that must be met. DO-178C defines five software levels. The lowest is level E, where a software bug has no impact on the safety of the crew or passengers. An example might be the passenger entertainment system. The highest is level A, where a software bug could have catastrophic results. An example might be the flight control software that responds to pilot commands.

The aerospace industry can leverage much less from the Linux community regarding assurance than the other criteria stated earlier. On the one hand, Linux has been extensively field-tested, so it has a strong product history. Due to the crowd-sourcing nature of open source, Linux likely has more expert peer reviews than any other existing software. Assurance of Linux also benefits from the reasonably large number of tests available within several test frameworks. On the other hand, Linux was not designed expressly for aerospace, nor even for safety-critical use cases in general. The design has been much more iterative and ad-hoc, making it more challenging to demonstrate the correct design to software safety regulatory authorities.

Conclusion

Linux is already being used in flight-certified systems at level D. Aerospace companies like Boeing are now poised to use Linux more broadly and at higher levels of assurance, with groups like ELISA leading the effort. ELISA is the Enabling Linux In Safety Applications project under the Linux Foundation. Its mission is to make it easier for companies to build and certify Linux-based safety-critical applications. ELISA recently formed a new working group focused on Aerospace, which will tackle some of the challenges outlined above. We are just getting this group started and welcome new members!

I have crawled, walked, and run with Linux. Now it is time to fly!

For more information

This article previously ran on Linux.com.

ELISA Summit: Medical Devices Working Group Update (Video)

By Blog, ELISA Summit, Working Group

An estimated 185 people registered for the ELISA Summit, which took place virtually on September 7-8 to gather Linux community members and attendees from around the world. The event, which featured 15 sessions and 20 speakers, was open to anyone involved or interested in defining, using, or learning about common elements, processes, and tools that can be incorporated into Linux-based, safety-critical systems amenable to safety certification. Members of the ELISA Project community presented best practices and overviews on emerging trends and hot topics to using open source software in safety-critical applications and detailed working group updates.

We’ll be featuring event videos in blogs each week. Today, we focus on a session presented by the team members from ELISA Medical Device Working Group: Jason Smith, Jeffrey (Jefro) Osier-Mixon, Kate Stewart, Milan Lakhani,Nicole Pappler, Shefali Sharma, Shuah Khan on the topic of Medical Device Working Group update.

The main goal of this working group is to develop best practices to analyze systems and identify the components of Linux that will be participating in safety analysis, in the context of medical device safety standards. The main activities include 

  • Analysis of open source medical device application (openAPS)
  • Create documentation of results of STPA analysis (system, requirements, architecture, design, …)
  • Comparison of results of STPA analysis to 62304 Software of Unknown Provenance (SOUP)
  • Create documentation on usage of tooling to support kernel analysis 

In this session, the team shares progress to date, as well as some of the lessons learned and areas where they could use some help. The deliverables being worked on for the next quarter will be previewed as well.

Watch the video below or check out the presentation materials here.

For more details about the ELISA Project, visit the main website here. To learn more about the Medical Device Working Group or to join the community, click here.

ELISA Summit: Kernel Tracing (Video)

By Blog, ELISA Summit, Working Group

An estimated 185 people registered for the ELISA Summit, which took place virtually on September 7-8 to gather Linux community members and attendees from around the world. The event, which featured 15 sessions and 20 speakers, was open to anyone involved or interested in defining, using, or learning about common elements, processes, and tools that can be incorporated into Linux-based, safety-critical systems amenable to safety certification. Members of the ELISA Project community presented best practices and overviews on emerging trends and hot topics to using open source software in safety-critical applications and detailed working group updates.

We’ll be featuring event videos in blogs each week. Today, we focus on a session presented by Shefali Sharma, Senior year CSE Student, India and LFX Mentee at ELISA Medical Devices WG on the topic “Kernel Tracing.” In this video, Shefali presents the work she did during her ELISA Mentorship Program including:

  • Understanding system resources necessary to build and run a workload is important.
  • The highlights of theLinux tracing and strace can be used to discover the system resources in use by a workload. 
  • The completeness of the system usage information depends on the completeness of coverage of a workload.
  • Performance and security of the operating system can be analyzed with the help of tools like ftrace, perf, stress-ng, paxtest.
  • Once we discover and understand the workload needs, we can focus on them to avoid regressions and use it to evaluate safety considerations.

In addition to these topics, she also explains about her mentorship experience with ELISA Medical Working Group.  Watch the video below or check out the presentation materials here.

If you’re interested in becoming a ELISA Project or Linux Foundation mentee, you can review mentorships and all here: https://lfx.linuxfoundation.org/tools/mentorship/.

ELISA Summit: Generation of Static Architecture Diagrams for Specific Kernel Images (Video)

By Blog, ELISA Summit, Working Group

An estimated 185 people registered for the ELISA Summit, which took place virtually on September 7-8 to gather Linux community members and attendees from around the world. The event, which featured 15 sessions and 20 speakers, was open to anyone involved or interested in defining, using, or learning about common elements, processes, and tools that can be incorporated into Linux-based, safety-critical systems amenable to safety certification. Members of the ELISA Project community presented best practices and overviews on emerging trends and hot topics to using open source software in safety-critical applications and detailed working group updates.

We’ll be featuring event videos in blogs each week. Today, we focus on a session presented by Alessandro Carminati, Red Hat and Maurizio Papini, Red Hat on the topic Generation of Static Architecture Diagrams for Specific Kernel Images.”

In this talk, the experts shared how they generated a static architecture diagram of the Kernel based on radare2. To analyze the kernel for safety is challenging since it is a huge monolithic piece of code. Subsystems exist within the kernel, but they are not well defined nor documented. ISO26262 part6 requires a ‘Software architectural design specification’ that can be used to support safety analysis and drive the function of tests.

Watch the video below or check out the presentation materials here.

ELISA Summit: Automotive Working Group Update – Tell-tales an Evolution Use Case Towards Driver Assistance ?!(Video)

By Blog, ELISA Summit, Working Group

An estimated 185 people registered for the ELISA Summit, which took place virtually on September 7-8 to gather Linux community members and attendees from around the world. The event, which featured 15 sessions and 20 speakers, was open to anyone involved or interested in defining, using, or learning about common elements, processes, and tools that can be incorporated into Linux-based, safety-critical systems amenable to safety certification. Members of the ELISA Project community presented best practices and overviews on emerging trends and hot topics to using open source software in safety-critical applications and detailed working group updates.

We’ll be featuring event videos in blogs each week. Today we’ll feature the session by Philipp Ahmann, Robert Bosch GmbH supported by work from Paul Albertella, Codethink, and Christopher Temple, Arm on the topic Automotive Working Group Update – Tell tales an evolution use case towards driver assistance.

The session mainly covered the topics such as what is a tell tale and why is it the use case of the Automotive WG? What is STPA and advantages of it. This session gave an update on the latest activities of the Automotive Working Group status. Focus was put on the explanation why the Automotive Working Group has selected the use case of “safe displaying of warning signs on instrument cockpit” also called “telltales”. The benefits of the use case is illustrated as well. The relationship to other use cases is provided and the natural evolution to other automotive use cases like driver assistance features is shown.

Watch the video below or check out the presentation materials here.

To learn more about the Automotive Working Group or to join the mailing list or meetings, click here.

Addressing Space Isolation for Enhanced Safety of the Linux Kernel (Video)

By Blog, Technical Update, Working Group

Written by Igor Stoppa, Senior Software Architect at Nvidia

For more than two decades, Linux has made inroad in new fields of applications, from data centres, to embedded. We see now a growing demand for Linux in safety critical applications, ranging from automotive to robotics, to medical appliances.

However, Linux was not designed with these applications in mind, and unsurprisingly it is not an ideal fit, at the moment.In particular, one major pain point is the very limited resilience to spatial interferences originating from within the kernel itself.

Furthermore, the code base if much larger than what can be found in other operating systems traditionally found in safe applications. This is also compounded by the fact that Linux does not follow the processes traditionally in use for Functional Safety.

Summary

In the video, I describe my ongoing experiment of modifying the Linux kernel, to introduce a form of Address Space Isolation, meant to provide a mechanism enforcing freedom from interference. The presentation describes the problems, possible means to address it, and the current progress with the implementation. You’ll see a methodology for the safety analysis of a Linux system and mechanism for improving the safety of selected components.

This presentation ties both into the scope of the Linux Features for Safety-Critical Systems Working Group and the Critical SW track at Open Source Summit Europe. Though this work is not formally sponsored nor endorsed by ELISA, it is something I shared with the community for brainstorm and discussion purposes.

If you’d like to learn more about the Linux Features for Safety-Critical Systems Working Group or you’d like to continue this conversation, please join the mailing list or a WG meeting here.

Open Source Summit North America (Videos)

By Blog, Industry Conference, Working Group

This year, Open Source Summit North America was held as an umbrella conference, composed of a collection of 14 events covering the most important technologies, topics, and issues affecting open source today in June. There were a total of 2,771 attendees with 1,286 of those attending in person in Austin, from 1,041 organizations across 68 countries around the globe. The event attracted a diversified mix of open source community members from across the ecosystem. 54% of attendees were in technical positions, and developers comprised more than a quarter of attendees. You can read the post-event report here. You can also view all of the event playlists on the Linux Foundation Youtube Channel.

The ELISA Project was featured in several sessions and represented by ambassadors and community members at the conference. If you missed these presentations, you can watch the videos below:

Enabling Linux in Safety Applications (panel discussion)Gabriele Paoloni, Red Hat (ELISA board chair) Kate Stewart, Linux Foundation (ELISA Executive Director) Paul Albertella, CodeThink (Open Source Engineering Process) Elana Copperman, Intel (Linux Features) Philipp Ahmann, Bosch GmbH (Automotive) Milan Lakhani, Codethink (Medical Devices) 

Meeting business and safety objectives while building safety critical applications is a huge challenge for any industry, particularly those who have not had previous experience with open source and Linux. ELISA’s charter is to help industries navigate technical and non-technical challenges in order to bring the benefits of open source to safety applications and help organizations provide the rigor needed for certification. This panel features ELISA working group leads who will share their vision of making Linux a prominent player for FuSa applications in several industries. Join us to learn more about the project and how you can contribute to the community’s overall success.

Finding the Path from Embedded to Edge using Product LinesSteffen Evers, Bosch.IO & Philipp Ahmann, Robert Bosch GmBH

Linux is used for many embedded device classes today. However, it is increasingly desirable to connect these devices with each other and with the cloud. Embedded container technology can be used to make this easier by merging server/cloud and embedded technologies. However, it also leads to more challenges e.g. in respect to security, safety, traceability, and SBOMs. Using Linux across multiple device classes and product lines, and adding cloud technology, causes the complexity and efforts to explode.

In this talk, we describe how Bosch, and others, use embedded containers and “reference systems” to avoid redundant work and get a large number of embedded projects under control.

A reference system is an adjustable compilation of tools along with a pre-configured bundle of packages for a common use case and defined set of devices. This reuse significantly reduces development and maintenance costs, and speeds up the time to market. In this way, reference systems can form the base for your product lines.

Bosch uses the in-house Debian-based embedded distribution “Apertis” as the basis for several reference systems, e.g. for automotive infotainment systems. In doing so we push as many efforts as possible from individual projects into Apertis, as the meta-layer. Thereby, the users can focus more on the actual functionality and applications. e.g. one issue that we have addressed in the context of software management is the handling of GPLv3 in embedded devices. Another topic has been mainline support for kernel drivers.

BOF: SBOMs for Embedded Systems: What’s Working? What’s Not? – Kate Stewart

With the recent focus on improving Cybersecurity in IoT & Embedded, the expectation that a Software Bill of Materials (SBOM) can be produced, is becoming the norm. Having a clear understanding of the software running on an embedded system, especially in safety critical applications,  like medical devices, energy infrastructure, etc. has become essential.  Regulatory authorities have recognized this and are starting to expect it as a condition for engagement.  This BOF will provide an overview of the emerging regulatory landscape, as well as examples of how SBOMs are already being generated today for embedded systems by open source projects such as Zephyr, Yocto and others,  followed by a discussion of the gaps folks are seeing in practice, and ways we might tackle them.

Static Partitioning with Xen, LinuxRT, and Zephyr: A Concrete End-to-end Example – Stefano Stabellini, AMD

Static partitioning enables multiple domains to run alongside each other with no interference. They could be running Linux, an RTOS, or another OS, and all of them have direct access to different portions of the SoC. In the last five years, the Xen community introduced several new features to make Xen-based static partitioning possible. Dom0less to start multiple static domains in parallel at boot, and Cache Coloring to minimize cache interference effects are among them. Static inter-domain communications mechanisms were introduced this year, while “ImageBuilder” has been making system-wide configurations easier. An easy-to-use complete solution is within our grasp. This talk will show the progress made on Xen static partitioning. The audience will learn to configure a realistic reference design with multiple partitions: a LinuxRT partition, a Zephyr partition, and a larger Linux partition. The presentation will show how to set up communication channels and direct hardware access for the domains. It will explain how to measure interrupt latency and use cache coloring to zero cache interference effects. The talk will include a live demo of the reference design.

RTLA: Real-time Linux Analysis Toolset – Daniel Bristot De Oliveira, Red Hat

Currently, Real-time Linux is evaluated using a black-box approach. While the black-box method provides an overview of the system, it fails to provide a root cause analysis for unexpected values. Developers have to use kernel trace features to debug these cases, requiring extensive knowledge about the system and fastidious tracing setup and breakdown. Such analysis will be even more impactful after the PREEMPT_RT merge. To support these cases, since version 5.17, the Linux kernel includes a new tool named rtla, which stands for Real-time Linux Analysis. The rtla is a meta-tool that consists of a set of commands that aims to analyze the real-time properties of Linux. Instead of testing Linux as a black box, rtla leverages kernel tracing capabilities to provide precise information about latencies and root causes of unexpected results. In this talk, Daniel will present two tools provided by rtla. The timerlat tool to measure IRQ and thread latency for interrupt-driven applications and the osnoise tool to evaluate the ability of Linux to isolate workload from the interferences from the rest of the system. The presentation includes examples of how to use the tool to find the root cause analysis and collect extra tracing information directly from the tool.