Igor Ljubuncic
on 8 August 2019
Slow snap? Trace-exec to the rescue!
Slow applications are never fun. But not knowing why an application is not behaving correctly can be even more frustrating. A well-designed system that can diagnose performance or startup issues and inform the user about the problem goes a long way toward mitigating the frustration, and may even help resolve the root cause altogether.
Back in January, we reviewed several tools used to troubleshoot snap usage, including the strace functionality built into the snap binary. However, strace requires some knowledge of system internals, so it may not be the first choice for users or developers looking to quickly diagnose and profile the behavior of their snaps. From version 2.36 onwards, snapd ships with the –trace-exec run option. This convenient and friendly feature lists the slowest processes in the snap startup and runtime chain, which can help you pinpoint the source of the issue.
Trace-exec in action
Technical troubleshooting is like an onion – layered and complex. Which is why you always start with simple things first. Going back to strace, if you want to profile the execution of an application, you can run strace with the -c (summary) flag. This will give you the list of all the system calls, the percentage of CPU time spent in each, average time per system call, errors, and several other important details. At first glance, this can be a useful indicator, but it requires a keen eye, and the data is not shown on a per-process level.
Trace-exec offers a terser output, listing up to 10 slowest processes that ran and finished during the run of a snap. Typically, this will include the actual snap startup and some portion of its runtime. While not as comprehensive as strace, it can give you a good overview of all the commands, command wrappers and helper scripts used to bootstrap the snap.
Let’s look at a practical example with the VLC media player.
snap run --trace-exec vlc
Slowest 10 exec calls during snap run:
0.007s snap-update-ns
0.088s /snap/core/7270/usr/lib/snapd/snap-confine
0.006s /usr/lib/snapd/snap-exec
0.007s /usr/bin/getent
0.006s /bin/mkdir
0.006s /bin/mkdir
0.007s /bin/mkdir
0.160s /snap/vlc/1049/bin/desktop-launch
0.035s /snap/vlc/1049/usr/bin/glxinfo
0.042s /snap/vlc/1049/bin/vlc-snap-wrapper.sh
Total time: 2.295s
From this run, we can see that a portion of time is dedicated to the snap setup, like the confinement. You also get the total time – but part of this number may also be the actual time the application runs, so it is not strictly the startup portion of the entire sequence. Nevertheless, it is a useful indicator, and we’ve discussed this in much greater detail in the I have a need, a need for snap article some time ago.
On its own, of course, the information will not resolve the problem – but it can be used by snap developers and application developers to optimize their tools. For instance, an enhancement or a big fix could be implemented into snapd, or perhaps a developer may realize that certain libraries staged into the snap contribute the highest penalty to the startup and behavior of their software.
In comparison, with strace, you would see the following:
snap run --strace='-c' vlc
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ------------
53.40 4.041220 3419 1182 1 poll
34.21 2.589384 2281 1135 30 futex
4.87 0.368380 251 1466 81 read
4.61 0.348855 2474 141 56 wait4
1.09 0.082512 15 5607 4457 open
0.55 0.041282 41282 1 rt_sigtimedw
0.36 0.027104 4 6200 2616 stat
0.09 0.006461 3 2536 mprotect
0.08 0.006255 20 316 getdents
0.08 0.006148 8 740 490 access
0.07 0.005506 3 1853 6 close
…
0.00 0.000000 0 2 capset
0.00 0.000000 0 1 prlimit64
------ ----------- ----------- --------- --------- ------------
100.00 7.568409 31057 7960 total
This output is not for the faint-hearted. You have no instant visibility into which processes contributed to which system call, and you also need some understanding of what each system call means. If you want to know more, you can always use the man 2 pages to learn more about specific calls, e.g.:
man 2 mprotect
Summary
Trace-exec complements other troubleshooting features in snapd quite nicely. Alongside other assets in the snap ecosystems, it provides developers as well as users with means to better understand why sometimes their applications do not always perform quite as they should.
This valuable information, when available, also allows the snap teams to implement fixes and changes that will make the overall snap experience smoother and more enjoyable. If there are other features or capabilities you’d like to see introduced in snapd, please let us now by joining our forum for a discussion.
Photo by Marten Bjork on Unsplash.