Introduction
I am using a Raspberry Pi 3 Model B+ as my media center. Libreelec, Kodi and HDMI-CEC (using the remote of the TV to navigate through the content) are very nice, and I’m even planning to buy a DVD-T adapter so that I won’t have to use the TV as a source but always stay in Kodi.
What was bothering me was a high CPU load when Kodi was supposed to be idling. The 15min CPU load average was hovering at around 0.15 when it should have been around 0.00 – as no media was played, no Weather/RSS updates were active, so was causing this Kodi CPU load ? It turned out to be the way the PlexKodiConnect plugin integrates with Kodi and works around Kodi limitations. But this is not immediately obvious on why.
strace
and gdb
were extremely helpful and this post is mostly a log of the debugging session. I was able to identify a bug in PKC and also delve deep into decompiling and modifying live code.
strace
-ing the Kodi process
Before we start: strace
is not provided on current libreelec distribution (version 8.2.4), only the GNU debugger is – so you have to statically cross-compile your own for armv7-a 32bit architecture (for example, using a buildroot environment).
1. Obtaining the PID:
# ps -ef | grep kodi.bin | grep -v grep 374 root 219:32 /usr/lib/kodi/kodi.bin --standalone -fs --lircdev /run/lirc/lircd
2. A simple strace
shows a LOT of noise (partially because the system is not on 64bit and the clock_gettime
is not hidden from strace
by Linux vDSO). Anyway, trying to cleanup the output there are many grep excludes and one thing caught my eye, a read
from file descriptor 4 that happened more than once a second:
# strace -p 374 -s 1025 2>&1 | grep -v futex | grep -v clock | grep -v fstat64 | grep -v poll | grep -v read\(17 | grep -v read\(18 | grep -v nan | grep -v ioctl read(4, "cpu 855760 72483 525659 188998247 1348 0 9062 0 0 0\ncpu0 200130 12847 187892 46476225 322 0 8342 0 0 0\ncpu1 207595 18898 106326 47532512 354 0 233 0 0 0\ncpu2 230989 21896 118157 47476558 292 0 244 0 0 0\ncpu3 217043 18839 113282 47512950 378 0 241 0 0 0\nintr 1247031637 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 172313668 0 0 0 0 0 8166 50892998 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6850 0 0 44 0 0 0 0 0 0 0 965828508 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4864 7129 0 0 0 0 2905 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\nctxt 542777988\nbtime 1529838905\nprocesses 8886\nprocs_running 1\nprocs_blocked 0\nsoftirq 31460344 3216317 11069637 95868 1126126 0 0 2231344 9288352 0 4432700\n", 1024) = 912 ...
Investigating /proc/374/fd/4
surprise surprise, Kodi is reading /proc/stat
very often.
# ls -al /proc/374/fd/4 lr-x------ 1 root root 64 Jun 24 13:26 /proc/374/fd/4 -> /proc/stat
Why would Kodi do that ? I needed to get the stack trace upon the reads and then to check the source code to find out. Enter gdb
.
gdb
– getting the stack trace (registry, breakpoints)
There were few challenges with gdb since I have never used it in this scenario:
- Can it break onto syscall ?
- If so, how can it break into a specific syscall (where first argument is 4) ?
- Can it write memory (so that I can NOP the instruction) ?
- Can it write specific memory from a batch script (so that I don’t do the manual steps every time Kodi starts) ?
Google answered ‘yes’ to all these questions.
One thing to mention, to find out how the names of the registers we need to use the info reg
command – but for that, gdb
needs to be already at a breakpoint. So I’ll skip ahead and show the output (luckily, it seemed to have stopped at a read
from the file descriptor 4 (found in register r0
)).
Thread 1 "kodi.bin" hit Breakpoint 1, 0x75272ad0 in read () at ../sysdeps/unix/syscall-template.S:84 84 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) info reg r0 0x4 4 r1 0x136b490 20362384 r2 0x400 1024 r3 0x0 0 r4 0x136b058 20361304 r5 0x7521ae34 1965141556 r6 0x5e8 1512 r7 0x752e5be4 1965972452 r8 0xa 10 r9 0x7e8b44c8 2123056328 r10 0x0 0 r11 0x136b490 20362384 r12 0x1 1 sp 0x7e8b43f0 0x7e8b43f0 lr 0x7521bb68 1965144936 pc 0x75272ad0 0x75272ad0 <read> cpsr 0x480e0010 1208877072
The summary of the gdb
commands is below:
gdb -p 374
– attach to Kodi pid.break read if $r0 == 4
– break point on theread
syscall if the file descriptor (r0
register on Cortex-A53 (ARMv8) of RPI3B+)c
– is to continue (gdb will start paying attention to the process)bt
– to get the stack trace.
Let’s see them in action:
# gdb -p 374 (gdb) break read if $r0 == 4 (gdb) c Thread 1 "kodi.bin" hit Breakpoint 1, 0x75272ad0 in read () at ../sysdeps/unix/syscall-template.S:84 84 ../sysdeps/unix/syscall-template.S: No such file or directory. (gdb) bt #0 0x75272ad0 in read () at ../sysdeps/unix/syscall-template.S:84 #1 0x7521bb68 in _IO_new_file_underflow (fp=0x136b058) at fileops.c:600 #2 0x7521cd24 in __GI__IO_default_uflow (fp=0x136b058) at genops.c:413 #3 0x7520f490 in __GI__IO_getline_info (fp=fp@entry=0x136b058, buf=buf@entry=0x7e8b44c8 "", n=n@entry=255, delim=delim@entry=10, extract_delim=1, extract_delim@entry=2123056720, eof=eof@entry=0x0) at iogetline.c:60 #4 0x7520f554 in __GI__IO_getline (fp=fp@entry=0x136b058, buf=buf@entry=0x7e8b44c8 "", n=n@entry=255, delim=delim@entry=10, extract_delim=extract_delim@entry=1) at iogetline.c:34 #5 0x7520e120 in _IO_fgets (buf=0x7e8b44c8 "", n=256, fp=0x136b058) at iofgets.c:53 #6 0x0057073c in CCPUInfo::readProcStat(unsigned long long&, unsigned long long&, unsigned long long&, unsigned long long&, unsigned long long&) () #7 0x005709a8 in CCPUInfo::getUsedPercentage() () #8 0x006fa1e4 in CApplication::Process() () #9 0x00749358 in CXBApplicationEx::Run(CFileItemList&) () #10 0x005d1b80 in XBMC_Run () #11 0x00350d38 in main ()
Cool. It seems that CApplication::Process()
calls to CCPUInfo::getUsedPercentage()
. Checking github of Application.cpp
, I found this code:
#if !defined(TARGET_DARWIN) g_cpuInfo.getUsedPercentage(); // must call it to recalculate pct values #endif
and indeed within CCPUInfo.cpp
there is a call to readProcStat()
:
if (!readProcStat(userTicks, niceTicks, systemTicks, idleTicks, ioTicks)) return m_lastUsedPercentage;
The trouble I see is see is that the result value of getUsedPercentage()
is not used. I would understand to call it if the Debug is on (when the UI is supposed to show the CPU load) but why during Application main loop ? It does seem to update the values internally, but at a quick glance with Github search (without having the code locally) I was not able to find who uses them (beside a test and GUIWindowDebugInfo.cpp
, but that is also calling the getUsedPercentage()
just at the beginning of the CGUIWindowDebugInfo::Process
method, which makes the Application call useless).
Anyway, I wanted the comment out the call, but how, without recompiling ? It seemed to have no configuration option, and recompiling Kodi … so let’s go low level.
Disassembly
If I can identify the assembly call to CCPUInfo::getUsedPercentage()
within CApplication::Process()
it will be easier just to NOP the instruction. The address of CApplication::Process()
is 0x006fa1e4
according to the listing above. gdb allows disassembling any memory location using the disassemble [address]
command:
# gdb -p 374 (gdb) disassemble 0x006fa1e4 ... 0x006fa1cc <+312>: mov r0, r5 0x006fa1d0 <+316>: bl 0x6fa2b8 <_ZN10CStopWatch5ResetEv> 0x006fa1d4 <+320>: mov r0, r4 0x006fa1d8 <+324>: bl 0x6f9f40 <_ZN12CApplication11ProcessSlowEv> 0x006fa1dc <+328>: ldr r0, [pc, #76] ; 0x6fa230 <_ZN12CApplication7ProcessEv+412> 0x006fa1e0 <+332>: bl 0x570938 <_ZN8CCPUInfo17getUsedPercentageEv> 0x006fa1e4 <+336>: add sp, sp, #28 0x006fa1e8 <+340>: pop {r4, r5, pc} 0x006fa1ec <+344>: add r0, sp, #12 0x006fa1f0 <+348>: bl 0x349290 <_ZNSsD1Ev@plt> 0x006fa1f4 <+352>: bl 0x349230 <__cxa_end_cleanup@plt> 0x006fa1f8 <+356>: add r0, sp, #12 ...
And scrolling few pages through the code, it is immediately visible … the branch instruction 0x006fa1e0 <+332>: bl 0x570938 <_ZN8CCPUInfo17getUsedPercentageEv>
which is the actual call to the getUsedPercentage(). Replacing it with NOP provided a challenge, what is the instruction code for NOP ?! After some searching, it seems that the Cortex CPU has a NOP instruction code of 0xE320F000
.
gdb also allows to write any address of the memory using the set [address] = [val]
so let’s replace the bl
with a nop
, effectively commenting out the call to read/parse /proc/stat
:
# gdb -p 374 (gdb) set {int}0x006fa1e0 = 0xE320F000 (gdb) disassemble 0x006fa1e4 ... 0x006fa1cc <+312>: mov r0, r5 0x006fa1d0 <+316>: bl 0x6fa2b8 <_ZN10CStopWatch5ResetEv> 0x006fa1d4 <+320>: mov r0, r4 0x006fa1d8 <+324>: bl 0x6f9f40 <_ZN12CApplication11ProcessSlowEv> 0x006fa1dc <+328>: ldr r0, [pc, #76] ; 0x6fa230 <_ZN12CApplication7ProcessEv+412> 0x006fa1e0 <+332>: nop {0} 0x006fa1e4 <+336>: add sp, sp, #28 0x006fa1e8 <+340>: pop {r4, r5, pc} 0x006fa1ec <+344>: add r0, sp, #12 0x006fa1f0 <+348>: bl 0x349290 <_ZNSsD1Ev@plt> 0x006fa1f4 <+352>: bl 0x349230 <__cxa_end_cleanup@plt> 0x006fa1f8 <+356>: add r0, sp, #12 ...
Permanent fix
A nice feature of gdb allows executing command scripts in batch mode. It also seems that the address of the code is not relocatable, and once Kodi is up, the jump to getUsedPercentage() is always at the same position, 0x006fa1e0
. By creating a small file with the set
command, gdb can execute it after Kodi starts:
# cat /storage/.config/gdb-disable-proc-stat.cmd set {int}0x006fa1e0 = 0xE320F000
And then the .config/autostart.sh has to be (created or) edited to have gdb applying the fix:
# cat .config/autostart.sh ( gdb --batch -x /storage/.config/gdb-disable-proc-stat.cmd -p `pgrep kodi.bin` ) &
Observations
- gdb breakpoints is causing Kodi to restart.
- Kodi does so many
clock_gettime
calls that trigger kernel context switches. Linux vDSO comes to mind. - no stability issues encountered disabling the
bl
instruction at boot. Debug mode works correctly once activated.
As a side node: you can actually manipulate the file descriptors and make Kodi read /dev/zero
when accessing file descriptor 4, but still, it feels like CPU cycles will still be wasted trying to do a syscall just to get noting out of it. Few links describing the approach:
https://groups.google.com/forum/#!topic/alt.hackers/0ZMsMc5DvUw
http://ingvar.blog.redpill-linpro.com/2010/07/10/changing-a-process-file-descriptor-on-the-fly/
Leave a Reply