DistroKit Mailinglist
 help / color / mirror / Atom feed
* [DistroKit] How to debug systemd services failing to start with 11/SEGV?
@ 2024-04-09 12:42 Alexander Dahl
  2024-04-09 13:53 ` [DistroKit] [systemd-devel] " Lennart Poettering
  2024-04-09 13:58 ` [DistroKit] " Ahmad Fatoum
  0 siblings, 2 replies; 7+ messages in thread
From: Alexander Dahl @ 2024-04-09 12:42 UTC (permalink / raw)
  To: systemd-devel; +Cc: distrokit

Hello everyone,

I'm currently trying to build a firmware for an embedded device and
running into trouble because systemd seems to crash.  The BSP is
based on pengutronix DistroKit (master) built with ptxdist and the
target is the Microchip SAM9X60-Curiosity board, which is arm v5te
architecture (that board is not part of DistroKit, support for that is
in an upper layer of mine not public yet (?)).

Everything is quite recent, building systemd version 255.2 currently.
On startup I get messages like this (this is the first one, later on
there are lot more, all with the same status):

   [   11.175650] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
   [   11.239679] systemd[1]: systemd-journald.service: Failed with result 'signal'.
   [   11.292640] systemd[1]: Failed to start systemd-journald.service.
   [FAILED] Failed to start systemd-journald.service.
   See 'systemctl status systemd-journald.service' for details.

The system drops me on a shell later, where I can run the above
mentioned command, which gives:

    ~ # systemctl status systemd-journald.service
    x systemd-journald.service - Journal Service
         Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static)
         Active: failed (Result: signal) since Tue 2024-04-09 11:44:52 UTC; 11min a>
    TriggeredBy: x systemd-journald-dev-log.socket
                 * systemd-journald-audit.socket
                 x systemd-journald.socket
           Docs: man:systemd-journald.service(8)
                 man:journald.conf(5)
        Process: 197 ExecStart=/usr/lib/systemd/systemd-journald (code=killed, sign>
       Main PID: 197 (code=killed, signal=SEGV)
       FD Store: 0 (limit: 4224)
            CPU: 330ms

This does not help me much.  Other services crashing: systemd-udevd
and systemd-timesyncd, also with status 11/SEGV which is segmentation
fault, right?

I had this board running with an older version of systemd, but I can
not remember which was the last good version.

Could anyone give me a hint please how to debug this?

(The same version runs fine btw. when building that BSP for an arm v7a
target!)

Greets
Alex




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [DistroKit] [systemd-devel] How to debug systemd services failing to start with 11/SEGV?
  2024-04-09 12:42 [DistroKit] How to debug systemd services failing to start with 11/SEGV? Alexander Dahl
@ 2024-04-09 13:53 ` Lennart Poettering
  2024-04-09 14:21   ` Alexander Dahl
  2024-04-09 13:58 ` [DistroKit] " Ahmad Fatoum
  1 sibling, 1 reply; 7+ messages in thread
From: Lennart Poettering @ 2024-04-09 13:53 UTC (permalink / raw)
  To: systemd-devel, distrokit

On Di, 09.04.24 14:42, Alexander Dahl (ada@thorsis.com) wrote:

> Hello everyone,
>
> I'm currently trying to build a firmware for an embedded device and
> running into trouble because systemd seems to crash.  The BSP is
> based on pengutronix DistroKit (master) built with ptxdist and the
> target is the Microchip SAM9X60-Curiosity board, which is arm v5te
> architecture (that board is not part of DistroKit, support for that is
> in an upper layer of mine not public yet (?)).
>
> Everything is quite recent, building systemd version 255.2 currently.
> On startup I get messages like this (this is the first one, later on
> there are lot more, all with the same status):
>
>    [   11.175650] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
>    [   11.239679] systemd[1]: systemd-journald.service: Failed with result 'signal'.
>    [   11.292640] systemd[1]: Failed to start systemd-journald.service.
>    [FAILED] Failed to start systemd-journald.service.
>    See 'systemctl status systemd-journald.service' for details.
>
> The system drops me on a shell later, where I can run the above
> mentioned command, which gives:
>
>     ~ # systemctl status systemd-journald.service
>     x systemd-journald.service - Journal Service
>          Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static)
>          Active: failed (Result: signal) since Tue 2024-04-09 11:44:52 UTC; 11min a>
>     TriggeredBy: x systemd-journald-dev-log.socket
>                  * systemd-journald-audit.socket
>                  x systemd-journald.socket
>            Docs: man:systemd-journald.service(8)
>                  man:journald.conf(5)
>         Process: 197 ExecStart=/usr/lib/systemd/systemd-journald (code=killed, sign>
>        Main PID: 197 (code=killed, signal=SEGV)
>        FD Store: 0 (limit: 4224)
>             CPU: 330ms
>
> This does not help me much.  Other services crashing: systemd-udevd
> and systemd-timesyncd, also with status 11/SEGV which is segmentation
> fault, right?

Yes.

> I had this board running with an older version of systemd, but I can
> not remember which was the last good version.
>
> Could anyone give me a hint please how to debug this?

"coredumpctl gdb" should get open the most recent backtrace for you.

The coredump should also show up in the logs with a backtrace.

Lennart

--
Lennart Poettering, Berlin



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [DistroKit] How to debug systemd services failing to start with 11/SEGV?
  2024-04-09 12:42 [DistroKit] How to debug systemd services failing to start with 11/SEGV? Alexander Dahl
  2024-04-09 13:53 ` [DistroKit] [systemd-devel] " Lennart Poettering
@ 2024-04-09 13:58 ` Ahmad Fatoum
  1 sibling, 0 replies; 7+ messages in thread
From: Ahmad Fatoum @ 2024-04-09 13:58 UTC (permalink / raw)
  To: systemd-devel, distrokit

Hello Alexander,

On 09.04.24 14:42, Alexander Dahl wrote:
> I had this board running with an older version of systemd, but I can
> not remember which was the last good version.
> 
> Could anyone give me a hint please how to debug this?
> 
> (The same version runs fine btw. when building that BSP for an arm v7a
> target!)

I had similar issues with SAMA5D3, which despite being ARMv7-A has no NEON
support. An instruction set mismatch sounds likely in your case as well.

Can you generate a coredump and analyze it off-target to see which function
is crashing?

Cheers,
Ahmad

> 
> Greets
> Alex
> 
> 
> 

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [DistroKit] [systemd-devel] How to debug systemd services failing to start with 11/SEGV?
  2024-04-09 13:53 ` [DistroKit] [systemd-devel] " Lennart Poettering
@ 2024-04-09 14:21   ` Alexander Dahl
  2024-04-10 10:37     ` Alexander Dahl
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Dahl @ 2024-04-09 14:21 UTC (permalink / raw)
  To: Lennart Poettering; +Cc: distrokit, systemd-devel

Hello Lennart,

thanks for your quick reply, see below.

Am Tue, Apr 09, 2024 at 03:53:24PM +0200 schrieb Lennart Poettering:
> On Di, 09.04.24 14:42, Alexander Dahl (ada@thorsis.com) wrote:
> 
> > Hello everyone,
> >
> > I'm currently trying to build a firmware for an embedded device and
> > running into trouble because systemd seems to crash.  The BSP is
> > based on pengutronix DistroKit (master) built with ptxdist and the
> > target is the Microchip SAM9X60-Curiosity board, which is arm v5te
> > architecture (that board is not part of DistroKit, support for that is
> > in an upper layer of mine not public yet (?)).
> >
> > Everything is quite recent, building systemd version 255.2 currently.
> > On startup I get messages like this (this is the first one, later on
> > there are lot more, all with the same status):
> >
> >    [   11.175650] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
> >    [   11.239679] systemd[1]: systemd-journald.service: Failed with result 'signal'.
> >    [   11.292640] systemd[1]: Failed to start systemd-journald.service.
> >    [FAILED] Failed to start systemd-journald.service.
> >    See 'systemctl status systemd-journald.service' for details.
> >
> > The system drops me on a shell later, where I can run the above
> > mentioned command, which gives:
> >
> >     ~ # systemctl status systemd-journald.service
> >     x systemd-journald.service - Journal Service
> >          Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static)
> >          Active: failed (Result: signal) since Tue 2024-04-09 11:44:52 UTC; 11min a>
> >     TriggeredBy: x systemd-journald-dev-log.socket
> >                  * systemd-journald-audit.socket
> >                  x systemd-journald.socket
> >            Docs: man:systemd-journald.service(8)
> >                  man:journald.conf(5)
> >         Process: 197 ExecStart=/usr/lib/systemd/systemd-journald (code=killed, sign>
> >        Main PID: 197 (code=killed, signal=SEGV)
> >        FD Store: 0 (limit: 4224)
> >             CPU: 330ms
> >
> > This does not help me much.  Other services crashing: systemd-udevd
> > and systemd-timesyncd, also with status 11/SEGV which is segmentation
> > fault, right?
> 
> Yes.
> 
> > I had this board running with an older version of systemd, but I can
> > not remember which was the last good version.
> >
> > Could anyone give me a hint please how to debug this?
> 
> "coredumpctl gdb" should get open the most recent backtrace for you.

This gives:

    ~ # coredumpctl gdb
    No journal files were found.
    No match found.

> The coredump should also show up in the logs with a backtrace.

I only have serial console output.  journald is crashing.  With
`dmesg` I see systemd messages in kernel log, but no backtrace.
gdbserver is installed on target, no gdb currently.  Trying to get a
coredump tomorrow.

Greets
Alex




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [DistroKit] [systemd-devel] How to debug systemd services failing to start with 11/SEGV?
  2024-04-09 14:21   ` Alexander Dahl
@ 2024-04-10 10:37     ` Alexander Dahl
  2024-04-10 13:08       ` Alexander Dahl
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Dahl @ 2024-04-10 10:37 UTC (permalink / raw)
  To: Lennart Poettering, systemd-devel, distrokit

Hello everone,

I thought I knew how to let the kernel create coredumps … (see below).

Am Tue, Apr 09, 2024 at 04:21:21PM +0200 schrieb Alexander Dahl:
> Hello Lennart,
> 
> thanks for your quick reply, see below.
> 
> Am Tue, Apr 09, 2024 at 03:53:24PM +0200 schrieb Lennart Poettering:
> > On Di, 09.04.24 14:42, Alexander Dahl (ada@thorsis.com) wrote:
> > 
> > > Hello everyone,
> > >
> > > I'm currently trying to build a firmware for an embedded device and
> > > running into trouble because systemd seems to crash.  The BSP is
> > > based on pengutronix DistroKit (master) built with ptxdist and the
> > > target is the Microchip SAM9X60-Curiosity board, which is arm v5te
> > > architecture (that board is not part of DistroKit, support for that is
> > > in an upper layer of mine not public yet (?)).
> > >
> > > Everything is quite recent, building systemd version 255.2 currently.
> > > On startup I get messages like this (this is the first one, later on
> > > there are lot more, all with the same status):
> > >
> > >    [   11.175650] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
> > >    [   11.239679] systemd[1]: systemd-journald.service: Failed with result 'signal'.
> > >    [   11.292640] systemd[1]: Failed to start systemd-journald.service.
> > >    [FAILED] Failed to start systemd-journald.service.
> > >    See 'systemctl status systemd-journald.service' for details.
> > >
> > > The system drops me on a shell later, where I can run the above
> > > mentioned command, which gives:
> > >
> > >     ~ # systemctl status systemd-journald.service
> > >     x systemd-journald.service - Journal Service
> > >          Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static)
> > >          Active: failed (Result: signal) since Tue 2024-04-09 11:44:52 UTC; 11min a>
> > >     TriggeredBy: x systemd-journald-dev-log.socket
> > >                  * systemd-journald-audit.socket
> > >                  x systemd-journald.socket
> > >            Docs: man:systemd-journald.service(8)
> > >                  man:journald.conf(5)
> > >         Process: 197 ExecStart=/usr/lib/systemd/systemd-journald (code=killed, sign>
> > >        Main PID: 197 (code=killed, signal=SEGV)
> > >        FD Store: 0 (limit: 4224)
> > >             CPU: 330ms
> > >
> > > This does not help me much.  Other services crashing: systemd-udevd
> > > and systemd-timesyncd, also with status 11/SEGV which is segmentation
> > > fault, right?
> > 
> > Yes.
> > 
> > > I had this board running with an older version of systemd, but I can
> > > not remember which was the last good version.
> > >
> > > Could anyone give me a hint please how to debug this?
> > 
> > "coredumpctl gdb" should get open the most recent backtrace for you.
> 
> This gives:
> 
>     ~ # coredumpctl gdb
>     No journal files were found.
>     No match found.
> 
> > The coredump should also show up in the logs with a backtrace.
> 
> I only have serial console output.  journald is crashing.  With
> `dmesg` I see systemd messages in kernel log, but no backtrace.
> gdbserver is installed on target, no gdb currently.  Trying to get a
> coredump tomorrow.

Well I tried for like three hours now, and I could not get a coredump
from journald nor udevd.  Tried with the systemd way of creating
coredumps, which means /usr/lib/systemd/systemd-coredump is registered
as kernel.core_pattern with settings from
/usr/lib/sysctl.d/50-coredump.conf and I even made sure
systemd-coredump.socket is active (started from the maintenance
shell):

    ~ # systemctl status systemd-coredump.socket
    * systemd-coredump.socket - Process Core Dump Socket
         Loaded: loaded (/usr/lib/systemd/system/systemd-coredump.socket; static)
         Active: active (listening) since Tue 2024-04-09 11:46:21 UTC; 2min 44s ago
           Docs: man:systemd-coredump(8)
         Listen: /run/systemd/coredump (SequentialPacket)
       Accepted: 1; Connected: 0;
         CGroup: /system.slice/systemd-coredump.socket

Started a long running process like this:

    ~ # /usr/lib/systemd/systemd-udevd -d
    Starting systemd-udevd version 255.2

Killed it like this:

    ~ # kill -11 269

Got this on serial console (kernel log):

    [  329.282462] systemd[1]: Started systemd-coredump@1-271-0.service.
    [  329.454739] systemd[1]: Listening on systemd-journald-dev-log.socket.
    [  329.552429] systemd[1]: Starting systemd-journald.service...
    [  329.953204] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
    [  329.966458] systemd[1]: systemd-journald.service: Failed with result 'signal'.
    [  329.977845] systemd[1]: Failed to start systemd-journald.service.
    [  330.002231] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 1.
    [  330.092289] systemd[1]: Starting systemd-journald.service...
    [  330.513858] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
    [  330.533602] systemd[1]: systemd-journald.service: Failed with result 'signal'.
    [  330.544728] systemd[1]: Failed to start systemd-journald.service.
    [  330.570419] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 2.
    [  330.652355] systemd[1]: Starting systemd-journald.service...
    [  331.071540] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
    [  331.084789] systemd[1]: systemd-journald.service: Failed with result 'signal'.
    [  331.104092] systemd[1]: Failed to start systemd-journald.service.
    [  331.121208] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 3.
    [  331.212357] systemd[1]: Starting systemd-journald.service...
    [  331.648631] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
    [  331.662108] systemd[1]: systemd-journald.service: Failed with result 'signal'.
    [  331.673588] systemd[1]: Failed to start systemd-journald.service.
    [  331.702297] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 4.
    [  331.782346] systemd[1]: Starting systemd-journald.service...
    [  332.221658] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
    [  332.244381] systemd[1]: systemd-journald.service: Failed with result 'signal'.
    [  332.255494] systemd[1]: Failed to start systemd-journald.service.
    [  332.265870] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 5.
    [  332.291243] systemd[1]: systemd-journald.service: Start request repeated too quickly.
    [  332.299240] systemd[1]: systemd-journald.service: Failed with result 'signal'.
    [  332.308392] systemd[1]: Failed to start systemd-journald.service.
    [  332.315720] systemd[1]: systemd-journald.socket: Failed with result 'service-start-limit-hit'.
    [  332.335020] systemd[1]: systemd-journald-dev-log.socket: Failed with result 'service-start-limit-hit'.
    [  332.838500] systemd[1]: systemd-coredump@1-271-0.service: Deactivated successfully.
    [  332.851322] systemd[1]: systemd-coredump@1-271-0.service: Consumed 1.004s CPU time.

As you can see the systemd-coredump@1-271-0.service is started, and
systemd tries to start systemd-journald.service again and again, each
failing with segfault.  However after that folder
/var/lib/systemd/coredump/ is completely empty.

I tried with a different (simple) core pipe handler which works all
the time on a non systemd system:

    ~ # cat /usr/local/bin/pipecore.sh 
    #!/bin/sh
    /usr/bin/xz -z -0 - > /mnt/data/tmp/cores/core-${1}-s${2}-${3}.xz
   ~ # echo '|/usr/local/bin/pipecore.sh %e %s %t' > /proc/sys/kernel/core_pattern

For stuff not started with systemctl this successfully creates
coredumps in my folder.  Trying `systemctl start
systemd-journald.service` after that: nothing.  Something is
preventing coredumps of systemd itself, and I have no idea what or
why.  Read https://systemd.io/COREDUMP/ but it did not help me in this
case.

Will try to go back to a working previous version now, and maybe
bisect later if my frustration got back to a normal level.

Greets
Alex




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [DistroKit] [systemd-devel] How to debug systemd services failing to start with 11/SEGV?
  2024-04-10 10:37     ` Alexander Dahl
@ 2024-04-10 13:08       ` Alexander Dahl
       [not found]         ` <CAPWNY8Vk8zO_fLqiC0CbbPZmF8yk89R9pwR1UepJJ1XH9SDNFA@mail.gmail.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Dahl @ 2024-04-10 13:08 UTC (permalink / raw)
  To: Lennart Poettering, systemd-devel, distrokit; +Cc: Topi Miettinen

Hello,

gave it a try and bisected systemd … see below.

Am Wed, Apr 10, 2024 at 12:37:39PM +0200 schrieb Alexander Dahl:
> Hello everone,
> 
> I thought I knew how to let the kernel create coredumps … (see below).
> 
> Am Tue, Apr 09, 2024 at 04:21:21PM +0200 schrieb Alexander Dahl:
> > Hello Lennart,
> > 
> > thanks for your quick reply, see below.
> > 
> > Am Tue, Apr 09, 2024 at 03:53:24PM +0200 schrieb Lennart Poettering:
> > > On Di, 09.04.24 14:42, Alexander Dahl (ada@thorsis.com) wrote:
> > > 
> > > > Hello everyone,
> > > >
> > > > I'm currently trying to build a firmware for an embedded device and
> > > > running into trouble because systemd seems to crash.  The BSP is
> > > > based on pengutronix DistroKit (master) built with ptxdist and the
> > > > target is the Microchip SAM9X60-Curiosity board, which is arm v5te
> > > > architecture (that board is not part of DistroKit, support for that is
> > > > in an upper layer of mine not public yet (?)).
> > > >
> > > > Everything is quite recent, building systemd version 255.2 currently.
> > > > On startup I get messages like this (this is the first one, later on
> > > > there are lot more, all with the same status):
> > > >
> > > >    [   11.175650] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
> > > >    [   11.239679] systemd[1]: systemd-journald.service: Failed with result 'signal'.
> > > >    [   11.292640] systemd[1]: Failed to start systemd-journald.service.
> > > >    [FAILED] Failed to start systemd-journald.service.
> > > >    See 'systemctl status systemd-journald.service' for details.
> > > >
> > > > The system drops me on a shell later, where I can run the above
> > > > mentioned command, which gives:
> > > >
> > > >     ~ # systemctl status systemd-journald.service
> > > >     x systemd-journald.service - Journal Service
> > > >          Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static)
> > > >          Active: failed (Result: signal) since Tue 2024-04-09 11:44:52 UTC; 11min a>
> > > >     TriggeredBy: x systemd-journald-dev-log.socket
> > > >                  * systemd-journald-audit.socket
> > > >                  x systemd-journald.socket
> > > >            Docs: man:systemd-journald.service(8)
> > > >                  man:journald.conf(5)
> > > >         Process: 197 ExecStart=/usr/lib/systemd/systemd-journald (code=killed, sign>
> > > >        Main PID: 197 (code=killed, signal=SEGV)
> > > >        FD Store: 0 (limit: 4224)
> > > >             CPU: 330ms
> > > >
> > > > This does not help me much.  Other services crashing: systemd-udevd
> > > > and systemd-timesyncd, also with status 11/SEGV which is segmentation
> > > > fault, right?
> > > 
> > > Yes.
> > > 
> > > > I had this board running with an older version of systemd, but I can
> > > > not remember which was the last good version.
> > > >
> > > > Could anyone give me a hint please how to debug this?
> > > 
> > > "coredumpctl gdb" should get open the most recent backtrace for you.
> > 
> > This gives:
> > 
> >     ~ # coredumpctl gdb
> >     No journal files were found.
> >     No match found.
> > 
> > > The coredump should also show up in the logs with a backtrace.
> > 
> > I only have serial console output.  journald is crashing.  With
> > `dmesg` I see systemd messages in kernel log, but no backtrace.
> > gdbserver is installed on target, no gdb currently.  Trying to get a
> > coredump tomorrow.
> 
> Well I tried for like three hours now, and I could not get a coredump
> from journald nor udevd.  Tried with the systemd way of creating
> coredumps, which means /usr/lib/systemd/systemd-coredump is registered
> as kernel.core_pattern with settings from
> /usr/lib/sysctl.d/50-coredump.conf and I even made sure
> systemd-coredump.socket is active (started from the maintenance
> shell):
> 
>     ~ # systemctl status systemd-coredump.socket
>     * systemd-coredump.socket - Process Core Dump Socket
>          Loaded: loaded (/usr/lib/systemd/system/systemd-coredump.socket; static)
>          Active: active (listening) since Tue 2024-04-09 11:46:21 UTC; 2min 44s ago
>            Docs: man:systemd-coredump(8)
>          Listen: /run/systemd/coredump (SequentialPacket)
>        Accepted: 1; Connected: 0;
>          CGroup: /system.slice/systemd-coredump.socket
> 
> Started a long running process like this:
> 
>     ~ # /usr/lib/systemd/systemd-udevd -d
>     Starting systemd-udevd version 255.2
> 
> Killed it like this:
> 
>     ~ # kill -11 269
> 
> Got this on serial console (kernel log):
> 
>     [  329.282462] systemd[1]: Started systemd-coredump@1-271-0.service.
>     [  329.454739] systemd[1]: Listening on systemd-journald-dev-log.socket.
>     [  329.552429] systemd[1]: Starting systemd-journald.service...
>     [  329.953204] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
>     [  329.966458] systemd[1]: systemd-journald.service: Failed with result 'signal'.
>     [  329.977845] systemd[1]: Failed to start systemd-journald.service.
>     [  330.002231] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 1.
>     [  330.092289] systemd[1]: Starting systemd-journald.service...
>     [  330.513858] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
>     [  330.533602] systemd[1]: systemd-journald.service: Failed with result 'signal'.
>     [  330.544728] systemd[1]: Failed to start systemd-journald.service.
>     [  330.570419] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 2.
>     [  330.652355] systemd[1]: Starting systemd-journald.service...
>     [  331.071540] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
>     [  331.084789] systemd[1]: systemd-journald.service: Failed with result 'signal'.
>     [  331.104092] systemd[1]: Failed to start systemd-journald.service.
>     [  331.121208] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 3.
>     [  331.212357] systemd[1]: Starting systemd-journald.service...
>     [  331.648631] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
>     [  331.662108] systemd[1]: systemd-journald.service: Failed with result 'signal'.
>     [  331.673588] systemd[1]: Failed to start systemd-journald.service.
>     [  331.702297] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 4.
>     [  331.782346] systemd[1]: Starting systemd-journald.service...
>     [  332.221658] systemd[1]: systemd-journald.service: Main process exited, code=killed, status=11/SEGV
>     [  332.244381] systemd[1]: systemd-journald.service: Failed with result 'signal'.
>     [  332.255494] systemd[1]: Failed to start systemd-journald.service.
>     [  332.265870] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 5.
>     [  332.291243] systemd[1]: systemd-journald.service: Start request repeated too quickly.
>     [  332.299240] systemd[1]: systemd-journald.service: Failed with result 'signal'.
>     [  332.308392] systemd[1]: Failed to start systemd-journald.service.
>     [  332.315720] systemd[1]: systemd-journald.socket: Failed with result 'service-start-limit-hit'.
>     [  332.335020] systemd[1]: systemd-journald-dev-log.socket: Failed with result 'service-start-limit-hit'.
>     [  332.838500] systemd[1]: systemd-coredump@1-271-0.service: Deactivated successfully.
>     [  332.851322] systemd[1]: systemd-coredump@1-271-0.service: Consumed 1.004s CPU time.
> 
> As you can see the systemd-coredump@1-271-0.service is started, and
> systemd tries to start systemd-journald.service again and again, each
> failing with segfault.  However after that folder
> /var/lib/systemd/coredump/ is completely empty.
> 
> I tried with a different (simple) core pipe handler which works all
> the time on a non systemd system:
> 
>     ~ # cat /usr/local/bin/pipecore.sh 
>     #!/bin/sh
>     /usr/bin/xz -z -0 - > /mnt/data/tmp/cores/core-${1}-s${2}-${3}.xz
>    ~ # echo '|/usr/local/bin/pipecore.sh %e %s %t' > /proc/sys/kernel/core_pattern
> 
> For stuff not started with systemctl this successfully creates
> coredumps in my folder.  Trying `systemctl start
> systemd-journald.service` after that: nothing.  Something is
> preventing coredumps of systemd itself, and I have no idea what or
> why.  Read https://systemd.io/COREDUMP/ but it did not help me in this
> case.
> 
> Will try to go back to a working previous version now, and maybe
> bisect later if my frustration got back to a normal level.

Well, I still have no idea on that one, but bisecting systemd spit out
this changeset, after that things were segfaulting:

    7a114ed4b39e9670f6a511f3eecb6fd58274d27b is the first bad commit
    commit 7a114ed4b39e9670f6a511f3eecb6fd58274d27b
    Author: Topi Miettinen <toiwoton@gmail.com>
    Date:   Sun Nov 6 21:12:45 2022 +0200
    
        execute: use prctl(PR_SET_MDWE) for MemoryDenyWriteExecute=yes
        
        On some ARM platforms, the dynamic linker could use PROT_BTI memory protection
        flag with `mprotect(..., PROT_BTI | PROT_EXEC)` to enable additional memory
        protection for executable pages. But `MemoryDenyWriteExecute=yes` blocks this
        with seccomp filter denying all `mprotect(..., x | PROT_EXEC)`.
        
        Newly preferred method is to use prctl(PR_SET_MDWE) on supported kernels. Then
        in-kernel implementation can allow PROT_BTI as necessary, without weakening
        MDWE. In-kernel version may also be extended to more sophisticated protections
        in the future.
    
     man/systemd.exec.xml      |  8 +++++---
     src/basic/missing_prctl.h |  8 ++++++++
     src/core/execute.c        | 14 ++++++++++++++
     3 files changed, 27 insertions(+), 3 deletions(-)

Had a look at that changeset, and I can not see anything which lets
systemd crash in that function.  With debug logging on, I see this in
dmesg output:

    [  +0.004439] (journald)[90]: systemd-journald.service: Enabled MemoryDenyWriteExecute= with PR_SET_MDWE

Looks like the success return path of the changed function.  Maybe
that setting now leads to crashing as a consequence of the enabled
memory protection?

Note: platform here is 32 bit arm, namely v5te on Microchip SAM9X60
SoC.  Kernel is 6.6, maybe I did not get the kernelconfig right and
some options are not set correctly?  Or maybe those crashes are real?
Then I could need some help how to _really_ enable coredumps for
journald, udevd, and timesyncd.  Got a hint off-list to pass
'systemd.dump_core=true' to kernel cmdline, but that had no effect on
coredump creation.

Greets
Alex




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [DistroKit] [systemd-devel] How to debug systemd services failing to start with 11/SEGV?
       [not found]         ` <CAPWNY8Vk8zO_fLqiC0CbbPZmF8yk89R9pwR1UepJJ1XH9SDNFA@mail.gmail.com>
@ 2024-04-10 13:49           ` Alexander Dahl
  0 siblings, 0 replies; 7+ messages in thread
From: Alexander Dahl @ 2024-04-10 13:49 UTC (permalink / raw)
  To: Mantas Mikulėnas; +Cc: systemd-devel, distrokit, Topi Miettinen

Hello Mantas,

Am Wed, Apr 10, 2024 at 04:45:58PM +0300 schrieb Mantas Mikulėnas:
> On Wed, Apr 10, 2024 at 4:08 PM Alexander Dahl <ada@thorsis.com> wrote:
> 
> > Note: platform here is 32 bit arm, namely v5te on Microchip SAM9X60
> > SoC.  Kernel is 6.6, maybe I did not get the kernelconfig right and
> > some options are not set correctly?  Or maybe those crashes are real?
> > Then I could need some help how to _really_ enable coredumps for
> > journald, udevd, and timesyncd.  Got a hint off-list to pass
> > 'systemd.dump_core=true' to kernel cmdline, but that had no effect on
> > coredump creation.
> >
> 
> I would just set kernel.core_pattern to a *file* path, e.g.
> "/var/log/core.%P". Then use the shell's ulimit command to raise the
> coredump size limit as it defaults to zero (ulimit -c unlimited), and
> manually start /usr/lib/systemd/systemd-timesyncd from the shell (timesyncd
> is the simplest one and doesn't do anything system-critical).

Tried that already.  The service does not crash in that case.

As a workaround for now I added this to
/etc/systemd/system/systemd-journald.service.d/override.conf but I'm
pretty sure it just hides the underlying issue?

    [Service]
    MemoryDenyWriteExecute=no

Greets
Alex

> 
> Alternatively, run the service under the debugger: `gdb /usr/.../timesyncd`.
> 
> -- 
> Mantas Mikulėnas



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-04-10 13:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-09 12:42 [DistroKit] How to debug systemd services failing to start with 11/SEGV? Alexander Dahl
2024-04-09 13:53 ` [DistroKit] [systemd-devel] " Lennart Poettering
2024-04-09 14:21   ` Alexander Dahl
2024-04-10 10:37     ` Alexander Dahl
2024-04-10 13:08       ` Alexander Dahl
     [not found]         ` <CAPWNY8Vk8zO_fLqiC0CbbPZmF8yk89R9pwR1UepJJ1XH9SDNFA@mail.gmail.com>
2024-04-10 13:49           ` Alexander Dahl
2024-04-09 13:58 ` [DistroKit] " Ahmad Fatoum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox