Self-Hosted Version
25.11.1
CPU Architecture
x86_64
Docker Version
25.0.3
Docker Compose Version
2.32.3
Machine Specification
Installation Type
Upgrade from 25.8.0 to 25.11.1
Steps to Reproduce
Environment
- Self-hosted version: 25.11.1
- Upgraded from: 25.8.0
- Host OS: Linux x86_64
- Available RAM: ~12 GiB free of 48 GiB total
- Shared memory (
/dev/shm): 22 GiB available, <1 MB in use
- Disk: 1.3 TB available
- Swap: Within expected system requirements (16gb)
Description
After upgrading from 25.8.0 to 25.11.1 (With a data migration to SeaweedFS), the process-segments consumer enters a continuous crash-restart loop and never successfully starts. All other containers are healthy and the Sentry application itself is functional. Only process-segments is affected.
The container starts, initializes its multiprocessing pool (including running parallel_worker_initializer), begins consuming from the buffered-segments Kafka topic, and then crashes approximately 20 to 30 seconds into processing every time, without exception.
Observed behaviour
The crash cycle follows this consistent pattern:
- Container starts, multiprocessing pool initializes successfully
- Consumer is assigned
Partition(topic=Topic(name='buffered-segments'), index=0)
- Worker begins processing, then emits one or more incomplete batch warnings
- Child process terminates (signal 17 / SIGCHLD)
- Parent process crashes with
ChildProcessTerminated: 17
- Container restarts and the cycle repeats
Error output
WARNING arroyo.processing.strategies.run_task_with_multiprocessing: Received incomplete batch (57.00% complete), resubmitting
Traceback (most recent call last):
File ".../run_task_with_multiprocessing.py", line 860, in __reset_batch_builder
input_block = self.__input_blocks.pop()
IndexError: pop from empty list
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File ".../processor.py", line 440, in _run_once
self.__processing_strategy.submit(message)
File ".../healthcheck.py", line 29, in submit
self.__next_step.submit(message)
File ".../run_task_with_multiprocessing.py", line 879, in submit
self.__reset_batch_builder()
File ".../run_task_with_multiprocessing.py", line 862, in __reset_batch_builder
raise MessageRejected("no available input blocks") from e
arroyo.processing.strategies.abstract.MessageRejected: no available input blocks
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
File ".../kafka.py", line 67, in run_processor_with_signals
processor.run()
...
raise ChildProcessTerminated(signum)
arroyo.processing.strategies.run_task_with_multiprocessing.ChildProcessTerminated: 17
ERROR arroyo.processing.processor: Caught exception, shutting down...
Investigation
We investigated the following potential causes:
- Memory pressure: Ruled out. 12 GiB RAM available,
/dev/shm has 22 GiB available with negligible usage. No OOM entries in dmesg.
- Shared memory exhaustion: Ruled out.
df -h /dev/shm confirms ample space and only a single unrelated semaphore present.
- Kafka connectivity: Consumer successfully connects, receives partition assignment, and begins consuming before crashing.
- Kafka topic reset: Ruled out. In addition to recreating the consumer groups, we also deleted and recreated all Kafka topics related to this consumer, including
buffered-segments itself, to ensure no stale messages, corrupt data, or leftover topic configuration was contributing to the crash. The issue persists on a completely fresh topic with no existing messages.
The error originates in arroyo's run_task_with_multiprocessing.py where the parent process attempts to reset the batch builder after the child process is killed, finds no available input blocks, and crashes. The root cause appears to be the child worker process being terminated (SIGCHLD) silently before the parent can recover however the child's own stderr output is not surfaced in Docker logs.
Steps to reproduce
- Run a healthy self-hosted Sentry 25.8.0 instance
- Upgrade to 25.11.1 following the standard upgrade procedure
- Observe
process-segments container entering a crash-restart loop
What we have tried
- Restarting the full application
- Re-running the full
./install.sh / docker compose up sequence
- Verifying resource availability (RAM, shm, disk, swap)
Full logs
logs.txt
Expected Result
All containers are healthy after the upgrade and the application is online, however due to the crash loop on process-segments it is not fully operational.
Actual Result
See the attached full logs in section 1.
Event ID
No response
Self-Hosted Version
25.11.1
CPU Architecture
x86_64
Docker Version
25.0.3
Docker Compose Version
2.32.3
Machine Specification
Installation Type
Upgrade from 25.8.0 to 25.11.1
Steps to Reproduce
Environment
/dev/shm): 22 GiB available, <1 MB in useDescription
After upgrading from 25.8.0 to 25.11.1 (With a data migration to SeaweedFS), the
process-segmentsconsumer enters a continuous crash-restart loop and never successfully starts. All other containers are healthy and the Sentry application itself is functional. Onlyprocess-segmentsis affected.The container starts, initializes its multiprocessing pool (including running
parallel_worker_initializer), begins consuming from thebuffered-segmentsKafka topic, and then crashes approximately 20 to 30 seconds into processing every time, without exception.Observed behaviour
The crash cycle follows this consistent pattern:
Partition(topic=Topic(name='buffered-segments'), index=0)ChildProcessTerminated: 17Error output
Investigation
We investigated the following potential causes:
/dev/shmhas 22 GiB available with negligible usage. No OOM entries indmesg.df -h /dev/shmconfirms ample space and only a single unrelated semaphore present.buffered-segmentsitself, to ensure no stale messages, corrupt data, or leftover topic configuration was contributing to the crash. The issue persists on a completely fresh topic with no existing messages.The error originates in
arroyo'srun_task_with_multiprocessing.pywhere the parent process attempts to reset the batch builder after the child process is killed, finds no available input blocks, and crashes. The root cause appears to be the child worker process being terminated (SIGCHLD) silently before the parent can recover however the child's own stderr output is not surfaced in Docker logs.Steps to reproduce
process-segmentscontainer entering a crash-restart loopWhat we have tried
./install.sh/docker compose upsequenceFull logs
logs.txt
Expected Result
All containers are healthy after the upgrade and the application is online, however due to the crash loop on
process-segmentsit is not fully operational.Actual Result
See the attached full logs in section 1.
Event ID
No response