RTEMS / Tools / RTEMS Tools

Go to Issues or Merge Requests

Merge Requests Summary


Issues

22 - RTEMS tester fails to run correctly under python 3.13 (opened)

Id

22

State

opened

Type

ISSUE

Author

Kinsey Moore

Assignee(s)

Kinsey Moore

Created

2025-02-28T22:39:31.754Z

Updated

2025-03-02T02:43:13.317Z

Milestone

6.2

Labels

tool::test

Link

https://gitlab.rtems.org/rtems/tools/rtems-tools/-/issues/22

Merges

0

Summary

When attempting to run the RTEMS tester with Python 3.13, these error messages are produced:

.../rtems-tools/tester/rt/test.py:76: SyntaxWarning: invalid escape sequence '\\['
status_regx = re.compile('^\\[\\s*\\d+/\\s*\\d+\\] p:.+')
.../rtems-tools/tester/rt/test.py:188: SyntaxWarning: invalid escape sequence '\\.'
norun = re.compile('.*\\.norun.*')
Incorrect RTEMS Tools installation

This was tested on the 6.1 branch, but there appear to be no changes to main that would have fixed this.

Steps to reproduce

Pre-set options

Author: Kinsey Moore

2025-03-02T02:21:55.776Z

assigned to @opticron

Author: Kinsey Moore

2025-02-28T22:39:31.957Z

assigned to @opticron

Author: Kinsey Moore

2025-02-28T22:39:32.416Z

Author: Kinsey Moore

2025-03-02T02:13:13.434Z

On further investigation, this is 2 separate issues. The escapes need to be fixed to resolve the warnings and the actual exit is caused by missing telnetlib since it was deprecated in 3.11 and removed in 3.13.

Author: Kinsey Moore

2025-03-02T02:13:13.403Z

On debian systems, there is currently a workaround for the actual exit issue by installing python3-zombie-telnetlib.

Author: Amar Takhar

2025-03-02T02:28:45.323Z

The current state of telnetlib isn’t great I don’t think there are any maintained ones. This one is its own project but no activity from the maintainer in years: https://github.com/jquast/telnetlib3

The Python one was just removed there is nobody touching the code anymore which is why they called it ‘zombie’ .. apt name.

telnetlib3 is at least installable via pypi. I’m not sure if it’s a drop in replacement does it work can you try it out?

FreeBSD for example has no telnet library for python available and I’m sure that’s true for most OS the Python project itself recommends either telnetlib3 or Exscript which is even more out of maintainership

Author: Kinsey Moore

2025-03-02T02:34:43.202Z

telnetlib3 appears to be a mostly drop-in replacement (s/telnetlib/telnetlib3/g), but I don’t have an example of a tester configuration that actually uses it, so I can’t verify that functionality completely.

Author: Chris Johns

2025-03-02T02:37:05.927Z

https://docs.rtems.org/docs/main/user/testing/configuration.html#console and look for:

%define bsp_tty_dev      1.2.3.4:8989

Author: Kinsey Moore

2025-03-02T02:43:13.300Z

Ah, I do actually have a test configuration that I can verify this on. It’ll probably have to wait until Monday.

Author: Kinsey Moore

2025-03-02T02:21:56.083Z

moved from rtems-source-builder#95

Author: Kinsey Moore

2025-03-02T02:22:42.566Z

mentioned in merge request !53

15 - rtems-test: Leaks pipes when using a gdb based simulator (opened)

Id

15

State

opened

Type

ISSUE

Author

Joel Sherrill

Created

2024-10-15T21:28:33.821Z

Updated

2024-12-04T21:40:36.736Z

Milestone

6.2

Link

https://gitlab.rtems.org/rtems/tools/rtems-tools/-/issues/15

Merges

0

Summary

When used with a gdb based simulator like psim, rtems-test uses multiple file descriptors – one for gdb.cfg and multiple for pipes. On a server with 56 cores, I encountered the following failure:

[572/675] p:491 f:5   u:6   e:22  I:0   B:3   t:0   L:0   i:0   W:0   | powerpc/psim: spsem_err02.exe
[564/675] p:491 f:5   u:6   e:22  I:0   B:3   t:0   L:0   i:0   W:0   | powerpc/psim: spregion_err01.exe
error: error opening config file: /home/joel/rtems-cron-6.1-rc4/tools/6/share/rtems/tester/rtems/version.cfg

The error opening the config file is because the process rtems-test has exceeded its allowed maximum of file descriptors. The following snippet is from watching lsof with the pid for rtems-test. We can see that gdb.cfg is open 52 times which seems reasonable give a 56 core machine.

I did the rtems-test run in one terminal and the lsof commands in another.

$ lsof -p 1232936 | grep gdb.cfg | wc -l
52

But the process has an increasing number of pipes open as the tests continue to execute.

[joel@gitlab ~]$ lsof -p 1232936 | grep pipe | wc -l
776
[joel@gitlab ~]$ lsof -p 1232936 | grep pipe | wc -l
819
[joel@gitlab ~]$ lsof -p 1232936 | grep pipe | wc -l
842
[joel@gitlab ~]$ lsof -p 1232936 | grep pipe | wc -l
943
[joel@gitlab ~]$ lsof -p 1232936 | grep pipe | wc -l
951

At ~55/675 tests, there are ~165 pipes in the lsof. At ~160/675 tests, there are ~300 pipes in the lsof. It continues to increase.

All go away when the rtems-test fails and exits.

The number of file descriptors open for rtems-test do NOT increase when using SIS.

Steps to reproduce

You should be able to see the number of pipe file descriptors increase on any multi-core machine when using a gdb based simulator. But there must be some mechanism that is able to reclaim them that is able to run. It must not be happening in the higher core environment.

Pre-set options

Author: Joel Sherrill

2024-10-15T21:29:23.567Z

changed the description

Author: Kinsey Moore

2024-10-29T01:26:47.904Z

Interesting, I can’t actually reproduce this on my 8c16t machine when running through all the sptests. It sits at exactly 3x16 pipes and varies a bit around 1x16 gdb.cfg descriptors. I’ll have to go try on the actual machine that Joel was using.

Author: Amar Takhar

2024-10-29T01:26:47.875Z

What OS? Should probably start collecting platform details to see what we’re testing on

Author: Kinsey Moore

2024-10-29T01:55:57.260Z

The OS I just tested on is Debian Bookworm (headless). The machine that demonstrates the issue is Rocky 9.4 (not headless). I was able to cause something to happen on the 8c16t system by setting –jobs=52 which caused the numbers to fluctuate up and down with a peak around 300 pipes which is twice the number that should be visible if things are getting cleaned up in a timely manner. The Rocky machine is also running gitlab, so that may also have an effect.

Author: Amar Takhar

2024-10-29T02:07:31.042Z

Have you taken a look to see how busy the disk is? Is the machine heavily CPU bound? It may be calling for them to be cleaned up but too busy to do it. No idea just throwing ideas out. Memory/swapping may be an issue too.

Author: Kinsey Moore

2024-10-29T02:39:38.221Z

I found a recommendation that when calling proc.kill(), you should follow it with proc.communicate() to ensure that all I/O is finalized and culled. This seems to keep the pipe count to almost exactly 3x jobs, but exposes other problems on my server. The 28c56t machine is not under any load but gitlab, but seems to be less performant than my server is. AArch64 toolchain build times are: * 8c16t(personal server): 13.75m * 28c56t: 22m * 8c16t VM(virtualbox on a laptop): 16m

Something is definitely going on on that machine.

Once I fixed the proc.communicate() issue, other issues start popping up on my server when the tester is coerced to run at 52 jobs:

[232/675] p:152 f:4   u:3   e:21  I:0   B:3   t:0   L:0   i:0   W:0   | powerpc/psim: psxcancel01.exe
error: gdb.cfg:54: macro '%{rtems_version}' not found
error: gdb.cfg:54: macro '%{rtems_version}' not found
warning: switched to dry run due to errors
error: gdb.cfg:54: macro '%{rtems_version}' not found
error: gdb.cfg:54: macro '%{rtems_version}' not found
error: gdb.cfg:54: macro '%{rtems_version}' not found
error: gdb.cfg:54: macro '%{rtems_version}' not found
error: gdb.cfg:60: macro '%{rtems_version}' not found
error: gdb.cfg:60: macro '%{rtems_version}' not found
error: gdb.cfg:60: macro '%{rtems_version}' not found
error: gdb.cfg:60: macro '%{rtems_version}' not found
[233/675] p:153 f:4   u:3   e:21  I:0   B:3   t:0   L:0   i:0   W:0   | powerpc/psim: psxchroot01.exe
...
[249/675] p:170 f:4   u:4   e:21  I:0   B:3   t:0   L:0   i:1   W:0   | powerpc/psim: psxfatal01.exe
error: config error: gdb.cfg:36: No 'target' defined
[253/675] p:170 f:4   u:4   e:21  I:0   B:3   t:0   L:0   i:1   W:0   | powerpc/psim: psxfile01.exe
[251/675] p:170 f:4   u:4   e:21  I:0   B:3   t:0   L:0   i:1   W:0   | powerpc/psim: psxfchx01.exe

Author: Amar Takhar

2024-10-29T02:46:50.979Z

Interesting what area of the code did you put the communicate() in? Are we using popen to open the process to send data you’re right that you typically do want to call communicate() but I wonder if something is getting overwritten or we don’t have timeouts set.

Author: Chris Johns

2024-10-29T02:58:49.155Z

Is the macro not found error new?

Author: Kinsey Moore

2024-10-29T03:04:21.617Z

It doesn’t show up if I run at 16 jobs (with or without the patch).

Author: Kinsey Moore

2024-10-29T03:09:54.736Z

Running at 52 jobs without the proc.communicate() patch shows different errors and backtraces such as:

Traceback (most recent call last):
AttributeError: 'NoneType' object has no attribute 'target_end'

AttributeError: 'NoneType' object has no attribute 'target_end'

During handling of the above exception, another exception occurred:

File "/usr/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
File "/media/32b58c68-8211-4697-9049-c4881aa9125e/rtems-dev/rtems-tools/tester/rt/config.py", line 487, in capture
Traceback (most recent call last):
self.run()
File "/usr/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
During handling of the above exception, another exception occurred:
File "/usr/lib/python3.11/threading.py", line 975, in run
self.process.target_end()

^^^^^^^^^^^^^^^^^^^^^^^
self.run()
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.11/threading.py", line 975, in run
File "/media/32b58c68-8211-4697-9049-c4881aa9125e/rtems-dev/rtems-tools/rtemstoolkit/execute.py", line 256, in _readthread
AttributeError: 'NoneType' object has no attribute 'target_end'

Author: Chris Johns

2024-10-29T05:01:32.044Z

The reader is stuck?

Author: Gedare Bloom

2024-12-04T21:40:37.719Z

Merge Requests