summaryrefslogtreecommitdiffstats
path: root/bitbake/lib/bb/server
Commit message (Collapse)AuthorAgeFilesLines
* bitbake: process/server: Fix typoRichard Purdie2024-02-101-1/+1
| | | | | | | | Ensure the message matches the filenames the code actually uses. (Bitbake rev: deb7db2e2b125c6a6732db4f185f4de5926494fd) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: process: Add profile logging for main loopRichard Purdie2024-02-101-0/+16
| | | | | | | | | When the idle/main loop was added, we didn't include profiling information for it. There is a performance issue in there, add logging for it. (Bitbake rev: d8d5cd43a60560f67e86f4f625113b0f73b944c0) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: catch and expand multiprocessing connection exceptionsMark Asselstine2024-01-101-2/+8
| | | | | | | | | | | | | | | | | | | | | | | Doing builds on systems with limited resources, or with high demand package builds such as chromium it isn't uncommon for the OOM Killer to be triggered and for bitbake-server to be selected as the process to be killed. When the bitbake-server does terminate unexpectedly due to the OOM Killer or otherwise, this currently results in a generic python traceback with little indication as to what has failed. Here we trap and raise the exceptions while extending the exception text in runCommand() to make it clear that this is most likely caused by the bitbake-server unexpectedly terminating. Callers of runCommand() should be updated to properly handle the BrokenPipeError and EOFError exceptions to avoid printing a python traceback, but even if they don't, the added text in the exceptions should provide some hints as to what might have caused the failure. (Bitbake rev: 5ff62b802f79acc86bbd6a99484f08501ff5dc2d) Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: bitbake/lib: spawn server/worker using the current Python interpreterRoss Burton2023-09-261-1/+1
| | | | | | | | | | | | | | | | | The user may have invoked ./bin/bitbake using a different Python interpreter than whatever python3 is on $PATH (for example, explicitly using a different version). However, as the server and workers are spawned directly they'll use the hashbang and thus a different Python. We also ensure that argv[0] is set to sys.executable instead of 'bitbake-server' or 'bitbake-worker', so that sys.executable is set to the right value inside the child. Without this the server won't be able to start any workers. (Bitbake rev: b44d5d2a53d3082c8ce94e09c0cf833e33e25aec) Signed-off-by: Ross Burton <ross.burton@arm.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Disable the flush() call in server loggingRichard Purdie2023-09-181-1/+2
| | | | | | | | | | | | | | | We've been chasing bitbake timeouts for a while and it was unclear where things were blocking on IO. It appears the flush() call in server logging can cause pauses up to minutes long on systems with slow (spinning) disks that are heavily loaded with IO. Since the flush() was added to aid debugging of other timing issues, we shouldn't need it now and it can be disabled. Leave a comment as a reminder of the pain this can cause. (Bitbake rev: afbc169e1490a86d6250969f780062c426eb4682) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: lib: Drop inotify support and replace with mtime checksRichard Purdie2023-09-181-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the flush in serverlog() removed and a memory resident bitbake with a 60s timeout, the following could fail in strange ways: rm bitbake-cookerdaemon.log bitbake-layers add-layer ../meta-virtualization/ bitbake-layers add-layer ../meta-openembedded/meta-oe/ bitbake -m specifically that it might error adding meta-oe with an error related to meta-virt. This clearly shows that whilst bblayers.conf was modified, bitbake was not recognising that. This would fit with the random autobuilder issues seen when the serverlog flush() call was removed. The issue appears to be that you have no way to "sync()" the inotify events with the command stream coming over the socket. There is no way to know if there are changes in the IO queue which bitbake needs to wait for before proceeding with the next command. I did experiment with os.sync() and fsync on the inotify fd, however nothing addressed the issue. Since it is extremely important we have accurate cache data, the only realistic thing to do is to switch to stat() calls and check mtime. For bitbake commands, this is straightforward since we can revalidate the cache upon new connections/commands. For tinfoil this is problematic and we need to introduce and explict command "revalidateCaches" that the code can use to force bitbake to re-check it's cache validity. I've exposed this through tinfoil with a new "modified_files" function. So, this patch: a) drops inotify support within bitbake's cooker/server and switch to using mtime b) requires a new function call in tinfoil when metadata has been modified (Bitbake rev: da3ec3801bdb80180b3f1ac24edb27a698415ff7) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Add more timing debugRichard Purdie2023-09-051-4/+9
| | | | | | | | | | It is helpful to have timestamps on the ping failures so that they can be matched against the bitbake logs. It is also useful to understand how long the server takes for form a reply verses when it is sent. (Bitbake rev: 65969a7a8f5ae22c230431d2db080eb187a27708) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: fix sig handleYang Xu2023-08-111-5/+4
| | | | | | | | | | process.signal_received is a list for signum and not iterable, change a suitable method to handle sig. (Bitbake rev: bfc53b190bd2530c2bfcea0690127d7eff620f45) Signed-off-by: Yang Xu <yang.xu@mediatek.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Show command in timeout messageRichard Purdie2023-06-301-1/+1
| | | | | | | | | | To learn more about the server timeout issues, be clear in the error message about which command is showing the timeout. It is currently unclear if this is the original command or a ping to the server. (Bitbake rev: ac3cd866274f67b29eff89e393132bdabf76dbfd) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server: Fix crash when checking lock fileJoshua Watt2023-06-011-1/+1
| | | | | | | | | | | Fixes a crash when the server process attempts to check the PID of the lock file that resulted because an integer (os.getpid()) was attempting to be concatenated to a string (Bitbake rev: 5d499682a0a739b5269247a8f6dbb874e3eec456) Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/xmlrpc: Fix after currentAsyncCommand locking changesRichard Purdie2023-03-071-1/+1
| | | | | | | | | | | | | | | | | After changes in bitbake b5215887d2f8ea3f28f1ebda721bd5b8f93ec7f3, "process/cooker/command: Fix currentAsyncCommand locking/races", command.py assumes it has access to the process server but the xmlrpc backend was passing in the xmlrpc server object leading to errors like: xmlrpc.client.Fault: <Fault 1: "<class 'AttributeError'>:'BitBakeXMLRPCServer' object has no attribute 'set_async_cmd'"> Fixing to pass the process server to command.py resolves this issue. [YOCTO #15008] (Bitbake rev: ce5b65d5fada474ef21ac28440af6ad45287650a) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Improve idle thread exception handlingRichard Purdie2023-02-201-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | If the inotifier code has an exception, bitbake currently hangs. Catch any exception and exit if seen. Also check the idle thread is alive and exit if it disappears. This should stop bitbake hanging if such a situation arises in future such as this example: 3323260 21:48:31.554468 Running command ['getVariable', 'BBINCLUDELOGS'] Exception in thread Thread-1 (idle_thread): Traceback (most recent call last): File "/usr/lib64/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/usr/lib64/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/server/process.py", line 408, in idle_thread self.cooker.process_inotify_updates() File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/cooker.py", line 256, in process_inotify_updates n.read_events() File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/pyinotify.py", line 1207, in read_events if fcntl.ioctl(self._fd, termios.FIONREAD, buf_, 1) == -1: OSError: [Errno 9] Bad file descriptor 3323260 21:48:32.206995 Command Completed (socket: True) (Bitbake rev: 358b5b02d5de1ab0f98104c4ec4953e46999b9a5) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Fix lockfile contents check bugRichard Purdie2023-01-241-1/+1
| | | | | | | | We need to check against the first line of the file, fix the typo. (Bitbake rev: 4abc598fb01d426394f4222dfc752e620a8e1b7b) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Improve lockfile handling at exitRichard Purdie2023-01-141-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | If memory resident bitbake is active and the build directory is renamed upon build completion, several bad things can happen: * the old build directory could be re-created to contain a lockfile leaving an empty directory behind * a lockfile for a new build could be found and attempt to be locked This patch avoids creating an empty directory (not perfectly, but should work in the majority of cases - an empty directory is cosmetic). It also now compares the lock file contents to it's own pid and just exits if it doesn't match, it is clearly then belonging to some new process. This will be combined with bitbake shutdown calls on the autobuilder to ensure "saved" build directories, or build directories being deleted by clobberdir don't do strange things. (Bitbake rev: b986eac18b6a8bf633f5ef15f32f68de4c86173b) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Move heartbeat to idle threadRichard Purdie2023-01-111-22/+22
| | | | | | | | | | Rather than risk the heartbeat event code locking up the server control socket, handle it in the 'idle' thread with the other work. The aim is to remove it as a possible issue with some ongoing hangs. (Bitbake rev: 0f9a0c7853b181817bf01863a26da21412376294) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: process/cooker/command: Fix currentAsyncCommand locking/racesRichard Purdie2023-01-111-6/+24
| | | | | | | | | | | | | | currentAsyncCommand currently doesn't have any locking and we have a conflict in "idle" conditions since the idle functions count needs to be zero *and* there needs to be no active command. Move the changes/checks of currentAsyncCommand to within the lock and then we can add it to the condition for idle, simplifying some of the code. (Bitbake rev: b5215887d2f8ea3f28f1ebda721bd5b8f93ec7f3) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: cooker: Clean up inotify idle handlerRichard Purdie2023-01-061-2/+3
| | | | | | | | | We no longer need to abstract the inotify callback handler, remove the abstraction and simplify/clean up the code. (Bitbake rev: af4ccab8acc49e91bf7647f209d69f4858618466) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: lib/bb: Update thread/process locks to use a timeoutRichard Purdie2023-01-051-8/+8
| | | | | | | | | | | | | | | | | | | | | The thread/process locks we use translate to futexes in Linux. If a process dies holding the lock, anything else trying to take the lock will hang indefinitely. An example would be the OOM killer taking out a parser process. To avoid bitbake processes just hanging indefinitely, add a timeout to our lock calls using a context manager. If we can't obtain the lock after waiting 5 minutes, hard exit out using os._exit(1). Use _exit() to avoid locking in any other places trying to write error messages to event handler queues (which also need locks). Whilst a bit harsh, this should mean we stop having lots of long running processes in cases where things are never going to work out and also avoids hanging builds on the autobuilder. (Bitbake rev: d2a3f662b0eed900fc012a392bfa0a365df0df9b) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Run idle commands in a separate idle threadRichard Purdie2022-12-312-38/+69
| | | | | | | | | | | | | | | | | | | | | | | | | When bitbake is off running heavier "idle" commands, it doesn't service it's command socket which means stopping/interrupting it is hard. It also means we can't "ping" from the UI to know if it is still alive. For those reasons, split idle command execution into it's own thread. The commands are generally already self containted so this is easier than expected. We do have to be careful to only handle inotify poll() from a single thread at a time. It also means we always have to use a thread lock when sending events since both the idle thread and the command thread may generate log messages (and hence events). The patch depends on previous fixes to the builtins locking in event.py and the heartbeat enable/disable changes as well as other locking additions. We use a condition to signal from the idle thread when other sections of code can continue, thanks to Joshua Watt for the review and tweaks squashed into this patch. We do have some sync points where we need to ensure any currently executing commands have finished before we can start a new async command for example. (Bitbake rev: 67dd9a5e84811df8869a82da6a37a41ee8fe94e2) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Add locking around idle functions accessesRichard Purdie2022-12-311-5/+14
| | | | | | | | | In preparation for adding splitting bitbakes work into two threads, add locking around the idle functions list accesses. (Bitbake rev: a9c63ce8932898b595fb7776cf5467d3c0afe4f7) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Improve idle loop exit codeRichard Purdie2022-12-311-1/+10
| | | | | | | | | | | | | | When idle handlers want to exit, returning "False" isn't very clear and also causes challenges with the ordering of the removing the idle handler and marking that no async command is running. Use a specific class to signal the exit condition allowing clearer code and allowing the async command to be cleared after the handler has been removed, reducing any opportunity for races. (Bitbake rev: 102e8d0d4c5c0dd8c7ba09ad26589deec77e4308) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Improve exception and idle function loggingRichard Purdie2022-12-311-0/+4
| | | | | | | | | | | | | | | Currently if the idle functions loop suffers a traceback, it is silently dropped and there is no log message to say what happened. This change at least means the traceback is in the cooker log, making some debugging possible. Add some logging to show when handlers are added/removed to allow a better idea of what the server code is doing from the server log file. (Bitbake rev: 9cf3102dc36513124fe5ead2f1e448b51833b6ac) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: event: Add enable/disable heartbeat codeRichard Purdie2022-12-311-2/+2
| | | | | | | | | | | Currently heartbeat events are always generated by the server whilst it is active. Change this so they only appear when builds are running, which is when most code would expect to be executed. This removes a number of races around changes in the datastore which can happen outside of builds. (Bitbake rev: 8c36c90afc392980d999a981a924dc7d22e2766e) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Add bitbake.sock race handlingRichard Purdie2022-12-211-1/+11
| | | | | | | | | | | | | | | | We've seen cases where the bitbake.sock file appears to disappear but the server continues to hold bitbake.lock. The most likely explaination is that some previous build directory was moved out the way, a server there kept running, eventually exited and removed the sock file from the wrong directory. To guard against this, save the inode information for the sock file and check it before deleting the file. The new code isn't entirely race free but should guard against what is a rare but annoying potential issue. (Bitbake rev: b02ebbffdae27e564450446bf84c4e98d094ee4a) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: process: log odd unlink events with bitbake.sockFrank de Brabander2022-12-171-2/+3
| | | | | | | | | | | | Log when the socket file already exists and is removed before recreating a new socket. Log when unlinking the socket file failed. (Bitbake rev: cfd7c9899f988bab6d9fe7bbfbdb60603fb5ed34) Signed-off-by: Frank de Brabander <debrabander@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: main/process: Add extra sockname debuggingRichard Purdie2022-12-131-3/+4
| | | | | | | | | | | | | | We're struggling to understand how bitbake.sock can sometimes disappear in live builds when we can't see where it could have been deleted. This causes connection failures to the server and failed builds. Add some extra debugging around the server log and client retry log messages to give more information for the next time this issue occurs. (Bitbake rev: 376a516dc8c96727fd042ada65f803013601ee2d) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: main/server: Add lockfile debugging upon server retryRichard Purdie2022-12-091-22/+31
| | | | | | | | | | | | We keep seeing server issues where the lockfile is present but we can't connect to it. Reuse the lockfile debugging code from the server to dump better information to the console from the client side when we run into this issue. Whilst not pretty, this might give us a chance of being able to debug the problems further. (Bitbake rev: 22685460b5ecb1aeb4ff3436088ecdacb43044d7) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server: Ensure cooker profiling worksRichard Purdie2022-11-201-3/+5
| | | | | | | | | | | | | The previous cleanups meant that when the cooker was started, profiling was always disabled as configuration was sent to the server later and this was too late to profile the main loop. Pass the "profile" option over the server commandline so that we can profile cooker itself again, the setting can now take effect early enough. (Bitbake rev: c97c1f1c127ef3f8fbbd1b4e187ab58bfb0a73e5) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Fix logging issues where only the first message was ↵Richard Purdie2022-06-251-2/+5
| | | | | | | | | | | | | | | displayed I realised only the first logging message was being displayed in a given parsing process. The reason turned out to be the UI handler failing with a "pop from empty list". The default handler was then lost and no further messages were processed. Fix this by catching the exception correctly in the connection writer code. (Bitbake rev: d3e64f64525187f1409531a0bd99df576e627f7f) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Avoid tracebacks at exitRichard Purdie2022-06-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | In theory this should have been worked around but is still occurring. Add it to the list of things to ignore when bitbake is shutting down. Traceback (most recent call last): File "/usr/lib64/python3.9/threading.py", line 973, in _bootstrap_inner self.run() File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/server/process.py", line 698, in startCallbackHandler event = self.reader.get() File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/server/process.py", line 722, in get res = self.reader.recv_bytes() File "/usr/lib64/python3.9/multiprocessing/connection.py", line 221, in recv_bytes buf = self._recv_bytes(maxlength) File "/usr/lib64/python3.9/multiprocessing/connection.py", line 426, in _recv_bytes return self._recv(size) File "/usr/lib64/python3.9/multiprocessing/connection.py", line 384, in _recv chunk = read(handle, remaining) TypeError: an integer is required (got type NoneType)' (Bitbake rev: 7a28ac4fe478bee1e52e84412da9626495f9c6c7) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Remove daemonic thread usageRichard Purdie2022-06-081-5/+9
| | | | | | | | | | | | | We're seeing UI deadlocks occasionally and this is possibly due to the use of a daemonic thread in the UI event queue processing. This thread could terminate holding a threading Lock() which would cause issues for the process when exitting. Change the shutdown process to handle this more cleanly. (Bitbake rev: f5ad8349a5dbff9824a89f5708cfd011d61888c9) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Avoid risk of exception deadlocksRichard Purdie2022-06-081-14/+9
| | | | | | | | | | | | | | The open coded lock acquire/release in the UI event handler doesn't cover the case an exception occurs and if one did, it could deadlock the code. Switch to use 'with' statements which would handle this possibility. We have seen deadlocks in the UI at exit this so this removes a possible cause. (Bitbake rev: bd12792f28efd2f03510653ec947ebf961315272) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Drop unused importRichard Purdie2022-04-211-1/+0
| | | | | | (Bitbake rev: 543315e6463f15ca7ab2b4ef3e8ed41bb4207ccf) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Disable gc around critical sectionRichard Purdie2022-04-031-0/+3
| | | | | | | | | | | | | The python gc can trigger whilst we're holding the event stream lock and when cleaning up objects, they can trigger warnings. This translates into a new event which would then need the lock and we can deadlock. Disable gc whilst we hold that lock to avoid this unfortunate and problematic situation. (Bitbake rev: 96a6303949cefd469bcf5ed250ff512271354357) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: cooker/process: Fix signal handling lockupsRichard Purdie2022-03-301-2/+20
| | | | | | | | | | | | | | | | | | | | If a parser process is terminated while holding a write lock, then it will lead to a deadlock (see https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Process.terminate). With SIGTERM, we don't want to terminate holding the lock. We also don't want a SIGINT to cause a partial write to the event stream. I tried using signal masks to avoid this but it doesn't work, see https://bugs.python.org/issue47139 Instead, add a signal handler and catch the calls around the critical section. We also need a thread lock to ensure other threads in the same process don't handle the signal until all the threads are not in the lock. (Bitbake rev: a40efaa5556a188dfe46c8d060adde37dc400dcd) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Correct a typo in a commentPeter Kjellerstedt2022-03-281-1/+1
| | | | | | | (Bitbake rev: b4a157b2fe2fb481ffa40e0f32659d05dd6320c2) Signed-off-by: Peter Kjellerstedt <peter.kjellerstedt@axis.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Move threads left debug to after cooker shutdownRichard Purdie2022-03-261-3/+3
| | | | | | | | | This debug is useful but the cooker shutdown or post_serve() may have cleanup left so run after those. (Bitbake rev: 1463fc0448d1a6a7265806a4a8b165b610dfb43f) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/xmlrpcserver: Add missing xmlrpcclient importRichard Purdie2022-03-081-0/+1
| | | | | | | | | This avoids backtraces when starting toaster or using bitbake in remote mode. (Bitbake rev: bf723f2cb5d288ca730e4f029110b36380420a01) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: lib/bb: Fix string concatination potential performance issuesRichard Purdie2021-11-031-3/+3
| | | | | | | | | | | | | | | | Python scales badly when concatinating strings in loops. Most of these references aren't problematic but at least one (in data.py) is probably a performance issue as the issue is compounded as strings become large. The way to handle this in python is to create lists which don't reconstruct all the objects when appending to them. We may as well fix all the references since it stops them being copy/pasted into something problematic in the future. This patch was based on issues highligthted by a report from AWS Codeguru. (Bitbake rev: d654139a833127b16274dca0ccbbab7e3bb33ed0) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: bitbake: correct deprecation warning in process.pyAlexander Kanavin2021-09-171-1/+1
| | | | | | | (Bitbake rev: aff52fe21a0b27f6302555c1e52a864550eb46ce) Signed-off-by: Alexander Kanavin <alex@linutronix.de> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: cooker/process: Fix typos in exiting messageMartin Jansa2021-09-011-1/+1
| | | | | | | (Bitbake rev: 1ff1ea3880d293b14ce0fc65e3bc4c938d587a2f) Signed-off-by: Martin Jansa <Martin.Jansa@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: process: Improve traceback error reporting from main loopRichard Purdie2021-08-061-2/+4
| | | | | | | | | | | | | Currently the code can just show nothing as the exception if there was a double fault, which in this code path is quite likely. This leads to an error log which effectively says "it failed" with no information about how. Improve things so we get a nice verbose traceback left in the logs/output which is preferable to no logs. (Bitbake rev: e5782b71647d1eb6de53bde7bc4f6019a5589f21) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server: Fix early parsing errors preventing zombie bitbakeJoshua Watt2021-07-201-1/+1
| | | | | | | | | | | | | | | | | If the client process never sends cooker data, the server timeout will be 0.0, not None. This will prevent the server from exiting, as it is waiting for a new client. In particular, the client will disconnect with a bad "INHERIT" line, such as: INHERIT += "this-class-does-not-exist" Instead of checking explicitly for None, check for a false value, which means either 0.0 or None. (Bitbake rev: 13e2855bff6a6ead6dbd33c5be4b988aafcd4afa) Signed-off-by: Joshua Watt <JPEWhacker@gmail.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Handle error in heartbeat funciton in OOM caseRichard Purdie2021-05-181-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've seen cases where an OOM error causes bitbake server to hang: 9171 02:21:09.127810 Command Completed Traceback (most recent call last): File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/bin/bitbake-server", line 51, in <module> bb.server.process.execServer(lockfd, readypipeinfd, lockname, sockname, timeout, xmlrpcinterface) File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/server/process.py", line 550, in execServer server.run() File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/server/process.py", line 108, in run ret = self.main() File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/server/process.py", line 242, in main ready = self.idle_commands(.1, fds) File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/server/process.py", line 370, in idle_commands bb.event.fire(heartbeat, self.cooker.data) File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/event.py", line 216, in fire fire_class_handlers(event, d) File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/event.py", line 123, in fire_class_handlers execute_handler(name, handler, event, d) File "/home/pokybuild/yocto-worker/qemux86/build/bitbake/lib/bb/event.py", line 93, in execute_handler ret = handler(event) File "/home/pokybuild/yocto-worker/qemux86/build/meta/classes/buildstats.bbclass", line 182, in defaultrun_buildstats write_host_data(os.path.join(bsdir, "host_stats"), e, d, "interval") File "/home/pokybuild/yocto-worker/qemux86/build/meta/classes/buildstats.bbclass", line 160, in write_host_data output = subprocess.check_output(c.split(), stderr=subprocess.STDOUT, timeout=limit).decode('utf-8') File "/usr/lib/python3.6/subprocess.py", line 356, in check_output **kwargs).stdout File "/usr/lib/python3.6/subprocess.py", line 423, in run with Popen(*popenargs, **kwargs) as process: File "/usr/lib/python3.6/subprocess.py", line 729, in __init__ restore_signals, start_new_session) File "/usr/lib/python3.6/subprocess.py", line 1295, in _execute_child restore_signals, start_new_session, preexec_fn) OSError: [Errno 12] Cannot allocate memory We need to wrap the calls in the same high level wrapper as idle function calls and trigger an exit upon an unhandled exception. (Bitbake rev: 74042b5b89d5a170013fc1a327ce3a6530fbf7d5) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: bitbake-server: ensure server timeout is a floatRoss Burton2021-04-201-1/+1
| | | | | | | | | | | | | | | | | | bitbake-server is spawned by process.py and passes the arguments it is given to ProcessServer. There's some type confusion here: bitbake-server is called with a string representation of the timeout, which may be None. If the timeout is not set, pass 0 instead of None. Inside bitbake-server a ProcessServer is created which expects the timeout to be a float not a string, so always float() the value. [ YOCTO #14350 ] (Bitbake rev: c93ae1f861208f6d39fd15c84fbcd0e2b54331f5) Signed-off-by: Ross Burton <ross.burton@arm.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: process: Show command exceptions in the server log as wellRichard Purdie2020-10-111-0/+1
| | | | | | | | | | There are autobuilder logs where the server commands are failing but we have no debug info in the server log. Improve this to try and understand what is failing. (Bitbake rev: 04d3a79226c9ea448b22f4efbab33876a72c9bdb) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Note when commands complete in logsRichard Purdie2020-09-051-0/+1
| | | | | | | | | | Its hard to tell from the server logs whether commands complete or not (or how long they take). Add extra info to allow more debugging of server timeouts. (Bitbake rev: 56285ada585ec1481449522282b335bcb5a2671e) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Prefix the log data with pid/time informationRichard Purdie2020-09-051-2/+2
| | | | | | | | | | Knowing which process printed which messages and the timestamp of the message is useful for debugging, so add this. Ensure the log parsing isn't affected by using search() instead of match(). (Bitbake rev: 1d043666710df1fa9d9586fd974c0371dd1514b0) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Ensure we don't keep looping if some other server ↵Richard Purdie2020-09-051-1/+20
| | | | | | | | | | | | | | | is started Showing "leftover process" messages when a new server has started and is being used by some UI is horrible. Compare the PID data from the lockfile to avoid this (and the ton of confusing log data it generates). Also, move the time.sleep() call to be after the first lock attempt, which reduces noise in the logs significantly. (Bitbake rev: ce1897a31afb5a14997bc3d2f459b90d43eecb7d) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
* bitbake: server/process: Don't show tracebacks if the lockfile is removedRichard Purdie2020-09-051-0/+6
| | | | | | | | | lsof/fuser error if the file doesn't exist. It can be deleted by something else so ignore this if it happens and loop. (Bitbake rev: b100d22ce37b7548b50e59a71802bcc903acd6ea) Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>