Tasks crashing

Message boards : Number crunching : Tasks crashing
Message board moderation

To post messages, you must log in.

AuthorMessage
gemini8
Avatar

Send message
Joined: 8 Oct 20
Posts: 7
Credit: 1,109,540
RAC: 6,266
Message 361 - Posted: 4 Jan 2023, 9:31:24 UTC

Hello.
Nice we have new work for the New Year, but some of it is flawed.
I get this stderr out on several tasks, and so do other users on the same tasks:
<core_client_version>7.18.1</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>

</stderr_txt>
]]>

Thank you for having a look at the issues.
- - - - - - - - - -
Greetings, Jens
ID: 361 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
gemini8
Avatar

Send message
Joined: 8 Oct 20
Posts: 7
Credit: 1,109,540
RAC: 6,266
Message 362 - Posted: 7 Jan 2023, 12:31:54 UTC

New work and apparently no issues anymore.
Thx.
- - - - - - - - - -
Greetings, Jens
ID: 362 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 28 Sep 20
Posts: 11
Credit: 559,814
RAC: 2,663
Message 363 - Posted: 13 Jan 2023, 8:48:17 UTC - in response to Message 362.  

New work didn't last, I didn't even see there was any...
ID: 363 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 15
Credit: 1,032,403
RAC: 2,355
Message 365 - Posted: 13 Jan 2023, 9:12:05 UTC

Got today morning a batch of Gaia_3 tasks, sadly all failed at 16% :(
ID: 365 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 15
Credit: 1,032,403
RAC: 2,355
Message 366 - Posted: 13 Jan 2023, 9:16:11 UTC
Last modified: 13 Jan 2023, 9:30:41 UTC

<message>
Disk usage limit exceeded</message>


Huh, interesting, seems Gaia broke my 10GB limit for BOINC files. I removed all limits and let's see what happens now.

EDIT:

Even with no limits, tasks failed around 15% mark, and Gaia was then using about 18GB of disk space (!). I'll suspend crunching new tasks until more is known.
ID: 366 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zupa

Send message
Joined: 21 Aug 19
Posts: 110
Credit: 888,695
RAC: 8,066
Message 367 - Posted: 13 Jan 2023, 12:47:49 UTC - in response to Message 366.  

Hi, I'm trying to understand why the previous data needed 3h calculations and the current input data counts so quickly. Because a certain calculation time is assumed, the result file takes on a monstrous size. I have reduced the calculation time to 5 min, increased the allowed size of the result file.
I am very surprised by this situation and apologise for the problem. It was a new packet of data as usual, and here such a surprise. A review of the results will show if the calculations are correct, but there is nothing to indicate that they are wrong, just that you are counting incredibly fast this time.
ID: 367 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 15
Credit: 1,032,403
RAC: 2,355
Message 368 - Posted: 13 Jan 2023, 13:20:21 UTC - in response to Message 367.  

Hi, I'm trying to understand why the previous data needed 3h calculations and the current input data counts so quickly. Because a certain calculation time is assumed, the result file takes on a monstrous size. I have reduced the calculation time to 5 min, increased the allowed size of the result file.
I am very surprised by this situation and apologise for the problem. It was a new packet of data as usual, and here such a surprise. A review of the results will show if the calculations are correct, but there is nothing to indicate that they are wrong, just that you are counting incredibly fast this time.


In that 16% I haven't seen a change in calculation speed, based on how fast it was going, the estimated task time would be around 1.5-2h, so not really a change from the previous batches. At least for me.
ID: 368 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zupa

Send message
Joined: 21 Aug 19
Posts: 110
Credit: 888,695
RAC: 8,066
Message 369 - Posted: 13 Jan 2023, 13:29:54 UTC - in response to Message 368.  

The program randomizes the orbits based on the covariance matrix, then integrates the motion. It does this for 2h.
If the processor is fast it will draw 100 orbits, if it is slow it will draw 3 orbits.
And at this point, more than 1,000,000 orbits are being drawn.
I'm checking this data packet, because it seems to include bug....
ID: 369 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 15
Credit: 1,032,403
RAC: 2,355
Message 370 - Posted: 13 Jan 2023, 17:24:53 UTC

Still no luck, every task I try fails in a computation error a few minutes in.
ID: 370 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Gaia01902USA

Send message
Joined: 6 Aug 21
Posts: 9
Credit: 154,524
RAC: 332
Message 376 - Posted: 26 Jan 2023, 23:22:53 UTC

Today 1/26 I'm getting Computation Error on any task I try, so I've stopped in case it's me.
I don't know how to navigate BOINC too much, but for what it's worth I see
Output file 8_... for task 8_... absent. That might just be because there was no actual computations.
Any task only runs a few seconds before it errors.
ID: 376 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey
Avatar

Send message
Joined: 26 Feb 20
Posts: 17
Credit: 3,807,621
RAC: 50,596
Message 377 - Posted: 27 Jan 2023, 3:45:15 UTC - in response to Message 376.  

Today 1/26 I'm getting Computation Error on any task I try, so I've stopped in case it's me.
I don't know how to navigate BOINC too much, but for what it's worth I see
Output file 8_... for task 8_... absent. That might just be because there was no actual computations.
Any task only runs a few seconds before it errors.


Nope not you, it happened on several of my pc's and I too stopped getting new tasks, mine were all the version 8 tasks
ID: 377 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zupa

Send message
Joined: 21 Aug 19
Posts: 110
Credit: 888,695
RAC: 8,066
Message 378 - Posted: 27 Jan 2023, 6:05:59 UTC - in response to Message 377.  

All computers return an error, I'm looking for the reason, thanks for the signal
ID: 378 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 15
Credit: 1,032,403
RAC: 2,355
Message 379 - Posted: 27 Jan 2023, 9:23:24 UTC
Last modified: 27 Jan 2023, 9:24:40 UTC

Hi!

I am no longer getting any calculation errors, yay! But around 20-30% of my tasks fail with "download failed". Can it be a problem on my side?

EDIT: Probably not, such tasks fail for everyone: http://150.254.66.104/gaiaathome/workunit.php?wuid=9753
ID: 379 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
zupa

Send message
Joined: 21 Aug 19
Posts: 110
Credit: 888,695
RAC: 8,066
Message 380 - Posted: 27 Jan 2023, 14:44:27 UTC - in response to Message 379.  

Hi,

I see. The inputs file are on system, I don't know what happen :(

I will check , when all task ended.
ID: 380 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
stfn

Send message
Joined: 21 Apr 21
Posts: 15
Credit: 1,032,403
RAC: 2,355
Message 381 - Posted: 30 Jan 2023, 14:01:47 UTC
Last modified: 30 Jan 2023, 14:02:52 UTC

And now the tasks are flying, no errors, very smooth, thank you for your work in making this happen panie profesorze! :)
ID: 381 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Conan
Avatar

Send message
Joined: 27 Apr 20
Posts: 40
Credit: 2,829,759
RAC: 58,401
Message 440 - Posted: 22 Oct 2024, 1:26:39 UTC
Last modified: 22 Oct 2024, 1:49:53 UTC

Getting a heap of "output file absent" errors


process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
12:10:18 (916877): wrapper (7.15.26016): starting
12:10:18 (916877): wrapper (7.15.26016): starting
12:10:18 (916877): wrapper: running ../../projects/gaiaathome.eu_gaiaathome/4_gaia@home[20241019.1457]_x86_64-pc-linux-gnu ()
12:10:24 (916877): 4_gaia@home[20241019.1457] exited; CPU time 0.002967
12:10:24 (916877): app exit status: 0x8b
12:10:24 (916877): called boinc_finish(195)

Only runs a few seconds but lots coming in now.

Seems to be from the 4_21xxx series of work units, successful ones only run for a few minutes.

Conan
ID: 440 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matthias Lehmkuhl

Send message
Joined: 12 Oct 20
Posts: 12
Credit: 1,558,306
RAC: 2,008
Message 488 - Posted: 24 Oct 2024, 6:59:18 UTC
Last modified: 24 Oct 2024, 7:09:35 UTC

on my Tasks I get the message "error while loading shared libraries"
../../projects/gaiaathome.eu_gaiaathome/4_gaia@home[20241019.1457]_x86_64-pc-linux-gnu: error while loading shared libraries: libquadmath.so.0: cannot open shared object file: No such file or directory

I can try to install the library, but the better way is to include the missing library to the gaia@home program
https://gaiaathome.eu/gaiaathome/result.php?resultid=117971

<core_client_version>8.0.4</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
01:17:25 (133274): wrapper (7.15.26016): starting
01:17:25 (133274): wrapper (7.15.26016): starting
01:17:25 (133274): wrapper: running ../../projects/gaiaathome.eu_gaiaathome/4_gaia@home[20241019.1457]_x86_64-pc-linux-gnu ()
../../projects/gaiaathome.eu_gaiaathome/4_gaia@home[20241019.1457]_x86_64-pc-linux-gnu: error while loading shared libraries: libquadmath.so.0: cannot open shared object file: No such file or directory
01:17:26 (133274): 4_gaia@home[20241019.1457] exited; CPU time 0.001857
01:17:26 (133274): app exit status: 0x7f
01:17:26 (133274): called boinc_finish(195)

</stderr_txt>
]]>
Matthias
ID: 488 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matthias Lehmkuhl

Send message
Joined: 12 Oct 20
Posts: 12
Credit: 1,558,306
RAC: 2,008
Message 493 - Posted: 24 Oct 2024, 11:16:54 UTC - in response to Message 488.  
Last modified: 24 Oct 2024, 11:17:41 UTC

Hello,
I did check on my other Ubuntu linux computers an found that there is the libquadmath0 library is already installed
So I installed on this machine the library manually an now the 4_gaia@home[20241019.1457]_x86_64-pc-linux-gnu is running without error
Matthias
ID: 493 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 3 Sep 19
Posts: 21
Credit: 27,755,419
RAC: 168,983
Message 496 - Posted: 24 Oct 2024, 19:36:51 UTC

ID: 496 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : Tasks crashing

©2024 GAVIP-GC