Page 1 of 1

#1 WU stuck

Posted: Mon Dec 10, 2018 8:27 pm
by davidbam
Grrrrrr - just aborted a Collatz WU which had become stuck! 18hrs of wasted time on an AMD RX580.

Anyone else seen that kind of problem? I only noticed when I saw that that machine hadn't uploaded to Collatz website in a while

#2 Re: WU stuck

Posted: Tue Dec 11, 2018 12:10 pm
by Alez
Some projects are notorious for this issue. Collatz, strangely, is not one of them.

#3 Re: WU stuck

Posted: Tue Dec 11, 2018 12:27 pm
by davidbam
I wonder if Boinctasks have the capability (or any plans) to spot this kind of thing? It would seem (relatively) easy to do on Linux clients. Hmmmm

#4 Re: WU stuck

Posted: Wed Dec 12, 2018 9:01 am
by Alez
Scole wrote a program to detect and abort stuck units on another project. It's on here somewhere. I'll look for it some time today, if I get a chance. Most of today should hopefully be spent travelling.

#5 Re: WU stuck

Posted: Thu Dec 13, 2018 11:34 pm
by davidbam
davidBAM wrote: Mon Dec 10, 2018 8:27 pm Grrrrrr - just aborted a Collatz WU which had become stuck! 18hrs of wasted time on an AMD RX580.

Anyone else seen that kind of problem? I only noticed when I saw that that machine hadn't uploaded to Collatz website in a while
Ditto - also rx580 (on t1500 this time in case it happens again)

#6 Re: WU stuck

Posted: Thu Dec 13, 2018 11:45 pm
by scole of TSBT
Alez wrote: Wed Dec 12, 2018 9:01 am Scole wrote a program to detect and abort stuck units on another project. It's on here somewhere. I'll look for it some time today, if I get a chance. Most of today should hopefully be spent travelling.
https://tsbt.co.uk/forum/viewtopic.php?f=172&t=2927

#7 Re: WU stuck

Posted: Fri Dec 14, 2018 7:45 am
by davidbam
It was happening on every Collatz WU on that machine. I have reset the project to see if it helps

#8 Re: WU stuck

Posted: Fri Dec 14, 2018 8:18 am
by davidbam
Hmmm - I think the card may be faulty

#9 Re: WU stuck

Posted: Fri Dec 14, 2018 1:21 pm
by Alez
It may be or it could be something else. Try running something other than collatz and see.
I have an AMD 7970 that locks up the entire system on moo wrapper almost immediately, but runs everything else fine. I can't find why it won't run moo, it just wont.

#10 Re: WU stuck

Posted: Fri Dec 14, 2018 4:45 pm
by davidbam
It stuck permanently on a PrimeGrid WU as well. Crept up to 100% eventually but then just sat there

#11 Re: WU stuck

Posted: Fri Dec 14, 2018 11:24 pm
by davidbam
I reckon I've got to the bottom of it. My theory is that the RX580 doesn't like running (under Linux) in an x79 motherboard. It works fine in others.

I also note that the optimisation needs to go in 2 places to be effective on all Collatz WU
/var/lib/boinc-client/projects/boinc.thesonntags.com_collatz/collatz_sieve_1.40_x86_64-pc-linux-gnu__opencl_ati.config
/var/lib/boinc-client/projects/boinc.thesonntags.com_collatz/collatz_sieve_1.40_x86_64-pc-linux-gnu__opencl_ati_gpu.config

#12 Re: WU stuck

Posted: Fri Dec 14, 2018 11:54 pm
by scole of TSBT
Is the BIOS up to date?

#13 Re: WU stuck

Posted: Sat Dec 15, 2018 12:04 am
by davidbam
I'll need to check - they were cheap boards off of eBay so I am probably paying the real price now. Mind you - they work fine with nVidia

#14 Re: WU stuck

Posted: Sat Dec 15, 2018 12:49 am
by Alez
davidBAM wrote: Sat Dec 15, 2018 12:04 am I'll need to check - they were cheap boards off of eBay so I am probably paying the real price now. Mind you - they work fine with nVidia
A very common scenario, linux and nVidia seems pretty much fine, AMD not so much.