WCG Problems
#1 WCG Problems
I am having problems with WCG tasks. They run for a bit but then stop processing. In the BONIC manager they appear to be running but are not actually using any CPU power. It happened to three tasks the other day which meant my PC was only crunching on 1 core. At first glance evryhing looks ok but I only realised what was happening when I noticed my RAC going down and my CPU temp was much lower.
It can be difficult sometimes to get the task working again a combination of suspending and restarting can work, it just means I need to constantly monitor whats happening. Also when the task resumes it usually goes up to about 108% done!
Is this normal, anyone else had issues? I run 64bit Fedora so not sure if it an issue with WCG on Linux.
It is pissing me off so I am going to convert to Rosetta and 3x+1 for now.
It can be difficult sometimes to get the task working again a combination of suspending and restarting can work, it just means I need to constantly monitor whats happening. Also when the task resumes it usually goes up to about 108% done!
Is this normal, anyone else had issues? I run 64bit Fedora so not sure if it an issue with WCG on Linux.
It is pissing me off so I am going to convert to Rosetta and 3x+1 for now.
#2
Hmmm, interesting.
I had exactly the same issue a couple of weeks ago, also running FC6x64. I found 3 boxes that for no apparent reason had stopped crunching units on one core. These machines have been rock solid (save for power induced problems back in January), so I don't think it's my hardware.
To be honest I didn't notice it on the CPU temps, but only when I went to shutdown the boxes for an upcoming business trip. I surmise that the stuck units must have only been stuck for a day or so, because my rac didn't suffer. Stopping Boinc and restarting it fixed the issue, but then I had to shutdown anyway.
Since returning, I've not seen any problems, but then again I haven't been looking. I use BoincView which will tell me if a WU approaches it's deadline. Problem is that may be a week or two down track.
Maybe there is a problem with some WU? Odd that it happened on Fedora, the same as you, but that is a very common Linux distro, so we need to be cautious about jumping to conclusions on that. I'll have a dig around in the forums and see what can be found.
I had exactly the same issue a couple of weeks ago, also running FC6x64. I found 3 boxes that for no apparent reason had stopped crunching units on one core. These machines have been rock solid (save for power induced problems back in January), so I don't think it's my hardware.
To be honest I didn't notice it on the CPU temps, but only when I went to shutdown the boxes for an upcoming business trip. I surmise that the stuck units must have only been stuck for a day or so, because my rac didn't suffer. Stopping Boinc and restarting it fixed the issue, but then I had to shutdown anyway.
Since returning, I've not seen any problems, but then again I haven't been looking. I use BoincView which will tell me if a WU approaches it's deadline. Problem is that may be a week or two down track.
Maybe there is a problem with some WU? Odd that it happened on Fedora, the same as you, but that is a very common Linux distro, so we need to be cautious about jumping to conclusions on that. I'll have a dig around in the forums and see what can be found.
- FlyingfocRS
- Boinc Warrant Officer Class 1
- Posts: 438
- Joined: Wed Jun 14, 2006 6:41 am
#3
RANT ALERT
I just wish that WCG would adopt the BOINC format a bit more.
For example in the "my grid" they should give you the total in BOINC cobblestones as well as the UD total.
Everytime I divide the figure they have by the suggested factor of 7 i always get less than what BS is showing as my total.
Seems they are still stuck half and half because if you do manage to navigate their site you can find your w/u's with the results shown in BOINC figures but with no totals.
Basically it's just a shambles.
RANT OVER
I just wish that WCG would adopt the BOINC format a bit more.
For example in the "my grid" they should give you the total in BOINC cobblestones as well as the UD total.
Everytime I divide the figure they have by the suggested factor of 7 i always get less than what BS is showing as my total.
Seems they are still stuck half and half because if you do manage to navigate their site you can find your w/u's with the results shown in BOINC figures but with no totals.
Basically it's just a shambles.
RANT OVER
- Megacruncher
- G.L.S.B.
- Posts: 4702
- Joined: Mon May 29, 2006 11:33 pm
- Location: Edinburgh, Scotland
- Contact:
#4
I'm with Flying on this. Any other Boinc project if you think there is a problem it's not too hard to track it down but with WCG there is no way of finding out f*** all about anything.
Anyway, like Sneaky, I've decided that it is time to go hell for leather for Rosetta and that is just what I'm going to do. After Rosetta is 100th or better I shall return to WCG but Adios until then!
Anyway, like Sneaky, I've decided that it is time to go hell for leather for Rosetta and that is just what I'm going to do. After Rosetta is 100th or better I shall return to WCG but Adios until then!
Willie the Megacruncher
- Buster Gunn
- Boinc Second Lieutenant
- Posts: 513
- Joined: Sun Aug 13, 2006 7:20 am
- Location: Wilmington Delaware USA
#5
Glad to see that others are finding WCG to be run by amateurs. It is just terrible to find anything or figure out whats going on. If they are steering everyone to the BOINC Client, then why not adopt the standard format?
But once again, ya can't tell IBM people anything. Story of my entire career in data processing.
But once again, ya can't tell IBM people anything. Story of my entire career in data processing.
Buster is my dog. Sadly, Buster is gone.
Buster is the Malamute, Obi is the Golden.
Obi is the dog of the house now.
Buster is the Malamute, Obi is the Golden.
Obi is the dog of the house now.
- FlyingfocRS
- Boinc Warrant Officer Class 1
- Posts: 438
- Joined: Wed Jun 14, 2006 6:41 am
#6
Can anybody else get onto the WCG forum?
I thought it was just my work connection but can't get on here either.
I thought it was just my work connection but can't get on here either.
#8
Found this thread:
http://www.worldcommunitygrid.org/forum ... read=17066
Doesn't seem to be any pattern. Happens on AMD, Intel, Windows, Linux. Some people were strangley suggesting Software Firewalls others blamed lack of resources.
I noticed on another thread people were saying it was only happening on the Dengue Drugs project but I moved only to do Dengue since it kept happening on the Cancer and Protien one.
So I am none the wiser. Just seems to be something odd with WCG.
http://www.worldcommunitygrid.org/forum ... read=17066
Doesn't seem to be any pattern. Happens on AMD, Intel, Windows, Linux. Some people were strangley suggesting Software Firewalls others blamed lack of resources.
I noticed on another thread people were saying it was only happening on the Dengue Drugs project but I moved only to do Dengue since it kept happening on the Cancer and Protien one.
So I am none the wiser. Just seems to be something odd with WCG.
- FlyingfocRS
- Boinc Warrant Officer Class 1
- Posts: 438
- Joined: Wed Jun 14, 2006 6:41 am
- FlyingfocRS
- Boinc Warrant Officer Class 1
- Posts: 438
- Joined: Wed Jun 14, 2006 6:41 am
#10
Jeez what is it with them.
I'm logged into "my grid"
But not the forum and there's no login or register option to be seen anywhere!!!!!!!!!!!!!!!!!
Seriously considering switching this piece of junk off and doing Rosetta!!!!
I'm logged into "my grid"
But not the forum and there's no login or register option to be seen anywhere!!!!!!!!!!!!!!!!!
Seriously considering switching this piece of junk off and doing Rosetta!!!!
-
- Boinc Second Lieutenant
- Posts: 543
- Joined: Sun Sep 24, 2006 5:03 am
- Location: Edinburgh
- Contact:
#11
Just checked a couple of results at WCG and got this
Result Log
<core_client_version>5.10.28</core_client_version>
<![CDATA[
<stderr_txt>
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
</stderr_txt>
]]>
Anyone got the foggiest what happened to my 9 hour wu?
Result Log
<core_client_version>5.10.28</core_client_version>
<![CDATA[
<stderr_txt>
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
Failed to get VersionInfo size: 2
</stderr_txt>
]]>
Anyone got the foggiest what happened to my 9 hour wu?