AQUQ UK Number 1 in Jepordy!

User avatar
rowpie
Peon
Posts: 239
Joined: Mon May 29, 2006 11:26 pm

#1 AQUQ UK Number 1 in Jepordy!

Post by rowpie »

Hey guys. Looking through stats (yes boring friday night) and we are going to lose UK number 1 in AQUA in 2 days if we don't treble our output. Do we let the UBT have it or send full power to the cores?
User avatar
FlyingfocRS
Boinc Warrant Officer Class 1
Boinc Warrant Officer Class 1
Posts: 438
Joined: Wed Jun 14, 2006 6:41 am

#2

Post by FlyingfocRS »

I've got 2 gpu's over to this a few days ago, got one wu but as yet no results.
Image
Image
jockmacmad2
Boinc Warrant Officer Class 2
Boinc Warrant Officer Class 2
Posts: 321
Joined: Tue Jan 27, 2009 7:18 am

#3

Post by jockmacmad2 »

What I want to know is how has a single machine on AQUA managed in a DAY 1,331,542.65 credits?

Here
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#4

Post by Megacruncher »

The AQUA workunits seem to take a long time & give excellent credit but how UL1 is getting the credit he is, I couldn't begin to guess. Anyone know about optimized apps for this project?

Anyhoo, with GPUGrid worthlessly unstable on a couple of my PCs I've put them (3 GPUs in total) onto Aqua. It'll be midweek before any of them finish. Hopefully they will last the distance and I'll get megacredits. I've another which has just crashed a GPUGid WU so I'll let it finish the two remaining GPUGrid WU & then get it onto AQUA too.

BTW, UBT did displace us as UK #1. :( Hopefully we can fight back . :)
Willie the Megacruncher
Image
User avatar
FlyingfocRS
Boinc Warrant Officer Class 1
Boinc Warrant Officer Class 1
Posts: 438
Joined: Wed Jun 14, 2006 6:41 am

#5

Post by FlyingfocRS »

Well I am getting loads of errors on one machine, maybe driver sensitive as this machine has an older driver version, will have a fiddle on Tuesday.
The other has reported back one result with a not too shabby 200+ cr/hr. 8)
Image
Image
PinkPenguin

#6

Post by PinkPenguin »

Megacruncher wrote:The AQUA workunits seem to take a long time & give excellent credit but how UL1 is getting the credit he is, I couldn't begin to guess. Anyone know about optimized apps for this project?
Looks like UL1 and probably [SG]marodeur6 are boincing 160 and 200 qubit work units on a v3.24 and 3.26 versions of the app. Haven't seen a 240 qubit WU yet. Anyway here's what I've found:

UL1 is using 2x Quad-core Q9550s with NVIDIA GTX 295 (895Mbyte) on both machines. On one machine running v3.24 of the app against 200-2M WUs for about 1,84 credits/run time sec. The other machine is running v3.26 against 160-3M WUs for 1,75 credits/run time sec. Seems like UL1 had the only machines that could complete the WUs OK all the others that ran the same WUs got errors or aborted.

Just to compare this I checked some of Merlyn's WUs (... em! sorry Steve) - I only found 1x 160 qubit (160-2M) on v2.27 of the app using a Q6600 and GTX 260 (895Mbyte). This was worth only 0.10 credits/run time sec. The other work units were all 30 or 40 qubit WUs which returned between 0.08 and 0.10 credits/run time sec depending on the machine.

This explains why UL1 is getting between 80.000 and 330.000 credits/run whereas others are getting between 3.000 and 6.000 even with GPUs at full throttle. Naturally there is always the problem of "run time" not being a good reference when it comes to GPUs but it was the only number available for comparison on the output... unless, of course, one divides "called boinc_finish" by "granted credit".

160 in the filename indicates no. of qubits and the 2M, 3M numbers indicate the number of MC sweeps in millions - both numbers give an idea of the length of the job. In the cases i looked at the 160 qubit jobs went through in a couple of days on the GTX cards the 200 qubit jobs seem to take a lot longer (but I don't think sent and received dates are that reliable for timing).

I should also note that the AQUA WUs are also multi-threaded so they can hog all the available cores on a machine. You either use CUDA or CPUs - I don't think this explains the difference in credits though.

Hope this helps - always presuming that I got it right... :?
jockmacmad2
Boinc Warrant Officer Class 2
Boinc Warrant Officer Class 2
Posts: 321
Joined: Tue Jan 27, 2009 7:18 am

#7

Post by jockmacmad2 »

Looks like Skulltrails FTW then lol

Well hmmm. Maybe I should go back to trying that on the i7 & 295s and compare.
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#8

Post by Megacruncher »

Thanks guys but I'm not sure I understood 100% of that. :? The question is what do I need to do to do as well as UL1?
Willie the Megacruncher
Image
PinkPenguin

#9

Post by PinkPenguin »

Megacruncher wrote:Thanks guys but I'm not sure I understood 100% of that. :? The question is what do I need to do to do as well as UL1?
... 'fraid that's the bit I couldn't work (probably 'cos I'm new to this) - the trick is to work out how to persuade the server to give you a version 3.24 or more recent app together with the new longer WUs... (160-Mx / 200-Mx or 240-Mx)....

In any case you have about 4-5 nice big ones coming up. They're all 240-M5s on your GTX 260 (895Mbyte) machines - it'll be interesting to see how these come out on your machines (they may take a little time though).

You might be able to guess the app version by looking at the AQUA sections in client_state.xml in the BOINC data directory (for rosetta it is parte of the .exe file name).
PinkPenguin

#10

Post by PinkPenguin »

.... P.S. AQUA - the v3.26 apps are CUDA apps and the v2.2x apps are normal multi-threaded CPU hogs.

http://aqua.dwavesys.com/apps.php

From what I can see you need the 2.1 drivers on Linux 64-bit (6.4.5 client which can be recompiled to bring the benchmarks into line with Windows benchmarks). On Windows the 2.2 drivers seem OK.

http://www.free-dc.org/forum/showthread ... e6&t=19264

http://aqua.dwavesys.com/result.php?resultid=910907

... if you have a GX2 (or maybe a couple) then ...

http://aqua.dwavesys.com/result.php?resultid=912538

:shock:
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#11

Post by Megacruncher »

The multithreaded CPU ones have been very gratifying.

If the GPU WU are as humungous as you suggest they might be then I can look forward to being an Aqua millionaire in about a week. :lol: 8)

Let's hope they don't all crash after 10 days and 98% :shock: :shock:

All eggs in one basket or what?
Willie the Megacruncher
Image
PinkPenguin

#12

Post by PinkPenguin »

Megacruncher wrote:Let's hope they don't all crash after 10 days and 98% :shock: :shock:

All eggs in one basket or what?
... judging from the forums and the results you could probably open a book on the subject quite successfully.... well, it's one way of keeping the eggs from all hoping into one basket, isn't it ? 8)
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#13

Post by Megacruncher »

At present I have 3 Aqua Cuda WU in progress - @ 34%, 39% & 43%. Altogether they have been running for 247 GPU hours. I'll keep you posted as to survival and hopefully credit. If they are sufficiently bountiful I might even swap generally to Aqua & give GPUGrid a rest given that:

a) Apart from one machine it has been unstable to the point of utter pishness recently.
b) It now accounts for 43% of my total credit - too much for one project
c) It is ages since I last got another 1 million project.

The multithreading AQUA CPU WU are proving pretty fruitful too. If CPDN wasn't a priority I'd be doing even more of them.
Willie the Megacruncher
Image
jockmacmad2
Boinc Warrant Officer Class 2
Boinc Warrant Officer Class 2
Posts: 321
Joined: Tue Jan 27, 2009 7:18 am

#14

Post by jockmacmad2 »

WTB some CUDA WU.

All I am getting are CPU ones as it sems to be not handing out CUDA ones right now.
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#15

Post by Megacruncher »

On the preferences page of Your Account tick everything and you might just get lucky.
Willie the Megacruncher
Image
jockmacmad2
Boinc Warrant Officer Class 2
Boinc Warrant Officer Class 2
Posts: 321
Joined: Tue Jan 27, 2009 7:18 am

#16

Post by jockmacmad2 »

I have both COU and GPU set there, it's just I get about 30 CPU units for each GPU.

The upside is (after aborting lots) I now have some of the said CUDA units. You need to watch them for a goodly while to see its actually running lol.

i.e. 5 1/2 hours here for 7.5% complete...
Image
User avatar
aardvark
Boinc Colour Sergeant
Boinc Colour Sergeant
Posts: 223
Joined: Fri Mar 20, 2009 2:05 am
Location: Aberdeen

#17

Post by aardvark »

I just completed my first AQUA CUDA work unit 64.5 hours of crunching on one core of a 9800GX2, and received a credit of 5,397.92. About one fifth of the return I would have gotten on GPUGRID.
Is it me ???
Image
PinkPenguin

#18

Post by PinkPenguin »

I just completed my first AQUA CUDA work unit 64.5 hours of crunching on one core of a 9800GX2, and received a credit of 5,397.92. About one fifth of the return I would have gotten on GPUGRID.
Is it me ???
Doing a straight division Granted Credit / Run Time you are getting about 0.99 credits per unit of run time on 128-4M WU this is better than others are getting on 20, 30, 40 WUs which run on CPUs.

However I have taken another look at Run Times and credits and this is what I have seen:

UL1 is running Linux boxes with the 6.4.5 core-client and the AQUA application on his boxes is 3.26 (whereas yours is a Windows version 3.29 with 6.6.36 core client).

Everything seems to hinge on Run Time - UL1s run times are very large between 300,000 and 500,000 it would seem that UL1 gets more credits as a result - for instance:
  • a 200-2M wu the runtime was 339,776 for 327,982 credits (0,97 credits / unit RT)
  • a 160-3M wu the runtime was 502,847 for 873,836 credits (1,74 credits / unit RT)
It would appear that Run Time is a significant factor in calculating credits. I have noticed that credits / unit of run time remain the same for WUs of the same type (e.g. 160-3M).

The really big credits are associated with a "no heart beat from core client" error message which means that either the core client or app has stopped working (presumaby they were subsequently restarted).

I believe that runtime is reported by the BOINC client and therefore it may be a question of which version of BOINC client you are using and also which operating system you are running.

Ref: http://boinc.berkeley.edu/boinc_papers/api/text.pdf

... Sorry about the length... just trying to work it out for myself and I have to think aloud ... :D
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#19

Post by Megacruncher »

Ah so I can't be assured that my triplets (42%@95 hrs, 54%@108hrs & 49%@109hrs) will be whamtastic UBT thrashers :?

Still, it'll be fun finding out. This is like a scratchcard that takes 9 days to open!
Willie the Megacruncher
Image
PinkPenguin

#20

Post by PinkPenguin »

Ah so I can't be assured that my triplets (42%@95 hrs, 54%@108hrs & 49%@109hrs) will be whamtastic UBT thrashers :?

Still, it'll be fun finding out. This is like a scratchcard that takes 9 days to open!
... you got it... looks like a scratchcard lottery! I think the benchmarks may only influence credits per unit of run time but it's the run time number returned that really counts...

(UL1 has another machine with practically the same benchmarks, hardware and OS and got only 6.119 credits for a 200-4M - the run time was 3.504 so he got 1.75 credits / run time unit - same as the 160-3M WUs which got granted 6 figure credits. This WU was sent on 24/06 and finished on 1/07 which doesn't mean much as it may only have started crunching on the previous day).

I should note that what comes after xxx-yM number in the task name are parameters for the AQUA app and I expect these may significantly affect run time.

If you want an idea of how it's coming along try taking a look at the client_state.xml file and look for the tag:
<final_cpu_time>7145.548000</final_cpu_time>
I think this gets updated at each checkpoint (above example is from CPDN - still running).

... One should probably expect this kind of behaviour with Quantum algorithms as the bits don't know whether they're 0 or 1 and can be neither... reminds me of some operating systems I know... :grommit: :grommit: :grommit:
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#21

Post by Megacruncher »

Aye well the dice are cast and will fall...as they do.
In the meantime my CUDA babies are 66.6% @ 134hrs, 60.0% @ 134hrs & 45% @ 103hrs.

So expect the first results on Sunday.
Willie the Megacruncher
Image
jockmacmad2
Boinc Warrant Officer Class 2
Boinc Warrant Officer Class 2
Posts: 321
Joined: Tue Jan 27, 2009 7:18 am

#22

Post by jockmacmad2 »

According to a thread on the Aqua forums it seems Linux machines are getting and I quote
10x the points of Windows for the same WU
Now I'm not sure of the validity of the 10x but there seems to be something in the fact they get alot more based on CPU runtime.
Image
steve

#23

Post by steve »

jockmacmad2 wrote:According to a thread on the Aqua forums it seems Linux machines are getting and I quote
10x the points of Windows for the same WU
Now I'm not sure of the validity of the 10x but there seems to be something in the fact they get alot more based on CPU runtime.

Here are my 2 linux box's doesn't appear to be 10x

http://aqua.dwavesys.com/results.php?hostid=18204

http://aqua.dwavesys.com/results.php?hostid=18213


Its is more probable that they are using modified boinc client's take a look at the top host's
PinkPenguin

#24

Post by PinkPenguin »

steve wrote: Here are my 2 linux box's doesn't appear to be 10x
...
Its is more probable that they are using modified boinc client's take a look at the top host's
... seems like for v2.27 WUs there isn't much difference between Windows and Linux as similar Windows machines get, more or less, the same amount of credits for the same amount of run time.

The v3.24 / v3.26 CUDA WUs are altogether another story looking at the top hosts and aardvark's WU is on a Vista box with benchmarks about half that of the top hosts... he gets the same credit / run time unit as the top hosts but his run time is much lower.

I haven't found a v3.29 app that gives the exceptionally long run times encountered in the top hosts (this is the version in Aardvark's WU) there are a couple of I7s with 280/295/GTXs that are getting 50.000 / 60.000 Credits at 0.8 credits per run time unit.

... modifying the client isn't difficult (I immagine most Linux users, at least, recompile the client to bring the benchmarks into line with Windows). But then could it be the app itself even if it is the core_client that returns the run time calculation.... ?
jockmacmad2
Boinc Warrant Officer Class 2
Boinc Warrant Officer Class 2
Posts: 321
Joined: Tue Jan 27, 2009 7:18 am

#25

Post by jockmacmad2 »

No your right Steve but I see you returned resultts in 2 days. Take a look at the other 10x(ish) WU and you will see it's 10 or 12 days to return. Now if the CPU is somehow ticking, which it should not be, but if is and theunits are based on CPU runtime which they are then ....

We incresed our Aqua to 246k yesterbay but UKBT managed 900+k :roll:
Image
PinkPenguin

#26

Post by PinkPenguin »

... Yep! Steve was right I should have looked further down the top hosts. Looks like a few Windows biggies have turned up in the last couple of days:

440,456 at 1,02 per run time unit this is a 128-4M like Aardvark's run.

520,153 at 1.09 per run time unit a 160-3M

505,450 at 1.41 per run time unit a 160-3M

... this isn't as good as the 1.75 that the 160.3M Linux units were getting but that is probably due to benchmarks. Same turnaround as Jock said around 7 to 10 days (all sent on 24th June a returned between 1st and 3rd July).

... does this mean that the CPU is not playing cricket ? damned dastardly, I say - CPUs should not be allowed tick unless authorised ! :grommit:
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#27

Post by Megacruncher »

If long runtimes are the key that unlocks high credits, then I am ever hopeful. My babies are maturing sloooowly indeed:

240-5M 51% after 117hr
240-5M 78% after 158hr
240-5M 71% after 158hr

And boy do I need some big credits. My remaining GPUGrid units keep crashing after 11 hrs (imagine the swearing that causes) and I keep having to switch off PCs to keep the house habitable.

Never mind, the heatwave seems to have broken and even with all the crunchers on the farm temp has just dipped below 27C for the first time this week! 8)
Willie the Megacruncher
Image
steve

#28

Post by steve »

Finished the first 200m wu today 91k credit

http://aqua.dwavesys.com/result.php?resultid=924897
PinkPenguin

#29

Post by PinkPenguin »

This does seem comparable to 200-2M WUs from linux boxes:

http://aqua.dwavesys.com/results.php?hostid=11169

Though it does get between 90K and 120K with less than half the run time.

For Windows XP Pro SP2 there are these which are 200-4M WUs like yours:
http://aqua.dwavesys.com/results.php?hostid=16622

There is the "No heartbeat... " message which is common to some of the big scoring WUs on both Linux and Windows. But then take a look at the Benchmarks... are Q6600s that good ? (I thought one of the problems with GPU credits is that the GPU is not included in the benchmark).

From the message board they'll be updating the app on monday though they don't say if they'll do anything about the credits... :?
jockmacmad2
Boinc Warrant Officer Class 2
Boinc Warrant Officer Class 2
Posts: 321
Joined: Tue Jan 27, 2009 7:18 am

#30

Post by jockmacmad2 »

Did you see the recent post there is a new Windows client on Monday that is 6x faster?
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#31

Post by Megacruncher »

Hopefully I'm less than a day away from finding out just how bountiful AQUA CUDA is for me:
My frontrunner is @ 89% after 182hrs, the runner-up manages 81% after 182hrs and the one that got switched off during the heatwave is trailing with 60% after 140hrs.

In the meantime the multithreaded CPU version is well worth running

This result yielded 13K of credit on a E6600. The elapsed time according to BM was 36hrs but the Runtime according to the link is 58.3hrs. Presumably this is the total core time, which since it works out at over 220 per core is nevertheless something to be coveted!
Willie the Megacruncher
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#32

Post by Megacruncher »

Well I had hoped to have a result to tell you about by now. However as I type my front runner still has 0.809% to go which is predicted to take 1hr 39min to finish off. Even a lifelong insomniac like me can't be arsed to stay up well after 2am to see what happens. I'll let you know in the morning unless you find out first.
Willie the Megacruncher
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#33

Post by Megacruncher »

Here you are

131.5K for 9 days of GPU crunching isn't too bad. About 50% better than I'd have got from GPUGrid on that machine.
Willie the Megacruncher
Image
PinkPenguin

#34

Post by PinkPenguin »

Aye, thar she blows, cap'n and a fine big'un 'tis too.... ! :shock:

Can I come down from the crows nest now ?

Took a look this morning and it wasn't there so it must have come while I was driving to work!
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#35

Post by Megacruncher »

And another whopper! 138K this time. 8) :D :wav:

Both these were run on the same machine which all told will, I reckon, have managed 300K in the last 9 days.
Willie the Megacruncher
Image
Reeltime
Boinc Second Lieutenant
Boinc Second Lieutenant
Posts: 543
Joined: Sun Sep 24, 2006 5:03 am
Location: Edinburgh
Contact:

#36

Post by Reeltime »

Megacruncher wrote:And another whopper! 138K this time. 8) :D :wav:

Both these were run on the same machine which all told will, I reckon, have managed 300K in the last 9 days.
I can't get any work at all for Aqua :(
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#37

Post by Megacruncher »

Nor can anyone else! :? Their admin keeps promising more and I think might even have delivered more today but it's all been snaffled already.

Check out this thread for details.

I've got one multi-threaded CPU unit that'll be finished tomorrow and 4 GPU units at 0%, 4%, 12% and 75%. After that it's back to GPUGrid unless Milkyway gets itself sorted.
Willie the Megacruncher
Image
PinkPenguin

#38

Post by PinkPenguin »

Megacruncher wrote:I've got one multi-threaded CPU unit that'll be finished tomorrow and 4 GPU units at 0%, 4%, 12% and 75%. After that it's back to GPUGrid unless Milkyway gets itself sorted.
Looks like you got another 500T / 600T credits due in - congratulations. 8)

Judging from the thread they will have the new apps and WUs up before the end of the week - which should be good news even if they look like they won't be as fast as they had hoped.

I don't think I'll try AQUA at the moment as CPDN is my long runner and I wouldn't feel right ditching them - my P4 zombies would probably drop dead (sic) if they saw an AQUA WU.

Apologies if I drove everyone up the wall with the newbie theorising and thanks for the pointers.
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#39

Post by Megacruncher »

Aqua & CPDN would coexist quite happily and all of your machines would cope with the multithreaded CPU application. So it might be worth a go - if you can get any work for it. 8)

As for newbie theorising, it was way too clever for me. :)
Willie the Megacruncher
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#40

Post by Megacruncher »

AQUA has, CUDA only, work available. Grab it while it lasts! :)
Willie the Megacruncher
Image
Reeltime
Boinc Second Lieutenant
Boinc Second Lieutenant
Posts: 543
Joined: Sun Sep 24, 2006 5:03 am
Location: Edinburgh
Contact:

#41

Post by Reeltime »

[quote=Megacruncher]AQUA has, CUDA only, work available. Grab it while it lasts! :)[/quote]

And then there was none :(
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#42

Post by Megacruncher »

It didn't last. I hope I'm not the only one of us who grabbed it! :lol:
Willie the Megacruncher
Image
steve

#43

Post by steve »

Megacruncher wrote:It didn't last. I hope I'm not the only one of us who grabbed it! :lol:

More now available --- get em while they last...
Reeltime
Boinc Second Lieutenant
Boinc Second Lieutenant
Posts: 543
Joined: Sun Sep 24, 2006 5:03 am
Location: Edinburgh
Contact:

#44

Post by Reeltime »

steve wrote:
Megacruncher wrote:It didn't last. I hope I'm not the only one of us who grabbed it! :lol:

More now available --- get em while they last...
Got a bunch. Seem to be a lot longer than the ones yesterday though
Even managed to grab a CPU unit
Not that Im complaining, got about 8k credits for 10 hord crunching last night :)
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#45

Post by Megacruncher »

I got quite a few queued up now. The short CUDA ones aren't too bad - I got about 1,200 for what was probably 90 minutes crunching last night.
To revert to the original reason for this thread UBT, are now 2 million in front of us :( but a few big results could overturn that lead quite easily, maybe. :lol:
Willie the Megacruncher
Image
jockmacmad2
Boinc Warrant Officer Class 2
Boinc Warrant Officer Class 2
Posts: 321
Joined: Tue Jan 27, 2009 7:18 am

#46

Post by jockmacmad2 »

Just had a crash when a 200-4M got to 100% after about 100 hours. No sign on the tasks list it got the results and its not in my client.

In fact I see:-

09/07/2009 13:41:14 AQUA@home Task 26jun09-200-4M-64-a_6_7_1 exited with zero status but no 'finished' file
09/07/2009 13:41:14 AQUA@home If this happens repeatedly you may need to reset the project.
09/07/2009 13:41:14 AQUA@home Task 26jun09-200-4M-64-a_6_8_1 exited with zero status but no 'finished' file
09/07/2009 13:41:14 AQUA@home If this happens repeatedly you may need to reset the project.
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4699
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Contact:

#47

Post by Megacruncher »

Ouch! 100 hours of crunching is a lot to lose. :( I've had quite a few AQUA units die on me but usually within a millisecond of starting: Which is thoughtful of them.
Willie the Megacruncher
Image
PinkPenguin

#48

Post by PinkPenguin »

Megacruncher wrote: In the meantime the multithreaded CPU version is well worth running
In the end I took your advice and am trying out a few 200-4M WUs. Looks like they give fixed credit 26,776 on the v2.29 (multi-threaded) app. They seem to run in between 12-30 hours on quad machines (on a Core Duo more like 50-60 hours). Which is good.
jockmacmad2 wrote:Just had a crash when a 200-4M got to 100% after about 100 hours. No sign on the tasks list it got the results and its not in my client.
If the output turns up in the task list eventually I would be interested to know if there is a "No heartbeat... " message. (This happened to me on a CPDN WU both output and credits did eventually turn up and I have noted that the "No heartbeat..." message is frequent in AQUA WUs).
jockmacmad2
Boinc Warrant Officer Class 2
Boinc Warrant Officer Class 2
Posts: 321
Joined: Tue Jan 27, 2009 7:18 am

#49

Post by jockmacmad2 »

I have the no heartbeat on both Aqua and I think my failed GPUGrid tasks. I will look a little later.
Image
PinkPenguin

#50

Post by PinkPenguin »

jockmacmad2 wrote:I have the no heartbeat on both Aqua and I think my failed GPUGrid tasks. I will look a little later.
You might like to check this with someone who has more experience.

I restarted the BOINC client (actually shutdown the portable, went home and restarted). You loose any data after the last checkpoint so you have to redo a part of the WU but it did deliver at the end.

There is an explanation here as to why the two sets of messages are associated - but I wouldn't reset the project unless you have lost all hope... :( judging from other AQUA v3.29 (cuda) and v2.29 (mt) tasks it is not infrequent :? If the output is like the following (copied from another AQUA cuda WU) you might be OK:

Code: Select all

<core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
No heartbeat from core client for 30 sec - exiting
.. repeated a lot ...
No heartbeat from core client for 30 sec - exiting
called boinc_finish
</stderr_txt>
]]>
If it is not like the above check for -161 return code at the end - that may be a problem....

This is the output from the CPDN task which gave the same "no 'finished' file" message in case you're interested:
http://climateapps2.oucs.ox.ac.uk/cpdnb ... id=8912081

Explanation: http://boinc-wiki.info/Result_%27%28res ... ed%27_file
Post Reply Previous topicNext topic

Return to “Retired Projects”