QuChemPedIA@home

Forum rules
User avatar
Dirk Broer
Corsair
Corsair
Posts: 1687
Joined: Thu Feb 20, 2014 11:24 pm
Location: Leiden, South Holland, Netherlands
Has thanked: 20 times
Been thanked: 34 times
Contact:

#1 Re: QuChemPedIA@home

Post by Dirk Broer »

Do you have the invitation code then?
Image
User avatar
Dirk Broer
Corsair
Corsair
Posts: 1687
Joined: Thu Feb 20, 2014 11:24 pm
Location: Leiden, South Holland, Netherlands
Has thanked: 20 times
Been thanked: 34 times
Contact:

#2 Re: QuChemPedIA@home

Post by Dirk Broer »

Thanks! They did not accept the code you supplied, but a new one popped up after I pressed 'join'.
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6023
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Has thanked: 72 times
Been thanked: 67 times
Contact:

#3 Re: QuChemPedIA@home

Post by davidbam »

this is now a FB project and looks awkward enough to where it might put people off. A chance of some points perhaps?
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Bryan
Boinc Brigadier
Boinc Brigadier
Posts: 2621
Joined: Thu May 21, 2015 6:18 pm
Has thanked: 0
Been thanked: 0

#4 Re: QuChemPedIA@home

Post by Bryan »

There is a native Linux app that doesn't require VBox and it runs quite well. The only caveat is you need to turn HT OFF if you have more than 32 threads. The app issues the taskset command 0xFFFFFFFF that puts all running WU onto the 1st 32 threads. Quite often the program launches child processes and every time one of those starts it reissues the 32 thread taskset affinity mask so you can't even setup your own script to use more than 32 threads.

I asked damotbe to change the affinity mask to allow at least 72 threads but he said that they don't currently have a programmer so it may or may not happen at some point.
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6023
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Has thanked: 72 times
Been thanked: 67 times
Contact:

#5 Re: QuChemPedIA@home

Post by davidbam »

So I guess it doesn't help to run several instances of 32 threads?

Is it worth trying with HT on and a different project on threads 33-and-up ?
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Bryan
Boinc Brigadier
Boinc Brigadier
Posts: 2621
Joined: Thu May 21, 2015 6:18 pm
Has thanked: 0
Been thanked: 0

#6 Re: QuChemPedIA@home

Post by Bryan »

No, regardless of how many instances are run ALL QuChem WU will be assigned to the 1st 32 threads with the taskset affinity mask.

You could certainly run another project on the top 32 threads but then you are using HT and the QuChem WU will take twice as long as running with threads vs cores. The Wu take from 18 - 30 hours running with HT off (IIRC).

Of course you could always get off your lazy butt and turn HT OFF .... just sayin' :roll:
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6023
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Has thanked: 72 times
Been thanked: 67 times
Contact:

#7 Re: QuChemPedIA@home

Post by davidbam »

Naw - won't be doing that. It would take all day to put monitor / keyboard on all the headless workstations. We can't all afford server mobo with IPMI :D

If I left HT on but only ran 32 threads on the 2990wx machines, would that be ALMOST the same as turning HT off ?
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Bryan
Boinc Brigadier
Boinc Brigadier
Posts: 2621
Joined: Thu May 21, 2015 6:18 pm
Has thanked: 0
Been thanked: 0

#8 Re: QuChemPedIA@home

Post by Bryan »

No, you would wind up with the top 2 dies sitting idle and all 32 WU stuck on the 16 cores of the bottom 2 dies.

BTW, unless they've changed the WU do NOT checkpoint so once started you want to make sure they keep on keepin' on.

When they 1st started the project ALL WU were assigned to the 1st thread on each CPU so it had 2 threads available for processing. I had a script that ran every 2 minutes and would set the affinity mask for 72 threads. Needless to say, I was kicking some butt since I could use all threads. They changed the executable/wrapper so it would set the affinity mask to the 1st 32 threads/cores. The script is no longer useful because the program launches child processes quite frequently and every time a new one launches it issues the taskset command and slams ALL running executables to the 1st 32 threads.

HERE is a link to a php implementation but as a I said, all child processes issue the affinity mask so it isn't all that useful anymore.
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6023
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Has thanked: 72 times
Been thanked: 67 times
Contact:

#9 Re: QuChemPedIA@home

Post by davidbam »

30 hours with no checkpointing !!!!! Sounds as bad as SRbase

I'll maybe try it on one 2990wx with HT off but, man, when I said 'difficult', I didn't realise how bdooly difficult.
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Bryan
Boinc Brigadier
Boinc Brigadier
Posts: 2621
Joined: Thu May 21, 2015 6:18 pm
Has thanked: 0
Been thanked: 0

#10 Re: QuChemPedIA@home

Post by Bryan »

The server status is showing the avg to be 1.7 hours so I guess they are running a shorter batch of WU now.
Image
User avatar
Hal Bregg
Boinc Sergeant
Boinc Sergeant
Posts: 175
Joined: Thu Nov 08, 2018 1:22 pm
Location: Cumbria
Has thanked: 0
Been thanked: 0

#11 Re: QuChemPedIA@home

Post by Hal Bregg »

davidBAM wrote: Wed Jan 01, 2020 3:20 pm 30 hours with no checkpointing !!!!! Sounds as bad as SRbase

I'll maybe try it on one 2990wx with HT off but, man, when I said 'difficult', I didn't realise how bdooly difficult.
SRBase has checkpoints. The progress bar doesn't reflect actual percentage of work done if you restart the task (see stderr.txt file for actual work progress).

More about checkpoints in this thread
http://srbase.my-firewall.org/sr5/forum ... =1001#4460
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6023
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Has thanked: 72 times
Been thanked: 67 times
Contact:

#12 Re: QuChemPedIA@home

Post by davidbam »

Oh, TYVM. That would certainly help a lot
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4426
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Has thanked: 7 times
Been thanked: 14 times
Contact:

#13 Re: QuChemPedIA@home

Post by Megacruncher »

I only joined this project yesterday. Mostly because I hadn't noticed it before.
It doesn't seem to be causing any problems and the credit isn't too bad. As well as getting some FB points for us I'm making it my next million credit target.
I notice an issue above about it not working well with hyperthreading. But my experience suggests that it is working okay. My Threadripper 1950X 16-Core Processor Linux machine is running 32 WU at a time without error and is more or less matching a 32 CPU instance on my Threadripper 2990WX 32-Core Processor, also Linux, machine.
Willie the Megacruncher
Image
User avatar
Bryan
Boinc Brigadier
Boinc Brigadier
Posts: 2621
Joined: Thu May 21, 2015 6:18 pm
Has thanked: 0
Been thanked: 0

#14 Re: QuChemPedIA@home

Post by Bryan »

The problem isn't hyperthreading. The issue is it slams ALL WU onto the 1st 32 threads of a machine. If you only have 32 threads then it isn't an issue.
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10238
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet
Has thanked: 21 times
Been thanked: 57 times

#15 Re: QuChemPedIA@home

Post by Alez »

It also seems to require quite a bit of memory. Had more than a few tasks sitting with waiting for memory whilst running it.
Image
The best form of help from above is a sniper on the rooftop....
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4426
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Has thanked: 7 times
Been thanked: 14 times
Contact:

#16 Re: QuChemPedIA@home

Post by Megacruncher »

I just tried it on Windows. Total wipeout. 169 tasks. 169 errors. I'll stick to Linux.
Willie the Megacruncher
Image
User avatar
Bryan
Boinc Brigadier
Boinc Brigadier
Posts: 2621
Joined: Thu May 21, 2015 6:18 pm
Has thanked: 0
Been thanked: 0

#17 Re: QuChemPedIA@home

Post by Bryan »

Linux runs native, Win runs VBox.
Image
User avatar
Megacruncher
G.L.S.B.
G.L.S.B.
Posts: 4426
Joined: Mon May 29, 2006 11:33 pm
Location: Edinburgh, Scotland
Has thanked: 7 times
Been thanked: 14 times
Contact:

#18 Re: QuChemPedIA@home

Post by Megacruncher »

Bryan wrote: Sun Jan 05, 2020 11:32 pm Win runs VBox.
I'll not bother then!
Willie the Megacruncher
Image
User avatar
Alez
[ TSBT's Pirate ]
[ TSBT's Pirate ]
Posts: 10238
Joined: Thu Oct 04, 2012 1:22 pm
Location: roaming the planet
Has thanked: 21 times
Been thanked: 57 times

#19 Re: QuChemPedIA@home

Post by Alez »

Same for me, Vbox has been a pointless exercise.
Image
The best form of help from above is a sniper on the rooftop....
davidbam
General Bitchin'
General Bitchin'
Posts: 6023
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Has thanked: 72 times
Been thanked: 67 times
Contact:

#20 Re: QuChemPedIA@home

Post by davidbam »

Not a good start from me. Two WU errored, 50 were invalid after significant run times. Only 3 validated at about 8pts per thread per hour on an OC 3900X. To add insult to injury, I forgot to join the team so even they didn't count.

Is this normal? (Linux)
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
UBT - Woodles
Boinc Corporal
Boinc Corporal
Posts: 69
Joined: Wed Jan 24, 2018 9:43 am
Has thanked: 2 times
Been thanked: 0

#21 Re: QuChemPedIA@home

Post by UBT - Woodles »

The only errors I've had have been during downloading or cancelled by the server, none resulted in any time being lost.

However, I do have about a third ending up as invalid, most after 30 ish seconds but some running to the normal execution time.

Every one has invalids, it's a project "feature" :)

I'm getting about 50 credits an hour with a single thread, are you running them multithreaded?
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6023
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Has thanked: 72 times
Been thanked: 67 times
Contact:

#22 Re: QuChemPedIA@home

Post by davidbam »

Thanks. No, not running MT. Does the last setting really mean # threads per WU ?
Image
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
UBT - Woodles
Boinc Corporal
Boinc Corporal
Posts: 69
Joined: Wed Jan 24, 2018 9:43 am
Has thanked: 2 times
Been thanked: 0

#23 Re: QuChemPedIA@home

Post by UBT - Woodles »

You would think so but no. I'm not sure what it's used for, I have it set to "No Limit" but tasks run on a single thread.

They have no work at the moment so I can't try different options.

The project developer has said that they've experimented and there's no advantage to using more than one core per workunit.
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6023
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Has thanked: 72 times
Been thanked: 67 times
Contact:

#24 Re: QuChemPedIA@home

Post by davidbam »

Okay thanks. Maybe I'll try again later on a lesser machine but there is no way I am putting my thoroughbred 3900X onto something which has 50 out of 55 invalid :D
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
UBT - Woodles
Boinc Corporal
Boinc Corporal
Posts: 69
Joined: Wed Jan 24, 2018 9:43 am
Has thanked: 2 times
Been thanked: 0

#25 Re: QuChemPedIA@home

Post by UBT - Woodles »

Makes sense :lol:
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6023
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Has thanked: 72 times
Been thanked: 67 times
Contact:

#26 Re: QuChemPedIA@home

Post by davidbam »

tried an Intel machine this time, older OS (ubuntu 18.10): 3 valid, 74 invalid

Not a good ratio, especially when some of the invalid WU have run for 20-30 mins before failing.
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
UBT - Woodles
Boinc Corporal
Boinc Corporal
Posts: 69
Joined: Wed Jan 24, 2018 9:43 am
Has thanked: 2 times
Been thanked: 0

#27 Re: QuChemPedIA@home

Post by UBT - Woodles »

I only have the one box on QuChem so can't help with different CPUs or OS.

If it helps, I've just downloaded four tasks and all ended up as invalid, three after 30 seconds, one after twenty minutes. Normally I'd get at least one valid out of four but it's a small sample size.
Image
davidbam
General Bitchin'
General Bitchin'
Posts: 6023
Joined: Wed Aug 15, 2018 1:15 pm
Location: Huntly, Scotland
Has thanked: 72 times
Been thanked: 67 times
Contact:

#28 Re: QuChemPedIA@home

Post by davidbam »

the CPU/OS may be a red herring but, out of interest, what are you running on please and I'll see if I have anything close
I think this is fool-proof but could you just try it for me please? • There are 10 types of people in the world; those who understand binary, and those who don’t
Image
UBT - Woodles
Boinc Corporal
Boinc Corporal
Posts: 69
Joined: Wed Jan 24, 2018 9:43 am
Has thanked: 2 times
Been thanked: 0

#29 Re: QuChemPedIA@home

Post by UBT - Woodles »

It's a 1950X, default settings with 32 Gig of RAM running Ubuntu 19.04 if I remember correctly. QuChem is down again so I can't check.

Edit: Just had a thought and checked on WUProp, details are correct (also Boinc version 7.14.2 if it matters?)
Image
User avatar
Dirk Broer
Corsair
Corsair
Posts: 1687
Joined: Thu Feb 20, 2014 11:24 pm
Location: Leiden, South Holland, Netherlands
Has thanked: 20 times
Been thanked: 34 times
Contact:

#30 Re: QuChemPedIA@home

Post by Dirk Broer »

Any specs on the machine in question? 64-bit is pretty standard on modern machines now.
I'd even go so far as saying that QuChemPedIA only has 64-bit apps for vbox (used for Windows and MacOS), and needs at least a 64-bit CPU for the rest (=Linux).
Image
Post Reply Previous topicNext topic

Return to “QuChemPedIA”