Deprecated: Function get_magic_quotes_gpc() is deprecated in /srv/BOINC/live-webcode/html/inc/util.inc on line 640
Project server code update

WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!

Project server code update

Message boards : News : Project server code update
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 17 · Next

AuthorMessage
Claggy

Send message
Joined: 29 Dec 06
Posts: 78
Credit: 4,040,969
RAC: 0
Message 112875 - Posted: 4 Jun 2014, 20:17:21 UTC - in response to Message 112874.  
Last modified: 4 Jun 2014, 20:31:41 UTC

Holmis reported the same for BRP4G-cuda32-nv301 in the problems area, except he inocculated his against "Exit status 197 EXIT_TIME_LIMIT_EXCEEDED" with a big boost to rsc_fpops_bound.

I guess one of us (and that probably means me) should fire up a GPU fetch and compare the calculations in the server log with what actually ends up in client_state.xml

I saw his post after I posted mine, I'm letting them all error, as I want the fix to come from the project/the Boinc Devs, rather than a work around,

From my client_state.xml, BRP4G has an extra three digits compared to the BRP5 app (of which I haven't received any work yet, so shouldn't have been updated yet):

<app_version>
<app_name>einsteinbinary_BRP5</app_name>
<version_num>139</version_num>
<platform>windows_x86_64</platform>
<avg_ncpus>0.929041</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<flops>38787392469.934639</flops>
<plan_class>BRP5-opencl-ati</plan_class>
<api_version>7.2.2</api_version>
<file_ref>
<file_name>einsteinbinary_BRP5_1.39_windows_x86_64__BRP5-opencl-ati.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>einsteinbinary_BRP4_1.00_graphics_windows_intelx86.exe</file_name>
<open_name>graphics_app</open_name>
</file_ref>
<coproc>
<type>ATI</type>
<count>1.000000</count>
</coproc>
<gpu_ram>377487360.000000</gpu_ram>
<dont_throttle/>
</app_version>
<app_version>
<app_name>einsteinbinary_BRP4G</app_name>
<version_num>134</version_num>
<platform>windows_x86_64</platform>
<avg_ncpus>0.989277</avg_ncpus>
<max_ncpus>0.989277</max_ncpus>
<flops>41492924173738.344000</flops>
<plan_class>BRP4G-opencl-ati</plan_class>
<api_version>7.1.0</api_version>
<file_ref>
<file_name>einsteinbinary_BRP4G_1.34_windows_x86_64__BRP4G-opencl-ati.exe</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>einsteinbinary_BRP4_1.00_graphics_windows_intelx86.exe</file_name>
<open_name>graphics_app</open_name>
</file_ref>
<coproc>
<type>ATI</type>
<count>1.000000</count>
</coproc>
<gpu_ram>377487360.000000</gpu_ram>
<dont_throttle/>
</app_version>


Claggy
ID: 112875 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 112876 - Posted: 4 Jun 2014, 20:37:34 UTC
Last modified: 4 Jun 2014, 20:49:38 UTC

And from my lappy:

2014-06-04 20:28:09.8459 [PID=26529] [version] [AV#720] app_plan() returned false
2014-06-04 20:28:09.8459 [PID=26529] [version] [AV#716] (BRP4G-cuda32-nv301) adjusting projected flops based on PFC avg: 2124.60G
2014-06-04 20:28:09.8459 [PID=26529] [version] Best version of app einsteinbinary_BRP4G is [AV#716] (2124.60 GFLOPS)
2014-06-04 20:28:09.8459 [PID=26529] [send] est delay 0, skipping deadline check
2014-06-04 20:28:09.8460 [PID=26529] [version] get_app_version(): getting app version for WU#599227 (p2030.20131124.G175.87-01.48.S.b1s0g0.00000_1328) appid:29
2014-06-04 20:28:09.8460 [PID=26529] [version] returning cached version: [AV#716]
2014-06-04 20:28:09.8460 [PID=26529] [send] est delay 0, skipping deadline check
2014-06-04 20:28:09.8509 [PID=26529] [send] Sending app_version einsteinbinary_BRP4G 2 133 BRP4G-cuda32-nv301; projected 2124.60 GFLOPS
2014-06-04 20:28:09.8510 [PID=26529] [send] est. duration for WU 599227: unscaled 131.79 scaled 131.85
2014-06-04 20:28:09.8510 [PID=26529] [send] [HOST#11359] sending [RESULT#1455845 p2030.20131124.G175.87-01.48.S.b1s0g0.00000_1328_4] (est. dur. 131.85s (0h02m11s85)) (max time 2635.79s (0h43m55s79))

Both server and client are estimating 131 seconds. But a laptop NV GT 420M with 192 GFLOPS peak, running at 2.1 TeraFlop? Well, we wanted to check the PFC averages.....

Edit: the real problem in client_state is

<app_version>
<app_name>einsteinbinary_BRP4G</app_name>
<version_num>133</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>0.895024</avg_ncpus>
<max_ncpus>0.895024</max_ncpus>
<flops>2124597383074.081300</flops>
<plan_class>BRP4G-cuda32-nv301</plan_class>
ID: 112876 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 112877 - Posted: 4 Jun 2014, 21:21:15 UTC - in response to Message 112875.  

I saw his post after I posted mine, I'm letting them all error, as I want the fix to come from the project/the Boinc Devs, rather than a work around,

Understood. I'm going to try and run mine, to establish a real APR to contrast with that stupid 'PFC avg' initial estimate. Hopefully that'll generate some more ammunition to throw at David. Thank goodness the 32 tasks per day limit worked properly.....
ID: 112877 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 112878 - Posted: 5 Jun 2014, 7:17:12 UTC

So far, every single one of the CasA tasks I've run since this test started has ended in 'validate error'. That's across several machines, but the worst example is host 9130.
ID: 112878 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 112879 - Posted: 5 Jun 2014, 8:54:59 UTC
Last modified: 5 Jun 2014, 8:57:03 UTC

I see the CasA WUs (which were very old, generated in January, and incompatible with the current validator) have now been cancelled.

I'll abort all unstarted examples: should we abort jobs in progress too?

Edit - hold that thought. There are newly generated tasks in the database too, don't abort those.
ID: 112879 · Report as offensive     Reply Quote
Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 15 Oct 04
Posts: 1956
Credit: 6,218,130
RAC: 0
Message 112880 - Posted: 5 Jun 2014, 8:56:19 UTC - in response to Message 112879.  

I see the CasA WUs (which were very old, generated in January, and incompatible with the current validator) have now been cancelled.

I'll abort all unstarted examples: should we abort jobs in progress too?


Yes, please.

BM
ID: 112880 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 112881 - Posted: 5 Jun 2014, 9:22:15 UTC - in response to Message 112880.  

All the suspect CasA (GW) tasks in the database have been cancelled and unconditionally aborted by the project. Any you still have running on your computers (after doing a project update) should run OK, as should any new ones you get issued.
ID: 112881 · Report as offensive     Reply Quote
Claggy

Send message
Joined: 29 Dec 06
Posts: 78
Credit: 4,040,969
RAC: 0
Message 112882 - Posted: 5 Jun 2014, 9:30:21 UTC - in response to Message 112881.  
Last modified: 5 Jun 2014, 9:52:24 UTC

All the suspect CasA (GW) tasks in the database have been cancelled and unconditionally aborted by the project. Any you still have running on your computers (after doing a project update) should run OK, as should any new ones you get issued.

I've got some fresh Casa tasks on my last contact, But I also got a single BRP task, But BRP is deselected on the work venue for that host
(I do have 'Run beta/test application versions?' and 'Run CPU versions of applications for which GPU versions are available' selected though):

https://albert.phys.uwm.edu/host_sched_logs/8/8143

https://albert.phys.uwm.edu/result.php?resultid=1431784

In progress tasks for computer 8143

Edit: added the log so we don't loose it:

2014-06-05 09:04:29.2602 [PID=4416] Request: [USER#xxxxx] [HOST#8143] [IP xxx.xxx.xxx.103] client 7.2.42
2014-06-05 09:04:29.2613 [PID=4416 ] [send] [HOST#8143] app version 588 is reliable
2014-06-05 09:04:29.2613 [PID=4416 ] [send] set_trust: random choice for cons valid 43: yes
2014-06-05 09:04:29.2613 [PID=4416 ] [send] [AV#649] not reliable; cons valid 1 < 10
2014-06-05 09:04:29.2613 [PID=4416 ] [send] set_trust: cons valid 1 < 10, don't use single replication
2014-06-05 09:04:29.2613 [PID=4416 ] [send] [AV#707] not reliable; cons valid 0 < 10
2014-06-05 09:04:29.2613 [PID=4416 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#710] not reliable; cons valid 1 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 1 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#711] not reliable; cons valid 1 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 1 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#712] not reliable; cons valid 4 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 4 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#713] not reliable; cons valid 2 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 2 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#716] not reliable; cons valid 7 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 7 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#721] not reliable; cons valid 0 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#728] not reliable; cons valid 6 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 6 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#729] not reliable; cons valid 5 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 5 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#737] not reliable; cons valid 0 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 0 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#766] not reliable; cons valid 1 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 1 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [AV#768] not reliable; cons valid 3 < 10
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: cons valid 3 < 10, don't use single replication
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [HOST#8143] app version 842 is reliable
2014-06-05 09:04:29.2614 [PID=4416 ] [send] set_trust: random choice for cons valid 13: yes
2014-06-05 09:04:29.2614 [PID=4416 ] [send] [HOST#8143] app version 843 is reliable
2014-06-05 09:04:29.2615 [PID=4416 ] [send] set_trust: random choice for cons valid 12: yes
2014-06-05 09:04:29.2615 [PID=4416 ] [send] Not using matchmaker scheduling; Not using EDF sim
2014-06-05 09:04:29.2615 [PID=4416 ] [send] CPU: req 259200.00 sec, 3.00 instances; est delay 0.00
2014-06-05 09:04:29.2615 [PID=4416 ] [send] AMD/ATI GPU: req 0.00 sec, 0.00 instances; est delay 0.00
2014-06-05 09:04:29.2616 [PID=4416 ] [send] work_req_seconds: 259200.00 secs
2014-06-05 09:04:29.2616 [PID=4416 ] [send] available disk 96.01 GB, work_buf_min 64800
2014-06-05 09:04:29.2616 [PID=4416 ] [send] on_frac 0.741574 active_frac 0.983919 gpu_active_frac 0.983914
2014-06-05 09:04:29.2616 [PID=4416 ] [send] CPU features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx tm2 pbe
2014-06-05 09:04:29.2622 [PID=4416 ] [mixed] sending locality work first
2014-06-05 09:04:29.2622 [PID=4416 ] [locality] [HOST#8143] removing file rand_PAS.bank.v3 from file_infos list
2014-06-05 09:04:29.2623 [PID=4416 ] [locality] [HOST#8143] removing file JPLEPH.405 from file_infos list
2014-06-05 09:04:29.2624 [PID=4416 ] [locality] [HOST#8143] removing file stochastic_full.bank from file_infos list
2014-06-05 09:04:29.2624 [PID=4416 ] [locality] [HOST#8143] removing file earth_09_11 from file_infos list
2014-06-05 09:04:29.2624 [PID=4416 ] [locality] [HOST#8143] removing file sun_09_11 from file_infos list
2014-06-05 09:04:29.2625 [PID=4416 ] [locality] [HOST#8143] removing file l1_0071.20_S6Direct from file_infos list
2014-06-05 09:04:29.2625 [PID=4416 ] [locality] [HOST#8143] removing file l1_0071.25_S6Direct from file_infos list
2014-06-05 09:04:29.2625 [PID=4416 ] [locality] [HOST#8143] removing file l1_0071.30_S6Direct from file_infos list
2014-06-05 09:04:29.2625 [PID=4416 ] [locality] [HOST#8143] removing file l1_0071.35_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0071.40_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0071.45_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0996.80_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0996.85_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0996.90_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0996.95_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.00_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.05_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.10_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.15_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.20_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.25_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.30_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.35_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.40_S6Direct from file_infos list
2014-06-05 09:04:29.2626 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.45_S6Direct from file_infos list
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.50_S6Direct from file_infos list
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.55_S6Direct from file_infos list
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.60_S6Direct from file_infos list
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.65_S6Direct from file_infos list
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.70_S6Direct from file_infos list
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.75_S6Direct from file_infos list
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.80_S6Direct from file_infos list
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] removing file l1_0997.85_S6Direct from file_infos list
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] has file h1_0071.20_S6Direct
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] has file h1_0071.25_S6Direct
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] has file h1_0071.30_S6Direct
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] has file h1_0071.35_S6Direct
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] has file h1_0071.40_S6Direct
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] has file h1_0071.45_S6Direct
2014-06-05 09:04:29.2627 [PID=4416 ] [locality] [HOST#8143] has file h1_0996.80_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0996.85_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0996.90_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0996.95_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.00_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.05_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.10_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.15_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.20_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.25_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.30_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.35_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.40_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.45_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.50_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.55_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.60_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.65_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.70_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.75_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.80_S6Direct
2014-06-05 09:04:29.2628 [PID=4416 ] [locality] [HOST#8143] has file h1_0997.85_S6Direct
2014-06-05 09:04:29.2657 [PID=4416 ] [version] get_app_version(): getting app version for WU#595851 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1943) appid:21
2014-06-05 09:04:29.2657 [PID=4416 ] [version] looking for version of einsteinbinary_BRP4
2014-06-05 09:04:29.2658 [PID=4416 ] [version] Checking plan class 'BRP4X64'
2014-06-05 09:04:29.2667 [PID=4416 ] [version] reading plan classes from file '/BOINC/projects/AlbertAtHome/plan_class_spec.xml'
2014-06-05 09:04:29.2667 [PID=4416 ] [version] [AV#588] (BRP4X64) setting projected flops based on host elapsed time avg: 4.93G
2014-06-05 09:04:29.2667 [PID=4416 ] [version] [AV#588] (BRP4X64) comparison pfc: 5.17G et: 4.93G
2014-06-05 09:04:29.2667 [PID=4416 ] [version] Best app version is now AV588 (5.05 GFLOP)
2014-06-05 09:04:29.2668 [PID=4416 ] [version] [AV#588] (BRP4X64) setting projected flops based on host elapsed time avg: 4.93G
2014-06-05 09:04:29.2668 [PID=4416 ] [version] [AV#588] (BRP4X64) comparison pfc: 5.17G et: 4.93G
2014-06-05 09:04:29.2668 [PID=4416 ] [version] Best version of app einsteinbinary_BRP4 is [AV#588] (4.93 GFLOPS)
2014-06-05 09:04:29.2668 [PID=4416 ] [send] est delay 0, skipping deadline check
2014-06-05 09:04:29.2688 [PID=4416 ] [send] Sending app_version einsteinbinary_BRP4 7 133 BRP4X64; projected 4.93 GFLOPS
2014-06-05 09:04:29.2689 [PID=4416 ] [CRITICAL] No filename found in [WU#595851 p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1943]
2014-06-05 09:04:29.2689 [PID=4416 ] [send] est. duration for WU 595851: unscaled 3548.71 scaled 4863.59
2014-06-05 09:04:29.2689 [PID=4416 ] [send] [HOST#8143] sending [RESULT#1431784 p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1943_0] (est. dur. 4863.59s (1h21m03s58)) (max time 70974.20s (19h42m54s19))
2014-06-05 09:04:29.2714 [PID=4416 ] [locality] send_old_work(p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1943_0) sent result created 463.2 hours ago [RESULT#1431784]
2014-06-05 09:04:29.2714 [PID=4416 ] [locality] Note: sent NON-LOCALITY result p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1943_0
2014-06-05 09:04:29.2714 [PID=4416 ] [locality] send_results_for_file(h1_0071.20_S6Direct)
2014-06-05 09:04:29.2734 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.20_S6Direct, 0) prev_result.id=1454321
2014-06-05 09:04:29.2747 [PID=4416 ] [version] get_app_version(): getting app version for WU#606152 (h1_0071.20_S6Direct__S6CasAf40_71.35Hz_1) appid:28
2014-06-05 09:04:29.2747 [PID=4416 ] [version] looking for version of einstein_S6CasA
2014-06-05 09:04:29.2747 [PID=4416 ] [version] Checking plan class 'SSE2'
2014-06-05 09:04:29.2748 [PID=4416 ] [version] [AV#707] (SSE2) adjusting projected flops based on PFC avg: 2.45G
2014-06-05 09:04:29.2748 [PID=4416 ] [version] Best app version is now AV707 (8.22 GFLOP)
2014-06-05 09:04:29.2748 [PID=4416 ] [version] [AV#707] (SSE2) adjusting projected flops based on PFC avg: 2.45G
2014-06-05 09:04:29.2748 [PID=4416 ] [version] Best version of app einstein_S6CasA is [AV#707] (2.45 GFLOPS)
2014-06-05 09:04:29.2748 [PID=4416 ] [send] est. duration for WU 606152: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.2748 [PID=4416 ] [send] [WU#606152] meets deadline: 607.95 + 36212.11 < 604800
2014-06-05 09:04:29.2768 [PID=4416 ] [send] Sending app_version einstein_S6CasA 2 105 SSE2; projected 2.45 GFLOPS
2014-06-05 09:04:29.2770 [PID=4416 ] [locality] [HOST#8143] Already has file h1_0071.20_S6Direct
2014-06-05 09:04:29.2770 [PID=4416 ] [locality] [HOST#8143] reducing disk needed for WU by 5387688 bytes (length of h1_0071.20_S6Direct)
2014-06-05 09:04:29.2770 [PID=4416 ] [send] est. duration for WU 606152: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.2770 [PID=4416 ] [send] [HOST#8143] sending [RESULT#1454335 h1_0071.20_S6Direct__S6CasAf40_71.35Hz_1_0] (est. dur. 36212.11s (10h03m32s10)) (max time 528442.38s (146h47m22s38))
2014-06-05 09:04:29.4721 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.20_S6Direct, 1) prev_result.id=1454335
2014-06-05 09:04:29.4737 [PID=4416 ] [version] get_app_version(): getting app version for WU#606153 (h1_0071.20_S6Direct__S6CasAf40_71.35Hz_0) appid:28
2014-06-05 09:04:29.4737 [PID=4416 ] [version] returning cached version: [AV#707]
2014-06-05 09:04:29.4738 [PID=4416 ] [send] est. duration for WU 606153: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.4738 [PID=4416 ] [send] [WU#606153] meets deadline: 5134.46 + 36212.11 < 604800
2014-06-05 09:04:29.4748 [PID=4416 ] [send] Sending app_version einstein_S6CasA 2 105 SSE2; projected 2.45 GFLOPS
2014-06-05 09:04:29.4751 [PID=4416 ] [locality] [HOST#8143] Already has file h1_0071.20_S6Direct
2014-06-05 09:04:29.4752 [PID=4416 ] [locality] [HOST#8143] reducing disk needed for WU by 5387688 bytes (length of h1_0071.20_S6Direct)
2014-06-05 09:04:29.4752 [PID=4416 ] [send] est. duration for WU 606153: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.4752 [PID=4416 ] [send] [HOST#8143] sending [RESULT#1454337 h1_0071.20_S6Direct__S6CasAf40_71.35Hz_0_0] (est. dur. 36212.11s (10h03m32s10)) (max time 528442.38s (146h47m22s38))
2014-06-05 09:04:29.4766 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.20_S6Direct, 2) prev_result.id=1454337
2014-06-05 09:04:29.4779 [PID=4416 ] [debug] [locality] trigger h1_0071.20_S6Direct state after retrieval: nw=0 wa=1 nwa=1 wsr=0
2014-06-05 09:04:29.4780 [PID=4416 ] [locality] work generator says no work remaining for trigger h1_0071.20_S6Direct
2014-06-05 09:04:29.4780 [PID=4416 ] [locality] make_more_work_for_file(h1_0071.20_S6Direct, 2)=-1
2014-06-05 09:04:29.4785 [PID=4416 ] [locality] send_results_for_file(h1_0071.25_S6Direct)
2014-06-05 09:04:29.4792 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.25_S6Direct, 0) prev_result.id=1281036
2014-06-05 09:04:29.4804 [PID=4416 ] [version] get_app_version(): getting app version for WU#606150 (h1_0071.25_S6Direct__S6CasAf40_71.35Hz_4) appid:28
2014-06-05 09:04:29.4804 [PID=4416 ] [version] returning cached version: [AV#707]
2014-06-05 09:04:29.4805 [PID=4416 ] [send] est. duration for WU 606150: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.4805 [PID=4416 ] [send] [WU#606150] meets deadline: 9660.98 + 36212.11 < 604800
2014-06-05 09:04:29.4812 [PID=4416 ] [send] Sending app_version einstein_S6CasA 2 105 SSE2; projected 2.45 GFLOPS
2014-06-05 09:04:29.4814 [PID=4416 ] [locality] [HOST#8143] Already has file h1_0071.25_S6Direct
2014-06-05 09:04:29.4814 [PID=4416 ] [locality] [HOST#8143] reducing disk needed for WU by 5387688 bytes (length of h1_0071.25_S6Direct)
2014-06-05 09:04:29.4814 [PID=4416 ] [send] est. duration for WU 606150: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.4814 [PID=4416 ] [send] [HOST#8143] sending [RESULT#1454328 h1_0071.25_S6Direct__S6CasAf40_71.35Hz_4_1] (est. dur. 36212.11s (10h03m32s10)) (max time 528442.38s (146h47m22s38))
2014-06-05 09:04:29.4824 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.25_S6Direct, 1) prev_result.id=1454328
2014-06-05 09:04:29.4834 [PID=4416 ] [version] get_app_version(): getting app version for WU#606151 (h1_0071.25_S6Direct__S6CasAf40_71.35Hz_3) appid:28
2014-06-05 09:04:29.4834 [PID=4416 ] [version] returning cached version: [AV#707]
2014-06-05 09:04:29.4834 [PID=4416 ] [send] est. duration for WU 606151: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.4835 [PID=4416 ] [send] [WU#606151] meets deadline: 14187.49 + 36212.11 < 604800
2014-06-05 09:04:29.4841 [PID=4416 ] [send] Sending app_version einstein_S6CasA 2 105 SSE2; projected 2.45 GFLOPS
2014-06-05 09:04:29.4845 [PID=4416 ] [locality] [HOST#8143] Already has file h1_0071.25_S6Direct
2014-06-05 09:04:29.4846 [PID=4416 ] [locality] [HOST#8143] reducing disk needed for WU by 5387688 bytes (length of h1_0071.25_S6Direct)
2014-06-05 09:04:29.4846 [PID=4416 ] [send] est. duration for WU 606151: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.4846 [PID=4416 ] [send] [HOST#8143] sending [RESULT#1454330 h1_0071.25_S6Direct__S6CasAf40_71.35Hz_3_1] (est. dur. 36212.11s (10h03m32s10)) (max time 528442.38s (146h47m22s38))
2014-06-05 09:04:29.5814 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.25_S6Direct, 2) prev_result.id=1454330
2014-06-05 09:04:29.5829 [PID=4416 ] [version] get_app_version(): getting app version for WU#606727 (h1_0071.25_S6Direct__S6CasAf40_71.35Hz_2) appid:28
2014-06-05 09:04:29.5829 [PID=4416 ] [version] returning cached version: [AV#707]
2014-06-05 09:04:29.5829 [PID=4416 ] [send] est. duration for WU 606727: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.5829 [PID=4416 ] [send] [WU#606727] meets deadline: 18714.00 + 36212.11 < 604800
2014-06-05 09:04:29.5838 [PID=4416 ] [send] Sending app_version einstein_S6CasA 2 105 SSE2; projected 2.45 GFLOPS
2014-06-05 09:04:29.5840 [PID=4416 ] [locality] [HOST#8143] Already has file h1_0071.25_S6Direct
2014-06-05 09:04:29.5840 [PID=4416 ] [locality] [HOST#8143] reducing disk needed for WU by 5387688 bytes (length of h1_0071.25_S6Direct)
2014-06-05 09:04:29.5840 [PID=4416 ] [send] est. duration for WU 606727: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.5840 [PID=4416 ] [send] [HOST#8143] sending [RESULT#1455921 h1_0071.25_S6Direct__S6CasAf40_71.35Hz_2_0] (est. dur. 36212.11s (10h03m32s10)) (max time 528442.38s (146h47m22s38))
2014-06-05 09:04:29.5851 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.25_S6Direct, 3) prev_result.id=1455921
2014-06-05 09:04:29.5863 [PID=4416 ] [version] get_app_version(): getting app version for WU#606728 (h1_0071.25_S6Direct__S6CasAf40_71.4Hz_1) appid:28
2014-06-05 09:04:29.5863 [PID=4416 ] [version] returning cached version: [AV#707]
2014-06-05 09:04:29.5863 [PID=4416 ] [send] est. duration for WU 606728: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.5863 [PID=4416 ] [send] [WU#606728] meets deadline: 23240.52 + 36212.11 < 604800
2014-06-05 09:04:29.5871 [PID=4416 ] [send] Sending app_version einstein_S6CasA 2 105 SSE2; projected 2.45 GFLOPS
2014-06-05 09:04:29.5873 [PID=4416 ] [locality] [HOST#8143] Already has file h1_0071.25_S6Direct
2014-06-05 09:04:29.5873 [PID=4416 ] [locality] [HOST#8143] reducing disk needed for WU by 5387688 bytes (length of h1_0071.25_S6Direct)
2014-06-05 09:04:29.5873 [PID=4416 ] [send] est. duration for WU 606728: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:29.5873 [PID=4416 ] [send] [HOST#8143] sending [RESULT#1455923 h1_0071.25_S6Direct__S6CasAf40_71.4Hz_1_0] (est. dur. 36212.11s (10h03m32s10)) (max time 528442.38s (146h47m22s38))
2014-06-05 09:04:29.5884 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.25_S6Direct, 4) prev_result.id=1455923
2014-06-05 09:04:29.5893 [PID=4416 ] [debug] [locality] trigger h1_0071.25_S6Direct state after retrieval: nw=0 wa=1 nwa=0 wsr=0
2014-06-05 09:04:29.5902 [PID=4416 ] [locality] make_more_work_for_file(h1_0071.25_S6Direct, 4)=0
2014-06-05 09:04:31.5905 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.25_S6Direct, 5) prev_result.id=1455923
2014-06-05 09:04:31.5915 [PID=4416 ] [debug] [locality] trigger h1_0071.25_S6Direct state after retrieval: nw=0 wa=1 nwa=1 wsr=0
2014-06-05 09:04:31.5915 [PID=4416 ] [locality] work generator says no work remaining for trigger h1_0071.25_S6Direct
2014-06-05 09:04:31.5915 [PID=4416 ] [locality] make_more_work_for_file(h1_0071.25_S6Direct, 5)=-1
2014-06-05 09:04:31.5920 [PID=4416 ] [locality] send_results_for_file(h1_0071.30_S6Direct)
2014-06-05 09:04:31.5926 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.30_S6Direct, 0) prev_result.id=0
2014-06-05 09:04:31.5935 [PID=4416 ] [version] get_app_version(): getting app version for WU#606767 (h1_0071.30_S6Direct__S6CasAf40_71.4Hz_5) appid:28
2014-06-05 09:04:31.5935 [PID=4416 ] [version] returning cached version: [AV#707]
2014-06-05 09:04:31.5935 [PID=4416 ] [send] est. duration for WU 606767: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:31.5935 [PID=4416 ] [send] [WU#606767] meets deadline: 27767.03 + 36212.11 < 604800
2014-06-05 09:04:31.5944 [PID=4416 ] [send] Sending app_version einstein_S6CasA 2 105 SSE2; projected 2.45 GFLOPS
2014-06-05 09:04:31.5945 [PID=4416 ] [locality] [HOST#8143] Already has file h1_0071.30_S6Direct
2014-06-05 09:04:31.5946 [PID=4416 ] [locality] [HOST#8143] reducing disk needed for WU by 5387688 bytes (length of h1_0071.30_S6Direct)
2014-06-05 09:04:31.5946 [PID=4416 ] [send] est. duration for WU 606767: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:31.5946 [PID=4416 ] [send] [HOST#8143] sending [RESULT#1456528 h1_0071.30_S6Direct__S6CasAf40_71.4Hz_5_0] (est. dur. 36212.11s (10h03m32s10)) (max time 528442.38s (146h47m22s38))
2014-06-05 09:04:31.5960 [PID=4416 ] [locality] in_send_results_for_file(h1_0071.30_S6Direct, 1) prev_result.id=1456528
2014-06-05 09:04:31.5971 [PID=4416 ] [version] get_app_version(): getting app version for WU#606768 (h1_0071.30_S6Direct__S6CasAf40_71.4Hz_4) appid:28
2014-06-05 09:04:31.5971 [PID=4416 ] [version] returning cached version: [AV#707]
2014-06-05 09:04:31.5971 [PID=4416 ] [send] est. duration for WU 606768: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:31.5971 [PID=4416 ] [send] [WU#606768] meets deadline: 32293.54 + 36212.11 < 604800
2014-06-05 09:04:31.5980 [PID=4416 ] [send] Sending app_version einstein_S6CasA 2 105 SSE2; projected 2.45 GFLOPS
2014-06-05 09:04:31.5987 [PID=4416 ] [locality] [HOST#8143] Already has file h1_0071.30_S6Direct
2014-06-05 09:04:31.5988 [PID=4416 ] [locality] [HOST#8143] reducing disk needed for WU by 5387688 bytes (length of h1_0071.30_S6Direct)
2014-06-05 09:04:31.5988 [PID=4416 ] [send] est. duration for WU 606768: unscaled 26422.12 scaled 36212.11
2014-06-05 09:04:31.5988 [PID=4416 ] [send] [HOST#8143] sending [RESULT#1456530 h1_0071.30_S6Direct__S6CasAf40_71.4Hz_4_0] (est. dur. 36212.11s (10h03m32s10)) (max time 528442.38s (146h47m22s38))
2014-06-05 09:04:31.6014 [PID=4416 ] [send] don't need more work
2014-06-05 09:04:31.6015 [PID=4416 ] [send] don't need more work
2014-06-05 09:04:31.6015 [PID=4416 ] [send] don't need more work
2014-06-05 09:04:31.6015 [PID=4416 ] [mixed] sending non-locality work second
2014-06-05 09:04:31.6074 [PID=4416 ] [version] get_app_version(): getting app version for WU#595892 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1984) appid:21
2014-06-05 09:04:31.6075 [PID=4416 ] [version] looking for version of einsteinbinary_BRP4
2014-06-05 09:04:31.6075 [PID=4416 ] [version] Checking plan class 'BRP4X64'
2014-06-05 09:04:31.6075 [PID=4416 ] [version] [AV#588] Don't need CPU jobs, skipping
2014-06-05 09:04:31.6075 [PID=4416 ] [version] Checking plan class 'BRP4SSE'
2014-06-05 09:04:31.6075 [PID=4416 ] [version] [AV#598] Don't need CPU jobs, skipping
2014-06-05 09:04:31.6075 [PID=4416 ] [version] returning NULL; platforms:
2014-06-05 09:04:31.6075 [PID=4416 ] [version] windows_x86_64
2014-06-05 09:04:31.6075 [PID=4416 ] [version] windows_intelx86
2014-06-05 09:04:31.6075 [PID=4416 ] [version] get_app_version(): getting app version for WU#606386 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3280) appid:29
2014-06-05 09:04:31.6075 [PID=4416 ] [version] looking for version of einsteinbinary_BRP4G
2014-06-05 09:04:31.6076 [PID=4416 ] [version] Checking plan class 'BRP4G-opencl-ati'
2014-06-05 09:04:31.6076 [PID=4416 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_brp' : true : 1.000000
2014-06-05 09:04:31.6076 [PID=4416 ] [version] [AV#721] Skipping AMD/ATI GPU version - user prefs say no AMD/ATI GPU
2014-06-05 09:04:31.6076 [PID=4416 ] [version] Checking plan class 'BRP4G-cuda32'
2014-06-05 09:04:31.6076 [PID=4416 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_brp' : true : 1.000000
2014-06-05 09:04:31.6076 [PID=4416 ] [version] plan_class_spec: No NVIDIA GPUs found
2014-06-05 09:04:31.6076 [PID=4416 ] [version] [AV#723] app_plan() returned false
2014-06-05 09:04:31.6076 [PID=4416 ] [version] Checking plan class 'BRP4G-cuda32-nv301'
2014-06-05 09:04:31.6076 [PID=4416 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_brp' : true : 1.000000
2014-06-05 09:04:31.6076 [PID=4416 ] [version] plan_class_spec: No NVIDIA GPUs found
2014-06-05 09:04:31.6076 [PID=4416 ] [version] [AV#716] app_plan() returned false
2014-06-05 09:04:31.6077 [PID=4416 ] [version] Checking plan class 'BRP4G-opencl-ati'
2014-06-05 09:04:31.6077 [PID=4416 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_brp' : true : 1.000000
2014-06-05 09:04:31.6077 [PID=4416 ] [version] [AV#720] Skipping AMD/ATI GPU version - user prefs say no AMD/ATI GPU
2014-06-05 09:04:31.6077 [PID=4416 ] [version] returning NULL; platforms:
2014-06-05 09:04:31.6077 [PID=4416 ] [version] windows_x86_64
2014-06-05 09:04:31.6077 [PID=4416 ] [version] windows_intelx86
2014-06-05 09:04:31.6077 [PID=4416 ] [version] get_app_version(): getting app version for WU#595880 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1972) appid:21
2014-06-05 09:04:31.6077 [PID=4416 ] [version] get_app_version(): getting app version for WU#595895 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1987) appid:21
2014-06-05 09:04:31.6077 [PID=4416 ] [version] get_app_version(): getting app version for WU#603298 (p2030.20131124.G176.16-01.04.S.b4s0g0.00000_1728) appid:29
2014-06-05 09:04:31.6078 [PID=4416 ] [version] get_app_version(): getting app version for WU#595893 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1985) appid:21
2014-06-05 09:04:31.6078 [PID=4416 ] [version] get_app_version(): getting app version for WU#595894 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1986) appid:21
2014-06-05 09:04:31.6078 [PID=4416 ] [version] get_app_version(): getting app version for WU#606383 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3232) appid:29
2014-06-05 09:04:31.6079 [PID=4416 ] [version] get_app_version(): getting app version for WU#595881 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1973) appid:21
2014-06-05 09:04:31.6079 [PID=4416 ] [version] get_app_version(): getting app version for WU#595882 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1974) appid:21
2014-06-05 09:04:31.6079 [PID=4416 ] [version] get_app_version(): getting app version for WU#606396 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3440) appid:29
2014-06-05 09:04:31.6079 [PID=4416 ] [version] get_app_version(): getting app version for WU#595882 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1974) appid:21
2014-06-05 09:04:31.6079 [PID=4416 ] [version] get_app_version(): getting app version for WU#595883 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1975) appid:21
2014-06-05 09:04:31.6080 [PID=4416 ] [version] get_app_version(): getting app version for WU#606384 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3248) appid:29
2014-06-05 09:04:31.6080 [PID=4416 ] [version] get_app_version(): getting app version for WU#595843 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1935) appid:21
2014-06-05 09:04:31.6080 [PID=4416 ] [version] get_app_version(): getting app version for WU#595883 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1975) appid:21
2014-06-05 09:04:31.6080 [PID=4416 ] [version] get_app_version(): getting app version for WU#606397 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3456) appid:29
2014-06-05 09:04:31.6080 [PID=4416 ] [version] get_app_version(): getting app version for WU#595844 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1936) appid:21
2014-06-05 09:04:31.6081 [PID=4416 ] [version] get_app_version(): getting app version for WU#595884 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1976) appid:21
2014-06-05 09:04:31.6081 [PID=4416 ] [version] get_app_version(): getting app version for WU#606397 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3456) appid:29
2014-06-05 09:04:31.6081 [PID=4416 ] [version] get_app_version(): getting app version for WU#595884 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1976) appid:21
2014-06-05 09:04:31.6081 [PID=4416 ] [version] get_app_version(): getting app version for WU#595885 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1977) appid:21
2014-06-05 09:04:31.6081 [PID=4416 ] [version] get_app_version(): getting app version for WU#606398 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3472) appid:29
2014-06-05 09:04:31.6082 [PID=4416 ] [version] get_app_version(): getting app version for WU#595885 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1977) appid:21
2014-06-05 09:04:31.6082 [PID=4416 ] [version] get_app_version(): getting app version for WU#595886 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1978) appid:21
2014-06-05 09:04:31.6082 [PID=4416 ] [version] get_app_version(): getting app version for WU#603293 (p2030.20131124.G176.16-01.04.S.b4s0g0.00000_1648) appid:29
2014-06-05 09:04:31.6082 [PID=4416 ] [version] get_app_version(): getting app version for WU#595886 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1978) appid:21
2014-06-05 09:04:31.6082 [PID=4416 ] [version] get_app_version(): getting app version for WU#595887 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1979) appid:21
2014-06-05 09:04:31.6083 [PID=4416 ] [version] get_app_version(): getting app version for WU#606398 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3472) appid:29
2014-06-05 09:04:31.6083 [PID=4416 ] [version] get_app_version(): getting app version for WU#595837 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1929) appid:21
2014-06-05 09:04:31.6083 [PID=4416 ] [version] get_app_version(): getting app version for WU#595887 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1979) appid:21
2014-06-05 09:04:31.6083 [PID=4416 ] [version] get_app_version(): getting app version for WU#603295 (p2030.20131124.G176.16-01.04.S.b4s0g0.00000_1680) appid:29
2014-06-05 09:04:31.6084 [PID=4416 ] [version] get_app_version(): getting app version for WU#595888 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1980) appid:21
2014-06-05 09:04:31.6084 [PID=4416 ] [version] get_app_version(): getting app version for WU#595888 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1980) appid:21
2014-06-05 09:04:31.6084 [PID=4416 ] [version] get_app_version(): getting app version for WU#603300 (p2030.20131124.G176.16-01.04.S.b4s0g0.00000_1760) appid:29
2014-06-05 09:04:31.6084 [PID=4416 ] [version] get_app_version(): getting app version for WU#595839 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1931) appid:21
2014-06-05 09:04:31.6084 [PID=4416 ] [version] get_app_version(): getting app version for WU#595889 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1981) appid:21
2014-06-05 09:04:31.6085 [PID=4416 ] [version] get_app_version(): getting app version for WU#603317 (p2030.20131124.G176.16-01.04.S.b4s0g0.00000_2032) appid:29
2014-06-05 09:04:31.6085 [PID=4416 ] [version] get_app_version(): getting app version for WU#595864 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1956) appid:21
2014-06-05 09:04:31.6085 [PID=4416 ] [version] get_app_version(): getting app version for WU#595889 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1981) appid:21
2014-06-05 09:04:31.6085 [PID=4416 ] [version] get_app_version(): getting app version for WU#603297 (p2030.20131124.G176.16-01.04.S.b4s0g0.00000_1712) appid:29
2014-06-05 09:04:31.6085 [PID=4416 ] [version] get_app_version(): getting app version for WU#595890 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1982) appid:21
2014-06-05 09:04:31.6086 [PID=4416 ] [version] get_app_version(): getting app version for WU#595865 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1957) appid:21
2014-06-05 09:04:31.6086 [PID=4416 ] [version] get_app_version(): getting app version for WU#595867 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1959) appid:21
2014-06-05 09:04:31.6086 [PID=4416 ] [version] get_app_version(): getting app version for WU#603296 (p2030.20131124.G176.16-01.04.S.b4s0g0.00000_1696) appid:29
2014-06-05 09:04:31.6086 [PID=4416 ] [version] get_app_version(): getting app version for WU#595868 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1960) appid:21
2014-06-05 09:04:31.6086 [PID=4416 ] [version] get_app_version(): getting app version for WU#595862 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1954) appid:21
2014-06-05 09:04:31.6087 [PID=4416 ] [version] get_app_version(): getting app version for WU#606388 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3312) appid:29
2014-06-05 09:04:31.6087 [PID=4416 ] [version] get_app_version(): getting app version for WU#595894 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1986) appid:21
2014-06-05 09:04:31.6087 [PID=4416 ] [version] get_app_version(): getting app version for WU#595869 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1961) appid:21
2014-06-05 09:04:31.6087 [PID=4416 ] [version] get_app_version(): getting app version for WU#606388 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3312) appid:29
2014-06-05 09:04:31.6087 [PID=4416 ] [version] get_app_version(): getting app version for WU#595863 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1955) appid:21
2014-06-05 09:04:31.6088 [PID=4416 ] [version] get_app_version(): getting app version for WU#595869 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1961) appid:21
2014-06-05 09:04:31.6088 [PID=4416 ] [version] get_app_version(): getting app version for WU#606389 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3328) appid:29
2014-06-05 09:04:31.6088 [PID=4416 ] [version] get_app_version(): getting app version for WU#595851 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1943) appid:21
2014-06-05 09:04:31.6088 [PID=4416 ] [version] get_app_version(): getting app version for WU#595870 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1962) appid:21
2014-06-05 09:04:31.6088 [PID=4416 ] [version] get_app_version(): getting app version for WU#606389 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3328) appid:29
2014-06-05 09:04:31.6089 [PID=4416 ] [version] get_app_version(): getting app version for WU#595870 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1962) appid:21
2014-06-05 09:04:31.6089 [PID=4416 ] [version] get_app_version(): getting app version for WU#595859 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1951) appid:21
2014-06-05 09:04:31.6089 [PID=4416 ] [version] get_app_version(): getting app version for WU#606390 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3344) appid:29
2014-06-05 09:04:31.6089 [PID=4416 ] [version] get_app_version(): getting app version for WU#595871 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1963) appid:21
2014-06-05 09:04:31.6090 [PID=4416 ] [version] get_app_version(): getting app version for WU#595860 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1952) appid:21
2014-06-05 09:04:31.6090 [PID=4416 ] [version] get_app_version(): getting app version for WU#606390 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3344) appid:29
2014-06-05 09:04:31.6090 [PID=4416 ] [version] get_app_version(): getting app version for WU#595852 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1944) appid:21
2014-06-05 09:04:31.6090 [PID=4416 ] [version] get_app_version(): getting app version for WU#595871 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1963) appid:21
2014-06-05 09:04:31.6090 [PID=4416 ] [version] get_app_version(): getting app version for WU#606391 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3360) appid:29
2014-06-05 09:04:31.6091 [PID=4416 ] [version] get_app_version(): getting app version for WU#595872 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1964) appid:21
2014-06-05 09:04:31.6091 [PID=4416 ] [version] get_app_version(): getting app version for WU#595872 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1964) appid:21
2014-06-05 09:04:31.6091 [PID=4416 ] [version] get_app_version(): getting app version for WU#606391 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3360) appid:29
2014-06-05 09:04:31.6091 [PID=4416 ] [version] get_app_version(): getting app version for WU#595853 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1945) appid:21
2014-06-05 09:04:31.6092 [PID=4416 ] [version] get_app_version(): getting app version for WU#595873 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1965) appid:21
2014-06-05 09:04:31.6092 [PID=4416 ] [version] get_app_version(): getting app version for WU#606392 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3376) appid:29
2014-06-05 09:04:31.6092 [PID=4416 ] [version] get_app_version(): getting app version for WU#595873 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1965) appid:21
2014-06-05 09:04:31.6092 [PID=4416 ] [version] get_app_version(): getting app version for WU#595831 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1923) appid:21
2014-06-05 09:04:31.6092 [PID=4416 ] [version] get_app_version(): getting app version for WU#606392 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3376) appid:29
2014-06-05 09:04:31.6093 [PID=4416 ] [version] get_app_version(): getting app version for WU#595874 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1966) appid:21
2014-06-05 09:04:31.6093 [PID=4416 ] [version] get_app_version(): getting app version for WU#595874 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1966) appid:21
2014-06-05 09:04:31.6093 [PID=4416 ] [version] get_app_version(): getting app version for WU#606393 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3392) appid:29
2014-06-05 09:04:31.6093 [PID=4416 ] [version] get_app_version(): getting app version for WU#595861 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1953) appid:21
2014-06-05 09:04:31.6094 [PID=4416 ] [version] get_app_version(): getting app version for WU#595875 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1967) appid:21
2014-06-05 09:04:31.6094 [PID=4416 ] [version] get_app_version(): getting app version for WU#606393 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3392) appid:29
2014-06-05 09:04:31.6094 [PID=4416 ] [version] get_app_version(): getting app version for WU#595866 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1958) appid:21
2014-06-05 09:04:31.6094 [PID=4416 ] [version] get_app_version(): getting app version for WU#595875 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1967) appid:21
2014-06-05 09:04:31.6095 [PID=4416 ] [version] get_app_version(): getting app version for WU#606394 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3408) appid:29
2014-06-05 09:04:31.6095 [PID=4416 ] [version] get_app_version(): getting app version for WU#595876 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1968) appid:21
2014-06-05 09:04:31.6095 [PID=4416 ] [version] get_app_version(): getting app version for WU#595876 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1968) appid:21
2014-06-05 09:04:31.6095 [PID=4416 ] [version] get_app_version(): getting app version for WU#606394 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3408) appid:29
2014-06-05 09:04:31.6096 [PID=4416 ] [version] get_app_version(): getting app version for WU#595877 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1969) appid:21
2014-06-05 09:04:31.6096 [PID=4416 ] [version] get_app_version(): getting app version for WU#595890 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1982) appid:21
2014-06-05 09:04:31.6096 [PID=4416 ] [version] get_app_version(): getting app version for WU#606395 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3424) appid:29
2014-06-05 09:04:31.6096 [PID=4416 ] [version] get_app_version(): getting app version for WU#595891 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1983) appid:21
2014-06-05 09:04:31.6097 [PID=4416 ] [version] get_app_version(): getting app version for WU#595891 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1983) appid:21
2014-06-05 09:04:31.6097 [PID=4416 ] [version] get_app_version(): getting app version for WU#603324 (p2030.20131124.G176.16-01.04.S.b4s0g0.00000_2144) appid:29
2014-06-05 09:04:31.6097 [PID=4416 ] [version] get_app_version(): getting app version for WU#595878 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1970) appid:21
2014-06-05 09:04:31.6097 [PID=4416 ] [version] get_app_version(): getting app version for WU#595892 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1984) appid:21
2014-06-05 09:04:31.6097 [PID=4416 ] [version] get_app_version(): getting app version for WU#606310 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_2064) appid:29
2014-06-05 09:04:31.6098 [PID=4416 ] [version] get_app_version(): getting app version for WU#595879 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1971) appid:21
2014-06-05 09:04:31.6834 [PID=4416 ] Sending reply to [HOST#8143]: 9 results, delay req 60.00
2014-06-05 09:04:31.6840 [PID=4416 ] Scheduler ran 2.429 seconds


Claggy
ID: 112882 · Report as offensive     Reply Quote
Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 15 Oct 04
Posts: 1956
Credit: 6,218,130
RAC: 0
Message 112883 - Posted: 5 Jun 2014, 9:38:18 UTC - in response to Message 112882.  

Thanks for reporting. This looks like a bug to me in current server (scheduler) code. May take a bit of time to investigate, though.

BM
ID: 112883 · Report as offensive     Reply Quote
Claggy

Send message
Joined: 29 Dec 06
Posts: 78
Credit: 4,040,969
RAC: 0
Message 112884 - Posted: 5 Jun 2014, 10:30:16 UTC - in response to Message 112883.  
Last modified: 5 Jun 2014, 11:05:57 UTC

My this morning, ATI BRP4G tasks report the same wacky speeds (and short estimated durations) as last night,

Edit: got them all physically removed from my client_state.xml so they can be resent when the scheduler is fixed.

https://albert.phys.uwm.edu/host_sched_logs/8/8143

2014-06-05 09:56:29.7913 [PID=7201 ] [version] looking for version of einsteinbinary_BRP4G
2014-06-05 09:56:29.7913 [PID=7201 ] [version] Checking plan class 'BRP4G-opencl-ati'
2014-06-05 09:56:29.7913 [PID=7201 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_brp' : true : 1.000000
2014-06-05 09:56:29.7913 [PID=7201 ] [version] [AV#721] (BRP4G-opencl-ati) adjusting projected flops based on PFC avg: 34968.78G
2014-06-05 09:56:29.7913 [PID=7201 ] [version] Best app version is now AV721 (18620.28 GFLOP)
2014-06-05 09:56:29.7913 [PID=7201 ] [version] [AV#721] (BRP4G-opencl-ati) adjusting projected flops based on PFC avg: 34968.78G
2014-06-05 09:56:29.7914 [PID=7201 ] [version] Best version of app einsteinbinary_BRP4G is [AV#721] (34968.78 GFLOPS)
2014-06-05 09:56:29.7914 [PID=7201 ] [send] est delay 0, skipping deadline check
2014-06-05 09:56:29.7914 [PID=7201 ] [version] get_app_version(): getting app version for WU#606395 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3424) appid:29
2014-06-05 09:56:29.7914 [PID=7201 ] [version] returning cached version: [AV#721]
2014-06-05 09:56:29.7914 [PID=7201 ] [send] est delay 0, skipping deadline check
2014-06-05 09:56:29.7923 [PID=7201 ] [RESULT#1454918] expected to be unsent; instead, state is 4
2014-06-05 09:56:29.7923 [PID=7201 ] [version] get_app_version(): getting app version for WU#595902 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1994) appid:21
2014-06-05 09:56:29.7923 [PID=7201 ] [version] returning cached version: [AV#588]
2014-06-05 09:56:29.7923 [PID=7201 ] [version] get_app_version(): getting app version for WU#595873 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1965) appid:21
2014-06-05 09:56:29.7923 [PID=7201 ] [version] returning cached version: [AV#588]
2014-06-05 09:56:29.7923 [PID=7201 ] [version] get_app_version(): getting app version for WU#606395 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3424) appid:29
2014-06-05 09:56:29.7923 [PID=7201 ] [version] returning cached version: [AV#721]
2014-06-05 09:56:29.7923 [PID=7201 ] [send] est delay 0, skipping deadline check
2014-06-05 09:56:29.7924 [PID=7201 ] [version] get_app_version(): getting app version for WU#606395 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3424) appid:29
2014-06-05 09:56:29.7924 [PID=7201 ] [version] returning cached version: [AV#721]
2014-06-05 09:56:29.7924 [PID=7201 ] [send] est delay 0, skipping deadline check
2014-06-05 09:56:29.7928 [PID=7201 ] [RESULT#1454919] expected to be unsent; instead, state is 4
2014-06-05 09:56:29.7928 [PID=7201 ] [version] get_app_version(): getting app version for WU#595902 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_1994) appid:21
2014-06-05 09:56:29.7928 [PID=7201 ] [version] returning cached version: [AV#588]
2014-06-05 09:56:29.7928 [PID=7201 ] [version] get_app_version(): getting app version for WU#595915 (p2030.20131124.G175.86-01.90.N.b2s0g0.00000_2007) appid:21
2014-06-05 09:56:29.7928 [PID=7201 ] [version] returning cached version: [AV#588]
2014-06-05 09:56:29.7928 [PID=7201 ] [version] get_app_version(): getting app version for WU#606407 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3616) appid:29
2014-06-05 09:56:29.7929 [PID=7201 ] [version] returning cached version: [AV#721]
2014-06-05 09:56:29.7929 [PID=7201 ] [send] est delay 0, skipping deadline check
2014-06-05 09:56:29.7929 [PID=7201 ] [version] get_app_version(): getting app version for WU#606407 (p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3616) appid:29
2014-06-05 09:56:29.7929 [PID=7201 ] [version] returning cached version: [AV#721]
2014-06-05 09:56:29.7929 [PID=7201 ] [send] est delay 0, skipping deadline check
2014-06-05 09:56:29.7974 [PID=7201 ] [send] Sending app_version einsteinbinary_BRP4G 7 134 BRP4G-opencl-ati; projected 34968.78 GFLOPS
2014-06-05 09:56:29.7976 [PID=7201 ] [send] est. duration for WU 606407: unscaled 8.01 scaled 10.96
2014-06-05 09:56:29.7976 [PID=7201 ] [send] [HOST#8143] sending [RESULT#1454943 p2030.20131124.G176.16-01.04.S.b2s0g0.00000_3616_1] (est. dur. 10.96s (0h00m10s95)) (max time 160.14s (0h02m40s14))


Claggy
ID: 112884 · Report as offensive     Reply Quote
Profile Holmis

Send message
Joined: 4 Jan 05
Posts: 104
Credit: 2,104,736
RAC: 0
Message 112885 - Posted: 5 Jun 2014, 10:34:40 UTC

I tried asking for more tasks to my Nvidia GPU and got the following in Boinc's Event log:

05/06/2014 12:17:53 | Albert@Home | Requesting new tasks for NVIDIA
05/06/2014 12:17:53 | Albert@Home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
05/06/2014 12:17:53 | Albert@Home | [sched_op] NVIDIA work request: 102560.41 seconds; 0.00 devices
05/06/2014 12:17:53 | Albert@Home | [sched_op] intel_gpu work request: 0.00 seconds; 0.00 devices
05/06/2014 12:17:55 | Albert@Home | Scheduler request completed: got 0 new tasks
05/06/2014 12:17:55 | Albert@Home | [sched_op] Server version 703
05/06/2014 12:17:55 | Albert@Home | Project requested delay of 60 seconds
05/06/2014 12:17:55 | Albert@Home | [sched_op] Deferring communication for 00:01:00
05/06/2014 12:17:55 | Albert@Home | [sched_op] Reason: requested by project

As you can see there was no reason given for why I didn't receive any tasks.
Next step was checking the server contact log and I found this:

2014-06-05 10:17:54.8969 [PID=8307 ]    [version] Checking plan class 'BRP4G-cuda32-nv301'
2014-06-05 10:17:54.8969 [PID=8307 ]    [version] plan_class_spec: parsed project prefs setting 'gpu_util_brp' : true : 0.500000
2014-06-05 10:17:54.8969 [PID=8307 ]    [version] [AV#716] daily quota exceeded

So the reason was that I've already had my fill for the day.
Checking the Application details for my host gives:

Binary Radio Pulsar Search (Arecibo, GPU) 1.33 windows_intelx86 (BRP4G-cuda32-nv301)
Number of tasks completed  13
Max tasks per day	   45
Number of tasks today      54
Consecutive valid tasks    13
Average processing rate    56.59266205016
Average turnaround time    0.29 days

So I'm over the daily quota, but why didn't the scheduler tell me so in the reply to Boinc?
ID: 112885 · Report as offensive     Reply Quote
Profile Bernd Machenschalk
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 15 Oct 04
Posts: 1956
Credit: 6,218,130
RAC: 0
Message 112886 - Posted: 5 Jun 2014, 11:34:43 UTC

I enabled another debug flag (debug_array) to possibly get a grip on the app selection issue.

This means that the scheduler log excerpts that you see published for your hosts will get even longer. Please don't post these here in all gory detail, these are kept for ~200d on the server for the devs & admins anyway.

BM
ID: 112886 · Report as offensive     Reply Quote
Claggy

Send message
Joined: 29 Dec 06
Posts: 78
Credit: 4,040,969
RAC: 0
Message 112887 - Posted: 5 Jun 2014, 11:49:55 UTC - in response to Message 112886.  
Last modified: 5 Jun 2014, 11:50:47 UTC

I got some of those tasks resent:

https://albert.phys.uwm.edu/host_sched_logs/8/8143

Claggy
ID: 112887 · Report as offensive     Reply Quote
Profile Holmis

Send message
Joined: 4 Jan 05
Posts: 104
Credit: 2,104,736
RAC: 0
Message 112888 - Posted: 5 Jun 2014, 15:58:07 UTC - in response to Message 112886.  

I enabled another debug flag (debug_array) to possibly get a grip on the app selection issue.

This means that the scheduler log excerpts that you see published for your hosts will get even longer. Please don't post these here in all gory detail, these are kept for ~200d on the server for the devs & admins anyway.

BM

I just made a work request for CPU work and was granted 10 S6CasA tasks and one BRP4 task.
In my Einstein@home prefs the BRP4 search is not selected but Beta-apps are.

Unfortunately Boinc contacted the scheduler again before I could check the server log so I missed it, just wanted to point out that there should be 2 logs at around 15:46 today.

This is the first line from the second contact, the first contact that assigned the CPU tasks should have occurred a few minutes before this one.
2014-06-05 15:46:56.9050 [PID=16227] Request: [USER#xxxxx] [HOST#2267] [IP xxx.xxx.xxx.226] client 7.2.42
ID: 112888 · Report as offensive     Reply Quote
Claggy

Send message
Joined: 29 Dec 06
Posts: 78
Credit: 4,040,969
RAC: 0
Message 112894 - Posted: 6 Jun 2014, 10:55:17 UTC - in response to Message 112886.  
Last modified: 6 Jun 2014, 10:55:36 UTC

Got some of tasks resent again, still the same, tasks are predicted to take 16 seconds, this host hasn't completed it's 11 validations of that app_version yet, so it's using the initial estimate, and not it's app_version APR yet:

Binary Radio Pulsar Search (Arecibo, GPU) 1.34 windows_x86_64 (BRP4G-opencl-ati)
Number of tasks completed 7
Max tasks per day 1
Number of tasks today 0
Consecutive valid tasks 0
Average processing rate 61.916362373902
Average turnaround time 0.82 days


Claggy
ID: 112894 · Report as offensive     Reply Quote
Profile zombie67 [MM]
Avatar

Send message
Joined: 10 Oct 06
Posts: 130
Credit: 30,924,459
RAC: 0
Message 112896 - Posted: 6 Jun 2014, 13:38:37 UTC - in response to Message 112853.  

If your are going to use Dave's random number generator, I leave the project. Some CPU projects have fixed it to number generator of expected and acceptable range, but no GPU project has been successful in that deal. Good luck.

We know all that. The purpose of this test is, very specifically, to test and try out some fixes to CreditNew that some volunteers have spent the last nine months developing.

It would be most helpful if you would remain attached to the project, to generate some baseline data from a good range of hosts.

Albert has been chosen for this task specifically because it's a test project where nothing is expected to work anyway!


At first, I thought this test must be going on with other apps that I am not running, because my Binary Radio Pulsar Search (Arecibo, GPU) tasks were still getting a flat 1000 per. I guess it took a while to kick in. This morning, I can see all of the validated tasks with differing credits awarded. There are a couple with ~500. A couple ~300-400. All the rest range from 90-120 credits.

So, what do we need to do to get this CreditNew problem fixed?
Dublin, California
Team: SETI.USA

ID: 112896 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 112897 - Posted: 6 Jun 2014, 14:11:34 UTC - in response to Message 112896.  

If your are going to use Dave's random number generator, I leave the project. Some CPU projects have fixed it to number generator of expected and acceptable range, but no GPU project has been successful in that deal. Good luck.

We know all that. The purpose of this test is, very specifically, to test and try out some fixes to CreditNew that some volunteers have spent the last nine months developing.

It would be most helpful if you would remain attached to the project, to generate some baseline data from a good range of hosts.

Albert has been chosen for this task specifically because it's a test project where nothing is expected to work anyway!


At first, I thought this test must be going on with other apps that I am not running, because my Binary Radio Pulsar Search (Arecibo, GPU) tasks were still getting a flat 1000 per. I guess it took a while to kick in. This morning, I can see all of the validated tasks with differing credits awarded. There are a couple with ~500. A couple ~300-400. All the rest range from 90-120 credits.

So, what do we need to do to get this CreditNew problem fixed?

We're still generating the baseline - as you noticed, it took a few attempts to disable the previous fixed credits: now we can see and quantify the scale of the problem. There was another glitch with the CasA (GW) tasks this morning, so they still haven't properly started.

But rest assured, there are people editing away in the background even as I type.
ID: 112897 · Report as offensive     Reply Quote
Profile Holmis

Send message
Joined: 4 Jan 05
Posts: 104
Credit: 2,104,736
RAC: 0
Message 112898 - Posted: 6 Jun 2014, 16:17:51 UTC - in response to Message 112897.  

We're still generating the baseline - as you noticed, it took a few attempts to disable the previous fixed credits: now we can see and quantify the scale of the problem. There was another glitch with the CasA (GW) tasks this morning, so they still haven't properly started.

But rest assured, there are people editing away in the background even as I type.

So a few of questions about this test of the credit system:

Do we mere mortals need to do anything special or do we just run task and let the wizards take care of things in the background?

Is there something I or any other regular user can do to help and/or speed things up?

Should I/we focus on a special search or run them all?
ID: 112898 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 112899 - Posted: 6 Jun 2014, 16:56:02 UTC - in response to Message 112898.  

We're still generating the baseline - as you noticed, it took a few attempts to disable the previous fixed credits: now we can see and quantify the scale of the problem. There was another glitch with the CasA (GW) tasks this morning, so they still haven't properly started.

But rest assured, there are people editing away in the background even as I type.

So a few of questions about this test of the credit system:

Do we mere mortals need to do anything special or do we just run task and let the wizards take care of things in the background?

Is there something I or any other regular user can do to help and/or speed things up?

Should I/we focus on a special search or run them all?

Well, I can only answer as the Sorcerer's Apprentice - I'm following what the wizards are doing, and trying to interpret their Delphic utterances.

The whole CreditNew structure - if you can dignify it as a structure - is basically built round CPU applications, with coprocessors tacked on as an afterthought. So it would perhaps be a good idea - most helpful - to fire through some extra CasA/GW tasks, so the baseline for those catches up after the slow start. But we're just going into a long (3-day) weekend in Germany, so there's no rush. Just keep taking the tablets as usual, and see how dirty the laundry gets. Can anybody beat Zombie for variability? I've seen him get from below 100, to above 10,000, for the par-1000 BRP4G tasks.
ID: 112899 · Report as offensive     Reply Quote
Profile Holmis

Send message
Joined: 4 Jan 05
Posts: 104
Credit: 2,104,736
RAC: 0
Message 112900 - Posted: 6 Jun 2014, 17:25:39 UTC - in response to Message 112899.  

So it would perhaps be a good idea - most helpful - to fire through some extra CasA/GW tasks, so the baseline for those catches up after the slow start. But we're just going into a long (3-day) weekend in Germany, so there's no rush. Just keep taking the tablets as usual, and see how dirty the laundry gets.

Roger that, will run some extra CasA tasks and then let the server decide.

Can anybody beat Zombie for variability?

Well, the server seems to think I've had to much and are now issuing credit between 88.84 - 127.05 per BRP4G task. Wish I could get 10,000+ for a task, would be good for my RAC! =)
ID: 112900 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 . . . 17 · Next

Message boards : News : Project server code update



This material is based upon work supported by the National Science Foundation (NSF) under Grant PHY-0555655 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2024 Bruce Allen for the LIGO Scientific Collaboration