Posts by Claggy

WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!

21) Message boards : Problems and Bug Reports : Errors - 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED (Message 113123) Posted 23 Jun 2014 by Claggy Post: We're in the middle of Boinc server software testing here, see the news threads, the rsc_fpops_bound is O.K, the server is supplying ridiculous speed estimates for the initial tasks. Claggy
22) Message boards : Problems and Bug Reports : question about "quorum" for units (Message 113111) Posted 19 Jun 2014 by Claggy Post: It will in time be sent, the scheduler just has to wait for the right moment to send it, it won't necessarily send both tasks at the same time. Claggy
23) Message boards : News : Project server code update (Message 113086) Posted 18 Jun 2014 by Claggy Post: So you're saying that a host which has a very low actual throughput, relative to its marketing rating, will 'claim high' for credit? My HD7770 against another HD7770 (3,215): https://albert.phys.uwm.edu/workunit.php?wuid=620885 My HD7770 against another HD7770 (4,555): https://albert.phys.uwm.edu/workunit.php?wuid=618068 against a HD 7500/7600/8500/8600 series (2,927): https://albert.phys.uwm.edu/workunit.php?wuid=620828 against a HD 5800/5900 series (2,897): https://albert.phys.uwm.edu/workunit.php?wuid=620875 against a HD 6900 series (3,409): https://albert.phys.uwm.edu/workunit.php?wuid=619539 against a GeForce G210 (3,218): https://albert.phys.uwm.edu/workunit.php?wuid=620250 against a 8800GTX (3,013): https://albert.phys.uwm.edu/workunit.php?wuid=619497 against a 8800GTX (4,890): https://albert.phys.uwm.edu/workunit.php?wuid=617804 against a 9600 GT (3,525): https://albert.phys.uwm.edu/workunit.php?wuid=618083 against a 9600 GT (3,258): https://albert.phys.uwm.edu/workunit.php?wuid=618072 against a 9600 GT (3,374): https://albert.phys.uwm.edu/workunit.php?wuid=618075 against a 9600 GT (3,441): https://albert.phys.uwm.edu/workunit.php?wuid=618080 against a NVS 4200M (4,598): https://albert.phys.uwm.edu/workunit.php?wuid=606864 against a GT 555M (4,229): https://albert.phys.uwm.edu/workunit.php?wuid=612309 against a GTX 670M (3,388): https://albert.phys.uwm.edu/workunit.php?wuid=617797 against a GTX 680 (3,363) https://albert.phys.uwm.edu/workunit.php?wuid=617769 Against Richard's GTX670 (all around 2400): https://albert.phys.uwm.edu/workunit.php?wuid=620884 https://albert.phys.uwm.edu/workunit.php?wuid=620851 https://albert.phys.uwm.edu/workunit.php?wuid=620495 https://albert.phys.uwm.edu/workunit.php?wuid=620346 I guess AMD's, legacy NV's, and modern mobile NV's have a relative high flops to their actual throughput. Claggy
24) Message boards : News : Project server code update (Message 113066) Posted 18 Jun 2014 by Claggy Post: I've started documenting the wingmates who co-validate my 'high outlier' credit scores, but no pattern has emerged yet. Validated with different app versions, like x86 on one and x64 on another? Been running a number of CPU hosts on and off for months, mostly Arm, before the upgrade the best app, ie Neon app was only sent to my Arm hosts unless I aborted tasks to drive the Max tasks per day down low enough, (My 2012 HTC One S and the 1.43 Neon app only produced validate errors, and the scheduler wouldn't send the 1.43 VFP app unless I did that, it completed 5 of those O.K), the 1.44 Neon app is good through, and has completed over 200 hundred now with hardly a problem, no more VFP tasks have been sent. The two Parallellas were only doing Neon tasks before hand, afterwards they started picking up non Neon tasks, they are at 11 and 10 validations so far for non Neon, and 21 and 23 for Neon, The 2012 Nexus 7 had done only Neon tasks beforehand, afterwards it's picked up VFP tasks, done 8 of those against 37 of Neon, the VFP app is about half the speed of the Neon app, The C2D T8100 Linux x64 host before hand only picked up x64 BRP tasks, it's completed 271 so far, SSE2 x86 tasks have never been sent, On the HD7770 it picked up 1.34 windows_x86_64 (BRP4G-opencl-ati) and 1.34 windows_intelx86 (BRP4G-opencl-ati) work, these are different apps, with different file sizes, the x86_x64 app had some validations from beforehand, afterwards they were failing with max time exceeded errors for a few days, the x86 work got sent when the x64 max tasks per day got too low, since the x64 tasks got a reasonable speed estimate their tasks complete O.K, x86 tasks haven't been sent again, I have no idea which app is fastest, looks as if there are scheduler differences between sending CPU and GPU apps, I would have expected some x86 work to be sent. It's similar with the Perseus Arm Survey, I've had work from 1.39 windows_x86_64 (BRP5-opencl-ati) and 1.39 windows_intelx86 (BRP5-opencl-ati), the x64 app has some validations, the x86 none, while you can't tell the difference from the tasks page, Boinc Manager shows different duration estimates for the two, 33secs for x64, 20secs for x86, stderr.txt doesn't seem to tell them apart. CreditNew seems to use different calculations depending whether an app is above or below a sample level, could it be that one app version is above the sample level and the other isn't? Claggy
25) Message boards : News : Project server code update (Message 113035) Posted 17 Jun 2014 by Claggy Post: Unfortunately I missed the server log for a fetch - just got a 'report only' RPC instead. Could you grab a log if it does another work_fetch, please? I did another request, and suspended network: https://albert.phys.uwm.edu/host_sched_logs/8/8143 Claggy [version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G ~~That's not TeraFlops (speed), That's peak flop count, as in # of operations.~~ (verifying in code now) scratch that looks broken, walking the lot with beer Boinc startup says: 17/06/2014 18:17:17 \| \| CAL: ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (CAL version 1.4.1848, 1024MB, 984MB available, 3584 GFLOPS peak) 17/06/2014 18:17:17 \| \| OpenCL: AMD/ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (driver version 1348.5 (VM), device version OpenCL 1.2 AMD-APP (1348.5), 1024MB, 984MB available, 3584 GFLOPS peak) 17/06/2014 18:17:17 \| \| OpenCL CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1348.5 (sse2,avx), device version OpenCL 1.2 AMD-APP (1348.5)) The GTX460 always had a lot lower GFLOPS peak value, but was a lot more effective at Seti v6, v7 and AP v6, the exception being here, and the OpenCL Gamma-ray pulsar search #3 1.07 app, where the HD7770 was a little faster: https://albert.phys.uwm.edu/host_app_versions.php?hostid=8143 Gamma-ray pulsar search #3 1.07 windows_x86_64 (FGRPopencl-ati) Number of tasks completed 13 Max tasks per day 45 Number of tasks today 0 Consecutive valid tasks 13 Average processing rate 3.55 GFLOPS Average turnaround time 0.37 days Gamma-ray pulsar search #3 1.07 windows_x86_64 (FGRPopencl-nvidia) Number of tasks completed 12 Max tasks per day 44 Number of tasks today 0 Consecutive valid tasks 12 Average processing rate 2.87 GFLOPS Average turnaround time 0.88 days http://boinc.berkeley.edu/dev/forum_thread.php?id=8767&postid=51659 04/12/2013 21:25:07 \| \| CUDA: NVIDIA GPU 0: GeForce GTX 460 (driver version 331.58, CUDA version 6.0, compute capability 2.1, 1024MB, 854MB available, 1075 GFLOPS peak) 04/12/2013 21:25:07 \| \| CAL: ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (CAL version 1.4.1848, 1024MB, 984MB available, 3584 GFLOPS peak) 04/12/2013 21:25:07 \| \| OpenCL: NVIDIA GPU 0: GeForce GTX 460 (driver version 331.58, device version OpenCL 1.1 CUDA, 1024MB, 854MB available, 1075 GFLOPS peak) 04/12/2013 21:25:07 \| \| OpenCL: AMD/ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (driver version 1348.4 (VM), device version OpenCL 1.2 AMD-APP (1348.4), 1024MB, 984MB available, 3584 GFLOPS peak) 04/12/2013 21:25:07 \| \| OpenCL CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1348.4 (sse2,avx), device version OpenCL 1.2 AMD-APP (1348.4)) Claggy
26) Message boards : News : Project server code update (Message 113029) Posted 17 Jun 2014 by Claggy Post: Unfortunately I missed the server log for a fetch - just got a 'report only' RPC instead. Could you grab a log if it does another work_fetch, please? I did another request, and suspended network: https://albert.phys.uwm.edu/host_sched_logs/8/8143 Claggy
27) Message boards : News : Project server code update (Message 113025) Posted 17 Jun 2014 by Claggy Post: For your info, my i7-2600K/HD7770 is now picking up Gamma-ray pulsar search #3 tasks, the initial CPU estimates look O.K at 4hrs 55mins, the ATI estimates are at 5 seconds. (This application type has CPU, Nvidia, ATI and Intel apps across Windows, Mac and Linux (But no Intel app on Linux)) Claggy whetstone, Flops and rsc_fpops_est for GPu and CPU? edit: 'please' - sorry ::) CPU p_fpops is 4514900817.923695 HD7770 peak_flops is 3584000000000.000000 flops for the CPU app_version of hsgamma_FGRP3 is 845960315.482654 flops for the ATI GPU app_version of hsgamma_FGRP3 is 2950327174499.708000 rsc_fpops_est is 15000000000000.000000, with rsc_fpops_bound at 300000000000000.000000 With an Gamma-ray pulsar search #3 only request I got: https://albert.phys.uwm.edu/host_sched_logs/8/8143 2014-06-17 17:18:23.1994 [PID=2155 ] [send] CPU: req 8330.13 sec, 0.00 instances; est delay 0.00 2014-06-17 17:18:23.1995 [PID=2155 ] [send] AMD/ATI GPU: req 8692.21 sec, 0.00 instances; est delay 0.00 2014-06-17 17:18:23.1995 [PID=2155 ] [send] work_req_seconds: 8330.13 secs 2014-06-17 17:18:23.1995 [PID=2155 ] [send] available disk 95.78 GB, work_buf_min 95040 2014-06-17 17:18:23.1995 [PID=2155 ] [send] on_frac 0.923624 active_frac 0.985800 gpu_active_frac 0.984082 2014-06-17 17:18:23.1995 [PID=2155 ] [send] CPU features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx tm2 pbe 2014-06-17 17:18:23.3103 [PID=2155 ] [mixed] sending locality work first 2014-06-17 17:18:23.3223 [PID=2155 ] [version] get_app_version(): getting app version for WU#604131 (LATeah0109C_32.0_0_-1.48e-10) appid:30 2014-06-17 17:18:23.3223 [PID=2155 ] [version] looking for version of hsgamma_FGRP3 2014-06-17 17:18:23.3224 [PID=2155 ] [version] Checking plan class 'FGRPopencl-ati' 2014-06-17 17:18:23.3234 [PID=2155 ] [version] reading plan classes from file '/BOINC/projects/AlbertAtHome/plan_class_spec.xml' 2014-06-17 17:18:23.3234 [PID=2155 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_fgrp' : true : 1.000000 2014-06-17 17:18:23.3234 [PID=2155 ] [version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G 2014-06-17 17:18:23.3234 [PID=2155 ] [version] Best app version is now AV911 (85.84 GFLOP) 2014-06-17 17:18:23.3235 [PID=2155 ] [version] Checking plan class 'FGRPopencl-intel_gpu' 2014-06-17 17:18:23.3235 [PID=2155 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_fgrp' : true : 1.000000 2014-06-17 17:18:23.3235 [PID=2155 ] [version] [version] No Intel GPUs found 2014-06-17 17:18:23.3235 [PID=2155 ] [version] [AV#912] app_plan() returned false 2014-06-17 17:18:23.3235 [PID=2155 ] [version] Checking plan class 'FGRPopencl-nvidia' 2014-06-17 17:18:23.3235 [PID=2155 ] [version] plan_class_spec: parsed project prefs setting 'gpu_util_fgrp' : true : 1.000000 2014-06-17 17:18:23.3235 [PID=2155 ] [version] plan_class_spec: No NVIDIA GPUs found 2014-06-17 17:18:23.3235 [PID=2155 ] [version] [AV#925] app_plan() returned false 2014-06-17 17:18:23.3235 [PID=2155 ] [version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G 2014-06-17 17:18:23.3235 [PID=2155 ] [version] Best version of app hsgamma_FGRP3 is [AV#911] (2950.33 GFLOPS) 2014-06-17 17:18:23.3236 [PID=2155 ] [send] est delay 0, skipping deadline check 2014-06-17 17:18:23.3264 [PID=2155 ] [send] Sending app_version hsgamma_FGRP3 7 111 FGRPopencl-ati; projected 2950.33 GFLOPS 2014-06-17 17:18:23.3265 [PID=2155 ] [CRITICAL] No filename found in [WU#604131 LATeah0109C_32.0_0_-1.48e-10] 2014-06-17 17:18:23.3265 [PID=2155 ] [send] est. duration for WU 604131: unscaled 5.08 scaled 5.59 2014-06-17 17:18:23.3265 [PID=2155 ] [send] [HOST#8143] sending [RESULT#1450173 LATeah0109C_32.0_0_-1.48e-10_1] (est. dur. 5.59s (0h00m05s59)) (max time 101.68s (0h01m41s68)) 2014-06-17 17:18:23.3291 [PID=2155 ] [locality] send_old_work(LATeah0109C_32.0_0_-1.48e-10_1) sent result created 344.0 hours ago [RESULT#1450173] 2014-06-17 17:18:23.3291 [PID=2155 ] [locality] Note: sent NON-LOCALITY result LATeah0109C_32.0_0_-1.48e-10_1 2014-06-17 17:18:23.3292 [PID=2155 ] [locality] send_results_for_file(h1_0997.00_S6Direct) 2014-06-17 17:18:23.3365 [PID=2155 ] [locality] in_send_results_for_file(h1_0997.00_S6Direct, 0) prev_result.id=1488887 Claggy
28) Message boards : News : Project server code update (Message 113019) Posted 17 Jun 2014 by Claggy Post: For your info, my i7-2600K/HD7770 is now picking up Gamma-ray pulsar search #3 tasks, the initial CPU estimates look O.K at 4hrs 55mins, the ATI estimates are at 5 seconds. (This application type has CPU, Nvidia, ATI and Intel apps across Windows, Mac and Linux (But no Intel app on Linux)) All Gamma-ray pulsar search #3 tasks for computer 8143 Claggy
29) Message boards : News : Web code updated (Message 112982) Posted 16 Jun 2014 by Claggy Post: Hm, I can't reproduce that. For me it shows my name and "log out" as expected... You're an administrator, the rest of us don't need to be logged onto that page, and can't logon because we aren't administrators. Claggy
30) Message boards : Problems and Bug Reports : 'User aborted' (Message 112963) Posted 15 Jun 2014 by Claggy Post: The event log is too small, it only goes back to the 13th (I'm running some 40 projects). The same PC is running Einstein on its HD 4000 IGP since... Look at stdoutdae.txt or stdoutdae.old in your Boinc Data directory, you'll find it will go back further. Claggy
31) Message boards : Problems and Bug Reports : 'User aborted' (Message 112960) Posted 14 Jun 2014 by Claggy Post: Hm, 201 (0xc9) EXIT_MISSING_COPROC, https://albert.phys.uwm.edu/result.php?resultid=1490485 I wonder if the client aborted them, and there's a mismatch in what the client says, and what the web code reports. What does the Event log say? Claggy
32) Message boards : Problems and Bug Reports : 'User aborted' (Message 112958) Posted 14 Jun 2014 by Claggy Post: Your computers are hidden, so there is no evidence of what is happening: https://albert.phys.uwm.edu/show_user.php?userid=108127 Claggy
33) Message boards : News : Project server code update (Message 112956) Posted 14 Jun 2014 by Claggy Post: Attached a new host to Albert, looking through the logs i keep getting the following download error: 14-Jun-2014 06:06:32 [Albert@Home] Started download of eah_slide_05.png 14-Jun-2014 06:06:32 [Albert@Home] Started download of eah_slide_07.png 14-Jun-2014 06:06:32 [Albert@Home] Started download of eah_slide_08.png 14-Jun-2014 06:06:33 [Albert@Home] Finished download of eah_slide_07.png 14-Jun-2014 06:06:33 [Albert@Home] Started download of EatH_mastercat_1344952579.txt 14-Jun-2014 06:06:34 [Albert@Home] Finished download of eah_slide_05.png 14-Jun-2014 06:06:34 [Albert@Home] Finished download of eah_slide_08.png 14-Jun-2014 06:06:34 [Albert@Home] Giving up on download of EatH_mastercat_1344952579.txt: permanent HTTP error On this new host (as well as on my HD7770) i'm still getting the very short estimates for Perseus Arm Survey GPU tasks, so i've added two zero's to the rsc_fpops values so they'll complete. Computer 11441 Claggy
34) Message boards : News : Web code updated (Message 112954) Posted 14 Jun 2014 by Claggy Post: Server status page should work again. BM It shows "Log In" at the top right, but i'm already logged in. I think its supposed to show my username and "Log out". Clicking on it indeed prompts to log in (again). Seti Beta had that a year or two ago, I believe they removed/hid it. Claggy
35) Message boards : News : Web code updated (Message 112950) Posted 13 Jun 2014 by Claggy Post: I'd noticed that the website availability had been erratic earlier today. Not a problem in itself, and I can work round some of the odder side effects. But I also got a few download errors on workunits around the same time. They showed as 'download error' on GW (CasA) data files like 'h1_0072.55_S6Direct'. The actual error code was ERR_HTTP_PERMANENT -224 // represents HTTP 404 or 416 error I'm still getting a fair amount of those: <message> WU download error: couldn't get input files: <file_xfer_error> <file_name>h1_1000.20_S6Direct</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> <file_xfer_error> <file_name>l1_1000.20_S6Direct</file_name> <error_code>-224 (permanent HTTP error)</error_code> <error_message>permanent HTTP error</error_message> </file_xfer_error> https://albert.phys.uwm.edu/result.php?resultid=1488062 Claggy
36) Message boards : News : Web code updated (Message 112937) Posted 12 Jun 2014 by Claggy Post: ...but still can't get work for my intel GPU mac mini. I went to see if there was work available, but that page is blown up right now. The only MAC intel_gpu app deployed here is for Gamma-ray pulsar search #3, and as far as I know they aren't producing work for it at the moment. Albert applications Claggy
37) Message boards : News : Project server code update (Message 112924) Posted 11 Jun 2014 by Claggy Post: I tried to resend those BRP (Arecibo, GPU) tasks, but got them expired instead (I had use ATI GPU set to No), So managed to get fresh GPU tasks, a mixture of BRP (Arecibo, GPU) and BRP (Perseus Arm Survey), the (Arecibo, GPU) tasks now have estimates of 13 minutes, while they take an hour, so they are now completeable, the (Perseus Arm Survey) tasks have estimates of 16 seconds, so aren't, i'll let the ones I have run and error: All tasks for computer 8143 Application details for host 8143 Claggy
38) Message boards : News : Project server code update (Message 112921) Posted 11 Jun 2014 by Claggy Post: Oh, you're going to love this one Jason Holmis Claggy Zombie Zombie (Mac) Host: 11363 2267 9008 6490 6109 GTX 780 GTX 660 GT 650M TITAN GTX 680MX Credit for BRP4G (GPU) Maximum 1170.48 1036.86 10239.0 1654.85 11847.50 Minimum 115.82 88.84 153.90 25.79 94.88 Average 548.33 463.98 3875.88 874.96 2256.70 Median 468.80 390.21 2977.38 865.33 1591.80 Std Dev 431.90 268.52 2873.26 362.30 2395.61 I'll upload a graph after lunch, when my monitor has cooled down and I've stopped laughing. For your info, my GT650M is running one task at a time, and I'm only running two CPU tasks at a time too, (It runs very hot, the 2.5GHz i5-3210M is a dual core with hyper threading, with it running on it's turbo mode of 2.89GHz the CPU cores sit at 99Â°C, add another core crunching, or the intel GPU crunching and it starts downclocking, both CPU and Nvidia GPU) Since I've now got Intel GPU tasks, the CPU is flucturating between 1.90GHz and 2.89GHz in 0.1GHz steps, ie 2.89, 2.79, 2.69, 2.59, 2.50, 2.40, 2.20, 2.10, etc, and the GT650M is switching between 950MHz and 118MHz, while the HD Graphics 4000 is switching between 950MHz, 1.0GHz, 1.05GHz and 1.10GHz, expect all task durations to flucturate. ;-) Claggy
39) Message boards : News : Project server code update (Message 112894) Posted 6 Jun 2014 by Claggy Post: Got some of tasks resent again, still the same, tasks are predicted to take 16 seconds, this host hasn't completed it's 11 validations of that app_version yet, so it's using the initial estimate, and not it's app_version APR yet: Binary Radio Pulsar Search (Arecibo, GPU) 1.34 windows_x86_64 (BRP4G-opencl-ati) Number of tasks completed 7 Max tasks per day 1 Number of tasks today 0 Consecutive valid tasks 0 Average processing rate 61.916362373902 Average turnaround time 0.82 days Claggy
40) Message boards : News : Project server code update (Message 112887) Posted 5 Jun 2014 by Claggy Post: I got some of those tasks resent: https://albert.phys.uwm.edu/host_sched_logs/8/8143 Claggy

Previous 20 · Next 20

This material is based upon work supported by the National Science Foundation (NSF) under Grant PHY-0555655 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2024 Bruce Allen for the LIGO Scientific Collaboration