Project server code update

WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!

Author	Message
jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113164 - Posted: 29 Jun 2014, 10:34:16 UTC - in response to Message 113162. From treblehit's server log https://albert.phys.uwm.edu/host_sched_logs/11/11519 2014-06-29 09:21:30.4581 [PID=3880 ] [version] [AV#738] (BRP5-opencl-ati) adjusting projected flops based on PFC avg: 16250.85G 2014-06-29 09:21:30.4581 [PID=3880 ] [version] Best app version is now AV738 (0.89 GFLOP) 2014-06-29 09:21:30.4581 [PID=3880 ] [version] [AV#738] (BRP5-opencl-ati) adjusting projected flops based on PFC avg: 16250.85G 2014-06-29 09:21:30.4581 [PID=3880 ] [version] Best version of app einsteinbinary_BRP5 is [AV#738] (16250.85 GFLOPS) I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way. Sure, first from client perspective: referring to the dodgy diagram, factoring in the bad onramp period default pfc_scale of 0.1 for GPUs, and inactive host_scale (x1) results in: wu pfc ('peak flop claim') est = 0.11wu_est (10% of minimum possible) device peak_flops likely standard GPU ~20x actual rate (app, card & system dependant) --> est about 1/200th of required elapsed --> bound exceed Now digging through server end... On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113164 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113165 - Posted: 29 Jun 2014, 10:34:59 UTC - in response to Message 113163. Last modified: 29 Jun 2014, 10:36:25 UTC I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way. Doesn't the Main project have this adjustment because they have a single DCF there, But we don't use DCF here, so this adjustment shouldn't be used? Claggy There is no adjustment, the adjustment is a lie. <dont_use_dcf> is hard wired active for all clients >= 7.0.28. On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113165 · Reply Quote

Claggy Send message Joined: 29 Dec 06 Posts: 78 Credit: 4,040,969 RAC: 0	Message 113166 - Posted: 29 Jun 2014, 10:41:35 UTC - in response to Message 113165. Last modified: 29 Jun 2014, 10:43:33 UTC I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way. Doesn't the Main project have this adjustment because they have a single DCF there, But we don't use DCF here, so this adjustment shouldn't be used? Claggy There is no adjustment, the adjustment is a lie. <dont_use_dcf> is hard wired active for all clients >= 7.0.28. But only on projects that don't use dcf, Einstein on my i7-2600K/HD7770 has a dcf of: <duration_correction_factor>1.267963</duration_correction_factor> Albert has of cause: <dont_use_dcf/> Claggy ID: 113166 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113167 - Posted: 29 Jun 2014, 10:43:58 UTC - in response to Message 113166. Last modified: 29 Jun 2014, 10:45:33 UTC I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way. Doesn't the Main project have this adjustment because they have a single DCF there, But we don't use DCF here, so this adjustment shouldn't be used? Claggy There is no adjustment, the adjustment is a lie. <dont_use_dcf> is hard wired active for all clients >= 7.0.28. But only on projects that don't use dcf, Einstein on my i7-2600K/HD7770 has a dcf of: <duration_correction_factor>1.267963</duration_correction_factor> Claggy Well you've lost me there, because every scheduler reply to a >= 7.0.28 client, accirding to the scheduler code, pushes <dont_use_dcf/> , [and there is no configuration switch for it ] On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113167 · Reply Quote

Claggy Send message Joined: 29 Dec 06 Posts: 78 Credit: 4,040,969 RAC: 0	Message 113168 - Posted: 29 Jun 2014, 10:47:59 UTC - in response to Message 113167. Last modified: 29 Jun 2014, 10:49:25 UTC I do think we ought to try and work out exactly where those figures come from. As with the numbers Claggy and I saw right at the beginning of this thread, they are vastly higher than any known 'peak FLOPs' value calculated and displayed by the BOINC client for any known GPU. At the very most, that calculated speed (or some rule-of-thumb fraction of it) should be used as a sanity cap on the PFC avg number - once we've understood what PFC avg is in this context, and how it came to be that way. Doesn't the Main project have this adjustment because they have a single DCF there, But we don't use DCF here, so this adjustment shouldn't be used? Claggy There is no adjustment, the adjustment is a lie. <dont_use_dcf> is hard wired active for all clients >= 7.0.28. But only on projects that don't use dcf, Einstein on my i7-2600K/HD7770 has a dcf of: <duration_correction_factor>1.267963</duration_correction_factor> Claggy Well you've lost me there, because every scheduler reply to a >= 7.0.28 client, accirding to the scheduler code, pushes <dont_use_dcf/> , [and there is no configuration switch for it ] Einstein has an older scheduler than Albert (or at least server version): 29/06/2014 11:45:58 \| Einstein@Home \| sched RPC pending: Requested by user 29/06/2014 11:45:58 \| Einstein@Home \| [sched_op] Starting scheduler request 29/06/2014 11:45:58 \| Einstein@Home \| Sending scheduler request: Requested by user. 29/06/2014 11:45:58 \| Einstein@Home \| Not requesting tasks: "no new tasks" requested via Manager 29/06/2014 11:45:58 \| Einstein@Home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices 29/06/2014 11:45:58 \| Einstein@Home \| [sched_op] ATI work request: 0.00 seconds; 0.00 devices 29/06/2014 11:46:00 \| Einstein@Home \| Scheduler request completed 29/06/2014 11:46:00 \| Einstein@Home \| [sched_op] Server version 611 29/06/2014 11:46:00 \| Einstein@Home \| Project requested delay of 60 seconds 29/06/2014 11:46:00 \| Einstein@Home \| [sched_op] Deferring communication for 00:01:00 29/06/2014 11:46:00 \| Einstein@Home \| [sched_op] Reason: requested by project 29/06/2014 11:46:05 \| Albert@Home \| sched RPC pending: Requested by user 29/06/2014 11:46:05 \| Albert@Home \| [sched_op] Starting scheduler request 29/06/2014 11:46:05 \| Albert@Home \| Sending scheduler request: Requested by user. 29/06/2014 11:46:05 \| Albert@Home \| Reporting 2 completed tasks 29/06/2014 11:46:05 \| Albert@Home \| Not requesting tasks: don't need 29/06/2014 11:46:05 \| Albert@Home \| [sched_op] CPU work request: 0.00 seconds; 0.00 devices 29/06/2014 11:46:05 \| Albert@Home \| [sched_op] ATI work request: 0.00 seconds; 0.00 devices 29/06/2014 11:46:08 \| Albert@Home \| Scheduler request completed 29/06/2014 11:46:08 \| Albert@Home \| [sched_op] Server version 703 29/06/2014 11:46:08 \| Albert@Home \| Project requested delay of 60 seconds 29/06/2014 11:46:08 \| Albert@Home \| [sched_op] handle_scheduler_reply(): got ack for task h1_0997.10_S6Direct__S6CasAf40_997.55Hz_1017_1 29/06/2014 11:46:08 \| Albert@Home \| [sched_op] handle_scheduler_reply(): got ack for task p2030.20130202.G202.32-01.96.N.b2s0g0.00000_2384_5 29/06/2014 11:46:08 \| Albert@Home \| [sched_op] Deferring communication for 00:01:00 29/06/2014 11:46:08 \| Albert@Home \| [sched_op] Reason: requested by project Claggy ID: 113168 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113169 - Posted: 29 Jun 2014, 10:48:58 UTC - in response to Message 113168. Ah allright, Yeah only interested in fixing current code, rather than diagnosing/patching old versions :) On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113169 · Reply Quote

Claggy Send message Joined: 29 Dec 06 Posts: 78 Credit: 4,040,969 RAC: 0	Message 113170 - Posted: 29 Jun 2014, 11:03:47 UTC - in response to Message 113169. Ah allright, Yeah only interested in fixing current code, rather than diagnosing/patching old versions :) I was thinking that they were using Einstein customisations here that might not be needed, looking at robl's Einstein log shows it's the durations that get scaled there: http://einstein.phys.uwm.edu/hosts_user.php?userid=613597 2014-06-29 09:28:50.6296 [PID=17986] [send] [HOST#7536795] Sending app_version 483 einsteinbinary_BRP5 7 139 BRP5-cuda32-nv270; 49.97 GFLOPS 2014-06-29 09:28:50.6312 [PID=17986] [send] est. duration for WU 193304662: unscaled 9004.88 scaled 18527.18 2014-06-29 09:28:50.6312 [PID=17986] [HOST#7536795] Sending [RESULT#443159459 PB0024_00191_182_0] (est. dur. 18527.18 seconds) 2014-06-29 09:28:50.6324 [PID=17986] [send] est. duration for WU 193307638: unscaled 9004.88 scaled 18527.18 2014-06-29 09:28:50.6324 [PID=17986] [send] [WU#193307638] meets deadline: 18527.18 + 18527.18 < 1209600 2014-06-29 09:28:50.6332 [PID=17986] [send] [HOST#7536795] Sending app_version 483 einsteinbinary_BRP5 7 139 BRP5-cuda32-nv270; 49.97 GFLOPS 2014-06-29 09:28:50.6347 [PID=17986] [send] est. duration for WU 193307638: unscaled 9004.88 scaled 18527.18 2014-06-29 09:28:50.6347 [PID=17986] [HOST#7536795] Sending [RESULT#443165551 PB0024_00141_24_0] (est. dur. 18527.18 seconds) 2014-06-29 09:28:50.6356 [PID=17986] [send] est. duration for WU 193249827: unscaled 9004.88 scaled 18527.18 2014-06-29 09:28:50.6356 [PID=17986] [send] [WU#193249827] meets deadline: 37054.37 + 18527.18 < 1209600 2014-06-29 09:28:50.6364 [PID=17986] [send] [HOST#7536795] Sending app_version 483 einsteinbinary_BRP5 7 139 BRP5-cuda32-nv270; 49.97 GFLOPS 2014-06-29 09:28:50.6380 [PID=17986] [send] est. duration for WU 193249827: unscaled 9004.88 scaled 18527.18 2014-06-29 09:28:50.6381 [PID=17986] [HOST#7536795] Sending [RESULT#443038987 PB0023_01561_144_0] (est. dur. 18527.18 seconds) Claggy ID: 113170 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113171 - Posted: 29 Jun 2014, 11:06:47 UTC Last modified: 29 Jun 2014, 11:07:38 UTC Now the server side, that 'Best version of app' striing comes from sched_version.cpp (scheduler inbuilt functions) and uses the following resources: app->name, bavp->avp->id, bavp->host_usage.projected_flops/1e9 That projected_flops is set during app version selection, as number os samples will be < 10 , flops will be adjusted based on pfc samples average for the app version (there will be 100 of those from other users). Since that's normalised elsewhere (see red ellipse on dodgy diagram), net effect translates pfc of 0.1 used for the original estimate, to 1, so peak_flops is x10-20 On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113171 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113172 - Posted: 29 Jun 2014, 11:11:20 UTC - in response to Message 113170. Last modified: 29 Jun 2014, 11:13:32 UTC I was thinking that they were using Einstein customisations here that might not be needed, looking at robl's Einstein log shows it's the durations that get scaled there: Yeah, they were before. Quite a lot of work Bernd had to do to get here to stock updated sever code. Now (here), should be pretty close or identical (for our purposes) to current Boinc master IIRC. On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113172 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113173 - Posted: 29 Jun 2014, 11:14:35 UTC - in response to Message 113171. Now the server side, that 'Best version of app' striing comes from sched_version.cpp (scheduler inbuilt functions) and uses the following resources: app->name, bavp->avp->id, bavp->host_usage.projected_flops/1e9 That projected_flops is set during app version selection, as number os samples will be < 10 , flops will be adjusted based on pfc samples average for the app version (there will be 100 of those from other users). Since that's normalised elsewhere (see red ellipse on dodgy diagram), net effect translates pfc of 0.1 used for the original estimate, to 1, so peak_flops is x10-20 Richard do you want code line numbers for that ? On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113173 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113174 - Posted: 29 Jun 2014, 11:35:36 UTC - in response to Message 113169. Ah allright, Yeah only interested in fixing current code, rather than diagnosing/patching old versions :) Yes, concentrating on the current code and moving it forward is certainly the right approach - but it's probably worth just being aware of the steps we moved through to reach this point, because it can influence compatibility problems that could arise in the future. As we've discussed, DCF was deprecated from client v7.0.28, and in the server code from a little earlier. But not everything in the BOINC world moves in lockstep, so we have older and newer servers in use, and we also have older and newer clients in use. Older servers take account of client DCF when scaling runtime estimates prior to allocating work: [send] active_frac 0.999987 on_frac 0.999802 DCF 0.776980 Newer servers don't: [send] on_frac 0.999802 active_frac 0.999987 gpu_active_frac 0.999978 Those are both the same machine (the one I've been graphing here), which explains why on_frac and active_frac are identical. But the first line comes from the Einstein server log, and the second line from the Albert server log. So, even my late-alpha version of BOINC (v7.3.19) is maintaining, using and reporting DCF against an 'old server' project which needs it. Good compatibility choice. But the reverse case is not so happy. An older client (I'm talking standard stock clients here, not Jason's specially-tweaked client) will do on using and reporting DCF as before, because it doesn't parse the <dont_use_dcf/> tag. But the newer server code has discarded DCF completely, and doesn't scale its internal runtime estimates when presented with a work request from a client which is still using it. This can - and does - result in servers allocating vastly different volumes of work from what the client expects, because the estimation process doesn't have all the same inputs. Say, for the sake of argument, that an 'old' (pre-v7.0.28) client has got itself into a state with DCF=100, and asks for 1 day of work. For the BRP4G tasks we're studying here, we'd all expect the server to allocate maybe 20 tasks, and the client to agree with the server calculation of estimated runtime, slightly over 1 day. But if the client is using DCF, and the server isn't, that can appear as a 100 day work work cache when the client does the local calculation. That's a case where server-client compatibility breaks down, and breaks down badly. ID: 113174 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113175 - Posted: 29 Jun 2014, 11:48:26 UTC - in response to Message 113174. Last modified: 29 Jun 2014, 11:53:21 UTC It's a bit of a stretch to examine border cases when the standard setup doesn't even work right. IMO let's start at the common case & work outward, because I guarantee if the numbers come up flaky there, then they aren;t going to be magically better with incompatible server and clients. For the present (treblehit's example) question, specifically the old Project DCF isn't involved in treblehit's example, on Albert, in any way (even though maintained by the client). It's the improper normalisation with inactive host scale appearing in another form ... however... since both host_scale and pfc_scales are, somewhat noisy and unstable, 'per app DCFs' in disguise, and improperly normalised, it amounts to familiar sets of wacky number symptoms. If you keep looking for those you will find them everywhere, because the entire system is dependant on these, and you'd just end up swearing Project DCF is active server side, which in a sense through a lot of spaghetti it is, though it isn't called that, and is per app version and per host app version instead. i.e. forget Project DCF (for now), use pfc_scale & host_scale. On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113175 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113176 - Posted: 29 Jun 2014, 11:55:36 UTC - in response to Message 113173. Now the server side, that 'Best version of app' striing comes from sched_version.cpp (scheduler inbuilt functions) and uses the following resources: app->name, bavp->avp->id, bavp->host_usage.projected_flops/1e9 That projected_flops is set during app version selection, as number os samples will be < 10 , flops will be adjusted based on pfc samples average for the app version (there will be 100 of those from other users). Since that's normalised elsewhere (see red ellipse on dodgy diagram), net effect translates pfc of 0.1 used for the original estimate, to 1, so peak_flops is x10-20 Richard do you want code line numbers for that ? That's OK, I can do a text search in sched_version.cpp same as you. What would perhaps be most useful would be an expanded table of all those TLA variable names, with your assessment of what David intended them to mean, and of what they actually mean in practice. Looking back at the thread openers, I reported: client 192 GFLOPS peak, based on PFC avg: 2124.60G I can't quickly find the client GFLOPS peak number for Claggy's ATI 'Capeverde' with "based on PFC avg: 34968.78G". I'd like to look for the variable (presumably a struct member) where we might expect GFLOPS peak to be stored, and see what it's multiplied by in those initial stages before 11 completions establish an APR. We might expect 0.1 from the words, but we seem to be using >10 by the numbers. ID: 113176 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113177 - Posted: 29 Jun 2014, 12:04:42 UTC - in response to Message 113176. right, that's what I meant by line numbers (with brief description) Caggy's case: if (av.pfc.n > MIN_VERSION_SAMPLES) { hu.projected_flops = hu.peak_flops/av.pfc.get_avg(); if (config.debug_version_select) { log_messages.printf(MSG_NORMAL, "[version] [AV#%d] (%s) adjusting projected flops based on PFC avg: %.2fG\n", is his marketing flops estimate peak_flops / app version pfc's . app version pfc is normalised to 0.1 (design flaw), and any real samples would have driven it toward 0.05 or lower . so that text should be 10-20x+ marketing flops, and is NOT the intent, nor remotely correct design. It's Gibberish. On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113177 · Reply Quote

Claggy Send message Joined: 29 Dec 06 Posts: 78 Credit: 4,040,969 RAC: 0	Message 113178 - Posted: 29 Jun 2014, 12:05:51 UTC - in response to Message 113176. I can't quickly find the client GFLOPS peak number for Claggy's ATI 'Capeverde' with "based on PFC avg: 34968.78G". I'd like to look for the variable (presumably a struct member) where we might expect GFLOPS peak to be stored, and see what it's multiplied by in those initial stages before 11 completions establish an APR. We might expect 0.1 from the words, but we seem to be using >10 by the numbers. 17/06/2014 18:17:17 \| \| CAL: ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (CAL version 1.4.1848, 1024MB, 984MB available, 3584 GFLOPS peak) 17/06/2014 18:17:17 \| \| OpenCL: AMD/ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (driver version 1348.5 (VM), device version OpenCL 1.2 AMD-APP (1348.5), 1024MB, 984MB available, 3584 GFLOPS peak) 17/06/2014 18:17:17 \| \| OpenCL CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1348.5 (sse2,avx), device version OpenCL 1.2 AMD-APP (1348.5)) Claggy ID: 113178 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113179 - Posted: 29 Jun 2014, 12:10:42 UTC - in response to Message 113178. Last modified: 29 Jun 2014, 12:17:22 UTC there you go. app version pfc average (!) is 3584GFLOPS/34968.78 ~= 0.102 [Edit:] unfortunately, that's improperly normalised, so meaningless without the normalisation reference app version figure, as per red ellipse on diagram... so the true figure will be likely around 0.02 or so, but anybody's guess without saying what app version is at 0.1 On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113179 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113180 - Posted: 29 Jun 2014, 12:20:03 UTC - in response to Message 113177. Last modified: 29 Jun 2014, 12:26:26 UTC app version pfc is normalised to 0.1 (design flaw), and any real samples would have driven it toward 0.05 or lower . so that text should be 10-20x+ marketing flops, and is NOT the intent, nor remotely correct design. It's Gibberish. The advice given to project administrators in http://boinc.berkeley.edu/trac/wiki/AppPlanSpec is: <gpu_peak_flops_scale>x</gpu_peak_flops_scale> scale GPU peak speed by this (default 1). I'm wondering whether they put in 0.1, expecting this to be a multiplier (real flops are lower than peak flops), but end up dividing by 0.1 instead? And from what you say, 'default 1' doesn't match the code either? Edit: the alternative C++ documentation for plan_classes is in http://boinc.berkeley.edu/trac/wiki/PlanClassFunc. There, the example is .21 // estimated GPU efficiency (actual/peak FLOPS) At least one of those must be upside down. ID: 113180 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113181 - Posted: 29 Jun 2014, 12:26:01 UTC - in response to Message 113180. Last modified: 29 Jun 2014, 12:30:29 UTC app version pfc is normalised to 0.1 (design flaw), and any real samples would have driven it toward 0.05 or lower . so that text should be 10-20x+ marketing flops, and is NOT the intent, nor remotely correct design. It's Gibberish. The advice given to project administrators in http://boinc.berkeley.edu/trac/wiki/AppPlanSpec is: <gpu_peak_flops_scale>x</gpu_peak_flops_scale> scale GPU peak speed by this (default 1). I'm wondering whether they put in 0.1, expecting this to be a multiplier (real flops are lower than peak flops), but end up dividing by 0.1 instead? And from what you say, 'default 1' doesn't match the code either? nope [0.1 is hardwired via 'magic number'], and 1 wouldn't be right for GPU anyway. correct would be ~0.05, don't normalise (except for credit), and enable+set a default host_scale of 1 from the start.... which would yield a projected flops (before convergence) of 0.05x1*peak_flops ... basically one 20th of the Marketing flops... then [let it] scale itself.. On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113181 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113182 - Posted: 29 Jun 2014, 12:33:15 UTC - in response to Message 113181. See edit to my last. In my view, if the relevant numbers are all <<1, we should be multiplying by them, not dividing by them. Out of coffee error - going shopping. Back soon. ID: 113182 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113183 - Posted: 29 Jun 2014, 12:43:10 UTC - in response to Message 113182. Last modified: 29 Jun 2014, 12:50:19 UTC See edit to my last. In my view, if the relevant numbers are all <<1, we should be multiplying by them, not dividing by them. Out of coffee error - going shopping. Back soon. The main issue is really that he starts with real marketing flops (more or less usable), works out an average efficiency there (yuck but still OK-ish), but then he normalises to some other app version... IOW multiplies by some arbitrary large number (or divides by some fraction if you prefer) with no connection to real throughputs or efficiencies in this device+app. That's OK for a relative number for credit (debatable)... but totally useless for time and throughput estimates (which are absolute estimates). Improper normalisation shrunk your estimate multiplying the projected_flops to 10x+ bloated marketing flops. On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113183 · Reply Quote