Project server code update

WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!

m.edu/view_profile.php?userid=2070"> Profile

Send message
Joined: 4 Jan 05
Posts: 104
Credit: 2,104,736
RAC: 0

Author	Message
jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113184 - Posted: 29 Jun 2014, 13:03:37 UTC - in response to Message 113180. Last modified: 29 Jun 2014, 13:15:27 UTC At least one of those must be upside down. In a sense yes. GPU app+device+conditions efficiency would be actual/peak, and must be less than 1 (and it is, e.g. it should be around 0.05 for single task Cuda GPU). Normalisation could be viewed as turning it upside down. It'll raise the GFlops & shrink the time estimate artificially --> the exact opposite of the kindof behaviour we want for new hosts/apps. A bit will become clearer when I have the next dodgy diagram ready. Getting bogged down in broken code is a bit of a red-herring at the moment, as there are design level issues to tackle first. In particular, debugging the normalisation, including the absurd GFlops numbers it produces, is pointless in the context of estimates. That's because neither the time nor Gflops should be being normalised [AT ALL], so it all get's disabled in estimates, and restricted to credit related uses where it's applicable to get the same credit claims from different apps. On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113184 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113185 - Posted: 29 Jun 2014, 13:30:28 UTC - in response to Message 113184. At least one of those must be upside down. In a sense yes. GPU app+device+conditions efficiency would be actual/peak, and must be less than 1 (and it is, e.g. it should be around 0.05 for single task Cuda GPU). Normalisation could be viewed as turning it upside down. It'll raise the GFlops & shrink the time estimate artificially --> the exact opposite of the kindof behaviour we want for new hosts/apps. A bit will become clearer when I have the next dodgy diagram ready. Getting bogged down in broken code is a bit of a red-herring at the moment, as there are design level issues to tackle first. In particular, debugging the normalisation, including the absurd GFlops numbers it produces, is pointless in the context of estimates. That's because neither the time nor Gflops should be being normalised [AT ALL], so it all get's disabled in estimates, and restricted to credit related uses where it's applicable to get the same credit claims from different apps. Well, we do (crudely) have two separate cases to deal with. 1) initial attach. We have to get rid of that divide-by-almost-zero, or hosts can't run. They get the absurdly low runtime estimate/bound and error when they exceed it. 2) steady state. In my (political) opinion, trying to bring back client-side DCF will be flogging one dead horse too many. We need some sort of server-side control of runtime estimates, so that client scheduling works and user expectations are met. I'm happy to accept that the new version will be different to the one we have now, and look forward to seeing it. OK, I'll get out of your hair, and take my coffee downstairs to grab some more stats. ID: 113185 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113186 - Posted: 29 Jun 2014, 13:37:12 UTC - in response to Message 113185. At least one of those must be upside down. In a sense yes. GPU app+device+conditions efficiency would be actual/peak, and must be less than 1 (and it is, e.g. it should be around 0.05 for single task Cuda GPU). Normalisation could be viewed as turning it upside down. It'll raise the GFlops & shrink the time estimate artificially --> the exact opposite of the kindof behaviour we want for new hosts/apps. A bit will become clearer when I have the next dodgy diagram ready. Getting bogged down in broken code is a bit of a red-herring at the moment, as there are design level issues to tackle first. In particular, debugging the normalisation, including the absurd GFlops numbers it produces, is pointless in the context of estimates. That's because neither the time nor Gflops should be being normalised [AT ALL], so it all get's disabled in estimates, and restricted to credit related uses where it's applicable to get the same credit claims from different apps. Well, we do (crudely) have two separate cases to deal with. 1) initial attach. We have to get rid of that divide-by-almost-zero, or hosts can't run. They get the absurdly low runtime estimate/bound and error when they exceed it. 2) steady state. In my (political) opinion, trying to bring back client-side DCF will be flogging one dead horse too many. We need some sort of server-side control of runtime estimates, so that client scheduling works and user expectations are met. I'm happy to accept that the new version will be different to the one we have now, and look forward to seeing it. OK, I'll get out of your hair, and take my coffee downstairs to grab some more stats. LoL, always appreciate bouncing it around, thanks. At the moment it's a bit like pointing to a bucket of kittens and saying 'that's not the flower-pot I ordered!'. Yeah it's possible to debate over the intent versus function more, but when push comes to shove it's just wrong & gives wacky numbers. Not really any more complicated than that in some sense ;) On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113186 · Reply Quote

Snow Crash Send message Joined: 11 Aug 13 Posts: 10 Credit: 5,011,603 RAC: 0	Message 113187 - Posted: 29 Jun 2014, 18:32:40 UTC June 29, 2014 18:00 UTC [url]https://albert.phys.uwm.edu/show_host_detail.php?hostid=9649[/url] BRP4G 2x using 1 cpu thread each (app_config), GPU utilization = 92% running an additional 4x Skynet POGs cpu WUs GPU 7950 mem=1325, gpu=1150, pcie v2 x16 OS Win7 x64 Home Premium CPU 980X running at 3.41 GHz with HT off MEM Triple channel 1600 (7.7.7.20.2) ID: 113187 · Reply Quote

treblehit Send message Joined: 12 Mar 05 Posts: 5 Credit: 35,119 RAC: 0	Message 113188 - Posted: 29 Jun 2014, 20:01:58 UTC - in response to Message 113185. 1) initial attach. We have to get rid of that divide-by-almost-zero, or hosts can't run. They get the absurdly low runtime estimate/bound and error when they exceed it. I'll be bringing more machines online today in a desperate attempt to provide steady, un-fiddled-with, untweaked, vanilla BRP4G work for you. I just need instructed: A) let them fail so you can see that, B) somehow prevent them from failing so that you have the reliable work-flow. Instructions, please. Bret ID: 113188 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113189 - Posted: 29 Jun 2014, 20:22:36 UTC - in response to Message 113188. Um, if you don't mind, I think it might be best to wait a little time. The administrators on this project are based in Europe, and as you know Jason is ahead of our time-zone, in Australia. I think it might be better to wait 12 hours or so, until we have a chance to compare notes by email when the lab opens in the morning. After all, we don't want to use up our entire supply of unattached new hosts in one hit, or else we won't have anything left to test Jason's patches with.... ID: 113189 · Reply Quote

treblehit Send message Joined: 12 Mar 05 Posts: 5 Credit: 35,119 RAC: 0	Message 113190 - Posted: 29 Jun 2014, 23:39:59 UTC - in response to Message 113189. [quote] Um, if you don't mind, I think it might be best to wait a little time. [quote] I completely understand, Richard. I was reluctant to bring it up in the first place. Unfortunately for me I have to deal with the hardware side of it when I can, so I'm going to cope with that today. I'll get it ready to connect remotely when you guys are ready for it. Let me know. You both know how to find me when and if you want me. In the meantime, I'm going to detach this host and go away to stop being a distraction. I only started this because "She Who Must Be Obeyed" had indicated you guys needed a reliable and unchanging stream of BRP4G tasks over on the GPU User's Group team message board. Bret ID: 113190 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113191 - Posted: 30 Jun 2014, 2:26:09 UTC - in response to Message 113189. Last modified: 30 Jun 2014, 2:30:37 UTC Um, if you don't mind, I think it might be best to wait a little time. The administrators on this project are based in Europe, and as you know Jason is ahead of our time-zone, in Australia. I think it might be better to wait 12 hours or so, until we have a chance to compare notes by email when the lab opens in the morning. After all, we don't want to use up our entire supply of unattached new hosts in one hit, or else we won't have anything left to test Jason's patches with.... Yes, unhooking that normalisation ( which divides by ~0.1, multiplies the GPU GFlops x~10 into absurd levels, and shrinks time estimates) is going to take quite some preparation to unhook safely. That same mechanism is hooked into credit (where it does make sense), so quite a lot of backwards & forwards for clarification, discussion and debate will be needed to get it 'right', and part of that's going to be me communicating effectively (which isn't always easy :)). The other aspect is that some bandaids will be painful to rip off, and still other odd artefacts might be hiding inside... and only way to tell for sure is open it up. The next few days will tell if we're all on the same page (but looking from different angles is fine). To me though, we are well through the tricky bits of understanding the current system enough to say it needs to be a lot better. On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113191 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113197 - Posted: 1 Jul 2014, 10:55:05 UTC Latest scattergram. I've reverted my 5367 to normal running (early afternoon yesterday), so my timings should be lower and steadier - doesn't really seem to show in credit yet. I wonder why Claggy's laptop gets such variable credit? ID: 113197 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113198 - Posted: 1 Jul 2014, 12:34:59 UTC - in response to Message 113197. Last modified: 1 Jul 2014, 13:02:43 UTC I wonder why Claggy's laptop gets such variable credit? Multiple tasks on smaller GPU, each running longer, will generate higher raw peak flop claims (pfc's) then that's averaged with the wingman's (Yellow triangle on dodgy diagram). So result can be anywhere from normal range to jackpot, as we previously assessed, depending on the wingman's claim. Though the prevalence of the jackpot conditions is less obvious, the noise in the system is still there. On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113198 · Reply Quote

Claggy Send message Joined: 29 Dec 06 Posts: 78 Credit: 4,040,969 RAC: 0	Message 113203 - Posted: 1 Jul 2014, 20:22:26 UTC - in response to Message 113198. I wonder why Claggy's laptop gets such variable credit? Multiple tasks on smaller GPU, each running longer, will generate higher raw peak flop claims (pfc's) then that's averaged with the wingman's (Yellow triangle on dodgy diagram). So result can be anywhere from normal range to jackpot, as we previously assessed, depending on the wingman's claim. Though the prevalence of the jackpot conditions is less obvious, the noise in the system is still there. I'm just running a single GPU task on both my GPU hosts, (the T8100's 128Mb 8400M GS doesn't count). Claggy ID: 113203 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113206 - Posted: 2 Jul 2014, 3:11:12 UTC - in response to Message 113203. Last modified: 2 Jul 2014, 3:18:06 UTC I wonder why Claggy's laptop gets such variable credit? Multiple tasks on smaller GPU, each running longer, will generate higher raw peak flop claims (pfc's) then that's averaged with the wingman's (Yellow triangle on dodgy diagram). So result can be anywhere from normal range to jackpot, as we previously assessed, depending on the wingman's claim. Though the prevalence of the jackpot conditions is less obvious, the noise in the system is still there. I'm just running a single GPU task on both my GPU hosts, (the T8100's 128Mb 8400M GS doesn't count). Claggy Could be the wingmen. (There's a number of combinations of wingmen types that'll give random results between two regions. Two similar wingmen tend to cancel with averaging and become 'normal') On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113206 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113207 - Posted: 2 Jul 2014, 10:26:08 UTC - in response to Message 113206. I wonder why Claggy's laptop gets such variable credit? Multiple tasks on smaller GPU, each running longer, will generate higher raw peak flop claims (pfc's) then that's averaged with the wingman's (Yellow triangle on dodgy diagram). So result can be anywhere from normal range to jackpot, as we previously assessed, depending on the wingman's claim. Though the prevalence of the jackpot conditions is less obvious, the noise in the system is still there. I'm just running a single GPU task on both my GPU hosts, (the T8100's 128Mb 8400M GS doesn't count). Claggy Could be the wingmen. (There's a number of combinations of wingmen types that'll give random results between two regions. Two similar wingmen tend to cancel with averaging and become 'normal') Conversely, when he's paired with me - now back to lower, stable, runtimes - no jackpot, no bonus. Sorry 'bout that. ID: 113207 · Reply Quote

jason_gee Send message Joined: 4 Jun 14 Posts: 109 Credit: 1,043,639 RAC: 0	Message 113209 - Posted: 2 Jul 2014, 10:54:43 UTC - in response to Message 113207. I wonder why Claggy's laptop gets such variable credit? Multiple tasks on smaller GPU, each running longer, will generate higher raw peak flop claims (pfc's) then that's averaged with the wingman's (Yellow triangle on dodgy diagram). So result can be anywhere from normal range to jackpot, as we previously assessed, depending on the wingman's claim. Though the prevalence of the jackpot conditions is less obvious, the noise in the system is still there. I'm just running a single GPU task on both my GPU hosts, (the T8100's 128Mb 8400M GS doesn't count). Claggy Could be the wingmen. (There's a number of combinations of wingmen types that'll give random results between two regions. Two similar wingmen tend to cancel with averaging and become 'normal') Conversely, when he's paired with me - now back to lower, stable, runtimes - no jackpot, no bonus. Sorry 'bout that. LoL, yep, throwing the dice to get an answer is as good as any ;) On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage ID: 113209 · Reply Quote

juan BFB Send message Joined: 10 Dec 12 Posts: 8 Credit: 1,674,320 RAC: 0	Message 113211 - Posted: 2 Jul 2014, 17:28:06 UTC Last modified: 2 Jul 2014, 17:30:46 UTC @Richard/Claggy Should i continue to crunch BRP4G only or you sugest to crunch another type of WU too (could do GPU work only here). BTW I slow down my cruchers here since donÂ´t belive quantity is what youÂ´re looking for and now they will produce a stable number of daily WU. ID: 113211 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113212 - Posted: 2 Jul 2014, 19:15:34 UTC - in response to Message 113211. BTW I slow down my cruchers here since donÂ´t belive quantity is what youÂ´re looking for and now they will produce a stable number of daily WU. I think that's probably a good idea. We're already at the stage where my last 12 consecutive validations have been against one or other of your hosts (5 different machines, I think). And the machines are all pretty similar, to each other and to mine: GTX 670/690/780, running Win7/64 or (in one case) Server 2008. In order to see (now) and test (later) BOINC's behaviour in the real world, we probably need a reasonable variation in hosts to give us realistic variation in the times and credits. Bernd has launched a new 'BRP5' (Persueus Arm Survey) v1.40, with a Beta app tag on it, to test that new feature in the BOINC scheduler. I'm in the process of switching my machine over to run that instead: some company would be nice, but be warned: we're half expecting to fall over the 'EXIT_TIME_LIMIT_EXCEEDED' problem at some stage with BRP5 Beta, so hosts running it probably need to be watched quite closely for strange estimated runtimes, and you need to be ready to take action to correct it. ID: 113212 · Reply Quote

Holmis Send message Joined: 4 Jan 05 Posts: 104 Credit: 2,104,736 RAC: 0	Message 113213 - Posted: 2 Jul 2014, 19:51:55 UTC - in response to Message 113212. Last modified: 2 Jul 2014, 19:52:14 UTC ... some company would be nice, but be warned: we're half expecting to fall over the 'EXIT_TIME_LIMIT_EXCEEDED' problem at some stage with BRP5 Beta... I just downloaded my first v1.40 BRP5 and I'd say it's looking pretty good so far! The estimated completion time shown in Boinc is 5h03m08s. These are the relevant lines from the scheduler log: 2014-07-02 19:35:03.2067 [PID=25783] [version] Best version of app einsteinbinary_BRP5 is [AV#934] (24.74 GFLOPS) 2014-07-02 19:35:03.2067 [PID=25783] [send] est delay 0, skipping deadline check 2014-07-02 19:35:03.2067 [PID=25783] [version] get_app_version(): getting app version for WU#625766 (PB0020_006A1_164) appid:27 2014-07-02 19:35:03.2067 [PID=25783] [version] returning cached version: [AV#934] 2014-07-02 19:35:03.2067 [PID=25783] [send] est delay 0, skipping deadline check 2014-07-02 19:35:03.3000 [PID=25783] [send] Sending app_version einsteinbinary_BRP5 2 140 BRP5-cuda32-nv301; projected 24.74 GFLOPS 2014-07-02 19:35:03.3001 [PID=25783] [send] est. duration for WU 625766: unscaled 18188.26 scaled 18306.56 2014-07-02 19:35:03.3001 [PID=25783] [send] [HOST#2267] sending [RESULT#1514790 PB0020_006A1_164_4] (est. dur. 18306.56s (5h05m06s55)) (max time 363765.12s (101h02m45s11)) And I've got this in the application details: Binary Radio Pulsar Search (Perseus Arm Survey) 1.40 windows_intelx86 (BRP5-cuda32-nv301) Number of tasks completed 0 Max tasks per day 0 Number of tasks today 1 Consecutive valid tasks 0 Average turnaround time 0.00 days For v1.39 the tasks took less than 5 hours and the APR was 21.91 GFlops. Whatever was changed seems to be working with regards to the initial estimates assuming that the app and workload is more or less the same. Keep up the good work! ID: 113213 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113214 - Posted: 2 Jul 2014, 20:03:50 UTC - in response to Message 113213. Nothing's I got something similar - 25.25Gflops 2014-07-02 17:43:24.7141 [PID=19995] 2014-07-02 17:43:24.7141 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7142 [PID=19995] 2014-07-02 17:43:24.7143 [PID=19995] 2014-07-02 17:43:24.7143 [PID=19995] 2014-07-02 17:43:24.7197 [PID=19995] 2014-07-02 17:43:24.7198 [PID=19995] 2014-07-02 17:43:24.7198 [PID=19995] But note that line I've picked The worry is that when 100 ID: 113214

using conservative projected flops: 25.25G [version] Best app version is now AV934 (102.01 GFLOP) [version] Checking plan class 'BRP5-opencl-ati' [version] plan_class_spec: parsed project prefs setting 'gpu_util_brp' : true : 0.480000 [version] plan_class_spec: No AMD GPUs found [version] [AV#937] app_plan() returned false [version] Checking plan class 'BRP5-opencl-intel_gpu' [version] plan_class_spec: parsed project prefs setting 'gpu_util_brp' : true : 0.480000 [version] [AV#935] Skipping Intel GPU version - user prefs say no Intel GPU [version] [AV#934] (BRP5-cuda32-nv301) using conservative projected flops: 25.25G [version] Best version of app einsteinbinary_BRP5 is [AV#934] (25.25 GFLOPS) [send] est delay 0, skipping deadline check [version] get_app_version(): getting app version for WU#625736 (PB0020_006A1_104) appid:27 [version] returning cached version: [AV#934] [send] est delay 0, skipping deadline check [send] Sending app_version einsteinbinary_BRP5 2 140 BRP5-cuda32-nv301; projected 25.25 GFLOPS [send] est. duration for WU 625736: unscaled 17819.43 scaled 17822.25 [send] [HOST#5367] sending [RESULT#1523511 PB0020_006A1_104_6] (est. dur. 17822.25s (4h57m02s24)) (max time 356388.68s (98h59m48s67)) out: that means there are fewer than 100 completed tasks for this app_version yet, across the project as a whole. tasks have been completed, but before you have completed 11 tasks on your host (to use APR), you'll see adjusting projected flops based on PFC avg and some absurdly large number. That'll be when the errors (if any) start. · Reply Quote
Message 113215 - Posted: 2 Jul 2014, 20:12:52 UTC - in response to Message 113214. Roger that, will keep a close watch on things until I've completed my first 11 tasks then. ID: 113215 · Reply Quote

Richard Haselgrove Send message Joined: 10 Dec 05 Posts: 450 Credit: 5,409,572 RAC: 0	Message 113217 - Posted: 2 Jul 2014, 21:04:15 UTC Well, here's the first conundrum: All Binary Radio Pulsar Search (Perseus Arm Survey) tasks for computer 5367 After 200 minutes of solid GTX 670 work on Perseus, I earn the princely sum of ... 15 credits! ID: 113217 · Reply Quote