Deprecated: Function get_magic_quotes_gpc() is deprecated in /srv/BOINC/live-webcode/html/inc/util.inc on line 640
Project server code update

WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!

Project server code update

Message boards : News : Project server code update
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 17 · Next

AuthorMessage
jason_gee

Send message
Joined: 4 Jun 14
Posts: 109
Credit: 1,043,639
RAC: 0
Message 113030 - Posted: 17 Jun 2014, 17:33:20 UTC - in response to Message 113028.  

I've seen that annotation before, somewhere.
rr_sim I think - can you look at a sample please, to check local boinc log against server values?


yes, were were there the other day digging out where whetstone was hiding. sched_version.cpp, estimate_flops() functions. That one for non- anon, and another slightly different for anon. For non-anon, Before statistics are gathered it's Boinc Whetstone for CPU (incidentally SIMD aware oin Android but not x86), and some mystery guesstimate for GPUs

Those mystery guesstimates for GPUs are one of the major quarries for our quest.

Claggy's ATI is running at 2.95 Teraflops, to put it in simpler numbers.


Yep. Also be aware in that area, just to complicate matters, that there is a scheduler config option David's thrown in, enabling a random multiplier across the project_flops for each app_version, so that app versions get juggled at least before stats are gathered.

I'm getting the distinct impression he's 'lost' the old 0.1 GPU flops scaling there (haven't come across it yet anyway, still looking), meaning that'll probably be using the raw client supplied marketing flops value, possibly by some random number...
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
ID: 113030 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 113031 - Posted: 17 Jun 2014, 17:34:11 UTC - in response to Message 113029.  

Unfortunately I missed the server log for a fetch - just got a 'report only' RPC instead. Could you grab a log if it does another work_fetch, please?

I did another request, and suspended network:

https://albert.phys.uwm.edu/host_sched_logs/8/8143

Claggy

[version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G
ID: 113031 · Report as offensive     Reply Quote
jason_gee

Send message
Joined: 4 Jun 14
Posts: 109
Credit: 1,043,639
RAC: 0
Message 113032 - Posted: 17 Jun 2014, 17:37:24 UTC - in response to Message 113031.  
Last modified: 17 Jun 2014, 17:40:19 UTC

Unfortunately I missed the server log for a fetch - just got a 'report only' RPC instead. Could you grab a log if it does another work_fetch, please?

I did another request, and suspended network:

https://albert.phys.uwm.edu/host_sched_logs/8/8143

Claggy

[version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G


That's not TeraFlops (speed), That's peak flop count, as in # of operations.

(verifying in code now)

*scratch that* looks broken, walking the lot with beer
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
ID: 113032 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 113033 - Posted: 17 Jun 2014, 17:41:22 UTC - in response to Message 113032.  

Unfortunately I missed the server log for a fetch - just got a 'report only' RPC instead. Could you grab a log if it does another work_fetch, please?

I did another request, and suspended network:

https://albert.phys.uwm.edu/host_sched_logs/8/8143

Claggy

[version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G


That's not TeraFlops (speed), That's peak flop count, as in # of operations.

(verifying in code now)

*scratch that* looks broken, walking the lot with beer

The server is using it as a speed for estimation purposes. Maybe that's our problem.
ID: 113033 · Report as offensive     Reply Quote
Eyrie

Send message
Joined: 20 Feb 14
Posts: 47
Credit: 2,410
RAC: 0
Message 113034 - Posted: 17 Jun 2014, 17:42:19 UTC - in response to Message 113032.  



*scratch that* looks broken, walking the lot with beer


peanut gallery: that's like saying that water is wet after falling in andd getting soaked...
Enjoy the beer. Valium might be the better choice.
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
ID: 113034 · Report as offensive     Reply Quote
Claggy

Send message
Joined: 29 Dec 06
Posts: 78
Credit: 4,040,969
RAC: 0
Message 113035 - Posted: 17 Jun 2014, 17:47:40 UTC - in response to Message 113032.  
Last modified: 17 Jun 2014, 18:35:27 UTC

Unfortunately I missed the server log for a fetch - just got a 'report only' RPC instead. Could you grab a log if it does another work_fetch, please?

I did another request, and suspended network:

https://albert.phys.uwm.edu/host_sched_logs/8/8143

Claggy

[version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G


That's not TeraFlops (speed), That's peak flop count, as in # of operations.

(verifying in code now)

*scratch that* looks broken, walking the lot with beer


Boinc startup says:

17/06/2014 18:17:17 | | CAL: ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (CAL version 1.4.1848, 1024MB, 984MB available, 3584 GFLOPS peak)
17/06/2014 18:17:17 | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (driver version 1348.5 (VM), device version OpenCL 1.2 AMD-APP (1348.5), 1024MB, 984MB available, 3584 GFLOPS peak)
17/06/2014 18:17:17 | | OpenCL CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1348.5 (sse2,avx), device version OpenCL 1.2 AMD-APP (1348.5))

The GTX460 always had a lot lower GFLOPS peak value, but was a lot more effective at Seti v6, v7 and AP v6, the exception being here, and the OpenCL Gamma-ray pulsar search #3 1.07 app, where the HD7770 was a little faster:

https://albert.phys.uwm.edu/host_app_versions.php?hostid=8143

Gamma-ray pulsar search #3 1.07 windows_x86_64 (FGRPopencl-ati)
Number of tasks completed 13
Max tasks per day 45
Number of tasks today 0
Consecutive valid tasks 13
Average processing rate 3.55 GFLOPS
Average turnaround time 0.37 days

Gamma-ray pulsar search #3 1.07 windows_x86_64 (FGRPopencl-nvidia)
Number of tasks completed 12
Max tasks per day 44
Number of tasks today 0
Consecutive valid tasks 12
Average processing rate 2.87 GFLOPS
Average turnaround time 0.88 days


http://boinc.berkeley.edu/dev/forum_thread.php?id=8767&postid=51659

04/12/2013 21:25:07 | | CUDA: NVIDIA GPU 0: GeForce GTX 460 (driver version 331.58, CUDA version 6.0, compute capability 2.1, 1024MB, 854MB available, 1075 GFLOPS peak)
04/12/2013 21:25:07 | | CAL: ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (CAL version 1.4.1848, 1024MB, 984MB available, 3584 GFLOPS peak)
04/12/2013 21:25:07 | | OpenCL: NVIDIA GPU 0: GeForce GTX 460 (driver version 331.58, device version OpenCL 1.1 CUDA, 1024MB, 854MB available, 1075 GFLOPS peak)
04/12/2013 21:25:07 | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 7700 series (Capeverde) (driver version 1348.4 (VM), device version OpenCL 1.2 AMD-APP (1348.4), 1024MB, 984MB available, 3584 GFLOPS peak)
04/12/2013 21:25:07 | | OpenCL CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (OpenCL driver vendor: Advanced Micro Devices, Inc., driver version 1348.4 (sse2,avx), device version OpenCL 1.2 AMD-APP (1348.4))

Claggy
ID: 113035 · Report as offensive     Reply Quote
Eyrie

Send message
Joined: 20 Feb 14
Posts: 47
Credit: 2,410
RAC: 0
Message 113036 - Posted: 17 Jun 2014, 17:49:29 UTC - in response to Message 113033.  
Last modified: 17 Jun 2014, 17:51:04 UTC

Unfortunately I missed the server log for a fetch - just got a 'report only' RPC instead. Could you grab a log if it does another work_fetch, please?

I did another request, and suspended network:

https://albert.phys.uwm.edu/host_sched_logs/8/8143

Claggy

[version] [AV#911] (FGRPopencl-ati) adjusting projected flops based on PFC avg: 2950.33G


That's not TeraFlops (speed), That's peak flop count, as in # of operations.

(verifying in code now)

*scratch that* looks broken, walking the lot with beer

The server is using it as a speed for estimation purposes. Maybe that's our problem.

of course it;s speed, it's APR later - 'based on' is our problem - something is being factored in incorrectly. AFAIK on SETI there's no such gross overestimation of GPU speed.

@ Claggy what is the peak flop count for that card? (sorry if you posted that aready)

edit: ta.

peak flops x pfc_ave ? the latter being <1 ?
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
ID: 113036 · Report as offensive     Reply Quote
jason_gee

Send message
Joined: 4 Jun 14
Posts: 109
Credit: 1,043,639
RAC: 0
Message 113037 - Posted: 17 Jun 2014, 17:58:00 UTC - in response to Message 113036.  
Last modified: 17 Jun 2014, 17:59:16 UTC

yes, this is bizarre:

once stats are gathered:
if (av.pfc.n > MIN_VERSION_SAMPLES) {
            hu.projected_flops = hu.peak_flops/av.pfc.get_avg();
            if (config.debug_version_select) {
                log_messages.printf(MSG_NORMAL,
                    "[version] [AV#%d] (%s) adjusting projected flops based on PFC avg: %.2fG\n",
                    av.id, av.plan_class, hu.projected_flops/1e9
                );
            }


Dodgy average aside (which we know all about the problems of sampled averages there, particularly with very few samples), looks like ratio of marketing flops estimate (from client) to operations (effective claimed)

Going to check if he's tweaked the definition of pfc here, because flops rate over average operations would give average time in seconds to me... chgecking that pfc with that beer...


[Edit:] no sign of our 0.1x scaling for GPU either, at least in albert code.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
ID: 113037 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 113038 - Posted: 17 Jun 2014, 18:31:02 UTC

Jason, with the high-scoring late validations, your average is now above par, at 1003.97

And your median is higher still, at 1168.97
ID: 113038 · Report as offensive     Reply Quote
Eyrie

Send message
Joined: 20 Feb 14
Posts: 47
Credit: 2,410
RAC: 0
Message 113039 - Posted: 17 Jun 2014, 19:09:51 UTC
Last modified: 17 Jun 2014, 19:11:13 UTC

Ok, so it is effectively using a scaled (marketing) peak flops value - iow a totally unrealistic estimate.

We do need something as a starting point though. Those peak flops are as inadequate as using 10X CPU speed was.

Eve comes in at 91e9 peak flops. From SETI (too small to run here) her GPU is slightly faster than her CPU. CPU needs ~2h for BRP. So roughly the GPU tasks would take 32 hours. That makes her about 32x slower than a 780 - that's the span we are dealing with and it will only grow larger as GPUs get ever faster.

91*32 = 2912 - which is about the figure we saw earlier for fast GPUs - so the slope of the peak flops is not too bad, but the offset is. With an APR of 33 for the 780 and about 1 for Eve we are looking at a ~90x overestimate. For BRP at least.

that scaling value that is being applied must bring the estimates into the correct magnitude over on seti...
any chance to get that number from Eric?

I don't know. If you underestimate the speed, you cache too few tasks - more frequent top up - only a problem if you really can't connect for longer periods of time as you'd run dry (not really a problem either ;) ).

It's the overestimation that runs afoul of the built-in safety-checks.

So how about using 1/100 of peak flops as a GPU starting point? I mean you have to start _somewhere_ ...

Any problems with underestimating I've failed to consider?
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
ID: 113039 · Report as offensive     Reply Quote
jason_gee

Send message
Joined: 4 Jun 14
Posts: 109
Credit: 1,043,639
RAC: 0
Message 113040 - Posted: 17 Jun 2014, 19:12:44 UTC - in response to Message 113038.  

Jason, with the high-scoring late validations, your average is now above par, at 1003.97

And your median is higher still, at 1168.97


good. better late than never :D

Yes we'll definitely need to stabilise CPU here first. GPU is going to take a bit more digging yet, and whether or not there is any connection at estimate, scheduler or validation determined before that one's tackled in detail

There are definitely those dicey averages in play (everywhere) to start with, then also I'm surprised to be finding reliance on those (nearly useless) GPU marketing flops figures embedded even after stats are gathered. Until the primary CPU scales are fixed, and averages for all kinds are replaced with damped values, any particular odd logic choice in there is likely to be obliterated in the noise anyway. (Paraphrasing the comments about chaos burying the noise, lol )
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
ID: 113040 · Report as offensive     Reply Quote
jason_gee

Send message
Joined: 4 Jun 14
Posts: 109
Credit: 1,043,639
RAC: 0
Message 113041 - Posted: 17 Jun 2014, 19:16:47 UTC - in response to Message 113039.  
Last modified: 17 Jun 2014, 19:22:28 UTC

Ok, so it is effectively using a scaled (marketing) peak flops value - iow a totally unrealistic estimate.

We do need something as a starting point though. Those peak flops are as inadequate as using 10X CPU speed was.
...


I agree, though 'true' averages can be fine and established quickly. 10% of the marketing flops should be near enough ballpark for a new host to get it going... which scaling or combination of scalings, is breaking the initial GPU estimate is a mystery to me at the moment, though I have no doubt it'll be much easier to spot with new hostIds in phase 2 when all the averages get replaced with actively controlled dampers.

Pass1 (starting point)
CPU coarse scaling correction
-- look for unexpected effects (e.g. are the GPU apps completely unconnected as expected here)
Pass2 (replace sampled averages with controllers, actively damped)
-- look for GPU scaling errors, particularly new hostids / apps
Pass3
-- GPU scaling logic refinement if needed (probably is)

Got enough to draw up something for passes one and two, will get a coffee & a break, then get to some documenting and coding
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
ID: 113041 · Report as offensive     Reply Quote
Eyrie

Send message
Joined: 20 Feb 14
Posts: 47
Credit: 2,410
RAC: 0
Message 113042 - Posted: 17 Jun 2014, 19:22:03 UTC - in response to Message 113041.  

Ok, so it is effectively using a scaled (marketing) peak flops value - iow a totally unrealistic estimate.

We do need something as a starting point though. Those peak flops are as inadequate as using 10X CPU speed was.
...


I agree, though 'true' averages can be fine and established quickly. 10% of the marketing flops should be near enough ballpark for a new host to get it going...
...

Didn't I just extensively calculate that 1% is more like it?!
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
ID: 113042 · Report as offensive     Reply Quote
jason_gee

Send message
Joined: 4 Jun 14
Posts: 109
Credit: 1,043,639
RAC: 0
Message 113043 - Posted: 17 Jun 2014, 19:24:25 UTC - in response to Message 113042.  
Last modified: 17 Jun 2014, 19:32:12 UTC

Ok, so it is effectively using a scaled (marketing) peak flops value - iow a totally unrealistic estimate.

We do need something as a starting point though. Those peak flops are as inadequate as using 10X CPU speed was.
...


I agree, though 'true' averages can be fine and established quickly. 10% of the marketing flops should be near enough ballpark for a new host to get it going...
...

Didn't I just extensively calculate that 1% is more like it?!



Yes, I'm talking from the intent written in code and comments at this point, not what it's actually achieving. If I were to comment on what it's actually achieving, I would have to invent some more words

[Edit:] something like "Bandaids on top of fudge factors applied to magic numbers" comes to mind, though doesn't quite capture it.
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
ID: 113043 · Report as offensive     Reply Quote
Eyrie

Send message
Joined: 20 Feb 14
Posts: 47
Credit: 2,410
RAC: 0
Message 113044 - Posted: 17 Jun 2014, 19:35:09 UTC

um, no.

It's achieving chaos. :D

Chaos theory tells us that that means that at least 3 coupled differential equations are in play :) 'three is chaos'.
To get the system into a steady-state, means either uncoupling or stabilising sub-equations.
From a mathematical pov this is quite fascinating.
I doubt you'd as easily produce a chaotic system if you were actually trying to get one. :D
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
ID: 113044 · Report as offensive     Reply Quote
jason_gee

Send message
Joined: 4 Jun 14
Posts: 109
Credit: 1,043,639
RAC: 0
Message 113045 - Posted: 17 Jun 2014, 19:51:47 UTC - in response to Message 113044.  
Last modified: 17 Jun 2014, 19:52:06 UTC

...
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
ID: 113045 · Report as offensive     Reply Quote
jason_gee

Send message
Joined: 4 Jun 14
Posts: 109
Credit: 1,043,639
RAC: 0
Message 113046 - Posted: 17 Jun 2014, 19:51:49 UTC - in response to Message 113044.  
Last modified: 17 Jun 2014, 19:55:22 UTC

um, no.

It's achieving chaos. :D

Chaos theory tells us that that means that at least 3 coupled differential equations are in play :) 'three is chaos'.
To get the system into a steady-state, means either uncoupling or stabilising sub-equations.
From a mathematical pov this is quite fascinating.
I doubt you'd as easily produce a chaotic system if you were actually trying to get one. :D


Yes, reminds me of a tongue in cheek comment I made suggesting the climate people might be interested in this... oh well

Yes we can, after poking the CPU app scale in pass 1, in pass 2 place the two scaling equations (scheduler & validation) into separate time domains so they stop interacting in weird ways, and damp the third, which is stochastic non-linear non-deterministic ( elapsed time based samples), then look for more logic issues.

I'm pretty convinced that there is a logic breakage there for new GPU hosts, but can't put my finger on it yet. It'll fall out during the first 2 passes I reckon.

[Edit:] I see the boinc messageboard echo in here works fine :)
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
ID: 113046 · Report as offensive     Reply Quote
Eyrie

Send message
Joined: 20 Feb 14
Posts: 47
Credit: 2,410
RAC: 0
Message 113047 - Posted: 17 Jun 2014, 19:55:13 UTC - in response to Message 113046.  

[Edit:] I see the boinc messageboard echo in here works fine :)

Beg your pardon?
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
ID: 113047 · Report as offensive     Reply Quote
jason_gee

Send message
Joined: 4 Jun 14
Posts: 109
Credit: 1,043,639
RAC: 0
Message 113048 - Posted: 17 Jun 2014, 19:56:06 UTC - in response to Message 113047.  
Last modified: 17 Jun 2014, 19:56:21 UTC

[Edit:] I see the boinc messageboard echo in here works fine :)

Beg your pardon?


Double posts seem to happen a lot (to me anyway) [not this time]
On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - C Babbage
ID: 113048 · Report as offensive     Reply Quote
Eyrie

Send message
Joined: 20 Feb 14
Posts: 47
Credit: 2,410
RAC: 0
Message 113049 - Posted: 17 Jun 2014, 19:58:07 UTC - in response to Message 113048.  

[Edit:] I see the boinc messageboard echo in here works fine :)

Beg your pardon?


Double posts seem to happen a lot (to me anyway) [not this time]

Your resident moderator(s) will probbaly be pleased if you red-x them for hiding. That's tongue in cheek. For once it's not me getting those reports :D
Queen of Aliasses, wielder of the SETI rolling pin, Mistress of the red shoes, Guardian of the orange tree, Slayer of very small dragons.
ID: 113049 · Report as offensive     Reply Quote
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 17 · Next

Message boards : News : Project server code update



This material is based upon work supported by the National Science Foundation (NSF) under Grant PHY-0555655 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2024 Bruce Allen for the LIGO Scientific Collaboration