Deprecated: Function get_magic_quotes_gpc() is deprecated in /srv/BOINC/live-webcode/html/inc/util.inc on line 640
[New release] BRP app v1.22 feedback thread

WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!

[New release] BRP app v1.22 feedback thread

Message boards : Problems and Bug Reports : [New release] BRP app v1.22 feedback thread
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Nikolay

Send message
Joined: 13 Jan 12
Posts: 4
Credit: 6,500
RAC: 0
Message 111900 - Posted: 3 Mar 2012, 12:16:19 UTC

Since v1.21/v1.22 update, I have got a lot of validation errors.
Please check my stats/failed units, maybe that will help you to find a bug.
My system is BOINC 7.18 + Mac OS X 10.7.3 + AMD HD6750M 1GB
ID: 111900 · Report as offensive     Reply Quote
[AF>Le_Pommier] McRoger

Send message
Joined: 9 Feb 08
Posts: 3
Credit: 216,378
RAC: 0
Message 111901 - Posted: 3 Mar 2012, 16:35:51 UTC - in response to Message 111900.  

Same for me, my results

Boinc Manager 7.0.12
OS X Lion 10.7.3
ATI Radeon 5770 (Apple) 1 Gb
ID: 111901 · Report as offensive     Reply Quote
[AF>Le_Pommier] McRoger

Send message
Joined: 9 Feb 08
Posts: 3
Credit: 216,378
RAC: 0
Message 111902 - Posted: 4 Mar 2012, 6:59:44 UTC - in response to Message 111901.  

Same for me, my results

Boinc Manager 7.0.12
OS X Lion 10.7.3
ATI Radeon 5770 (Apple) 1 Gb


Besides there is this error in the log

[19:14:19][1216][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).

Might it explain why calculation time is so huge compared to other platforms ?
ID: 111902 · Report as offensive     Reply Quote
Alex

Send message
Joined: 1 Mar 05
Posts: 88
Credit: 398,734
RAC: 0
Message 111903 - Posted: 4 Mar 2012, 22:23:43 UTC - in response to Message 111895.  


Could running vbox be a problem?


Don't know but could be if vbox acquires the GPU somehow...

Oliver


Not on my machine.
vbox (test4theory, 2 cpu's) and three albert BRP's (2 ati, one nvidia) are running fine together. Some are waiting for validation, some are validated, no errors or invalids.

win7 x64 8GB 7.0.12
ID: 111903 · Report as offensive     Reply Quote
Profile Oliver Behnke
Volunteer moderator
Project administrator
Project developer

Send message
Joined: 4 Sep 07
Posts: 130
Credit: 8,545,955
RAC: 0
Message 111904 - Posted: 5 Mar 2012, 8:27:38 UTC - in response to Message 111897.  

Hi Jord,

So, pretty please, can the fpops estimate be adjusted enough that they don't come in thinking to take 200+ hours?


I'll forward this to Bernd but he's pretty overwhelmed with more important topics right now and the BOINC devs are of little help analyzing this right now. Please bear with us.

Cheers,
Oliver
ID: 111904 · Report as offensive     Reply Quote
Profile Oliver Behnke
Volunteer moderator
Project administrator
Project developer

Send message
Joined: 4 Sep 07
Posts: 130
Credit: 8,545,955
RAC: 0
Message 111905 - Posted: 5 Mar 2012, 8:33:48 UTC - in response to Message 111902.  
Last modified: 5 Mar 2012, 8:36:29 UTC

Same for me, my results

Boinc Manager 7.0.12
OS X Lion 10.7.3
ATI Radeon 5770 (Apple) 1 Gb


Besides there is this error in the log

[19:14:19][1216][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).

Might it explain why calculation time is so huge compared to other platforms ?


1) Please read my intro post of this thread. The Mac version is known to produce invalid results. We already disabled it.
2) Your GPU is simply not that efficient that's why it takes so long. It's not about the platform but the GPU.
3) The message you quote is an "INFO" message, so no, it's not the reason. It's normal when a fresh dataset is being analyzed for the first time - there can't be any checkpoint then.

HTH,
Oliver
ID: 111905 · Report as offensive     Reply Quote
Profile Oliver Behnke
Volunteer moderator
Project administrator
Project developer

Send message
Joined: 4 Sep 07
Posts: 130
Credit: 8,545,955
RAC: 0
Message 111906 - Posted: 5 Mar 2012, 8:38:01 UTC - in response to Message 111900.  

Since v1.21/v1.22 update, I have got a lot of validation errors.
Please check my stats/failed units, maybe that will help you to find a bug.
My system is BOINC 7.18 + Mac OS X 10.7.3 + AMD HD6750M 1GB


Same as for "[AF>Le_Pommier] McRoger": no working OS X OpenCL app for the time being...

Oliver
ID: 111906 · Report as offensive     Reply Quote
Profile pragmatic prancing periodic problem child, left
Avatar

Send message
Joined: 26 Jan 05
Posts: 1639
Credit: 70,000
RAC: 0
Message 111907 - Posted: 5 Mar 2012, 16:58:49 UTC - in response to Message 111904.  

Hi Jord,

So, pretty please, can the fpops estimate be adjusted enough that they don't come in thinking to take 200+ hours?


I'll forward this to Bernd but he's pretty overwhelmed with more important topics right now and the BOINC devs are of little help analyzing this right now. Please bear with us.

It may be quite easy.

I changed <rsc_fpops_est>300000000000000.000000</rsc_fpops_est> to <rsc_fpops_est>30000000000000.000000</rsc_fpops_est> (one zero less) and restarted BOINC. Estimated time on a new task is now 15 hours, which is more in line than the original 208 hours.
Jord.

BOINC FAQ Service

They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake.
ID: 111907 · Report as offensive     Reply Quote
Profile Oliver Behnke
Volunteer moderator
Project administrator
Project developer

Send message
Joined: 4 Sep 07
Posts: 130
Credit: 8,545,955
RAC: 0
Message 111909 - Posted: 6 Mar 2012, 13:46:33 UTC - in response to Message 111907.  
Last modified: 7 Mar 2012, 10:47:51 UTC

It's not. The reason isn't a flaw in our runtime estimation (stored in the work unit definition) but BOINC's new automatic runtime estimation system (a.k.a. new credit system) we're also testing here on albert...

Oliver
ID: 111909 · Report as offensive     Reply Quote
Alex

Send message
Joined: 1 Mar 05
Posts: 88
Credit: 398,734
RAC: 0
Message 111910 - Posted: 7 Mar 2012, 15:24:21 UTC - in response to Message 111909.  

It's not. The reason isn't a flaw in our runtime estimation (stored in the work unit definition) but BOINC's new automatic runtime estimation system (a.k.a. new credit system) we're also testing here on albert...

Oliver


... and it looks like its faulty ????
ID: 111910 · Report as offensive     Reply Quote
skildude

Send message
Joined: 15 Nov 11
Posts: 9
Credit: 103,497
RAC: 0
Message 111911 - Posted: 7 Mar 2012, 22:44:25 UTC

I've gotten a great deal of invalids and inconclusives. Somethings wrong and I don't think its my GPU
ID: 111911 · Report as offensive     Reply Quote
spingadus[MM]

Send message
Joined: 15 Oct 06
Posts: 4
Credit: 250,000
RAC: 0
Message 111912 - Posted: 8 Mar 2012, 9:01:29 UTC

Just wanted to post that I finally got some Albert tasks!

I followed Skildude and Ageless's advice. I uninstalled the ATI drivers, then ran driversweep in safe mode. I found a lot of old Nvidia stuff as well from previous cards. I also updated to 7.0.20 just for the heck of it. The Boinc startup messages did show opencl for the gpu this time.

Thanks!
ID: 111912 · Report as offensive     Reply Quote
Profile Oliver Behnke
Volunteer moderator
Project administrator
Project developer

Send message
Joined: 4 Sep 07
Posts: 130
Credit: 8,545,955
RAC: 0
Message 111913 - Posted: 8 Mar 2012, 13:53:05 UTC - in response to Message 111910.  
Last modified: 8 Mar 2012, 13:54:21 UTC


... and it looks like its faulty ????


Well, let's say it's non-optimal, in particular for GPU apps. The runtime estimates are determined for every application version independently. Thus after each newly released version BOINC needs some time to gather statistics to come up with a valid/reasonable runtime estimate. Don't worry, we won't be using this new system over on einstein until it proves reliable, but we need to test it here in order to improve (fix) it at all - as soon as time permits.


Best,
Oliver
ID: 111913 · Report as offensive     Reply Quote
oz

Send message
Joined: 28 Feb 05
Posts: 10
Credit: 1,285,478
RAC: 0
Message 111914 - Posted: 8 Mar 2012, 17:19:54 UTC

Today I had a lot of atiOpenCL tasks aborted after exactly 24:14 min.

133328  43214   7 Mar 2012 | 16:58:05 UTC       8 Mar 2012 | 6:11:00 UTC        Error while computing   1,454.57        580.95  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133327  39414   7 Mar 2012 | 16:59:13 UTC       8 Mar 2012 | 6:11:00 UTC        Error while computing   1,453.71        578.01  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133326  39395   7 Mar 2012 | 16:59:13 UTC       8 Mar 2012 | 6:11:00 UTC        Error while computing   1,454.23        582.54  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133325  39432   7 Mar 2012 | 16:59:13 UTC       8 Mar 2012 | 8:30:35 UTC        Error while computing   1,453.80        582.52  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133324  39441   7 Mar 2012 | 16:59:13 UTC       8 Mar 2012 | 8:30:35 UTC        Error while computing   1,453.70        586.05  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133323  43314   7 Mar 2012 | 16:58:05 UTC       8 Mar 2012 | 6:11:00 UTC        Error while computing   1,454.00        584.04  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133321  39279   7 Mar 2012 | 16:55:50 UTC       8 Mar 2012 | 6:11:00 UTC        Error while computing   1,453.96        582.06  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133320  37403   7 Mar 2012 | 17:00:24 UTC       8 Mar 2012 | 8:30:35 UTC        Error while computing   1,454.57        614.77  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133319  36932   7 Mar 2012 | 17:00:24 UTC       8 Mar 2012 | 8:30:35 UTC        Error while computing   1,453.83        607.69  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133318  44053   7 Mar 2012 | 17:00:25 UTC       8 Mar 2012 | 10:38:10 UTC       Error while computing   1,453.83        662.62  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133317  38006   7 Mar 2012 | 17:01:34 UTC       8 Mar 2012 | 12:57:42 UTC       Error while computing   1,454.31        665.01  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133316  43437   7 Mar 2012 | 16:58:05 UTC       8 Mar 2012 | 6:11:00 UTC        Error while computing   1,454.19        582.07  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)


Bikemans:
133387  44311   7 Mar 2012 | 17:42:19 UTC       8 Mar 2012 | 9:42:23 UTC        Error while computing   947.54  846.04  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133348  44229   7 Mar 2012 | 17:42:19 UTC       8 Mar 2012 | 9:42:23 UTC        Error while computing   946.95  840.90  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133346  44226   7 Mar 2012 | 17:43:26 UTC       8 Mar 2012 | 10:07:15 UTC       Error while computing   947.58  838.51  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133314  39550   7 Mar 2012 | 17:41:09 UTC       8 Mar 2012 | 4:31:47 UTC        Error while computing   946.85  838.29  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
133274  44093   7 Mar 2012 | 17:41:09 UTC       8 Mar 2012 | 4:31:47 UTC        Error while computing   947.08  842.30  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
130790  44395   7 Mar 2012 | 17:43:27 UTC       8 Mar 2012 | 10:43:45 UTC       Error while computing   946.86  844.07  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)
130749  44374   7 Mar 2012 | 17:42:19 UTC       8 Mar 2012 | 9:42:23 UTC        Error while computing   947.30  837.06  ---     Binary Radio Pulsar Search v1.22 (atiOpenCL)


PS.:
Bikemans end up earlier due to better hardware
ID: 111914 · Report as offensive     Reply Quote
Profile Oliver Behnke
Volunteer moderator
Project administrator
Project developer

Send message
Joined: 4 Sep 07
Posts: 130
Credit: 8,545,955
RAC: 0
Message 111915 - Posted: 9 Mar 2012, 9:28:18 UTC - in response to Message 111914.  

Now that's strange. Looks like BOINC's borked runtime estimation again. Thanks for reporting...

Oliver
ID: 111915 · Report as offensive     Reply Quote
choks

Send message
Joined: 24 Feb 05
Posts: 5
Credit: 1,110,845
RAC: 0
Message 111920 - Posted: 12 Mar 2012, 8:05:58 UTC - in response to Message 111915.  

Hi,

I also had my tasks ended after 702 seconds (tasks 130935,130925,130906 for example). I had to divide <flops> by 10 in client_state.xml to allow tasks to finish.

I just upgraded to catalyst 12.12 (7/3/2012) and the good news for Linux users is that the CPU usage was significantly reducted. 1300 seconds of CPU time per work, instead of about 3600 with 12.11. Average CPU is now about 33%.

Christophe
ID: 111920 · Report as offensive     Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 25 Nov 05
Posts: 204
Credit: 64,008
RAC: 0
Message 111924 - Posted: 13 Mar 2012, 11:01:12 UTC

All 1.22 wu's are erroring out with max time elapsed http://albert.phys.uwm.edu/results.php?userid=128605&offset=0&show_names=0&state=5&appid=

running on boinc 7.0.20 ati drivers 12.2
ID: 111924 · Report as offensive     Reply Quote
Profile pragmatic prancing periodic problem child, left
Avatar

Send message
Joined: 26 Jan 05
Posts: 1639
Credit: 70,000
RAC: 0
Message 111925 - Posted: 13 Mar 2012, 20:03:40 UTC

I don't know if this affects the OpenCL in any way, but the Catalysts 12.2 do cause Anti Aliasing problems in some games. I noticed it after upgrading to these drivers, that all fine mist like graphics in Skyrim would become lots of square pixels. This can only be fixed by disabling AA and enabling FSAA instead.

ATI says it's a game problem, not their drivers, but heck if something works before and doesn't after changing the drivers, then how can that be the game's problem when that one hasn't changed literally a bit?
Jord.

BOINC FAQ Service

They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake.
ID: 111925 · Report as offensive     Reply Quote
Profile pragmatic prancing periodic problem child, left
Avatar

Send message
Joined: 26 Jan 05
Posts: 1639
Credit: 70,000
RAC: 0
Message 111926 - Posted: 13 Mar 2012, 22:45:05 UTC

And again...
Normal average run time of OpenCl tasks on my ATI HD6850 is around 6200 seconds. When not interrupted.

When interrupted (due to exit BOINC, suspend BOINC or suspend task (exclusive_app or switch between applications)), task run time length increases to 31,000 - 36,000 seconds (!!). (task list)


Jord.

BOINC FAQ Service

They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake.
ID: 111926 · Report as offensive     Reply Quote
Alex

Send message
Joined: 1 Mar 05
Posts: 88
Credit: 398,734
RAC: 0
Message 111928 - Posted: 14 Mar 2012, 21:10:49 UTC

As Tullio posted in an other thread, the Albert wu's are slower than the Einstein wu's.
I checked it twice with BRP3cuda32 wu's, running all of them with the same setting (GPU 0.5) on the same hardware.

@ Jord: I don't see this behaviour on my machine, I turn it off sometimes, put tasks on hold, start them on my HD5830 and let it finish on the APU. I have no tasks running longer than 12800 sec.

ID: 111928 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Problems and Bug Reports : [New release] BRP app v1.22 feedback thread



This material is based upon work supported by the National Science Foundation (NSF) under Grant PHY-0555655 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2024 Bruce Allen for the LIGO Scientific Collaboration