[New release] BRP app v1.22 feedback thread

log in

Advanced search

Message boards : Problems and Bug Reports : [New release] BRP app v1.22 feedback thread

Author Message
Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111876 - Posted: 27 Feb 2012 | 15:45:46 UTC
Last modified: 2 Mar 2012 | 9:20:57 UTC

Hi,

We just released BRP4 v1.22 which adds a number of improvements.

Notes:
* OpenCL GPU memory requirements reduced by 128 MB (to ~360 MB)
* Dropped support for OpenCL 1.0 GPUs (in favor of the above, OpenCL 1.0 only GPUs like the Radeon 4xxx are too slow anyway)
* More graceful OpenCL memory error handling
* Known issue: no OpenCL support for Mac OS X for the time being (we're looking into a potential Apple bug, no pun intended)
* CUDA apps are enabled again for the time being
* Input data download volume (per work unit) can be reduced by a factor of two (soon)
* Please use the latest Catalyst driver (>=12.1) and BOINC client (>=7.0.12).

Let's try and collect your feedback to this specific release (and this one only) in this thread.


Thanks,
Oliver

Profile mickydl*
Send message
Joined: 8 Dec 11
Posts: 6
Credit: 6,000
RAC: 0
Message 111878 - Posted: 27 Feb 2012 | 20:12:35 UTC - in response to Message 111876.
Last modified: 27 Feb 2012 | 20:14:50 UTC

Just to be sure: By "latest Catalyst driver (>=12.1)" you mean "AMD Catalystâ„¢ 11.11 - Revision number 12.1" ?

I can't find any Catalyst 12.1 for Linux on the AMD site.

mickydl*

Profile Trog Dog
Avatar
Send message
Joined: 25 Nov 05
Posts: 204
Credit: 64,008
RAC: 0
Message 111879 - Posted: 28 Feb 2012 | 0:29:12 UTC - in response to Message 111878.

the one available from here http://www2.ati.com/drivers/linux/amd-driver-installer-12.1-x86.x86_64.run
____________

skildude
Send message
Joined: 15 Nov 11
Posts: 9
Credit: 103,497
RAC: 0
Message 111880 - Posted: 28 Feb 2012 | 2:37:36 UTC

should we dump Wu's still using the 1.21 app? I've noticed a few inconclusives and errors on that app. not seeing any problems from other GPU projects

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111881 - Posted: 28 Feb 2012 | 9:03:25 UTC - in response to Message 111878.

Just to be sure: By "latest Catalyst driver (>=12.1)" you mean "AMD Catalystâ„¢ 11.11 - Revision number 12.1" ?

I can't find any Catalyst 12.1 for Linux on the AMD site.


That's just because of AMD's sloppy web editing. The one you found is the 12.1 driver.

Oliver

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111882 - Posted: 28 Feb 2012 | 9:05:38 UTC - in response to Message 111880.
Last modified: 28 Feb 2012 | 9:06:03 UTC

should we dump Wu's still using the 1.21 app? I've noticed a few inconclusives and errors on that app. not seeing any problems from other GPU projects


Depends on what errors you saw. If they are memory-related you probably want to reset your project. You'll be resent the same tasks but they'll be crunched with the latest app version which required less memory (OpenCL only).

Cheers,
Oliver

Profile Ageless
Avatar
Send message
Joined: 26 Jan 05
Posts: 1639
Credit: 70,000
RAC: 0
Message 111883 - Posted: 28 Feb 2012 | 23:47:36 UTC

Task comes in with <rsc_fpops_est>300000000000000.000000</rsc_fpops_est> which tells BOINC the task is going to take 210 hours and a bit, so BOINC will run it for a long time in panic mode. Can we please get a reasonable fpops estimate, one that doesn't immediately throw Albert tasks in High Priority?
____________
Jord.

BOINC FAQ Service

They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake.

spingadus[MM]
Send message
Joined: 15 Oct 06
Posts: 4
Credit: 250,000
RAC: 0
Message 111884 - Posted: 29 Feb 2012 | 2:18:48 UTC
Last modified: 29 Feb 2012 | 2:19:26 UTC

Are there actually tasks available? I'm not receiving anything for my GPU.

HD 6970
BOINC 7.0.18
ATI Driver 12.1


16459 Albert@Home 2/28/2012 6:15:47 PM update requested by user
16460 Albert@Home 2/28/2012 6:15:48 PM Sending scheduler request: Requested by user.
16461 Albert@Home 2/28/2012 6:15:48 PM Requesting new tasks for ATI
16462 Albert@Home 2/28/2012 6:15:51 PM Scheduler request completed: got 0 new tasks
16463 Albert@Home 2/28/2012 6:15:51 PM No tasks sent

I've even suspended all tasks before trying to update.

skildude
Send message
Joined: 15 Nov 11
Posts: 9
Credit: 103,497
RAC: 0
Message 111885 - Posted: 29 Feb 2012 | 2:22:12 UTC - in response to Message 111884.

you have to leave a cpu free otherwise it's unlikely to get work.

spingadus[MM]
Send message
Joined: 15 Oct 06
Posts: 4
Credit: 250,000
RAC: 0
Message 111887 - Posted: 29 Feb 2012 | 3:06:38 UTC - in response to Message 111885.

Freed up 2 threads out of 8 and still nothing. Still no work. I'm currently running Moo! so I suspended the project as well. No tasks.

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111888 - Posted: 29 Feb 2012 | 9:48:09 UTC - in response to Message 111884.
Last modified: 29 Feb 2012 | 9:54:46 UTC

Are there actually tasks available? I'm not receiving anything for my GPU.


Yes there are and your config looks fine so far. Please have a look at the BOINC event log: did BOINC recognize your AMD GPU as OpenCL device? According to our logs it doesn't seem to be the case. You might need to reinstall the Catalyst driver. Also, remember to start the X server and make sure that BOINC can access the X display. If "clinfo" exists on your system you may use it to verify that your GPU is properly enumerated by OpenCL.

Cheers,
Oliver

spingadus[MM]
Send message
Joined: 15 Oct 06
Posts: 4
Credit: 250,000
RAC: 0
Message 111889 - Posted: 29 Feb 2012 | 10:15:52 UTC - in response to Message 111888.

No X windows here, I'm running win7.

Is there somewhere in the logs that will show me if the card is enumerated correctly?

I re-installed the 12.1 driver yesterday and did a custom install instead of express. Everything appeared to be checked.

Its 2am here and I'm sleep typing, so I'll check when I wake up. :P

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111890 - Posted: 29 Feb 2012 | 10:50:17 UTC - in response to Message 111889.

No X windows here, I'm running win7


Oops, sorry :-) When you start the BOINC client it'll list all GPUs in the event log (advanced view). For AMD/ATI devices it might talk about CAL and OpenCL - we're interested only in the latter. You should find the list of GPUs more or less at the top of the event log, right before the registered projects are mentioned.

Oliver

Profile mickydl*
Send message
Joined: 8 Dec 11
Posts: 6
Credit: 6,000
RAC: 0
Message 111891 - Posted: 29 Feb 2012 | 14:25:12 UTC

My first OpenCL WU with the v1.22 app validated aigainst a CUDA result. No problem (see this Task). However, I believe that I am using an earlier version of Catalyst (< 12.1). I'll have to check that when I'm home from work.

mickydl*

oz
Send message
Joined: 28 Feb 05
Posts: 10
Credit: 1,060,681
RAC: 0
Message 111892 - Posted: 1 Mar 2012 | 8:01:51 UTC

This happens in 7.0.18:
The scheduler requests new jobs, and 3 seconds later it starts S6LV1. I can not imagine that the download of the task is completed then? Normally we can see something like this


Started download of p2030.20111110.G39.19-00.79.N.b2s0g0.00000_3648.binary
Finished download of p2030.20111110.G39.19-00.79.N.b2s0g0.00000_3648.binary

first.


01-Mar-2012 08:39:39 [Albert@Home] Sending scheduler request: To fetch work.
01-Mar-2012 08:39:39 [Albert@Home] Reporting 4 completed tasks, requesting new tasks for CPU
01-Mar-2012 08:39:48 [Albert@Home] Scheduler request completed: got 1 new tasks
01-Mar-2012 08:39:48 [Albert@Home] Resent lost task h1_0059.95_S6GC1__39_S6LV1A_1
01-Mar-2012 08:39:51 [Albert@Home] Starting task h1_0059.95_S6GC1__39_S6LV1A_1 using einstein_S6LV1 version 110 (SSE2) in slot 10
01-Mar-2012 08:39:52 [Albert@Home] Computation for task h1_0059.95_S6GC1__39_S6LV1A_1 finished
01-Mar-2012 08:39:52 [Albert@Home] Output file h1_0059.95_S6GC1__39_S6LV1A_1_0 for task h1_0059.95_S6GC1__39_S6LV1A_1 absent
01-Mar-2012 08:41:39 [Albert@Home] Sending scheduler request: To fetch work.
01-Mar-2012 08:41:39 [Albert@Home] Reporting 1 completed tasks, requesting new tasks for CPU
01-Mar-2012 08:41:41 [Albert@Home] Scheduler request completed: got 4 new tasks
01-Mar-2012 08:41:43 [Albert@Home] Starting task h1_0059.95_S6GC1__35_S6LV1A_1 using einstein_S6LV1 version 110 (SSE2) in slot 10
01-Mar-2012 08:41:43 [Albert@Home] Starting task h1_0059.95_S6GC1__33_S6LV1A_1 using einstein_S6LV1 version 110 (SSE2) in slot 11
01-Mar-2012 08:41:43 [Albert@Home] Starting task h1_0059.95_S6GC1__34_S6LV1A_1 using einstein_S6LV1 version 110 (SSE2) in slot 12
01-Mar-2012 08:41:44 [Albert@Home] Computation for task h1_0059.95_S6GC1__35_S6LV1A_1 finished
01-Mar-2012 08:41:44 [Albert@Home] Output file h1_0059.95_S6GC1__35_S6LV1A_1_0 for task h1_0059.95_S6GC1__35_S6LV1A_1 absent
01-Mar-2012 08:41:44 [Albert@Home] Starting task h1_0059.95_S6GC1__36_S6LV1A_1 using einstein_S6LV1 version 110 (SSE2) in slot 10
01-Mar-2012 08:41:45 [Albert@Home] Computation for task h1_0059.95_S6GC1__33_S6LV1A_1 finished
01-Mar-2012 08:41:45 [Albert@Home] Output file h1_0059.95_S6GC1__33_S6LV1A_1_0 for task h1_0059.95_S6GC1__33_S6LV1A_1 absent
01-Mar-2012 08:41:46 [Albert@Home] Computation for task h1_0059.95_S6GC1__34_S6LV1A_1 finished
01-Mar-2012 08:41:46 [Albert@Home] Output file h1_0059.95_S6GC1__34_S6LV1A_1_0 for task h1_0059.95_S6GC1__34_S6LV1A_1 absent
01-Mar-2012 08:41:47 [Albert@Home] Computation for task h1_0059.95_S6GC1__36_S6LV1A_1 finished
01-Mar-2012 08:41:47 [Albert@Home] Output file h1_0059.95_S6GC1__36_S6LV1A_1_0 for task h1_0059.95_S6GC1__36_S6LV1A_1 absent


Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111893 - Posted: 1 Mar 2012 | 9:40:42 UTC - in response to Message 111892.


The scheduler requests new jobs, and 3 seconds later it starts S6LV1.


Well, S6LV1 tasks are not the same as BRP tasks. S6LV1 tasks re-use data already present on your host while BRP data is only used once, for a single WU. Looking at the error output of the failed S6LV1 task should tell us what happened.

Please open another thread for that problem if it persists. This thread is meant to discuss BRP v1.22 only.

Cheers,
Oliver

skildude
Send message
Joined: 15 Nov 11
Posts: 9
Credit: 103,497
RAC: 0
Message 111894 - Posted: 2 Mar 2012 | 5:13:13 UTC - in response to Message 111890.
Last modified: 2 Mar 2012 | 5:14:45 UTC

No X windows here, I'm running win7


Oops, sorry :-) When you start the BOINC client it'll list all GPUs in the event log (advanced view). For AMD/ATI devices it might talk about CAL and OpenCL - we're interested only in the latter. You should find the list of GPUs more or less at the top of the event log, right before the registered projects are mentioned.

Oliver

He may need to uninstall the drivers, run driver sweep, then reinstall the 12.1 drivers. Leaving old drivers wreaks havoc on the OpenCL apps at Seti. Probably the same here. His Card is recognized as a 6900 series and he is running the 7.0.18 BOINC so that isn't a problem.

Could running vbox be a problem?

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111895 - Posted: 2 Mar 2012 | 9:22:51 UTC - in response to Message 111894.
Last modified: 2 Mar 2012 | 9:23:37 UTC


Could running vbox be a problem?


Don't know but could be if vbox acquires the GPU somehow...

Oliver

Profile Ageless
Avatar
Send message
Joined: 26 Jan 05
Posts: 1639
Credit: 70,000
RAC: 0
Message 111896 - Posted: 2 Mar 2012 | 15:15:01 UTC - in response to Message 111895.

No it doesn't. No GPUs are being used on T4T or the Vboxwrapper test project (the only two projects at this time where VBox is being used), other than for showing graphics of sorts. And then these projects require Vbox 4.1.4 or higher, as far as I know.

I'll go with driver corruption as well. It certainly never hurts to completely clean out previous drivers and then reinstall any later as new.
____________
Jord.

BOINC FAQ Service

They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake.

Profile Ageless
Avatar
Send message
Joined: 26 Jan 05
Posts: 1639
Credit: 70,000
RAC: 0
Message 111897 - Posted: 3 Mar 2012 | 1:56:53 UTC - in response to Message 111883.

Tasks still come in expecting to run for 205 hours.
So I still have Albert tasks running in a panic.

03/03/2012 02:47:11 | Albert@Home | [rr_sim] Result p2030.20111110.G39.19-00.79.N.b3s0g0.00100_864_3 projected to miss deadline.
03/03/2012 02:47:11 | Albert@Home | [rr_sim] Project has 1 projected ATI deadline misses
03/03/2012 02:47:31 | Albert@Home | [rr_sim] p2030.20111110.G39.19-00.79.N.b3s0g0.00100_864_3 misses deadline by 614511.77

<time_stats>
<on_frac>0.939516</on_frac>
<connected_frac>0.783900</connected_frac>
<active_frac>0.392607</active_frac>
<gpu_active_frac>0.392447</gpu_active_frac>
<last_update>1330725382.604116</last_update>
</time_stats>


Of course, it's because BOINC thinks that the 205 hours it's estimated to go do is really 205h / (39 / 100) = 525h (or almost 22 days). A tad difficult to do in 14 days. So it'll run from start to finish in high priority. And as we can see in here, DCF is no longer really used with Boinc 7. Not that it matters, DCF is 7.5, way too high to use reliably.

So, pretty please, can the fpops estimate be adjusted enough that they don't come in thinking to take 200+ hours?
____________
Jord.

BOINC FAQ Service

They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake.

Nikolay
Send message
Joined: 13 Jan 12
Posts: 4
Credit: 6,500
RAC: 0
Message 111900 - Posted: 3 Mar 2012 | 12:16:19 UTC

Since v1.21/v1.22 update, I have got a lot of validation errors.
Please check my stats/failed units, maybe that will help you to find a bug.
My system is BOINC 7.18 + Mac OS X 10.7.3 + AMD HD6750M 1GB

[AF>Le_Pommier] McRoger
Send message
Joined: 9 Feb 08
Posts: 3
Credit: 179,128
RAC: 2,176
Message 111901 - Posted: 3 Mar 2012 | 16:35:51 UTC - in response to Message 111900.

Same for me, my results

Boinc Manager 7.0.12
OS X Lion 10.7.3
ATI Radeon 5770 (Apple) 1 Gb

[AF>Le_Pommier] McRoger
Send message
Joined: 9 Feb 08
Posts: 3
Credit: 179,128
RAC: 2,176
Message 111902 - Posted: 4 Mar 2012 | 6:59:44 UTC - in response to Message 111901.

Same for me, my results

Boinc Manager 7.0.12
OS X Lion 10.7.3
ATI Radeon 5770 (Apple) 1 Gb


Besides there is this error in the log

[19:14:19][1216][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).

Might it explain why calculation time is so huge compared to other platforms ?

Alex
Send message
Joined: 1 Mar 05
Posts: 60
Credit: 315,126
RAC: 298
Message 111903 - Posted: 4 Mar 2012 | 22:23:43 UTC - in response to Message 111895.


Could running vbox be a problem?


Don't know but could be if vbox acquires the GPU somehow...

Oliver


Not on my machine.
vbox (test4theory, 2 cpu's) and three albert BRP's (2 ati, one nvidia) are running fine together. Some are waiting for validation, some are validated, no errors or invalids.

win7 x64 8GB 7.0.12
____________

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111904 - Posted: 5 Mar 2012 | 8:27:38 UTC - in response to Message 111897.

Hi Jord,

So, pretty please, can the fpops estimate be adjusted enough that they don't come in thinking to take 200+ hours?


I'll forward this to Bernd but he's pretty overwhelmed with more important topics right now and the BOINC devs are of little help analyzing this right now. Please bear with us.

Cheers,
Oliver

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111905 - Posted: 5 Mar 2012 | 8:33:48 UTC - in response to Message 111902.
Last modified: 5 Mar 2012 | 8:36:29 UTC

Same for me, my results

Boinc Manager 7.0.12
OS X Lion 10.7.3
ATI Radeon 5770 (Apple) 1 Gb


Besides there is this error in the log

[19:14:19][1216][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).

Might it explain why calculation time is so huge compared to other platforms ?


1) Please read my intro post of this thread. The Mac version is known to produce invalid results. We already disabled it.
2) Your GPU is simply not that efficient that's why it takes so long. It's not about the platform but the GPU.
3) The message you quote is an "INFO" message, so no, it's not the reason. It's normal when a fresh dataset is being analyzed for the first time - there can't be any checkpoint then.

HTH,
Oliver

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111906 - Posted: 5 Mar 2012 | 8:38:01 UTC - in response to Message 111900.

Since v1.21/v1.22 update, I have got a lot of validation errors.
Please check my stats/failed units, maybe that will help you to find a bug.
My system is BOINC 7.18 + Mac OS X 10.7.3 + AMD HD6750M 1GB


Same as for "[AF>Le_Pommier] McRoger": no working OS X OpenCL app for the time being...

Oliver

Profile Ageless
Avatar
Send message
Joined: 26 Jan 05
Posts: 1639
Credit: 70,000
RAC: 0
Message 111907 - Posted: 5 Mar 2012 | 16:58:49 UTC - in response to Message 111904.

Hi Jord,

So, pretty please, can the fpops estimate be adjusted enough that they don't come in thinking to take 200+ hours?


I'll forward this to Bernd but he's pretty overwhelmed with more important topics right now and the BOINC devs are of little help analyzing this right now. Please bear with us.

It may be quite easy.

I changed <rsc_fpops_est>300000000000000.000000</rsc_fpops_est> to <rsc_fpops_est>30000000000000.000000</rsc_fpops_est> (one zero less) and restarted BOINC. Estimated time on a new task is now 15 hours, which is more in line than the original 208 hours.
____________
Jord.

BOINC FAQ Service

They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake.

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111909 - Posted: 6 Mar 2012 | 13:46:33 UTC - in response to Message 111907.
Last modified: 7 Mar 2012 | 10:47:51 UTC

It's not. The reason isn't a flaw in our runtime estimation (stored in the work unit definition) but BOINC's new automatic runtime estimation system (a.k.a. new credit system) we're also testing here on albert...

Oliver

Alex
Send message
Joined: 1 Mar 05
Posts: 60
Credit: 315,126
RAC: 298
Message 111910 - Posted: 7 Mar 2012 | 15:24:21 UTC - in response to Message 111909.

It's not. The reason isn't a flaw in our runtime estimation (stored in the work unit definition) but BOINC's new automatic runtime estimation system (a.k.a. new credit system) we're also testing here on albert...

Oliver


... and it looks like its faulty ????
____________

skildude
Send message
Joined: 15 Nov 11
Posts: 9
Credit: 103,497
RAC: 0
Message 111911 - Posted: 7 Mar 2012 | 22:44:25 UTC

I've gotten a great deal of invalids and inconclusives. Somethings wrong and I don't think its my GPU

spingadus[MM]
Send message
Joined: 15 Oct 06
Posts: 4
Credit: 250,000
RAC: 0
Message 111912 - Posted: 8 Mar 2012 | 9:01:29 UTC

Just wanted to post that I finally got some Albert tasks!

I followed Skildude and Ageless's advice. I uninstalled the ATI drivers, then ran driversweep in safe mode. I found a lot of old Nvidia stuff as well from previous cards. I also updated to 7.0.20 just for the heck of it. The Boinc startup messages did show opencl for the gpu this time.

Thanks!

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111913 - Posted: 8 Mar 2012 | 13:53:05 UTC - in response to Message 111910.
Last modified: 8 Mar 2012 | 13:54:21 UTC


... and it looks like its faulty ????


Well, let's say it's non-optimal, in particular for GPU apps. The runtime estimates are determined for every application version independently. Thus after each newly released version BOINC needs some time to gather statistics to come up with a valid/reasonable runtime estimate. Don't worry, we won't be using this new system over on einstein until it proves reliable, but we need to test it here in order to improve (fix) it at all - as soon as time permits.


Best,
Oliver

oz
Send message
Joined: 28 Feb 05
Posts: 10
Credit: 1,060,681
RAC: 0
Message 111914 - Posted: 8 Mar 2012 | 17:19:54 UTC

Today I had a lot of atiOpenCL tasks aborted after exactly 24:14 min.


133328 43214 7 Mar 2012 | 16:58:05 UTC 8 Mar 2012 | 6:11:00 UTC Error while computing 1,454.57 580.95 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133327 39414 7 Mar 2012 | 16:59:13 UTC 8 Mar 2012 | 6:11:00 UTC Error while computing 1,453.71 578.01 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133326 39395 7 Mar 2012 | 16:59:13 UTC 8 Mar 2012 | 6:11:00 UTC Error while computing 1,454.23 582.54 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133325 39432 7 Mar 2012 | 16:59:13 UTC 8 Mar 2012 | 8:30:35 UTC Error while computing 1,453.80 582.52 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133324 39441 7 Mar 2012 | 16:59:13 UTC 8 Mar 2012 | 8:30:35 UTC Error while computing 1,453.70 586.05 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133323 43314 7 Mar 2012 | 16:58:05 UTC 8 Mar 2012 | 6:11:00 UTC Error while computing 1,454.00 584.04 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133321 39279 7 Mar 2012 | 16:55:50 UTC 8 Mar 2012 | 6:11:00 UTC Error while computing 1,453.96 582.06 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133320 37403 7 Mar 2012 | 17:00:24 UTC 8 Mar 2012 | 8:30:35 UTC Error while computing 1,454.57 614.77 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133319 36932 7 Mar 2012 | 17:00:24 UTC 8 Mar 2012 | 8:30:35 UTC Error while computing 1,453.83 607.69 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133318 44053 7 Mar 2012 | 17:00:25 UTC 8 Mar 2012 | 10:38:10 UTC Error while computing 1,453.83 662.62 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133317 38006 7 Mar 2012 | 17:01:34 UTC 8 Mar 2012 | 12:57:42 UTC Error while computing 1,454.31 665.01 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133316 43437 7 Mar 2012 | 16:58:05 UTC 8 Mar 2012 | 6:11:00 UTC Error while computing 1,454.19 582.07 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)


Bikemans:
133387 44311 7 Mar 2012 | 17:42:19 UTC 8 Mar 2012 | 9:42:23 UTC Error while computing 947.54 846.04 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133348 44229 7 Mar 2012 | 17:42:19 UTC 8 Mar 2012 | 9:42:23 UTC Error while computing 946.95 840.90 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133346 44226 7 Mar 2012 | 17:43:26 UTC 8 Mar 2012 | 10:07:15 UTC Error while computing 947.58 838.51 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133314 39550 7 Mar 2012 | 17:41:09 UTC 8 Mar 2012 | 4:31:47 UTC Error while computing 946.85 838.29 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
133274 44093 7 Mar 2012 | 17:41:09 UTC 8 Mar 2012 | 4:31:47 UTC Error while computing 947.08 842.30 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
130790 44395 7 Mar 2012 | 17:43:27 UTC 8 Mar 2012 | 10:43:45 UTC Error while computing 946.86 844.07 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)
130749 44374 7 Mar 2012 | 17:42:19 UTC 8 Mar 2012 | 9:42:23 UTC Error while computing 947.30 837.06 --- Binary Radio Pulsar Search v1.22 (atiOpenCL)


PS.:
Bikemans end up earlier due to better hardware

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111915 - Posted: 9 Mar 2012 | 9:28:18 UTC - in response to Message 111914.

Now that's strange. Looks like BOINC's borked runtime estimation again. Thanks for reporting...

Oliver

choks
Send message
Joined: 24 Feb 05
Posts: 5
Credit: 345,604
RAC: 139
Message 111920 - Posted: 12 Mar 2012 | 8:05:58 UTC - in response to Message 111915.

Hi,

I also had my tasks ended after 702 seconds (tasks 130935,130925,130906 for example). I had to divide <flops> by 10 in client_state.xml to allow tasks to finish.

I just upgraded to catalyst 12.12 (7/3/2012) and the good news for Linux users is that the CPU usage was significantly reducted. 1300 seconds of CPU time per work, instead of about 3600 with 12.11. Average CPU is now about 33%.

Christophe
____________

Profile Trog Dog
Avatar
Send message
Joined: 25 Nov 05
Posts: 204
Credit: 64,008
RAC: 0
Message 111924 - Posted: 13 Mar 2012 | 11:01:12 UTC

All 1.22 wu's are erroring out with max time elapsed http://albert.phys.uwm.edu/results.php?userid=128605&offset=0&show_names=0&state=5&appid=

running on boinc 7.0.20 ati drivers 12.2
____________

Profile Ageless
Avatar
Send message
Joined: 26 Jan 05
Posts: 1639
Credit: 70,000
RAC: 0
Message 111925 - Posted: 13 Mar 2012 | 20:03:40 UTC

I don't know if this affects the OpenCL in any way, but the Catalysts 12.2 do cause Anti Aliasing problems in some games. I noticed it after upgrading to these drivers, that all fine mist like graphics in Skyrim would become lots of square pixels. This can only be fixed by disabling AA and enabling FSAA instead.

ATI says it's a game problem, not their drivers, but heck if something works before and doesn't after changing the drivers, then how can that be the game's problem when that one hasn't changed literally a bit?
____________
Jord.

BOINC FAQ Service

They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake.

Profile Ageless
Avatar
Send message
Joined: 26 Jan 05
Posts: 1639
Credit: 70,000
RAC: 0
Message 111926 - Posted: 13 Mar 2012 | 22:45:05 UTC

And again...
Normal average run time of OpenCl tasks on my ATI HD6850 is around 6200 seconds. When not interrupted.

When interrupted (due to exit BOINC, suspend BOINC or suspend task (exclusive_app or switch between applications)), task run time length increases to 31,000 - 36,000 seconds (!!). (task list)


____________
Jord.

BOINC FAQ Service

They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake.

Alex
Send message
Joined: 1 Mar 05
Posts: 60
Credit: 315,126
RAC: 298
Message 111928 - Posted: 14 Mar 2012 | 21:10:49 UTC

As Tullio posted in an other thread, the Albert wu's are slower than the Einstein wu's.
I checked it twice with BRP3cuda32 wu's, running all of them with the same setting (GPU 0.5) on the same hardware.

@ Jord: I don't see this behaviour on my machine, I turn it off sometimes, put tasks on hold, start them on my HD5830 and let it finish on the APU. I have no tasks running longer than 12800 sec.

____________

oz
Send message
Joined: 28 Feb 05
Posts: 10
Credit: 1,060,681
RAC: 0
Message 111929 - Posted: 15 Mar 2012 | 8:43:00 UTC - in response to Message 111920.

Hi,
what did you do exactly on client_state.xml? If I change the <flops> entry in the ati_openCL application section it was automatically reset by the application after a while and tasks end up before finishing.

<app_version>
<app_name>einsteinbinary_BRP4</app_name>
<version_num>122</version_num>
<platform>i686-pc-linux-gnu</platform>
<avg_ncpus>0.150000</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<flops>4127438621653.708496</flops>
<plan_class>atiOpenCL</plan_class>
<api_version>7.0.18</api_version>
<file_ref>
<file_name>einsteinbinary_BRP4_1.22_i686-pc-linux-gnu__atiOpenCL</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>einsteinbinary_BRP4_1.00_graphics_i686-pc-linux-gnu</file_name>
<open_name>graphics_app</open_name>
</file_ref>
<coproc>
<type>ATI</type>
<count>1.000000</count>
</coproc>
<gpu_ram>377487360.000000</gpu_ram>
</app_version>

choks
Send message
Joined: 24 Feb 05
Posts: 5
Credit: 345,604
RAC: 139
Message 111930 - Posted: 16 Mar 2012 | 14:02:23 UTC - in response to Message 111929.

Hi

Once the jobs has been loaded and a couple were aborted, I disabled requesting new jobs, changed <flops> and waited for the remaining jobs to complete.

It looks this is no more required because the jobs I got today are processing OK, so it looks fixed.

Christophe
____________

Profile Ageless
Avatar
Send message
Joined: 26 Jan 05
Posts: 1639
Credit: 70,000
RAC: 0
Message 111933 - Posted: 18 Mar 2012 | 1:07:17 UTC - in response to Message 111926.

WUID 47277, run time: 29,286.40 seconds.
WUID 46805, run time: 39,079.57 seconds.
WUID 46559, run time: 4,538.00 seconds.

47277 has this:

[00:38:31][368][INFO ] Checkpoint committed!
Activated exception handling...
[02:14:57]

46805 has this:
[04:51:58][3600][INFO ] Checkpoint committed!
Activated exception handling...
[21:55:34]

And from there on in, they slow down. 46559 ran from start to finish without exception handling (aka a break), and as such it ran in 'normal' time.

Now, the troubling thing is that it doesn't do this with all tasks. WUID 47791 has a run time of 6,306.80 seconds, yet it also has this:
[00:47:25][4336][INFO ] Checkpoint committed!
Activated exception handling...
[00:48:18]

That was a BOINC exit & restart. The other two were stops of the task itself while BOINC continued running.
____________
Jord.

BOINC FAQ Service

They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake.

Christoph
Send message
Joined: 25 Aug 05
Posts: 48
Credit: 148,613
RAC: 11
Message 111934 - Posted: 19 Mar 2012 | 19:45:34 UTC
Last modified: 19 Mar 2012 | 19:47:19 UTC

I have an invalid.
http://albert.phys.uwm.edu/result.php?resultid=140891

Oh, and the long runtimes which Ageless has are normal to me. I will set NNT to other projects to see if the times go down when my tasks run in one go.
____________
Christoph

Profile Trog Dog
Avatar
Send message
Joined: 25 Nov 05
Posts: 204
Credit: 64,008
RAC: 0
Message 111937 - Posted: 24 Mar 2012 | 20:00:56 UTC - in response to Message 111924.

All 1.22 wu's are erroring out with max time elapsed http://albert.phys.uwm.edu/results.php?userid=128605&offset=0&show_names=0&state=5&appid=

running on boinc 7.0.20 ati drivers 12.2


Looks like it was the client at fault, upgraded to 7.0.23 & I have a wu in progress
____________

Infusioned
Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 111958 - Posted: 15 Apr 2012 | 22:12:48 UTC - in response to Message 111937.

I have been away for a bit due to my motherboard dying, and when I got back up and running with the rebuild I was waiting for 7.0.25 to go live so I could run Milkway with Albert without using a beta version for a live project.


So, poking through my WU times, I am hovering at ~ 5900 GPU seconds, and ~ 3,200 CPU seconds per WU.

AMD Phenom II x4 975 (couldn't find an 1100T for non-ripoff prices and am waiting for Piledriver [not happy with Bulldozer])
AMD HD 6950
8G DDR3 1600
Win 7 x64
Boinc 7.0.25

This WU vs. i2600k Sandybridge/550Ti shows the 2600k coming in at 1/3 the time of my cpu. However, Anandtech Bench does not show the 2600k as 66% faster. Also, wikipedia shows the AMD HD6950 SP GFLOPS at 2253 and the NVIDIA GTX 55Ti SP GFLOPS at 691.2, but the 550Ti time is 2/3 of mine.

So, my question is, what gives? Is the OpenCL app that unoptimized compared to the CUDA app?
____________

robertmiles
Send message
Joined: 16 Nov 11
Posts: 17
Credit: 601,915
RAC: 2,153
Message 111959 - Posted: 16 Apr 2012 | 3:39:42 UTC - in response to Message 111958.
Last modified: 16 Apr 2012 | 3:42:21 UTC

I've seen a message elsewhere saying that OpenCL workunits tend to need much more CPU use than running similar workunits using CUDA. This implies that slow CPUs will slow down OpenCL workunits much more than they slow down CUDA workunits.

Infusioned
Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 111960 - Posted: 16 Apr 2012 | 13:17:59 UTC - in response to Message 111959.
Last modified: 16 Apr 2012 | 13:19:36 UTC

This implies that slow CPUs will slow down OpenCL workunits much more than they slow down CUDA workunits.


Understood. However, that's why I checked Anandtech's benchmarks to see just how much faster the 2600k was than my cpu. The benchmarks do not reflect a 66% performance difference so there is something else going on.

Also, unless I read the charts wrong, comparing the GFLOPS between the two video cards, theoretically the 6950 should smoke the 550Ti in SP output (2253 vs. 691.2).

So, back to my original question, is the OpenCL app that unoptimized compared to the CUDA app?
____________

Infusioned
Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 111961 - Posted: 20 Apr 2012 | 14:26:02 UTC - in response to Message 111960.

Here is a WU from Seti@Home Beta's OpenCL application:

http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=3973426

I am 58327 and someone with a GTX 590 GPU, Intel 2600k CPU, Cuda OpenCL client is 56759.

My CPU seconds are 1463 and theirs are 2061.
My GPU seconds are 3244 and theirs are 2198.


My CPU time is actually lower (75%) of the 2600k, but my GPU time is ~150% of the GTX 590 (which again, is curious, given the GFLOP numbers).


My conclusion from all this is then, that the Albert AMD OpenCL application isn't as quite as optimized as the Albert CUDA application. Can anyone confirm/deny?
____________

Falconet
Send message
Joined: 20 Jan 12
Posts: 1
Credit: 0
RAC: 0
Message 111962 - Posted: 22 Apr 2012 | 17:03:27 UTC

Hmm the OpenCl app uses a full CPU core to work.
IS there any way to lower that usage?

Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 28 Aug 06
Posts: 1454
Credit: 1,763,743
RAC: 1,744
Message 111964 - Posted: 24 Apr 2012 | 9:25:11 UTC - in response to Message 111962.

Hmm the OpenCl app uses a full CPU core to work.
IS there any way to lower that usage?


Hmm the OpenCl app uses a full CPU core to work.
IS there any way to lower that usage?


Hi!

In terms of CPU usage, the OpenCL app should in theory be comparable to the NVIDIA/CUDA app, but we have seen huge differences in CPU usage with different driver versions from ATI. So the only advice I can give now is to try different drivers, sorry. Please let us know any results for your card (e.g. which driver worked better wrt CPU usage).



From the previous message:
My conclusion from all this is then, that the Albert AMD OpenCL application isn't as quite as optimized as the Albert CUDA application. Can anyone confirm/deny?


It's fair to say that the CUDA app is more optimized to NVIDIA cards than the OpenCL app is optimized to ATI cards, yes. This has several reasons:

* OpenCL is a multi-vendor platform while CUDA is NVIDIA only. If you write OpenCL code you want to keep the vendor-independence. It would be great if we could have just one code basis, it has to be seen whether this will be realistic without too much impact on performance on either platform.

* The OpenCL app for the pulsar search is a port of the CUDA app which came out first of course, so it's not specifically tuned to the strengths of ATI cards...yet

* The first priority is, needless to say, to get the app to a point where it runs on all our target platforms (OSX, Linux, Windows) and produces scientifically sound results that cross-validate with the CUDA and CPU apps. As has been mentioned elsewhere, the level of support (tools, libraries, bugfixing, drivers...) is certainly more mature for CUDA/NVIDIA than for OpenCL/ATI, so almost all our efforts currently have to be directed into "making it work at all" and less can be spent on "optimizing".

On the other hand the ATI cards are, without any questions, fine pieces of hardware! So I'm quite optimistic that already the first OpenCL app that will
go into production on E@H will have a decent performance/Watt ratio.

Stay tuned and thanks for helping us test the thing here on Albert@Home!

HBE
____________

Christoph
Send message
Joined: 25 Aug 05
Posts: 48
Credit: 148,613
RAC: 11
Message 111969 - Posted: 25 Apr 2012 | 12:28:56 UTC
Last modified: 25 Apr 2012 | 12:38:03 UTC

I did only now realise that my card is not supported because you demand a min workgroup size of 256.

I have 128. HD 5450.

Can you lower that? Otherwise I will stop crunching with my GPU here for now. Raistmer is waiting for more results over at SETI Beta.
____________
Christoph

Oliver Bock
Volunteer moderator
Project administrator
Project developer
Send message
Joined: 4 Sep 07
Posts: 116
Credit: 5,965,020
RAC: 1
Message 111973 - Posted: 27 Apr 2012 | 9:13:54 UTC - in response to Message 111969.

I did only now realise that my card is not supported because you demand a min workgroup size of 256.


We don't, we just set a preferred value. If your GPU doesn't support it, the value is dynamically adjusted accordingly.


Cheers,
Oliver

Christoph
Send message
Joined: 25 Aug 05
Posts: 48
Credit: 148,613
RAC: 11
Message 111977 - Posted: 27 Apr 2012 | 12:10:46 UTC - in response to Message 111973.

I did only now realise that my card is not supported because you demand a min workgroup size of 256.


We don't, we just set a preferred value. If your GPU doesn't support it, the value is dynamically adjusted accordingly.


Cheers,
Oliver


Ah, that sound good. but please have a look at this result, because I don't see an indication that the work group size is adjusted.

Another point is this: [04:23:10][4764][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
NOT only at the beginnig of the WU. But after nearly 2 hours runtime. Is that the app or is it BOINC?

http://albert.phys.uwm.edu/result.php?resultid=189038
____________
Christoph

Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar
Send message
Joined: 28 Aug 06
Posts: 1454
Credit: 1,763,743
RAC: 1,744
Message 111978 - Posted: 27 Apr 2012 | 13:36:53 UTC - in response to Message 111977.

Hi!

Each task that you download is actually a bundle of 8 independent sub-tasks. When one sub task is finished, processing of the next one begins, using its own checkpoints. So it is normal that there will be exactly 8 instances of the "starting from scratch" message in the logs per task.

CU
Heinz-Bernd

____________

Christoph
Send message
Joined: 25 Aug 05
Posts: 48
Credit: 148,613
RAC: 11
Message 111980 - Posted: 27 Apr 2012 | 19:50:52 UTC - in response to Message 111978.

Ah, these tiny bits of info.....now I remeber that I read somewhere about that.
Thank you for reminding me of that.
____________
Christoph

Post to thread

Message boards : Problems and Bug Reports : [New release] BRP app v1.22 feedback thread


Home · Your account · Message boards

This material is based upon work supported by the National Science Foundation (NSF) under Grant PHY-0555655 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2013 Bruce Allen for the LIGO Scientific Collaboration