WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!

[New release] BRP app v1.23/1.24 (OpenCL) feedback thread

Message boards : Problems and Bug Reports : [New release] BRP app v1.23/1.24 (OpenCL) feedback thread
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 111974 - Posted: 27 Apr 2012, 9:52:51 UTC

Hi,

We just released BRP4 v1.23 for ATI OpenCL under Linux and Windows which adds a number of improvements.

Notes:
* Now handles work units compatible with those on Einstein@Home (previously workunits on Albert were tweaked to work around a limitation in the OpenCL code)
* OpenCL GPU memory usage reduced
* modest performance improvement
* minor bug fixes
* better selection of work group size for kernels

* Known issue: no OpenCL support for Mac OS X for the time being (we're still looking into a potential Apple bug)

* Please use the latest Catalyst driver (>=12.1) and BOINC client (>=7.0.26). Note that this BOINC version is still a development version (but fixes some OpenGL related problems), it can be downloaded from here:

http://boinc.berkeley.edu/dl/

Without updating to this BOINC version, you will not be able to get OpenCL work on Albert!

Let's try and collect your feedback to this specific release (and this one only) in this thread.


Thanks,
Heinz-Bernd
ID: 111974 · Report as offensive     Reply Quote
TRuEQ & TuVaLu

Send message
Joined: 11 Sep 06
Posts: 75
Credit: 615,315
RAC: 0
Message 111979 - Posted: 27 Apr 2012, 16:09:20 UTC

I made an answer in another thread that also migth be in here.

http://albert.phys.uwm.edu/forum_thread.php?id=8883&nowrap=true#111976
ID: 111979 · Report as offensive     Reply Quote
Infusioned

Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 111985 - Posted: 28 Apr 2012, 22:40:43 UTC - in response to Message 111979.  

GPU load is steady at 20-21%, and CPU load literally bounces: 5%,15%,6%,14%,4%,17%, etc. with 17 being the highest I've seen.
ID: 111985 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 111986 - Posted: 29 Apr 2012, 7:11:19 UTC - in response to Message 111985.  

Thanks for the feedback.

We would also be interested to hear about graphics RAM usage, especially when crunching workunits that were generated beginning from 28th of May.

Cheers
HBE
ID: 111986 · Report as offensive     Reply Quote
Infusioned

Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 111987 - Posted: 29 Apr 2012, 16:42:36 UTC - in response to Message 111986.  
Last modified: 29 Apr 2012, 16:47:22 UTC


Thanks for the feedback.

We would also be interested to hear about graphics RAM usage, especially when crunching workunits that were generated beginning from 28th of May.

Cheers
HBE


March? April? Or May of last year?



wu 4/29/2012 9:29:59 AM | Albert@Home | Starting task p2030.20110421.G41.06+00.53.N.b6s0g0.00000_3728_0 using einsteinbinary_BRP4 version 123 (atiOpenCL) in slot 0




ID: 111987 · Report as offensive     Reply Quote
Infusioned

Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 111988 - Posted: 29 Apr 2012, 16:46:21 UTC - in response to Message 111987.  

Also, is there a way to make them thumbnails in my post and when you click them they link to larger images (just to not annoy people with really large images)?
ID: 111988 · Report as offensive     Reply Quote
Profile steffen_moeller

Send message
Joined: 9 Feb 05
Posts: 13
Credit: 397,892
RAC: 0
Message 111989 - Posted: 29 Apr 2012, 17:38:25 UTC - in response to Message 111986.  

HD 5670, 1GB RAM, Windows 7 Home, Catalyst version 12.4
uses 521 MB, 50 MB dynamic, load 90%, temperature 71.5 deg. Celsius
http://albert.phys.uwm.edu/result.php?resultid=198490
ID: 111989 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 111990 - Posted: 29 Apr 2012, 18:14:22 UTC - in response to Message 111989.  

Hi

Thanks again for the feedback.

Oops...I meant workunits generated on 28th of April, not May :-)

Cheers
HB
ID: 111990 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 111991 - Posted: 29 Apr 2012, 18:48:18 UTC - in response to Message 111990.  

One more thing: while the workunit mentioned above was sent only recently, it was generated already on the 23rd of April, so it is still one of the "tweaked" workunits. Once the newly generated workunits are reached out, we should see a reduced memory usage and some modest performance increase.

Cheers
HB
ID: 111991 · Report as offensive     Reply Quote
Profile steffen_moeller

Send message
Joined: 9 Feb 05
Posts: 13
Credit: 397,892
RAC: 0
Message 111992 - Posted: 29 Apr 2012, 19:14:27 UTC - in response to Message 111991.  

One more thing: while the workunit mentioned above was sent only recently, it was generated already on the 23rd of April, so it is still one of the "tweaked" workunits. Once the newly generated workunits are reached out, we should see a reduced memory usage and some modest performance increase.


Does this mean we are 5 community-days late with processing? If so, I suggest to just stop everything from being sent that does not bring additional insights. Hm, thinking again, you have certainly done that and I was just too quick when I read the announcement. Ah, wait, you expect an impact on the performance also from the tweaking, so you need to have the same new app performed both on tweaked and regular workunits ?!?

Steffen

ID: 111992 · Report as offensive     Reply Quote
Infusioned

Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 111993 - Posted: 29 Apr 2012, 22:11:58 UTC - in response to Message 111991.  
Last modified: 29 Apr 2012, 22:12:29 UTC

One more thing: while the workunit mentioned above was sent only recently, it was generated already on the 23rd of April, so it is still one of the "tweaked" workunits. Once the newly generated workunits are reached out, we should see a reduced memory usage and some modest performance increase.

Cheers
HB


I was afraid of that. However, I didn't know how to decipher what date p2030.20110421.G41.06+00.53.N.b6s0g0.00000_3728_0 ... Never mind, I just realized that 20110421 means April 21, 2011.
ID: 111993 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 111994 - Posted: 30 Apr 2012, 9:49:23 UTC - in response to Message 111993.  

Hi

It's ok that the new app version is first crunching thru some of the old workunits, to make sure we didn't break anything or significantly degraded performance even for the code paths that are used only with those old workunits. The support for the old, "tweaked" workunits will stay in the code in case we will again need it later.

The timestamps of 2011 that you might see in the logs or workunit file names refer to the time when the raw data for the workunit was recorded at the radio telescope. This is not crucial for the question we are discussing here.

Cheers
HB
ID: 111994 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 111996 - Posted: 1 May 2012, 0:03:26 UTC - in response to Message 111994.  
Last modified: 1 May 2012, 0:04:58 UTC

Hi!

I've seen the first "new" workunits being completed now, e.g. this one:

http://albert.phys.uwm.edu/workunit.php?wuid=68037

This should give a rough idea what to expect:

good:
* this one validated against a CUDA task
* comparing to older openCL tasks of the same host, the new app with the new workunits seems to show a 10-20% performance increase.

still needs improvement:
* CPU usage seems to be higher that for the CUDA app. I'm not sure how much of this is caused by the driver rather than the app itself
* overall performance is in the right ballpark as compared to the CUDA app, but there should be a bit more room for improvement.


Still, I think if this trend is confirmed by more results and validation is successful, we have a release candidate for Einstein@Home. We will have to upgrade the server side BOINC software to a version that supports OpenCL (as here on Albert@Home), tho.

So with your continued help as beta testers for the OpenCL app here, we are now closing in on going into production with the ATI app.


Cheers
HB
ID: 111996 · Report as offensive     Reply Quote
Infusioned

Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 111997 - Posted: 1 May 2012, 0:47:43 UTC - in response to Message 111996.  
Last modified: 1 May 2012, 0:51:02 UTC

I'm not sure why, but I've thrown 3 error recently:

http://albert.phys.uwm.edu/workunit.php?wuid=67888
http://albert.phys.uwm.edu/workunit.php?wuid=66586
http://albert.phys.uwm.edu/workunit.php?wuid=66147


edit:

Upon examination, all the wu's that errored start with this:

<core_client_version>7.0.26</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
ID: 111997 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 111999 - Posted: 1 May 2012, 13:49:32 UTC - in response to Message 111997.  

Thanks for the feedback, I think we have seen this particular error also with other apps and it might even be a general BOINC issue...definitley needs some investigation.

I see your host has now a mix of old and new WUs and I understand that the speedup is about 20%. If you can provide any numbers for the Video RAM usage, that would be cool.

Cheers
HB
ID: 111999 · Report as offensive     Reply Quote
terencewee*

Send message
Joined: 2 Feb 12
Posts: 5
Credit: 4,500
RAC: 0
Message 112001 - Posted: 1 May 2012, 21:46:55 UTC

Using this host.

It's a mobile workstation, i7-820qm, FirePro Mobility 7820 (Juniper-based).
Driver Package: 8.911.3.3-120309a-136336C
Catalyst version: 11.11

I was running POEM++ OpenCL x3 WU on it.
Pause all running WU.
Exit BOINC.

Re-launch BOINC.
Select Albert WU.
Resume

@ ~0.018%, the screen starts to have multi-color square dots
But it continue running.
Pause Albert WU.
Move mouse/window, dots disappear.
Resume Albert WU.
Driver restarts/recover @ ~0.320%.

Pause Albert WU.
Exit BOINC

Restart machine.

Login, launch BOINC.
Resume Albert WU.

No dots, continue run to completion (I hope).

Hope this can be rectified before release.

clinfo dump:
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.1 AMD-APP (831.4)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices cl_khr_d3d10_sharing
Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 4098
Board name: ATI FirePro M7820
Max compute units: 10
Max work items dimensions: 3
Max work items[0]: 256
Max work items[1]: 256
Max work items[2]: 256
Max work group size: 256
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 2
Preferred vector width float: 4
Preferred vector width double: 0
Native vector width char: 16
Native vector width short: 8
Native vector width int: 4
Native vector width long: 2
Native vector width float: 4
Native vector width double: 0
Max clock frequency: 700Mhz
Address bits: 32
Max memory allocation: 536870912
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 8192
Max image 2D height: 8192
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 2048
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 1073741824
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue properties:
Out-of-Order: No
Profiling : Yes
Platform ID: 000007FEF1FBC9C8
Name: Juniper
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 1.1
Driver version: CAL 1.4.1607 (VM)
Profile: FULL_PROFILE
Version: OpenCL 1.1 AMD-APP (831.4)
Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing



--
terencewee*
Sicituradastra.
ID: 112001 · Report as offensive     Reply Quote
Infusioned

Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 112002 - Posted: 2 May 2012, 0:19:05 UTC - in response to Message 111999.  
Last modified: 2 May 2012, 0:21:40 UTC

I see your host has now a mix of old and new WUs



? I poked through my history and all my wu's have 20110421 in them. I started aborting batches to try and get some new ones, but no dice so far. Unless I am mistaken, the 20110421 is the datestamp for when the data was recorded? Or is that the datestamp from when it was split?

I have the day off tomorrow so I will abort/babysit Boinc to try and get some newer ones.
ID: 112002 · Report as offensive     Reply Quote
Infusioned

Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 112003 - Posted: 2 May 2012, 2:32:20 UTC - in response to Message 112002.  
Last modified: 2 May 2012, 2:35:20 UTC

p2030.20110421.G41.29-00.40.S.b0s0g0.00000_744_0 using einsteinbinary_BRP4 version 123 (atiOpenCL)


GPU-Z & Task Manager:
http://img7.imageshack.us/img7/7159/p203020110421g41290040s.jpg
ID: 112003 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 112004 - Posted: 2 May 2012, 11:18:20 UTC - in response to Message 112002.  

I see your host has now a mix of old and new WUs



? I poked through my history and all my wu's have 20110421 in them.


This is not the WU creation date, you can see that one by following the WU link in the results list. It seems that the first "new" WUs were generated around 13:00 UTC on 27th of April already. When looking at your results, you will notice the results will fall into one of two narrow ranges of runtime, where the newer results (newer by WU creation time) run about 20% faster.

Cheers
HB





ID: 112004 · Report as offensive     Reply Quote
Infusioned

Send message
Joined: 11 Feb 05
Posts: 45
Credit: 149,000
RAC: 0
Message 112006 - Posted: 2 May 2012, 13:05:06 UTC - in response to Message 112004.  
Last modified: 2 May 2012, 13:05:16 UTC

p2030.20110421.G41.29-00.40.S.b0s0g0.00000_1264_1


http://img809.imageshack.us/img809/154/b0s0g00000012641.jpg
ID: 112006 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Problems and Bug Reports : [New release] BRP app v1.23/1.24 (OpenCL) feedback thread



This material is based upon work supported by the National Science Foundation (NSF) under Grant PHY-0555655 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2019 Bruce Allen for the LIGO Scientific Collaboration