WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!
[New release] BRP app v1.23/1.24 (OpenCL) feedback thread |
Message boards :
Problems and Bug Reports :
[New release] BRP app v1.23/1.24 (OpenCL) feedback thread
Message board moderation
Author | Message |
---|---|
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
Hi, We just released BRP4 v1.23 for ATI OpenCL under Linux and Windows which adds a number of improvements. Notes: * Now handles work units compatible with those on Einstein@Home (previously workunits on Albert were tweaked to work around a limitation in the OpenCL code) * OpenCL GPU memory usage reduced * modest performance improvement * minor bug fixes * better selection of work group size for kernels * Known issue: no OpenCL support for Mac OS X for the time being (we're still looking into a potential Apple bug) * Please use the latest Catalyst driver (>=12.1) and BOINC client (>=7.0.26). Note that this BOINC version is still a development version (but fixes some OpenGL related problems), it can be downloaded from here: http://boinc.berkeley.edu/dl/ Without updating to this BOINC version, you will not be able to get OpenCL work on Albert! Let's try and collect your feedback to this specific release (and this one only) in this thread. Thanks, Heinz-Bernd |
TRuEQ & TuVaLu Send message Joined: 11 Sep 06 Posts: 75 Credit: 615,315 RAC: 0 |
I made an answer in another thread that also migth be in here. http://albert.phys.uwm.edu/forum_thread.php?id=8883&nowrap=true#111976 |
Infusioned Send message Joined: 11 Feb 05 Posts: 45 Credit: 149,000 RAC: 0 |
GPU load is steady at 20-21%, and CPU load literally bounces: 5%,15%,6%,14%,4%,17%, etc. with 17 being the highest I've seen. |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
Thanks for the feedback. We would also be interested to hear about graphics RAM usage, especially when crunching workunits that were generated beginning from 28th of May. Cheers HBE |
Infusioned Send message Joined: 11 Feb 05 Posts: 45 Credit: 149,000 RAC: 0 |
March? April? Or May of last year? wu 4/29/2012 9:29:59 AM | Albert@Home | Starting task p2030.20110421.G41.06+00.53.N.b6s0g0.00000_3728_0 using einsteinbinary_BRP4 version 123 (atiOpenCL) in slot 0 |
Infusioned Send message Joined: 11 Feb 05 Posts: 45 Credit: 149,000 RAC: 0 |
Also, is there a way to make them thumbnails in my post and when you click them they link to larger images (just to not annoy people with really large images)? |
steffen_moeller Send message Joined: 9 Feb 05 Posts: 13 Credit: 397,892 RAC: 0 |
HD 5670, 1GB RAM, Windows 7 Home, Catalyst version 12.4 uses 521 MB, 50 MB dynamic, load 90%, temperature 71.5 deg. Celsius http://albert.phys.uwm.edu/result.php?resultid=198490 |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
Hi Thanks again for the feedback. Oops...I meant workunits generated on 28th of April, not May :-) Cheers HB |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
One more thing: while the workunit mentioned above was sent only recently, it was generated already on the 23rd of April, so it is still one of the "tweaked" workunits. Once the newly generated workunits are reached out, we should see a reduced memory usage and some modest performance increase. Cheers HB |
steffen_moeller Send message Joined: 9 Feb 05 Posts: 13 Credit: 397,892 RAC: 0 |
One more thing: while the workunit mentioned above was sent only recently, it was generated already on the 23rd of April, so it is still one of the "tweaked" workunits. Once the newly generated workunits are reached out, we should see a reduced memory usage and some modest performance increase. Does this mean we are 5 community-days late with processing? If so, I suggest to just stop everything from being sent that does not bring additional insights. Hm, thinking again, you have certainly done that and I was just too quick when I read the announcement. Ah, wait, you expect an impact on the performance also from the tweaking, so you need to have the same new app performed both on tweaked and regular workunits ?!? Steffen |
Infusioned Send message Joined: 11 Feb 05 Posts: 45 Credit: 149,000 RAC: 0 |
One more thing: while the workunit mentioned above was sent only recently, it was generated already on the 23rd of April, so it is still one of the "tweaked" workunits. Once the newly generated workunits are reached out, we should see a reduced memory usage and some modest performance increase. I was afraid of that. However, I didn't know how to decipher what date p2030.20110421.G41.06+00.53.N.b6s0g0.00000_3728_0 ... Never mind, I just realized that 20110421 means April 21, 2011. |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
Hi It's ok that the new app version is first crunching thru some of the old workunits, to make sure we didn't break anything or significantly degraded performance even for the code paths that are used only with those old workunits. The support for the old, "tweaked" workunits will stay in the code in case we will again need it later. The timestamps of 2011 that you might see in the logs or workunit file names refer to the time when the raw data for the workunit was recorded at the radio telescope. This is not crucial for the question we are discussing here. Cheers HB |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
Hi! I've seen the first "new" workunits being completed now, e.g. this one: http://albert.phys.uwm.edu/workunit.php?wuid=68037 This should give a rough idea what to expect: good: * this one validated against a CUDA task * comparing to older openCL tasks of the same host, the new app with the new workunits seems to show a 10-20% performance increase. still needs improvement: * CPU usage seems to be higher that for the CUDA app. I'm not sure how much of this is caused by the driver rather than the app itself * overall performance is in the right ballpark as compared to the CUDA app, but there should be a bit more room for improvement. Still, I think if this trend is confirmed by more results and validation is successful, we have a release candidate for Einstein@Home. We will have to upgrade the server side BOINC software to a version that supports OpenCL (as here on Albert@Home), tho. So with your continued help as beta testers for the OpenCL app here, we are now closing in on going into production with the ATI app. Cheers HB |
Infusioned Send message Joined: 11 Feb 05 Posts: 45 Credit: 149,000 RAC: 0 |
I'm not sure why, but I've thrown 3 error recently: http://albert.phys.uwm.edu/workunit.php?wuid=67888 http://albert.phys.uwm.edu/workunit.php?wuid=66586 http://albert.phys.uwm.edu/workunit.php?wuid=66147 edit: Upon examination, all the wu's that errored start with this: <core_client_version>7.0.26</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
Thanks for the feedback, I think we have seen this particular error also with other apps and it might even be a general BOINC issue...definitley needs some investigation. I see your host has now a mix of old and new WUs and I understand that the speedup is about 20%. If you can provide any numbers for the Video RAM usage, that would be cool. Cheers HB |
terencewee* Send message Joined: 2 Feb 12 Posts: 5 Credit: 4,500 RAC: 0 |
Using this host. It's a mobile workstation, i7-820qm, FirePro Mobility 7820 (Juniper-based). Driver Package: 8.911.3.3-120309a-136336C Catalyst version: 11.11 I was running POEM++ OpenCL x3 WU on it. Pause all running WU. Exit BOINC. Re-launch BOINC. Select Albert WU. Resume @ ~0.018%, the screen starts to have multi-color square dots But it continue running. Pause Albert WU. Move mouse/window, dots disappear. Resume Albert WU. Driver restarts/recover @ ~0.320%. Pause Albert WU. Exit BOINC Restart machine. Login, launch BOINC. Resume Albert WU. No dots, continue run to completion (I hope). Hope this can be rectified before release. clinfo dump: Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 1.1 AMD-APP (831.4) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices cl_khr_d3d10_sharing Platform Name: AMD Accelerated Parallel Processing Number of devices: 2 Device Type: CL_DEVICE_TYPE_GPU Device ID: 4098 Board name: ATI FirePro M7820 Max compute units: 10 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 4 Native vector width double: 0 Max clock frequency: 700Mhz Address bits: 32 Max memory allocation: 536870912 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 8192 Max image 2D height: 8192 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 2048 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 1073741824 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Kernel Preferred work group size multiple: 64 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue properties: Out-of-Order: No Profiling : Yes Platform ID: 000007FEF1FBC9C8 Name: Juniper Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 1.1 Driver version: CAL 1.4.1607 (VM) Profile: FULL_PROFILE Version: OpenCL 1.1 AMD-APP (831.4) Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing -- terencewee* Sicituradastra. |
Infusioned Send message Joined: 11 Feb 05 Posts: 45 Credit: 149,000 RAC: 0 |
I see your host has now a mix of old and new WUs ? I poked through my history and all my wu's have 20110421 in them. I started aborting batches to try and get some new ones, but no dice so far. Unless I am mistaken, the 20110421 is the datestamp for when the data was recorded? Or is that the datestamp from when it was split? I have the day off tomorrow so I will abort/babysit Boinc to try and get some newer ones. |
Infusioned Send message Joined: 11 Feb 05 Posts: 45 Credit: 149,000 RAC: 0 |
p2030.20110421.G41.29-00.40.S.b0s0g0.00000_744_0 using einsteinbinary_BRP4 version 123 (atiOpenCL) GPU-Z & Task Manager: http://img7.imageshack.us/img7/7159/p203020110421g41290040s.jpg |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
I see your host has now a mix of old and new WUs This is not the WU creation date, you can see that one by following the WU link in the results list. It seems that the first "new" WUs were generated around 13:00 UTC on 27th of April already. When looking at your results, you will notice the results will fall into one of two narrow ranges of runtime, where the newer results (newer by WU creation time) run about 20% faster. Cheers HB |
Infusioned Send message Joined: 11 Feb 05 Posts: 45 Credit: 149,000 RAC: 0 |
p2030.20110421.G41.29-00.40.S.b0s0g0.00000_1264_1 http://img809.imageshack.us/img809/154/b0s0g00000012641.jpg |