WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!
it's time to ask for help |
Message boards :
Problems and Bug Reports :
it's time to ask for help
Message board moderation
Author | Message |
---|---|
Stephan Goll Send message Joined: 13 Dec 05 Posts: 19 Credit: 1,874,367 RAC: 0 |
It's a 64 bit Debian, mixed stable and testing, the X server is running but not used because the system is running headless. Primegrid is running well, so CAL / OpenCL is working. The card is an AMD Radeon HD 5570, lspci tells this: 04:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Redwood PRO [Radeon HD 5500 Series] It's this one: http://albert.phys.uwm.edu/results.php?hostid=2756 boinc@celeron:~$ ldd projects/albert.phys.uwm.edu/einsteinbinary_BRP4_1.22_i686-pc-linux-gnu__atiOpenCL linux-gate.so.1 => (0xf773a000) libOpenCL.so.1 => /usr/lib32/libOpenCL.so.1 (0xf772e000) libpthread.so.0 => /lib32/libpthread.so.0 (0xf7715000) libm.so.6 => /lib32/libm.so.6 (0xf76ef000) libstdc++.so.6 => /usr/lib32/libstdc++.so.6 (0xf75fa000) libc.so.6 => /lib32/libc.so.6 (0xf749e000) /lib/ld-linux.so.2 (0xf773b000) libdl.so.2 => /lib32/libdl.so.2 (0xf749a000) libgcc_s.so.1 => /usr/lib32/libgcc_s.so.1 (0xf747c000) boinc@celeron:~$ ldd projects/albert.phys.uwm.edu/einsteinbinary_BRP4_1.00_graphics_i686-pc-linux-gnu linux-gate.so.1 => (0xf7701000) libpthread.so.0 => /lib32/libpthread.so.0 (0xf76e2000) libm.so.6 => /lib32/libm.so.6 (0xf76bc000) libdl.so.2 => /lib32/libdl.so.2 (0xf76b8000) libX11.so.6 => not found libXext.so.6 => not found libGL.so.1 => /usr/lib32/libGL.so.1 (0xf75ca000) libGLU.so.1 => not found libc.so.6 => /lib32/libc.so.6 (0xf746e000) /lib/ld-linux.so.2 (0xf7702000) libXext.so.6 => not found libgcc_s.so.1 => /usr/lib32/libgcc_s.so.1 (0xf7451000) Okay, some missing libs for the graphics application, but the OpenCL app seems to be happy. It should not matter because I do not use the X11 system. I installed only the amd-driver-installer-12-3-x86.x86_64.run, not the SDK. 23-Apr-2012 16:28:32 [---] Starting BOINC client version 7.0.26 for x86_64-pc-linux-gnu 23-Apr-2012 16:28:32 [---] log flags: file_xfer, sched_ops, task 23-Apr-2012 16:28:32 [---] Libraries: libcurl/7.21.0 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.15 libssh2/1.2.6 23-Apr-2012 16:28:32 [---] Running as a daemon 23-Apr-2012 16:28:32 [---] Data directory: /home/boinc 23-Apr-2012 16:28:32 [---] Processor: 2 GenuineIntel Intel(R) Celeron(R) CPU E3400 @ 2.60GHz [Family 6 Model 23 Stepping 10] 23-Apr-2012 16:28:32 [---] Processor: 1.00 MB cache 23-Apr-2012 16:28:32 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm tpr_shadow vnmi flexpriority 23-Apr-2012 16:28:32 [---] OS: Linux: 2.6.32-5-amd64 23-Apr-2012 16:28:32 [---] Memory: 7.79 GB physical, 512.36 MB virtual 23-Apr-2012 16:28:32 [---] Disk: 19.22 GB total, 17.70 GB free 23-Apr-2012 16:28:32 [---] Local time is UTC +2 hours 23-Apr-2012 16:28:32 [---] ATI GPU 0: Redwood (CAL version 1.4.1703, 1024MB, 1000MB available, 50 GFLOPS peak) 23-Apr-2012 16:28:32 [---] OpenCL: ATI GPU 0: Redwood (driver version CAL 1.4.1703, device version OpenCL 1.1 AMD-APP (898.1), 1024MB, 1000MB available) ... Boinc seems to be happy. And I'm getting: <core_client_version>7.0.26</core_client_version> <![CDATA[ <message> process exited with code 255 (0xff, -1) </message> <stderr_txt> [17:37:18][1626][INFO ] Application startup - thank you for supporting Einstein@Home! [17:37:18][1626][INFO ] Starting data processing... [17:37:18][1626][ERROR] Failed to get OpenCL platform/device info from BOINC (error: -1)! [17:37:18][1626][ERROR] Demodulation failed (error: -1)! 17:37:18 (1626): called boinc_finish </stderr_txt> ]]> I have no idea and I will surrender for today. Any help is welcome. Stephan |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
Hmm....this kind of error happens when the BOINC part that is part of the app cannot "see" the GPU that was assigned to the app by the boinc core client. This is a rather tricky part of the BOINC code and I would not be surprised if there were still some remaining problems. Could you please do the following: in a console on that machine: export DISPLAY=":0.0" clinfo is the GPU listed in the output? Just one GPU installed or several of them, and if many, which one(s) is/are detected? Thanks HBE |
Stephan Goll Send message Joined: 13 Dec 05 Posts: 19 Credit: 1,874,367 RAC: 0 |
Hi Bikeman, sure I can. Btw. clinfo runs on the console, no need to export the DISPLAY variable. Here comes the complete output (snipped some CPU related parts): boinc@celeron:~$ clinfo Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 1.1 AMD-APP (898.1) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices Platform Name: AMD Accelerated Parallel Processing Number of devices: 2 Device Type: CL_DEVICE_TYPE_GPU Device ID: 4098 Board name: ATI Radeon HD 5570 Device Topology: PCI[ B#4, D#0, F#0 ] Max compute units: 5 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 4 Native vector width double: 0 Max clock frequency: 0Mhz Address bits: 32 Max memory allocation: 134217728 Image support: Yes Max number of images read arguments: 128 Max number of images write arguments: 8 Max image 2D width: 8192 Max image 2D height: 8192 Max image 3D width: 2048 Max image 3D height: 2048 Max image 3D depth: 2048 Max samplers within kernel: 16 Max size of kernel argument: 1024 Alignment (bits) of base address: 2048 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: Yes Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 536870912 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Kernel Preferred work group size multiple: 64 Error correction support: 0 Unified memory for Host and Device: 0 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue properties: Out-of-Order: No Profiling : Yes Platform ID: 0x7fe58a9c2480 Name: Redwood Vendor: Advanced Micro Devices, Inc. Device OpenCL C version: OpenCL C 1.1 Driver version: CAL 1.4.1703 Profile: FULL_PROFILE Version: OpenCL 1.1 AMD-APP (898.1) Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_amd_meminfo <edit by myself - CPU> Device Type: CL_DEVICE_TYPE_CPU Device ID: 4098 Board name: Max compute units: 2 ........... <snipped>....... Platform ID: 0x7fe58a9c2480 Name: Intel(R) Celeron(R) CPU E3400 @ 2.60GHz Vendor: GenuineIntel Device OpenCL C version: OpenCL C 1.1 Driver version: 2.0 Profile: FULL_PROFILE Version: OpenCL 1.1 AMD-APP (898.1) Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt Regards, Stephan |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
Hi! Thanks for the feedback. Looks normal...hmm.....
Yup, that was just to ensure it's not forwarded, as then clinfo won't display the GPU. As you wrote, X server needs to be running even for headless GPU operation. Cheers HBE |
steffen_moeller Send message Joined: 9 Feb 05 Posts: 13 Credit: 397,892 RAC: 0 |
This is a bit weird, indeed. If BOINC is happy, so should be the app. What version of Debian are you using? Have you tried some another OpenCL app than Albert? Just to exclude things a bit: * from the user running the X session try "xhost +" * add the boinc user to the video group * restart the boinc client after X was started Good luck Steffen |
Stephan Goll Send message Joined: 13 Dec 05 Posts: 19 Credit: 1,874,367 RAC: 0 |
It is Debian, mostly from the stable release. Some parts like the libc, libssl and some of the 32 bit i686 compatibility libs are from the testing release. Not all libs are available in 32 bit version and that's why some of the libs are missing. About the xhost thing ... well, I can try this, but honestly: this is to allow boinc to write on the screen. I do not think this will help because this box is running headless. Adding the boinc accout to the video group ... well, there is no group with this name. But the devicefile (/dev/ati/card0) is set to 666 and PrimeGrip does not have any problems (see http://www.primegrid.com/results.php?hostid=252937). But there is an hint inside your answer ... I will check if I can start the X server as boinc user. May be this helps ... and I will report it later. Thanks, Stephan PS: Okay, I modified the /etc/X11/Xwrapper.config file and was able to start the X server from the boinc account. Let's wait ... |
Stephan Goll Send message Joined: 13 Dec 05 Posts: 19 Credit: 1,874,367 RAC: 0 |
Nothing changed ... still the same error. I'm clueless. Are some devs here who can explain the error message? It seems to come from the application, not from the boinc itself. Anyway: I will continue to fetch some work every week to see if something changes. Stephan |
Stephan Goll Send message Joined: 13 Dec 05 Posts: 19 Credit: 1,874,367 RAC: 0 |
I just found that POEM do have an OpenCL application. I subscribed at POEM and got my first WU. I will report when it has finished ... Stephan PS: http://boinc.fzk.de/poem/results.php?userid=46736 |
Trog Dog Send message Joined: 25 Nov 05 Posts: 204 Credit: 64,008 RAC: 0 |
I just found that POEM do have an OpenCL application. As does milkyway and collatz |
Stephan Goll Send message Joined: 13 Dec 05 Posts: 19 Credit: 1,874,367 RAC: 0 |
I just found that POEM do have an OpenCL application. Milkyway requires double precision ... my 5570 does not have this feature. Collatz ... ATI only for Windows. I'm running only linux. But the good thing is: the POEM app generates valid results just like PrimeGrid does. So I'm sure my card is fine, the box is fine ... but then there must be something wrong with the albert application. I was going back with the libc6, the libc6-i383 und the lib32gcc1 from the testing to the stable release and restartet the boinc. I will give albert another try. But then ... I don't know. Stephan |
Trog Dog Send message Joined: 25 Nov 05 Posts: 204 Credit: 64,008 RAC: 0 |
The collatz linux app is available under the optimised apps link on their front page from memory. |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
Hi Stephan, Just talked to the rest of the dev team (and David Anderson who happens to be here on a visit) about your problem with using the OpenCL card. We'd like to narrow it down to either the app (and its BOINC API library) or the BOINC client, so we suggest to do the following for debugging the problem: We would like to have a look at the init_data.xml file from the slot directory of a failed OpenCL task. DO NOT post that file here entirely because it contains confidential info! Anyway the app_info.xml file and the rest of the slot files will be deleted shortly after the app fails, so the way to get this file is to use a special flag in cc_config.xml to tell the Boinc core client to exit before starting a new task (which allows to inspect the init_data.xml file of a new OpenCL task). The flag is The tags that are interesting in the init_data.xml and the whole stuff between the tags and The app (or more specifically the BOINC lib that gets linked to this app) will try to detect the device specified in the XML and this is where it fails currently. Thanks for your help, Heinz-Bernd |
Stephan Goll Send message Joined: 13 Dec 05 Posts: 19 Credit: 1,874,367 RAC: 0 |
Okay ... I solved this puzzle. The thing is: the OpenCL albert application is _not_ only the einsteinbinary_BRP4_1.23_i686-pc-linux-gnu__atiOpenCL. This binary is only one half of the OpenCL application and the other half is ... einsteinbinary_BRP4_1.00_graphics_i686-pc-linux-gnu. This is a bit strange and confused me for a long time, but finally one must provide this binary with the proper libraries. In my case the libs in question were available ... but in the 64-bit version only. So I copied the missing libs from a 32-bit system to /usr/local/lib/ and linked them to /usr/lib32/. libGLU.so.1 libX11.so.6 libXau.so.6 libXdmcp.so.6 libXext.so.6 libxcb.so.1 After this I asked the albert server for work again and now the OpenCL application is running. 13 minutes and still crunching. Really ... I can understand that the application wants some essential libraries like the ones provided from the ATI driver, but the additional libraries ... So I think that the dev may think about the use of this libs for future releases but for now I can say: nice work. It's running. Thanks for the all the help, ideas and inspiration. May be this will be part of the documentation / FAQ for the OpenCL app to help others not to trap into this problem. Stephan |
Bikeman (Heinz-Bernd Eggenstein) Volunteer moderator Project administrator Project developer Send message Joined: 28 Aug 06 Posts: 1483 Credit: 1,864,017 RAC: 0 |
Hi! Good to hear that...even tho I do not fully understand it :-) . This other program you mentioned is the graphics (screensaver equivalent) which is only executed when you ask for it by pressing a button in BOINC manager. It is not required to run the science app. I suspect that it was more a reboot perhaps that fixed this, or maybe the OpenCL driver indirectly needs one of those libs that you installed additionally. Anyway, this is interesting an good to know in case the problem occurs with other volunteers, thanks for the feedback. Cheers HBE |
Stephan Goll Send message Joined: 13 Dec 05 Posts: 19 Credit: 1,874,367 RAC: 0 |
Hi! boinc@celeron:~$ uptime 02:32:07 up 13 days, 10:12, 2 users, load average: 2.40, 2.37, 2.36 No reboot since ... erm ... well, no reboot in at least nearly two weeks. You may ask the devs if the OpenCL app depends somehow on the graphics application. They should know ... and I would not be suprised if it does. Or you may try to unlink the libraries I mentioned or move them out of the library directory and restart boinc. I think I remember that you wrote that you installed the libGLU library to get the app running and this library is only needed by the graphics app, not the OpenCL app (see the ldd stuff above). Anyway: I hope to see the OpenCL app sooner or later in the einstein project. Happy crunching. :) Stephan PS: boinc@celeron:~$ strings projects/albert.phys.uwm.edu/einsteinbinary_BRP4_1.23_i686-pc-linux-gnu__atiOpenCL |grep -i graphics _ZN12GRAPHICS_APPC1Eb _ZN12GRAPHICS_APP10is_runningEv _ZN12GRAPHICS_APP3runEPc _ZN12GRAPHICS_MSGC2Ev xml_graphics_modes _ZN12GRAPHICS_MSGC1Ev _Z25boinc_graphics_make_shmemPKci _ZN12GRAPHICS_APP4killEv boinc_web_graphics_url send_web_graphics_url _Z24boinc_graphics_get_shmemPKc _ZN14APP_CLIENT_SHM19decode_graphics_msgEPcR12GRAPHICS_MSG boinc_init_graphics_diagnostics graphics_info graphics_app <web_graphics_url>%s</web_graphics_url> <mode_hide_graphics/> A lot of references to graphics in the OpenCL app ... |
michael17qs Send message Joined: 17 Jul 12 Posts: 2 Credit: 0 RAC: 0 |
no problemo |