Deprecated: Function get_magic_quotes_gpc() is deprecated in /srv/BOINC/live-webcode/html/inc/util.inc on line 640
it's time to ask for help

WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!

it's time to ask for help

Message boards : Problems and Bug Reports : it's time to ask for help
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Stephan Goll

Send message
Joined: 13 Dec 05
Posts: 19
Credit: 1,874,367
RAC: 0
Message 111963 - Posted: 23 Apr 2012, 16:14:06 UTC
Last modified: 23 Apr 2012, 16:18:13 UTC

It's a 64 bit Debian, mixed stable and testing, the X server is running but not used because the system is running headless.
Primegrid is running well, so CAL / OpenCL is working. The card is an AMD Radeon HD 5570, lspci tells this:
04:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Redwood PRO [Radeon HD 5500 Series]
It's this one: http://albert.phys.uwm.edu/results.php?hostid=2756

boinc@celeron:~$ ldd projects/albert.phys.uwm.edu/einsteinbinary_BRP4_1.22_i686-pc-linux-gnu__atiOpenCL
linux-gate.so.1 => (0xf773a000)
libOpenCL.so.1 => /usr/lib32/libOpenCL.so.1 (0xf772e000)
libpthread.so.0 => /lib32/libpthread.so.0 (0xf7715000)
libm.so.6 => /lib32/libm.so.6 (0xf76ef000)
libstdc++.so.6 => /usr/lib32/libstdc++.so.6 (0xf75fa000)
libc.so.6 => /lib32/libc.so.6 (0xf749e000)
/lib/ld-linux.so.2 (0xf773b000)
libdl.so.2 => /lib32/libdl.so.2 (0xf749a000)
libgcc_s.so.1 => /usr/lib32/libgcc_s.so.1 (0xf747c000)

boinc@celeron:~$ ldd projects/albert.phys.uwm.edu/einsteinbinary_BRP4_1.00_graphics_i686-pc-linux-gnu
linux-gate.so.1 => (0xf7701000)
libpthread.so.0 => /lib32/libpthread.so.0 (0xf76e2000)
libm.so.6 => /lib32/libm.so.6 (0xf76bc000)
libdl.so.2 => /lib32/libdl.so.2 (0xf76b8000)
libX11.so.6 => not found
libXext.so.6 => not found
libGL.so.1 => /usr/lib32/libGL.so.1 (0xf75ca000)
libGLU.so.1 => not found
libc.so.6 => /lib32/libc.so.6 (0xf746e000)
/lib/ld-linux.so.2 (0xf7702000)
libXext.so.6 => not found
libgcc_s.so.1 => /usr/lib32/libgcc_s.so.1 (0xf7451000)

Okay, some missing libs for the graphics application, but the OpenCL app seems to be happy. It should not matter because I do not use the X11 system.
I installed only the amd-driver-installer-12-3-x86.x86_64.run, not the SDK.

23-Apr-2012 16:28:32 [---] Starting BOINC client version 7.0.26 for x86_64-pc-linux-gnu
23-Apr-2012 16:28:32 [---] log flags: file_xfer, sched_ops, task
23-Apr-2012 16:28:32 [---] Libraries: libcurl/7.21.0 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.15 libssh2/1.2.6
23-Apr-2012 16:28:32 [---] Running as a daemon
23-Apr-2012 16:28:32 [---] Data directory: /home/boinc
23-Apr-2012 16:28:32 [---] Processor: 2 GenuineIntel Intel(R) Celeron(R) CPU E3400 @ 2.60GHz [Family 6 Model 23 Stepping 10]
23-Apr-2012 16:28:32 [---] Processor: 1.00 MB cache
23-Apr-2012 16:28:32 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss
ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm tpr_shadow vnmi flexpriority
23-Apr-2012 16:28:32 [---] OS: Linux: 2.6.32-5-amd64
23-Apr-2012 16:28:32 [---] Memory: 7.79 GB physical, 512.36 MB virtual
23-Apr-2012 16:28:32 [---] Disk: 19.22 GB total, 17.70 GB free
23-Apr-2012 16:28:32 [---] Local time is UTC +2 hours
23-Apr-2012 16:28:32 [---] ATI GPU 0: Redwood (CAL version 1.4.1703, 1024MB, 1000MB available, 50 GFLOPS peak)
23-Apr-2012 16:28:32 [---] OpenCL: ATI GPU 0: Redwood (driver version CAL 1.4.1703, device version OpenCL 1.1 AMD-APP (898.1), 1024MB, 1000MB available)
...

Boinc seems to be happy. And I'm getting:

<core_client_version>7.0.26</core_client_version>
<![CDATA[
<message>
process exited with code 255 (0xff, -1)
</message>
<stderr_txt>
[17:37:18][1626][INFO ] Application startup - thank you for supporting Einstein@Home!
[17:37:18][1626][INFO ] Starting data processing...
[17:37:18][1626][ERROR] Failed to get OpenCL platform/device info from BOINC (error: -1)!
[17:37:18][1626][ERROR] Demodulation failed (error: -1)!
17:37:18 (1626): called boinc_finish

</stderr_txt>
]]>

I have no idea and I will surrender for today. Any help is welcome.
Stephan
ID: 111963 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 111965 - Posted: 24 Apr 2012, 9:32:40 UTC - in response to Message 111963.  
Last modified: 24 Apr 2012, 9:33:37 UTC

Hmm....this kind of error happens when the BOINC part that is part of the app cannot "see" the GPU that was assigned to the app by the boinc core client. This is a rather tricky part of the BOINC code and I would not be surprised if there were still some remaining problems.


Could you please do the following:

in a console on that machine:

export DISPLAY=":0.0"

clinfo


is the GPU listed in the output? Just one GPU installed or several of them, and if many, which one(s) is/are detected?

Thanks
HBE
ID: 111965 · Report as offensive     Reply Quote
Profile Stephan Goll

Send message
Joined: 13 Dec 05
Posts: 19
Credit: 1,874,367
RAC: 0
Message 111966 - Posted: 24 Apr 2012, 13:12:03 UTC - in response to Message 111965.  
Last modified: 24 Apr 2012, 13:18:53 UTC

Hi Bikeman,
sure I can. Btw. clinfo runs on the console, no need to export the DISPLAY variable. Here comes the complete output (snipped some CPU related parts):

boinc@celeron:~$ clinfo
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.1 AMD-APP (898.1)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices


Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 4098
Board name: ATI Radeon HD 5570
Device Topology: PCI[ B#4, D#0, F#0 ]
Max compute units: 5
Max work items dimensions: 3
Max work items[0]: 256
Max work items[1]: 256
Max work items[2]: 256
Max work group size: 256
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 2
Preferred vector width float: 4
Preferred vector width double: 0
Native vector width char: 16
Native vector width short: 8
Native vector width int: 4
Native vector width long: 2
Native vector width float: 4
Native vector width double: 0
Max clock frequency: 0Mhz
Address bits: 32
Max memory allocation: 134217728
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 8192
Max image 2D height: 8192
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 2048
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 536870912
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue properties:
Out-of-Order: No
Profiling : Yes
Platform ID: 0x7fe58a9c2480
Name: Redwood
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 1.1
Driver version: CAL 1.4.1703
Profile: FULL_PROFILE
Version: OpenCL 1.1 AMD-APP (898.1)
Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_amd_meminfo

<edit by myself - CPU>
Device Type: CL_DEVICE_TYPE_CPU
Device ID: 4098
Board name:
Max compute units: 2
........... <snipped>.......
Platform ID: 0x7fe58a9c2480
Name: Intel(R) Celeron(R) CPU E3400 @ 2.60GHz
Vendor: GenuineIntel
Device OpenCL C version: OpenCL C 1.1
Driver version: 2.0
Profile: FULL_PROFILE
Version: OpenCL 1.1 AMD-APP (898.1)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt

Regards,
Stephan
ID: 111966 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 111967 - Posted: 24 Apr 2012, 17:14:32 UTC - in response to Message 111966.  

Hi!

Thanks for the feedback. Looks normal...hmm.....



Btw. clinfo runs on the console, no need to export the DISPLAY variable.


Yup, that was just to ensure it's not forwarded, as then clinfo won't display the GPU. As you wrote, X server needs to be running even for headless GPU operation.

Cheers
HBE
ID: 111967 · Report as offensive     Reply Quote
Profile steffen_moeller

Send message
Joined: 9 Feb 05
Posts: 13
Credit: 397,892
RAC: 0
Message 111970 - Posted: 25 Apr 2012, 20:02:48 UTC - in response to Message 111967.  

This is a bit weird, indeed. If BOINC is happy, so should be the app. What version of Debian are you using? Have you tried some another OpenCL app than Albert?

Just to exclude things a bit:
* from the user running the X session try "xhost +"
* add the boinc user to the video group
* restart the boinc client after X was started

Good luck

Steffen
ID: 111970 · Report as offensive     Reply Quote
Profile Stephan Goll

Send message
Joined: 13 Dec 05
Posts: 19
Credit: 1,874,367
RAC: 0
Message 111971 - Posted: 25 Apr 2012, 21:09:04 UTC - in response to Message 111970.  
Last modified: 25 Apr 2012, 21:24:35 UTC

It is Debian, mostly from the stable release. Some parts like the libc, libssl and some of the 32 bit i686 compatibility libs are from the testing release. Not all libs are available in 32 bit version and that's why some of the libs are missing.
About the xhost thing ... well, I can try this, but honestly: this is to allow boinc to write on the screen. I do not think this will help because this box is running headless. Adding the boinc accout to the video group ... well, there is no group with this name. But the devicefile (/dev/ati/card0) is set to 666 and PrimeGrip does not have any problems (see http://www.primegrid.com/results.php?hostid=252937).
But there is an hint inside your answer ... I will check if I can start the X server as boinc user. May be this helps ... and I will report it later.
Thanks,
Stephan
PS: Okay, I modified the /etc/X11/Xwrapper.config file and was able to start the X server from the boinc account. Let's wait ...
ID: 111971 · Report as offensive     Reply Quote
Profile Stephan Goll

Send message
Joined: 13 Dec 05
Posts: 19
Credit: 1,874,367
RAC: 0
Message 112012 - Posted: 2 May 2012, 19:24:04 UTC - in response to Message 111971.  

Nothing changed ... still the same error. I'm clueless. Are some devs here who can explain the error message? It seems to come from the application, not from the boinc itself. Anyway: I will continue to fetch some work every week to see if something changes.
Stephan
ID: 112012 · Report as offensive     Reply Quote
Profile Stephan Goll

Send message
Joined: 13 Dec 05
Posts: 19
Credit: 1,874,367
RAC: 0
Message 112019 - Posted: 3 May 2012, 14:44:45 UTC - in response to Message 112012.  

I just found that POEM do have an OpenCL application. I subscribed at POEM and got my first WU. I will report when it has finished ...
Stephan
PS: http://boinc.fzk.de/poem/results.php?userid=46736
ID: 112019 · Report as offensive     Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 25 Nov 05
Posts: 204
Credit: 64,008
RAC: 0
Message 112025 - Posted: 3 May 2012, 21:08:01 UTC - in response to Message 112019.  

I just found that POEM do have an OpenCL application.


As does milkyway and collatz
ID: 112025 · Report as offensive     Reply Quote
Profile Stephan Goll

Send message
Joined: 13 Dec 05
Posts: 19
Credit: 1,874,367
RAC: 0
Message 112028 - Posted: 3 May 2012, 22:06:25 UTC - in response to Message 112025.  

I just found that POEM do have an OpenCL application.


As does milkyway and collatz


Milkyway requires double precision ... my 5570 does not have this feature. Collatz ... ATI only for Windows. I'm running only linux.
But the good thing is: the POEM app generates valid results just like PrimeGrid does. So I'm sure my card is fine, the box is fine ... but then there must be something wrong with the albert application.

I was going back with the libc6, the libc6-i383 und the lib32gcc1 from the testing to the stable release and restartet the boinc. I will give albert another try. But then ... I don't know.
Stephan
ID: 112028 · Report as offensive     Reply Quote
Profile Trog Dog
Avatar

Send message
Joined: 25 Nov 05
Posts: 204
Credit: 64,008
RAC: 0
Message 112029 - Posted: 4 May 2012, 0:46:10 UTC - in response to Message 112028.  



Milkyway requires double precision ... my 5570 does not have this feature. Collatz ... ATI only for Windows. I'm running only linux.
But the good thing is: the POEM app generates valid results just like PrimeGrid does. So I'm sure my card is fine, the box is fine ... but then there must be something wrong with the albert application.

I was going back with the libc6, the libc6-i383 und the lib32gcc1 from the testing to the stable release and restartet the boinc. I will give albert another try. But then ... I don't know.
Stephan


The collatz linux app is available under the optimised apps link on their front page from memory.
ID: 112029 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 112033 - Posted: 4 May 2012, 14:31:46 UTC - in response to Message 112028.  

Hi Stephan,

Just talked to the rest of the dev team (and David Anderson who happens to be here on a visit) about your problem with using the OpenCL card. We'd like to narrow it down to either the app (and its BOINC API library) or the BOINC client, so we suggest to do the following for debugging the problem:


We would like to have a look at the init_data.xml file from the slot directory of a failed OpenCL task. DO NOT post that file here entirely because it contains confidential info!

Anyway the app_info.xml file and the rest of the slot files will be deleted shortly after the app fails, so the way to get this file is to use a special flag in cc_config.xml to tell the Boinc core client to exit before starting a new task (which allows to inspect the init_data.xml file of a new OpenCL task).

The flag is 1 , described here: http://boinc.berkeley.edu/wiki/Client_configuration


The tags that are interesting in the init_data.xml

xxx
yyy
zzz

and the whole stuff between the tags



and




The app (or more specifically the BOINC lib that gets linked to this app) will try to detect the device specified in the XML and this is where it fails currently.

Thanks for your help,

Heinz-Bernd

ID: 112033 · Report as offensive     Reply Quote
Profile Stephan Goll

Send message
Joined: 13 Dec 05
Posts: 19
Credit: 1,874,367
RAC: 0
Message 112040 - Posted: 5 May 2012, 16:34:42 UTC - in response to Message 112033.  
Last modified: 5 May 2012, 16:42:16 UTC

Okay ... I solved this puzzle.

The thing is: the OpenCL albert application is _not_ only the einsteinbinary_BRP4_1.23_i686-pc-linux-gnu__atiOpenCL. This binary is only one half of the OpenCL application and the other half is ... einsteinbinary_BRP4_1.00_graphics_i686-pc-linux-gnu. This is a bit strange and confused me for a long time, but finally one must provide this binary with the proper libraries. In my case the libs in question were available ... but in the 64-bit version only. So I copied the missing libs from a 32-bit system to /usr/local/lib/ and linked them to /usr/lib32/.

libGLU.so.1
libX11.so.6
libXau.so.6
libXdmcp.so.6
libXext.so.6
libxcb.so.1

After this I asked the albert server for work again and now the OpenCL application is running. 13 minutes and still crunching.
Really ... I can understand that the application wants some essential libraries like the ones provided from the ATI driver, but the additional libraries ...
So I think that the dev may think about the use of this libs for future releases but for now I can say: nice work. It's running.

Thanks for the all the help, ideas and inspiration. May be this will be part of the documentation / FAQ for the OpenCL app to help others not to trap into this problem.
Stephan
ID: 112040 · Report as offensive     Reply Quote
Profile Bikeman (Heinz-Bernd Eggenstein)
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 28 Aug 06
Posts: 1483
Credit: 1,864,017
RAC: 0
Message 112051 - Posted: 6 May 2012, 19:26:37 UTC - in response to Message 112040.  

Hi!

Good to hear that...even tho I do not fully understand it :-) . This other program you mentioned is the graphics (screensaver equivalent) which is only executed when you ask for it by pressing a button in BOINC manager. It is not required to run the science app.

I suspect that it was more a reboot perhaps that fixed this, or maybe the OpenCL driver indirectly needs one of those libs that you installed additionally. Anyway, this is interesting an good to know in case the problem occurs with other volunteers, thanks for the feedback.

Cheers
HBE
ID: 112051 · Report as offensive     Reply Quote
Profile Stephan Goll

Send message
Joined: 13 Dec 05
Posts: 19
Credit: 1,874,367
RAC: 0
Message 112052 - Posted: 7 May 2012, 0:53:04 UTC - in response to Message 112051.  
Last modified: 7 May 2012, 1:01:05 UTC

Hi!
...

I suspect that it was more a reboot perhaps that fixed this ...

HBE


boinc@celeron:~$ uptime
02:32:07 up 13 days, 10:12, 2 users, load average: 2.40, 2.37, 2.36

No reboot since ... erm ... well, no reboot in at least nearly two weeks. You may ask the devs if the OpenCL app depends somehow on the graphics application. They should know ... and I would not be suprised if it does.
Or you may try to unlink the libraries I mentioned or move them out of the library directory and restart boinc. I think I remember that you wrote that you installed the libGLU library to get the app running and this library is only needed by the graphics app, not the OpenCL app (see the ldd stuff above).
Anyway: I hope to see the OpenCL app sooner or later in the einstein project. Happy crunching.
:)
Stephan
PS:
boinc@celeron:~$ strings projects/albert.phys.uwm.edu/einsteinbinary_BRP4_1.23_i686-pc-linux-gnu__atiOpenCL |grep -i graphics
_ZN12GRAPHICS_APPC1Eb
_ZN12GRAPHICS_APP10is_runningEv
_ZN12GRAPHICS_APP3runEPc
_ZN12GRAPHICS_MSGC2Ev
xml_graphics_modes
_ZN12GRAPHICS_MSGC1Ev
_Z25boinc_graphics_make_shmemPKci
_ZN12GRAPHICS_APP4killEv
boinc_web_graphics_url
send_web_graphics_url
_Z24boinc_graphics_get_shmemPKc
_ZN14APP_CLIENT_SHM19decode_graphics_msgEPcR12GRAPHICS_MSG
boinc_init_graphics_diagnostics
graphics_info
graphics_app
<web_graphics_url>%s</web_graphics_url>
<mode_hide_graphics/>

A lot of references to graphics in the OpenCL app ...
ID: 112052 · Report as offensive     Reply Quote
michael17qs

Send message
Joined: 17 Jul 12
Posts: 2
Credit: 0
RAC: 0
Message 112131 - Posted: 17 Jul 2012, 10:00:49 UTC - in response to Message 112052.  

no problemo
ID: 112131 · Report as offensive     Reply Quote

Message boards : Problems and Bug Reports : it's time to ask for help



This material is based upon work supported by the National Science Foundation (NSF) under Grant PHY-0555655 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2024 Bruce Allen for the LIGO Scientific Collaboration