Deprecated: Function get_magic_quotes_gpc() is deprecated in /srv/BOINC/live-webcode/html/inc/util.inc on line 640
Intel GPU tasks error out

WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!

Intel GPU tasks error out

Message boards : Problems and Bug Reports : Intel GPU tasks error out
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile zombie67 [MM]
Avatar

Send message
Joined: 10 Oct 06
Posts: 130
Credit: 30,924,459
RAC: 0
Message 113273 - Posted: 26 Jul 2014, 3:28:32 UTC

I have 5 mac minis with HD 4000. They are all erring out. Here is a sample:

https://albert.phys.uwm.edu/result.php?resultid=1526988
Dublin, California
Team: SETI.USA

ID: 113273 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 113275 - Posted: 26 Jul 2014, 8:31:39 UTC - in response to Message 113273.  
Last modified: 26 Jul 2014, 9:03:15 UTC

Ah, the imfamous

Exit status 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED

You could try the workround I suggested in the thread of that name, at message 113253.
Use the name einsteinbinary_BRP4 in your app_config.xml file. No, scrub that - it's a hsgamma_FGRP3 task, same as the example: I hadn't realised we had FGRP for intel_gpu.


Afterthought - not sure if the FGRP utilisation factor control on the preferences page is applied to intel_gpu apps - you'll have to experiment and find out for yourself. If the workround doesn't work, you'll have to do it the old-fashioned way - increasing <rsc_fpops_bound> in client_state.xml by a factor of 100 or so for each task.
ID: 113275 · Report as offensive     Reply Quote
Profile zombie67 [MM]
Avatar

Send message
Joined: 10 Oct 06
Posts: 130
Credit: 30,924,459
RAC: 0
Message 113277 - Posted: 27 Jul 2014, 13:09:39 UTC

It worked. Thanks!

Do I have to leave this setting indefinitely? If not, at what point can I remove the app_config and change the settings back to normal?
Dublin, California
Team: SETI.USA

ID: 113277 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 10 Dec 05
Posts: 450
Credit: 5,409,572
RAC: 0
Message 113278 - Posted: 27 Jul 2014, 14:07:34 UTC - in response to Message 113277.  

It worked. Thanks!

Do I have to leave this setting indefinitely? If not, at what point can I remove the app_config and change the settings back to normal?

Look at the application details page for each of your hosts - e.g. https://albert.phys.uwm.edu/host_app_versions.php?hostid=8551 for the lowest numbered.

When the 'Number of tasks completed' for the Gamma-ray pulsar search #3 application reaches 11, that host will be safe to leave to its own devices.

But don't remove the app_config.xml file from any of them until you're able to set the utilisation factor in preferences back to 1.0, or all hell will break loose!

Note that 'completed' requires that the tasks be validated, not just returned successfully. I see that you've been paired with that dreadful batch of 40-odd intel-gpu hosts with bad OpenCL 1.1 drivers (ID numbers in the low 9xxx range) - that will delay your completions until well-managed hosts come along to sort out the inconclusive validations.
ID: 113278 · Report as offensive     Reply Quote
Profile zombie67 [MM]
Avatar

Send message
Joined: 10 Oct 06
Posts: 130
Credit: 30,924,459
RAC: 0
Message 113279 - Posted: 27 Jul 2014, 16:40:39 UTC
Last modified: 27 Jul 2014, 17:13:59 UTC

Thanks! All are crunching away on the GPU now, except for this one:

https://albert.phys.uwm.edu/show_host_detail.php?hostid=8551

It is identical to all the rest, same app_config.xml (copy/pasted the same file from dropbox, so no editing differences), same location, everything. It asks for work, but gets none. Any ideas?

Here is one that did get work, for comparison:

https://albert.phys.uwm.edu/show_host_detail.php?hostid=8552

Edit: Never mind. Half an hour later, it got some work. Odd. But no longer an issue.
Dublin, California
Team: SETI.USA

ID: 113279 · Report as offensive     Reply Quote
Profile zombie67 [MM]
Avatar

Send message
Joined: 10 Oct 06
Posts: 130
Credit: 30,924,459
RAC: 0
Message 113280 - Posted: 28 Jul 2014, 4:29:36 UTC - in response to Message 113279.  

Edit: Never mind. Half an hour later, it got some work. Odd. But no longer an issue.


Maybe not a one-off quirk after all. My TITAN cannot get any work for half a day now. It asks, and the server says it gets nothing. No explanation why though.
Dublin, California
Team: SETI.USA

ID: 113280 · Report as offensive     Reply Quote
Claggy

Send message
Joined: 29 Dec 06
Posts: 78
Credit: 4,040,969
RAC: 0
Message 113281 - Posted: 28 Jul 2014, 7:28:53 UTC - in response to Message 113280.  

Edit: Never mind. Half an hour later, it got some work. Odd. But no longer an issue.


Maybe not a one-off quirk after all. My TITAN cannot get any work for half a day now. It asks, and the server says it gets nothing. No explanation why though.

If you look at the task list you'll see the reason why, all it's tasks were aborted via the GUI yesterday, now it's got a small allowance for a few of the app versions,
it has managed to pick up 5 tasks today.

Claggy
ID: 113281 · Report as offensive     Reply Quote

Message boards : Problems and Bug Reports : Intel GPU tasks error out



This material is based upon work supported by the National Science Foundation (NSF) under Grant PHY-0555655 and by the Max Planck Gesellschaft (MPG). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the investigators and do not necessarily reflect the views of the NSF or the MPG.

Copyright © 2024 Bruce Allen for the LIGO Scientific Collaboration