WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!
Error -5 and 2021 on all GPU workunits |
Message boards :
Problems and Bug Reports :
Error -5 and 2021 on all GPU workunits
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Mikie Tim T Send message Joined: 22 Jan 05 Posts: 99 Credit: 17,235 RAC: 0 |
I am getting the following on all of the GPU workunits downloaded so far: Actually, this is a 1GB card, so it greatly exceeds the 512MB requirement for the application. Is there any other reason that this error could occur? |
Oliver Behnke Volunteer moderator Project administrator Project developer Send message Joined: 4 Sep 07 Posts: 130 Credit: 8,545,955 RAC: 0 |
Well, certainly but it's hard to tell since OpenCL doesn't provide the means yet to query the memory status on a given GPU. Do you run any other GPU tasks in parallel? Do you use that GPU in a multi-head setup? Maybe some GPU resources didn't get freed from some earlier stuff like a game (unlikely but possible)... If you're running windows and your GPU isn't too new you may check what's going on using this free tool: http://www.techpowerup.com/gpuz I'd be interesting for us to see whether our app is indeed the culprit. Just check the available memory before you start the app, then start it and check again. Best, Oliver |
Doktor z CNT Send message Joined: 22 Nov 11 Posts: 3 Credit: 1,245 RAC: 0 |
Dear Oliver, I have the same problem as others here. All tasks are finished as error while computing with error -5 and 2021. I have W XP 32bit, boinc 7.0.2, catalyst 11.12 (all in one package-with openCL 1.1 and SDK 2.5) on C2D with 2x6GB ram and HD5870 with more than 950 MB free memory. As you ordered here I stopped other projects on GPU and I was watching GPU-Z during the Albert WU processing. Usage of GPU before start is 0%, after the unit is computed it still doesn´t work on GPU (0%). Task crushes after about 30 seconds. More is probably here:http://albert.phys.uwm.edu/results.php?userid=333329 This is in BM protokol (for example): 14.12.2011 13:53:28 | Albert@Home | Starting task p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1 using einsteinbinary_BRP4 version 119 (atiOpenCL) 14.12.2011 13:53:46 | Albert@Home | Computation for task p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1 finished 14.12.2011 13:53:46 | Albert@Home | Output file p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1_0 for task p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1 absent 14.12.2011 13:53:46 | Albert@Home | Output file p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1_1 for task p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1 absent 14.12.2011 13:53:46 | Albert@Home | Output file p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1_2 for task p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1 absent 14.12.2011 13:53:46 | Albert@Home | Output file p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1_3 for task p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1 absent 14.12.2011 13:53:46 | Albert@Home | Output file p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1_4 for task p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1 absent 14.12.2011 13:53:46 | Albert@Home | Output file p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1_5 for task p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1 absent 14.12.2011 13:53:46 | Albert@Home | Output file p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1_6 for task p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1 absent 14.12.2011 13:53:46 | Albert@Home | Output file p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1_7 for task p2030.20100913.G44.54-00.26.S.b3s0g0.00000_848_1 absent Is anything alse I can do for Albert app to make it running? Doktor z CNT |
TRuEQ & TuVaLu Send message Joined: 11 Sep 06 Posts: 75 Credit: 615,315 RAC: 0 |
Dear Oliver, Have you tried downgrading the driver to 11.9? |
Doktor z CNT Send message Joined: 22 Nov 11 Posts: 3 Credit: 1,245 RAC: 0 |
Thank you for answering. I´ll try it and hope for success |
Doktor z CNT Send message Joined: 22 Nov 11 Posts: 3 Credit: 1,245 RAC: 0 |
Still no progress. Does somebody have some other advise??? |
Oliver Behnke Volunteer moderator Project administrator Project developer Send message Joined: 4 Sep 07 Posts: 130 Credit: 8,545,955 RAC: 0 |
Dear Oliver, Thanks for checking. I'll have a look! Cheers, Oliver (back from holiday) |
Mikie Tim T Send message Joined: 22 Jan 05 Posts: 99 Credit: 17,235 RAC: 0 |
Oliver, I have cleaned off all drivers, SDK, etc. from my machine and cleanly installed 11.12 drivers from AMD's website. I also updated to 7.0.8 as the Notices recommended. I suspended the other GPU project, exited and restarted the BOINC client, and took screenshots with GPU-Z as you recommended. At no point did the memory exceed even 100MB, but just terminated exactly after 1 minute of processing. Below are the screenshots as well as the pertinent error logging: Stderr output <core_client_version>7.0.8</core_client_version> <![CDATA[ <message> The specified transform does not match the bitmap's color space. (0x7e5) - exit code 2021 (0x7e5) </message> <stderr_txt> Activated exception handling... [21:37:50][4244][INFO ] Starting data processing... [21:37:50][4244][INFO ] Using OpenCL platform provided by: Advanced Micro Devices, Inc. [21:37:50][4244][INFO ] Using OpenCL device "ATI RV730" by: Advanced Micro Devices, Inc. [21:37:51][4244][WARN ] Kernel "kernelTimeSeriesModulation" exceeds device-specific maximum work group size (requested: 256)! ------> Reducing kernel's work group size to allowed maximum of: 128 work items [21:37:51][4244][WARN ] Kernel "kernelTimeSeriesResampling" exceeds device-specific maximum work group size (requested: 256)! ------> Reducing kernel's work group size to allowed maximum of: 128 work items [21:37:51][4244][WARN ] Kernel "kernelTimeSeriesMeanReduction" exceeds device-specific maximum work group size (requested: 256)! ------> Reducing kernel's work group size to allowed maximum of: 32 work items [21:37:51][4244][WARN ] Kernel "kernelTimeSeriesPadding" exceeds device-specific maximum work group size (requested: 256)! ------> Reducing kernel's work group size to allowed maximum of: 128 work items [21:37:51][4244][WARN ] Kernel "kernelPowerSpectrum" exceeds device-specific maximum work group size (requested: 256)! ------> Reducing kernel's work group size to allowed maximum of: 32 work items [21:37:51][4244][WARN ] Kernel "kernelHarmonicSumming" exceeds device-specific maximum work group size (requested: 256)! ------> Reducing kernel's work group size to allowed maximum of: 32 work items [21:37:51][4244][WARN ] Kernel "kernelFillFloatBuffer" exceeds device-specific maximum work group size (requested: 256)! ------> Reducing kernel's work group size to allowed maximum of: 128 work items [21:37:52][4244][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory). ------> Starting from scratch... [21:37:52][4244][INFO ] Header contents: ------> Original WAPP file: ./p2030.20111110.G39.19-00.79.N.b0s0g0.00100_DM417.60 ------> Sample time in microseconds: 65.4762 ------> Observation time in seconds: 274.62705 ------> Time stamp (MJD): 55875.841895339749 ------> Number of samples/record: 0 ------> Center freq in MHz: 1214.289551 ------> Channel band in MHz: 0.33605957 ------> Number of channels/record: 960 ------> Nifs: 1 ------> RA (J2000): 190544.269901 ------> DEC (J2000): 51556.4934006 ------> Galactic l: 0 ------> Galactic b: 0 ------> Name: G39.19-00.79.N ------> Lagformat: 0 ------> Sum: 1 ------> Level: 3 ------> AZ at start: 0 ------> ZA at start: 0 ------> AST at start: 0 ------> LST at start: 0 ------> Project ID: -- ------> Observers: -- ------> File size (bytes): 0 ------> Data size (bytes): 0 ------> Number of samples: 4194304 ------> Trial dispersion measure: 417.6 cm^-3 pc ------> Scale factor: 0.105215 [21:38:03][4244][INFO ] Seed for random number generator is -1007215342. [21:38:42][4244][INFO ] Derived global search parameters: ------> f_A probability = 0.08 ------> single bin prob(P_noise > P_thr) = 9.93986e-009 ------> thr1 = 18.4267 ------> thr2 = 21.5421 ------> thr4 = 26.5915 ------> thr8 = 35.0049 ------> thr16 = 49.3672 [21:38:43][4244][ERROR] Error during OpenCL FFT setup (error: -5) [21:38:43][4244][ERROR] Demodulation failed (error: 2021)! 21:38:43 (4244): called boinc_finish </stderr_txt> ]]> |
Oliver Behnke Volunteer moderator Project administrator Project developer Send message Joined: 4 Sep 07 Posts: 130 Credit: 8,545,955 RAC: 0 |
Thanks for the update! I didn't find the time to investigate this particular problem as I had to focus on BOINC internals (e.g. OpenCL device scheduling) over the past weeks. I'll to dig deeper into this as soon as I can. FYI, it's hard to trust those numbers as the ATI driver still have a number of known issues in that area. Stay tuned, Oliver |
Mikie Tim T Send message Joined: 22 Jan 05 Posts: 99 Credit: 17,235 RAC: 0 |
Well, 12.1 just came out, so I can save a System Restore point and attempt to upgrade from 11.12 if that would help resolve any issues. |
Mikie Tim T Send message Joined: 22 Jan 05 Posts: 99 Credit: 17,235 RAC: 0 |
Upgraded BOINC to 7.0.14 and Catalyst 12.1, but get a similar error: http://albert.phys.uwm.edu/result.php?resultid=114508 |
Barraud Denis Send message Joined: 5 Feb 12 Posts: 1 Credit: 56,894 RAC: 0 |
same problem / Binary Radio Pulsar Search v1.20 (atiOpenCL) http://albert.phys.uwm.edu/result.php?resultid=115783 http://albert.phys.uwm.edu/result.php?resultid=115738 Boinc 7.0.14 (x64) & Catalyst 12.1 AMD ATI Radeon HD 4700/4800 (RV740/RV770) (512MB) driver: 1.4.1664 |
pragmatic prancing periodic problem child, left Send message Joined: 26 Jan 05 Posts: 1639 Credit: 70,000 RAC: 0 |
AMD ATI Radeon HD 4700/4800 (RV740/RV770) (512MB) driver: 1.4.1664 The ATIOpenCL application needs at least 490MB memory free on the videocard. BOINC will at startup state how much memory the card has and how much it detects is free. If this value is under 490MB, tasks will err as the FFT setup cannot continue. (source post by Oliver). Jord. BOINC FAQ Service They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake. |
Mikie Tim T Send message Joined: 22 Jan 05 Posts: 99 Credit: 17,235 RAC: 0 |
OK, but that doesn't seem to be my problem according to the BOINC Messages tab: 2/3/2012 10:33:22 PM||ATI GPU 0: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1664, 1024MB, 992MB available, 1024 GFLOPS peak) 2/3/2012 10:33:22 PM||OpenCL: ATI GPU 0: ATI RV730 (driver version CAL 1.4.1664, device version OpenCL 1.0 AMD-APP (851.4), 512MB, 992MB available) 2/3/2012 10:33:22 PM||ATI GPU 0 is OpenCL-capable Are there any other reasons that this message could come about? |
pragmatic prancing periodic problem child, left Send message Joined: 26 Jan 05 Posts: 1639 Credit: 70,000 RAC: 0 |
Are there any other reasons that this message could come about? Try a reboot of the system. It's possible something else is stuck in the GPU's memory. Only a full power recycle can fix that. BOINC only checks on start-up how much memory there is available, it can't do that at any time afterwards. I'm not sure if the science app can do it either. Oliver? You do seem to be having the drivers bug, where BOINC shows half the memory of what's available. Apparently you have a 1024MB GPU, but it shows only 512MB with 992MB available. That's a bug in the drivers. Nothing we can do to fix that, that's something ATI should fix. Jord. BOINC FAQ Service They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake. |
Mikie Tim T Send message Joined: 22 Jan 05 Posts: 99 Credit: 17,235 RAC: 0 |
I tried running a 1.21 workunit as well to see if anything was different, but to no avail. http://albert.phys.uwm.edu/result.php?resultid=115623 |
Mikie Tim T Send message Joined: 22 Jan 05 Posts: 99 Credit: 17,235 RAC: 0 |
I tried rebooting as well with all of the projects of this machine paused. I then resumed Albert, so nothing else had a chance to take up memory beforehand. I got the same results, so apparently that wasn't it. Is there anything else that I could possibly try to get it rolling? |
terencewee* Send message Joined: 2 Feb 12 Posts: 5 Credit: 4,500 RAC: 0 |
@Mikie: Something to check to get BRP-OpenCL working for you: 0] Select all A@H tasks, suspend them. 1] Stop BOINC & don't run anything else 2] Check your environment variable TEMP (TMP), clean up that folder (sort by date, delete old files, etc.) 3] CHKDSK your HDD 4] Turn off Aero 5] Restart and run BOINC 6] Click on an A@H task that wasn't running previously, resume it (to ensure it's a fresh slot) I'm running Cat-12.1, 5850 (standard, 1GB), Win7-64, a free core dedicated to A@H running task. Hope it's working for you. -- terencewee* Sicituradastra. |