WARNING: This website is obsolete! Please follow this link to get to the new Albert@Home website!
New App S6LV1 |
Message boards :
News :
New App S6LV1
Message board moderation
Author | Message |
---|---|
Bernd Machenschalk Volunteer moderator Project administrator Project developer Send message Joined: 15 Oct 04 Posts: 1956 Credit: 6,218,130 RAC: 0 |
Along with the ongoing test related to OpenCL we will soon begin to test the setup for the next Einstein@Home Gravitational Wave search "S6LV1" (S6 data with "LineVeto", run #1). For now this will be a pure CPU App. |
robertmiles Send message Joined: 16 Nov 11 Posts: 19 Credit: 4,468,368 RAC: 0 |
So, far, both of those workunits I've had so far gave a computation error near the end of the predicted runtime. Server status suggests that no one else has completed one of those workunits successfully, either. |
Bernd Machenschalk Volunteer moderator Project administrator Project developer Send message Joined: 15 Oct 04 Posts: 1956 Credit: 6,218,130 RAC: 0 |
Thanks. Apparently the checkpointing is broken, i.e. the "toplist" structure is broken after the App resumes from a checkpoint, the next "insert" then crashes the App. Until this is fixed I suspended sending out more S6LV1 work. If you want to finish the tasks already out there, avoid an App restart (set "leave App in Memory while suspended" to "yes", avoid quitting BOINC or ashutting down your computer). BM |
robertmiles Send message Joined: 16 Nov 11 Posts: 19 Credit: 4,468,368 RAC: 0 |
I'll have to disable S6LV1 on one of my computers, then. A certain part of its backup software needs to run every night with BOINC not running. Upgrading to BOINC 7.0.2 might have fixed the problem requiring frequent reboots on a second computer. It will probably take a few more days to tell. For now, I'll let it keep trying to get S6LV1 workunits. |
Gaurav Khanna Send message Joined: 8 Nov 04 Posts: 12 Credit: 2,818,895 RAC: 0 |
Hmm. All the S6 LineVeto work units are crashing immediately for me: http://albert.phys.uwm.edu/result.php?resultid=65949 http://albert.phys.uwm.edu/result.php?resultid=65791 |
Bernd Machenschalk Volunteer moderator Project administrator Project developer Send message Joined: 15 Oct 04 Posts: 1956 Credit: 6,218,130 RAC: 0 |
Hi Gaurav! Could you stop BOINC and send me a init_data.xml file from a slot directory (e.g. per eMail, plain file, don't just copy&paste the text)? BM |
Bernd Machenschalk Volunteer moderator Project administrator Project developer Send message Joined: 15 Oct 04 Posts: 1956 Credit: 6,218,130 RAC: 0 |
Checkpointing should work in the Apps version 1.01 published minutes ago. Testing of S6LV1 resumed. BM |
robertmiles Send message Joined: 16 Nov 11 Posts: 19 Credit: 4,468,368 RAC: 0 |
I'll have to disable S6LV1 on one of my computers, then. A certain part of its backup software needs to run every night with BOINC not running. I've now found that BOINC 7.0.2 has made the problem requiring frequent reboots on the second computer less frequent, but not fully eliminated it. S6LV1 still enabled there. I've found a way to reduce the problem on the first computer to a few minutes every 24 hours without BOINC running and without it in memory, without rebooting Windows, but it requires staying up until 1 AM to start the backups for that computer manually. I'll check if that if good enough for the newest S6LV1. |
Bernd Machenschalk Volunteer moderator Project administrator Project developer Send message Joined: 15 Oct 04 Posts: 1956 Credit: 6,218,130 RAC: 0 |
Right now it's more important for us to learn whether the App now checkpoints and resumes correctly than to get completed results. You don't need to stay awake until 1AM for this to find out. Just stop BOINC (after running for >5min) and start it again. If the App resumes without crashing, it will continue to do so even after a reboot or whatever else may happen. If it crashes again when resuming, it's better to know that early than to waste more computing time. BM |
robertmiles Send message Joined: 16 Nov 11 Posts: 19 Credit: 4,468,368 RAC: 0 |
OK, I'll try that when I get another S6LV1 workunit if there's no other workunit from another BOINC project with an especially long CPU time since the last checkpoint. Currently, the XXL workunits from RNA World are about the worst for long times between checkpoints. |
robertmiles Send message Joined: 16 Nov 11 Posts: 19 Credit: 4,468,368 RAC: 0 |
Gravitational Wave S6 LineVeto serch 1.01 (SSE2) h1_0052.00_S6GC1__50_S6LV1A Appeared to resume from checkpoint properly after I shut down BOINC for a minute (not left in memory). Still running now. Looks like time for a check of whether this had an effect on getting the right answers. You might want to check the spelling of "serch", though. |
Bernd Machenschalk Volunteer moderator Project administrator Project developer Send message Joined: 15 Oct 04 Posts: 1956 Credit: 6,218,130 RAC: 0 |
FWIW we just got our first pair of results for WU #22235. Both matched and were found valid. None of these tasks was interrupted and resumed from checkpoint, though. BM |
pragmatic prancing periodic problem child, left Send message Joined: 26 Jan 05 Posts: 1639 Credit: 70,000 RAC: 0 |
May I point out a small typo, both in the apps page and the name of the application when showing in BOINC Manager? It says "Gravitational Wave S6 LineVeto serch" and "Gravitational Wave S6 LineVeto serch 1.01 (SSE2)". "Search" is misspelled. Am now running two of these beasts. Hopefully their <rsc_fpops_est> will be adjusted at some point? As they sure don't run for the 6 hours and 45 minutes that they're estimated at. It's been almost 2 hours and progress is only at 7%. Jord. BOINC FAQ Service They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake. |
tullio Send message Joined: 22 Jan 05 Posts: 796 Credit: 137,342 RAC: 0 |
After 1 hour it is a 3.550%. But it does not run in high priority, contrarily to the binary pulsar search. Tullio |
Bernd Machenschalk Volunteer moderator Project administrator Project developer Send message Joined: 15 Oct 04 Posts: 1956 Credit: 6,218,130 RAC: 0 |
May I point out a small typo, both in the apps page and the name of the application when showing in BOINC Manager? Thanks. Fixed in the DB, I don't know whether and when this propagates to the Client and then Manager. Hopefully their Actually we hope to get the App to live up to the speed / runtime we designed the workunits for. An important optimization that is in the S6Bucket App still doesn't work with code changes we had to make for S6LV1. We're working on that. The new server & client code should be able to adjust the runtime estimates with time, though. BM |
pragmatic prancing periodic problem child, left Send message Joined: 26 Jan 05 Posts: 1639 Credit: 70,000 RAC: 0 |
Actually we hope to get the App to live up to the speed / runtime we designed the workunits for. An important optimization that is in the S6Bucket App still doesn't work with code changes we had to make for S6LV1. We're working on that. The new server & client code should be able to adjust the runtime estimates with time, though. OK, that's fair. In the mean time, it sped up a little. 20.562% for the one at 5h 20m 35s and 16.863% for the other at 4h 41m 38s. Hopefully they survive the trip as they have been suspended and resumed multiple times now. Jord. BOINC FAQ Service They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake. |
tullio Send message Joined: 22 Jan 05 Posts: 796 Credit: 137,342 RAC: 0 |
Mine is now at 40.532% after 12:30:42 hours and running OK. Tullio |
Bernd Machenschalk Volunteer moderator Project administrator Project developer Send message Joined: 15 Oct 04 Posts: 1956 Credit: 6,218,130 RAC: 0 |
Hopefully they survive the trip as they have been suspended and resumed multiple times now. The previous error would make the app crash soon after resuming from a checkpoint. If this task was successfully resumed multiple times, there is nothing to worry about. BM |
pragmatic prancing periodic problem child, left Send message Joined: 26 Jan 05 Posts: 1639 Credit: 70,000 RAC: 0 |
The first one ended in 77K seconds run time, 70K CPU time. http://albert.phys.uwm.edu/result.php?resultid=72454 Jord. BOINC FAQ Service They say most of your brain shuts down in cryo-sleep. All but the primitive side, the animal side. No wonder I'm still awake. |
Neil Polson Send message Joined: 17 Dec 05 Posts: 3 Credit: 1,011 RAC: 0 |
Is there any reason why all the S6 tasks have been cancelled? EDIT: Just noticed you've released a new app! Problem with validating? |