next up previous
Next: 10. What would it Up: Einstein@Home S3 Analysis Summary Previous: 8. What is the

9. How does the Einstein@Home S3 search work?

The Einstein@Home search in the LIGO S3 data set starts with the 600 'best' hours of S3 data. The most sensitive instrument operating during S3 was the LIGO Hanford Observatory 4-km detector, so we use that data set. This is broken up into 60 segments totaling ten hours of data each. Because the instrument's operation is not continuous, but is often interrupted by loss of lock and other events, each of these sixty 10-hour data segments can span more than ten hours of wall-clock time. In all sixty cases, the 10 hours of science mode data was taken in less than 13 hours of wall-clock time.

Each data segment is then prepared as follows. The data are calibrated in the time domain, using a method described in [34]. This produces a time-domain data stream sampled 16384 times per second. Then 30-minute chunks of contiguous data are windowed and Fourier Transformed, producing 'Short Fourier Transform' (SFT) data sets. Known line features in the instrument9.1are removed. The end result is 2901 SFT files, each of which covers an (overlapping) band of about 0.8 Hz. Each file contains 1200 Fourier Transforms (60 ten-hour segments * 20 SFTs/ten-hour segment). The frequency range covered is from 50.0Hz to 1500.5 Hz.

An Einstein@Home host (for example, YOUR home PC) downloads one of the 2901 SFT files. Each of these contains 0.5 Hz of usable data, plus 'wings' on either side.

In overview, the host then steps twice through a grid of approximately 30,000 points on the sky (once for each of two different ten-hour datasets) searching for a pulsar at each of those locations, and then carries out a coincidence step designed to reduce the number of 'false positive' candidate events reported back to the server. It does this in a 0.1 Hz wide band. Candidate events that appear in both ten-hour stretches are reported back to the Einstein@Home server. These files are compressed before they are returned, and have an average size of about 110kB. However in some cases the files are much larger (a few MB) or much smaller, depending upon instrumental features that appear in the data.

A somewhat more detailed explanation of the search is that at each grid point, and for each frequency in a 0.1 Hertz band, the host computes the 'F Statistic' [35], which is the output of an optimal filter for a pulsar at that point in the grid for one of the ten-hour segments of data. If the noise in the instrument is Gaussian, then the F Statistic has a $\chi^2$ distribution with four degrees of freedom [36]; the non-centrality parameter is proportional to the square of the source's gravitational wave amplitude.

The candidate sources collected by Einstein@Home are those for which the value of the F statistic satisfies 2F>25. They are 'clustered together' in frequency space and then written to a file on the host disk. The host then repeats this procedure for a DIFFERENT ten-hour segment of data. Each of these two steps takes from three to twelve hours on a modern computer. The exact time depends mainly upon the speed of the processor and memory system, and weakly upon the data itself. Then, in a third step, the candidate events found in each of the two ten-hour segments are compared, and only those which are coincident within 1 mHz in frequency and 0.02 radian in sky position are retained. On the average, an Einstein@Home has needs 11.1 hours of CPU time to carry out this entire process.

The resulting list of potential pulsar candidates, which have appeared with consistent parameters in two different ten-hour LIGO data stretches, are returned to the server and stored there. There are 30 (pairs of ten-hour segments) * (1500.5 Hz - 50.0 Hz) * 1 job/0.1 Hz) = 435150 separate analysis jobs run in this way.

The results are then post-processed. The basic idea is simple. If a source is present somewhere on the sky and is strong enough to be confidently detected by the LIGO instrument, then it should produce a large value of the F Statistic in many (or all) of the 60 different ten-hour data segments. Moreover, the values of the sky position and frequency should agree to within a precision (determined by Monte-Carlo simulation) of 0.02 radians and 1 mHz respectively.

The post processing code works by loading all 60 result files (each one covers 0.1 Hz of one 10-hour data segment) into memory and then finding 'cells' in parameter space (sky position and frequency) where a large number of 10-hour data segments all contain a value of the F statistic above the 2F=25 threshold.

next up previous
Next: 10. What would it Up: Einstein@Home S3 Analysis Summary Previous: 8. What is the
Einstein@Home S3 Analysis Summary
Last Revised: 2005.09.11 16:22:17 UTC
Copyright © 2005 Bruce Allen for the LIGO Scientific Collaboration
Document version: 1.97