Opened 12 years ago
Closed 12 years ago
#164 closed defect (fixed)
Segfault with reading large reconstructed array
| Reported by: | MatthewWhiting | Owned by: | MatthewWhiting |
|---|---|---|---|
| Priority: | normal | Milestone: | Release-1.2.2 |
| Component: | Wavelet reconstruction | Version: | 1.2 |
| Severity: | normal | Keywords: | |
| Cc: |
Description
From BiQing? For:
Hi, Matt, I am working on the big data cube that I sent to you as test cube with the new Duchamp. When I switch on the "flagReconExists" flag and use the created recon file from previous run, it gives me segmentation fault. See the below parameters. Using this option should shorten the processing time (ie, without recreating the reconstruction cube) and get into the source finding with new threshold value. WARNING <Reading parameters> : Changing minVoxels to 14 given minPix=10 and minChannels=5 Opening image: ms_p.fits Dimensions of FITS file: 3835x6074x160x1 Reading data ... About to allocate 27.8118GB of which 13.8842GB is for the image Done. Data array has dimensions: 3835x6074x160 Opened successfully. Reading reconstructed array: Segmentation fault Thanks, BiQing imageFile ms_p.fits #flagSubsection true #Subsection [*,*,*] flaglog true logFile duchamp-Logfile_MSp_th012_gr005.txt outFile duchamp-Results_MSp_th012_gr005.txt flagOutputMomentMap false fileOutputMomentMap duchamp_moment_MSp.fits flagPlotSpectra true spectraFile duchamp-Spectra_MSp_th012_gr005.ps flagOutputMask true fileOutputMask MSp_th012_gr005.MASK.fits flagMaskWithObjectNum true flagKarma true karmaFile duchamp-Results_MSp_th012_gr005.ann precFlux 3 precVel 3 precSNR 2 flagTrim false flagMW false #minMW 168 #maxMW 397 flagReconExists true reconFile MSp_ATCA.recon.fits flagOutputRecon false fileOutputRecon MSp_ATCA.recon.fits flagBaseline false flagRobustStats 1 flagNegative 0 threshold 0.12 flagGrowth 1 growthThreshold 0.05 flagATrous true reconDim 3 scaleMin 3 snrRecon 2. filterCode 1 flagSmooth false smoothType spatial hanningWidth 4 kernMaj 5. kernMin 1. kernPA 0. flagFDR false flagAdjacent true threshSpatial 3 threshVelocity 7 flagRejectBeforeMerge true flagTwoStageMerging true minChannels 5 minPix 10 verbose 1 drawBorders 1 drawBlankEdges 1 spectralMethod peak spectralUnits km/s pixelCentre centroid sortingParam vel
Change History (4)
comment:1 Changed 12 years ago by
comment:2 Changed 12 years ago by
Actually, if this was a general problem, why can we read the original image?
The difference is that the original input cube is read via fits_read_subset_flt, while the recon array is read with fits_read_pix.
Since rebuilding the cfitsio library doesn't seem feasible (the long is hard-coded in for that function), perhaps re-write the reading function to use the same procedure might be the way to go.
comment:3 Changed 12 years ago by
| Milestone: | → Release-1.2.2 |
|---|
comment:4 Changed 12 years ago by
| Resolution: | → fixed |
|---|---|
| Status: | new → closed |
Have implemented the change to the new class structure described in #166, and make use of the alternative cftisio reading function fits_read_subset_flt as described above.
It seems that later versions of cfitsio get around this problem, but the new function will allow older versions to work fine as well.
Closing ticket, as this seems to be fixed.

BiQing? is using 1.2.
I was able to reproduce the problem using the large cube ms_n.fits that she sent me earlier. In order to do so, I had to fake up a saved reconstructed array by copying the input (and using CASA to remove the degenerate Stokes axis). Reading the full array (of size 3835x6074x140) generates the segfault.
Using gdb isolates the problem to the cfitsio library:
Starting program: /work/whi550/Duchamp-working/Duchamp-1.2.1 -p biqing-duchamp.in WARNING <Reading parameters> : Changing minVoxels to 14 given minPix=10 and minChannels=5 Opening image: ms_n.fits Dimensions of FITS file: 3835x6074x140x1 Reading data ... About to allocate 24.3407GB of which 12.1487GB is for the image Done. Data array has dimensions: 3835x6074x140 Opened successfully. Reading reconstructed array: Program received signal SIGSEGV, Segmentation fault. memcpy () at ../sysdeps/x86_64/memcpy.S:392 392 ../sysdeps/x86_64/memcpy.S: No such file or directory. in ../sysdeps/x86_64/memcpy.S Current language: auto The current source language is "auto; currently asm". (gdb) bt #0 memcpy () at ../sysdeps/x86_64/memcpy.S:392 #1 0x0000000000619324 in ffgbyt (fptr=0x9df050, nbytes=-4135346784, buffer=<value optimized out>, status=0x7fffffffcf28) at buffers.c:346 #2 0x000000000061956c in ffgr4b (fptr=0x9df050, byteloc=<value optimized out>, nvals=-1033836696, incre=<value optimized out>, values=0x7ff9e0363010, status=0x7fffffffcf28) at buffers.c:1010 #3 0x0000000000584d91 in ffgcle (fptr=0x9df050, colnum=<value optimized out>, firstrow=<value optimized out>, firstelem=<value optimized out>, nelem=3261130600, elemincre=1, nultyp=1, nulval=<value optimized out>, array=0x7ff9e0363010, nularray=0x7fffffffc75f "", anynul=0x7fffffffcf18, status=0x7fffffffcf28) at getcole.c:853 #4 0x000000000057b2f3 in ffgpxvll (fptr=0x9df050, datatype=0, firstpix=0x7fffffffc7d0, nelem=3261130600, nulval=<value optimized out>, array=0x7ff9e0363010, anynul=0x7fffffffcf18, status=0x7fffffffcf28) at getcol.c:220 #5 0x000000000057b57d in ffgpxv (fptr=0x9df050, datatype=42, firstpix=0x9df9a0, nelem=<value optimized out>, nulval=<value optimized out>, array=<value optimized out>, anynul=0x7fffffffcf18, status=0x7fffffffcf28) at getcol.c:43 #6 0x000000000047ef33 in duchamp::Cube::readReconCube (this=0x9dcf90) at src/Cubes/readRecon.cc:209 #7 0x0000000000469658 in duchamp::Cube::readSavedArrays (this=0x9dcf90) at src/Cubes/cubes_extended.cc:100 #8 0x000000000040c6e8 in main (argc=<value optimized out>, argv=0x7fffffffd9b8) at src/mainDuchamp.cc:80Notice that the number of elements goes negative for ffgr4b - this is because it requires it to be long, rather than LONGLONG. Need 64-bit build of cfitsio?