Opened 16 years ago
Closed 15 years ago
#67 closed defect (fixed)
Solve large file issues in miriad tasks
Reported by: | MarkWieringa | Owned by: | MarkWieringa |
---|---|---|---|
Priority: | major | Milestone: | 10. Stage 3 - available for testing |
Component: | MIRIAD - CABB branch | Version: | |
Keywords: | large file support | Cc: | |
Estimated Number of Hours: | 20 | Add Hours to Ticket: | 5 |
Billable?: | yes | Total Hours: | 0 |
Description (last modified by )
Issue reported by email - we may need to split this up by task
On Thu 2008/07/31 17:11:45 MST, Juergen Ott wrote in a message to: Mark Calabretta <mcalabre@atnf.csiro.au>, Phil Edwards <Philip.Edwards@csiro.au> and copied to: Adrienne Stilp <adrienne@astro.washington.edu>, Steven Warren <warren@astro.umn.edu>
Hi Juergen,
I was just wondering what the status is of upgrading MIRIAD to work with large datasets that will be delivered by CABB. In fact, right now I am using MIRIAD to reduce/image VLA data and I find that there are many issues that have to do with handling large volumes of data. E.g., the fits task is not able to convert all data in miriad format, but stops after ~3h worth of data, flagging and plotting tasks (blflag, uvplt) do
The low-level Miriad IO routines can handle large files but unfortunately the tasks themselves are limited by 4-byte, signed Fortran INTEGER variables and that causes task-specific problems of the sort you are seeing.
This is touched on briefly in the installation notes, ftp://ftp.atnf.csiro.au/pub/software/miriad/INSTALL.html, where task fits is mentioned by name.
not show all data, etc., and invert sometimes fails, too, because of the large datasets. I think that those things will be fixed by making MIRIAD ready for CABB data, so has there been any progress? Are there any beta versions of MIRIAD to do so?
Mark Wieringa is the one to ask.
Regards, Mark
see also: ticket:85
Change History (8)
comment:1 Changed 16 years ago by
Status: | new → assigned |
---|
comment:2 follow-up: 4 Changed 15 years ago by
comment:3 Changed 15 years ago by
- 2009-04-29
Another issue that came up was the handling of flag tables larger than 2^(31-1) bytes.
This required changes to the low-level io code (maskio.c
).
Date: Fri, 17 Apr 2009 11:38:43 +1000 From: Mark Wieringa <Mark.Wieringa@csiro.au> ... We've just encountered an integer overflow problem in miriad. The size of the flag table is limited to ~256MB because it uses an int to calculate offsets in bits into the file. This means we can't keep a 12h run of CABB data in one uv file. I'm working my way through the c code layers to see if we can fix this. Similar issues seem fixed elsewhere in the io routines (by using off_t), but not in the maskio routines.
There were separate issues to do with scratch files (srcio.c
). I think these were found by Bob Sault.
comment:4 Changed 15 years ago by
Replying to MarkWieringa:
Tested fits output of a file > 2GB, this also fails.
Tested on 64-bit, also fails:
delphinus-111% fits in=/DATA/DELPHINUS_3/len067/CABB/pictor-a.9000.uvaver out=test.fits op=uvout Fits: version 1.1 09-Apr-09 Polarisations copied: XX,YY,XY,YX. ### Fatal Error: Invalid argument delphinus-112% echo $? 1
Repeating with strace
delphinus-114% strace -o ./fits.strace fits in=/DATA/DELPHINUS_3/len067/CABB/pictor-a.9000.uvaver out=test.fits op=uvout Fits: version 1.1 09-Apr-09 Polarisations copied: XX,YY,XY,YX. ### Fatal Error: Invalid argument
shows that there is a problem during a seek()
call:
...blahblahblah...lseek(4, 2149679104, SEEK_SET) = 2149679104 read(4, "\32\275\256A5\275\305?\246V:\276\32\275\256A\321\177\237"..., 16384) = 16384 lseek(6, 2147569344, SEEK_SET) = 2147569344 write(6, ">\243\367\340\276O\317^A\344\r\215?\231\'\213\277\n\210"..., 9948) = 9 948 lseek(6, 18446744071562163612, SEEK_SET) = -1 EINVAL (Invalid argument) write(2, "### Fatal Error: Invalid argume"..., 35) = 35 close(6) = 0 unlink("test.fits") = 0 close(4) = 0 close(5) = 0 close(3) = 0 lseek(2, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) fstat(2, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 lseek(2, 0, SEEK_END) = -1 ESPIPE (Illegal seek) lseek(2, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) lseek(1, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0 lseek(1, 0, SEEK_END) = -1 ESPIPE (Illegal seek) lseek(1, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) munmap(0x2b66d4c9d000, 4096) = 0 exit_group(1) = ?
ie we hit a problem at 2^32
.
comment:5 Changed 15 years ago by
Description: | modified (diff) |
---|
comment:6 Changed 15 years ago by
The workaround for this particular case is to break the UV data into a few time ranges to stay below the size limit.
Apparently BobSault? is looking into reworking the code to avoid the silly seek()
from the start to the end of the file.
comment:7 Changed 15 years ago by
Add Hours to Ticket: | 0 → 5 |
---|
From Bob Sault:
I have just installed a number of changes into the RCS system to allow the "fits" task to handle FITS files that are larger than 2 Gbytes in size. There are a large number of small changes. The changes are invisible to the user. Below is a sketch of the changes.
Best regards Bob
16jul09 rjs mp.for - Added mpSign routine and better comments. 16jul09 rjs hio3.f2c - New routines to handle large file offsets from FORTRAN. 16jul09 rjs fitsio.for,fitsio.h - Changes to handle FITS files larger than 2 Gbytes. 16jul09 rjs wrap.f2c - Add a caste operation in htell (pedantry). 20jul09 rjs fits.for - Some cosmetic changes to messages to users.
comment:8 Changed 15 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
A further change by Bob to implement a PtrDiff? type, which is basically a fortran integer*8 has now solved the problem for large memory allocations in invert (and other tasks when needed).
Tested fits output of a file > 2GB, this also fails. Atlod can read a large file (2.8 GB) and produce a large (2.2GB) uv file, which is read fine by other uv programs (like uvindex). However fits fails to export a file of this size (it can export a subset of the file, e.g. single source).