Core Minutes 1/17/2012Power outage: (Richard) Building 50 [the one with all the computers in it] lost power around 10 PM last. Things were mostly back together by about 3 AM but recovery, including some of our xrootd servers, is still in progress.
ScienceTools: (Jim) added model functions, not yet tagged. He also promoted last week's Likelihood tag (see Science Tools Development Notes for details) but the new HEAD build failed to start. [See RM discussion below]
(Richard) Good news from LLR: Emilia Becheva will be helping with ST in the very near future.
Disk acquisition: (Richard) Not much news on this front. Last of delivery and hook-up are in progress.
FSSC: (Eric W.) Alex Reustle is now working with us [welcome, Alex!].
We have been stuck with an old version of cfitsio because some code, specifically some code in pulsar tools, requires an undocumented feature of that version which is explicitly not supported in newer versions, namely simultaneous access by more than one client of memory-resident files. Attempts to add this support to tip itself have so far been unsuccessful. It's clear that, at the very least, it would involve a substantial reworking of the package, so Eric took a different tack: he modified the pulsar code to use a disk file. This 2-line change appears to have done the trick. He has so far tested with SL5, 32- and 64-bit, and snow leopard.
Pass7 reprocessing (Leon) New calibration files (ultimately expected to replace ones currently in use which are very old, pre-launch in some cases) are nearly all ready.
(Tom G.) The process for reprocessing has been revamped and is ready to go. When the last calibrations get installed, will do validation to make sure the procedure is ok. High statistics runs will come later. (Richard) When will the real reprocessing get going? Other batch users should be warned. (Tom G.) Depends on validation runs and how long it takes to analyze results from them, probably a couple weeks.
GR/Pass8 (Tracy) Our last GR build dates from November and apparently was not exercised that much. Recently we discovered some things didn't get included. Now they are, along a new branch (which still uses old — v18 — gaudi; main branch uses v21).
Attention will be shifting to upcoming face-to-face, the week before the collaboration meeting.
Plans are being made for an engineering run to study truncations. Luca Baldini will be the point person for C & A.
Tracy has been looking into performance improvements in anticipation of the occasional very large events the new truncation schemes will produce. In particular he's trying out a pre-filter using the Hough transform which could either just throw out large events which it determines to be junk, or cut down on the linking step, which takes most of the time. It looks very promising so far.
(Leon) The truncation code
1. runs properly. 2. had a problem in that it wasn't distinguishing between planes which were empty of hits because conversion hadn't happened yet and planes which were empty because of truncation. It's possible to make such a distinction by using trigger bits.
Next on the agenda: implement Bill's idea to (optionally) truncate in software data taken with one of the new schemes, so that L1 processing need not change right away.
(Tracy) Tomorrow Rob will have a meeting for all concerned parties on the proposed engineering run. (Richard) Rob says it could be that if the hardware truncation limit is reached, a power reset will be required to get going again. (Anders) I think that is correct. Restarting a run is apparently not enough.
GR gaudi upgrade progress (Heather) has made many updates to get things working on Windows with CMT. (Joanne) has made some for SCons, tested so far on rhel4 and rhel5. Most were minor, but work on Overlay, in order to create a rootmap without seg faulting, extended over several days, tediously narrowing down the problem to a single source file, then a couple lines within that source file (which is, in most respects, quite similar to other files in the same directory). A fix has been committed and tagged, if not entirely understood. (Heather) However the rhel4 build with CMT is still failing.
(Heather) Still on the to-do list from last week: M.E. has requested a RHEL5 trial for L1proc, which would also require the Gaudi upgrade
GR and gcc44 (Heather) Johann has been working on building GR with gcc44. Many minor patches are needed to explicitly #include system headers. There is also a problem with our python dist. It doesn't include tcl because tcl isn't on the SLAC boxes used to make the rhel6 externals.
System stuff (Heather) The glastrm home directory was getting full so she deleted .core files and old mail. It seemed like it might be a good idea to delete old mail for the glast account as well, but she can't access its mail. This should probably be followed up or it will run out of space as well.
u35 also bears watching: it's at 92%
SCons RM (Tom S.) Just before break, we requested that the glastrm account be given (back) a password because it's needed to change the trscrontab. This was necessary because of a problem with the MySQL connection. It times out after 8 hours, but we were restarting the daemon only every 24 hours. We changed that to 6 hours and all was well — until Jan. 4th. Starting then, the RM daemon could no longer get to CVS, presumably because of a problem with afs tokens. There were no changes on our end and no known changes originating with the Computer Center in the relevant time frame.
For all OSes, CVS access is via the server, hence requires a remote login, even though it's only truly necessary for Windows, which doesn't have direct access to the nfs space containing the repository. This has been working fine from time immemorial but, in the face of the current failure, last week Tom changed the db entry for CVSROOT — for rhel4 32-bit only — to just /nfs/.. (i.e., direct access rather than via server), expecting that only rhel4 32-bit builds would be affected. But even though the dbs has slots for values of CVSROOT for each combination of (OS, build package,..) apparently only the rhel4 32-bit ones are actually used, so now all access is being done with this value and, predictably, Windows fails. He'll look into this further.
Since he doesn't anticipate making more changes to trscrontab any time soon, we can probably return to a passwordless glastrm account.
AOB
|
|
minutes index
|
next
|