Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://www.eso.org/~qc/dfos/phoenix1.html
Äàòà èçìåíåíèÿ: Thu Apr 7 14:45:49 2016
Äàòà èíäåêñèðîâàíèÿ: Sun Apr 10 00:33:39 2016
Êîäèðîâêà:

phoenix

Common DFOS tools:
Documentation

dfos = Data Flow Operations System, the common tool set for DFO

make printable

New:

see also:

v2.2:
- CERTIF_PGI (for optional certification pgi, for MUSE)
- new option -n for step -M (no cleanup)

v2.3:
- supports dual-instance processing (MASTER/SLAVE)

databases	none
dfos tools	calling createAB, createJob, processAB, processQC, moveProducts, ingestProducts
output	pipeline products, QC plots
upload/download	down: ABs from qc@qcweb up: logs and plots to qc@qcweb

Note: In this documentation, IDPs means Internal Data Products and stands for science data products. MCALIBs is short for master calibrations, as created by the PHOENIX process.

In some parts this documentation splits into parts applicable for the production of IDPs, or of MCALIBs. The IDP part then is shaded light blue (like this cell),

while the MCALIB parts are shaded light-yellow (like here). You can the ignore the respective other part.

PHO
ENIX

phoenix: workflow tool for automatic science processing

Description and process

While the supervised, quality-checked, PI-package oriented processing of science data was terminated in October 2011, an unsupervised (automatic) processing, without quality statement and aiming at the general scientific community, was proposed by R. Hanuschik in 2012. It was implemented in May 2013 on muc08, for the UVES IDP processing being the pathfinder in a series of similar projects. IDP stands for 'Internal Data Products' and is the acronym for science data products produced in-house (i.e. by QC), in contrast to the science data products produced externally by PIs (EDPs).

The phoenix tool is the workflow tool to support the automatic processing of large homogeneous science datasets.

With version 2.x, phoenix can also be used to create huge homogeneous batches of master calibrations. This could be necessary e.g. for historical periods of an instruments when the pipeline was not yet mature, or when no historical master calibrations were stored in the archive, or when the existing master calibrations are not compatible with the current pipeline. The main goal remains - creation of science-grade data products, hence a master calibration phoenix project is normally always followed by an IDP project.

The goal of the PHOENIX process is:

to provide science-grade products to the general archive community;
to offer them as internal data products (IDPs) via dedicated archive interfaces.

The processing is based on

certified and ingested master calibrations which went through the classical QC processing, scoring and certification workflow
existing associations (stored on the qcweb QC server and as calSelector static associations)
carefully reviewed and certified pipelines (by SDP and SDD).

This means in turn that science processing "the phoenix way" is only possible if

a mature and stable pipeline exists, certified by SDG to deliver science-grade products
all required master calibrations are available in the archive
all required ABs are available from a central resource.

The benefits of phoenix processing, in comparison to the existing science products from the PI packages provided by QC in the past (the "S. products"), are:

the homogeneous processing with a mature and reviewed pipeline (while all historical attempts were always done with the latest available pipeline and therefore have been naturally of inhomogeneous quality)
the earlier processing was in many cases less complete (e.g. SM only, standard modes or setups only)
now there are probably better processing parameters known, or available.

phoenix and autoDaily in comparison:

autoDaily

CAL

SCI: AB creation only

establishing the QC loop; catch issues, follow up trending, use scoring; product: certified and archived master calibrations; interactive and manpower-intensive; to be done in near-realtime

phoenix 1.x

CAL: downloaded

SCI

using certified master calibrations; no issues to be followed up, no decisions to be taken, no certification or rejection; capitalizing on previous work on CALibs; automatic, no human intervention (other than checking the process health); no time commitment, can be done anytime

phoenix 2.x

CAL

SCI

CAL: creating certified master calibrations, based on selection of masters required for science reduction, using existing scoring/certification information; to a large extent automatic, score-based decisions, minimal human intervention (but more than for IDPs)

SCI: as above

Workflow description (see also 'Operational workflow' here)

The IDP process has two (for MUSE: three) components:

pre-processing: scan the ABs, redefine input datasets (MUSE only);
the processing component, covered by phoenix;
the post-processing ingestion component which in the case of IDPs is covered by DFS tools like idp2sdp (the conversion tool for UVES) and the ingestionTool (both plugged in the standard DFOS ingestProducts), or, in the case of MCALIBs, by the DFOS tool ingestProducts directly (see below).

The phoenix (re)processing workflow has many similarities with the autoDaily workflow.

A) For IDPs it has the following components:

	phoenix -d <date> (call by date)						ingestProducts -m SCIENCE -d <date>
step	- download ABs from $ORIG_AB_DIR; - filter by raw_types, setups (as configured) - edit ABs (remove outdated content; call configured pgi)	- prepare execution jobs: download raw, download mcalib, AB and QC queues	- execute job files	- review	- distribute products	- finish that date	- convert products into proper IDP format (if configured) - ingest as IDP into archive
calling ...	[within phoenix]	createJob	ngasClient, vultur, processAB, processQC, scoreQC, QC procedures		moveProducts, renameProducts	finishNight; phoenixMonitor	idpConvert and idp2sdp (if configured); call_IT and ingestionTool
	There is no association module in phoenix for IDPs, all ABs are downloaded from a central source. Editing (by a PGI) of ABs is done to remove obsolete content (like RB_CONTENT) and add e.g. new processing parameters. If major changes to ABs have tobe done, there is the tool phoenixPrepare_<instr> that needs to be customized for the specific instrument and needs to be called before phoenix.	The call of createJob follows the current scheme on the mucs; download job files are essential since muc08 processes massively parallel (up to 30 jobs). Both the pipeline and the QC procedures are queued under condor. A "job concentrator" for QC jobs can be configured, for better efficiency. Non-standard job execution patterns are available (like 2stream processing for MUSE).	After the execution of the download jobs, two queues are executed: the processAB queue, and then the processQC queue.	in general not formalized; in most cases quick check for red scores sufficient; if needed, a CERTIF_PGI can be used; automatic comments are provided by pgi's	These is the standard moveProducts call, with all products, logs and plots in the final DFO directories and the logs+plots exported to qcweb.	The phoenixMonitor is called to provide the processing overview, in the histoMonitor style. An ingestion call is written into JOBS_INGEST.	The files are ingested in a separate workflow step.

While this is the basic processing step, the typical processing pattern is by month (looping over all dates).

B) For MCALIBs, it has the following components:

	phoenix -m <month> (call by month)						ingestProducts -m CALIB -d <date>
step	- create ABs for the data pool; - filter for hidden or previously rejected files - filter by setups (as configured) - edit ABs	- prepare execution jobs: download raw, AB and QC queues	- execute job files	- review	- distribute products	- finish that date	- detect and delete pre-existing products - ingest as normal MCALIB into archive
calling ...	createAB	createJob	ngasClient, vultur, processAB, processQC, scoreQC, QC procedures	[within phoenix: qualityChecker, autoCertifier]	moveProducts, renameProducts	finishNight; phoenixMonitor	ingestProducts
	Headers are downloaded if not existing in the system. They can be filtered and even edited (with DATA_PGI). Editing of ABs is done to edit processing parameters.	The call of createJob follows the current scheme on the mucs; download job files are essential for raw data. Both the pipeline and the QC procedures are queued under condor.	After the execution of the download jobs, two queues are executed: the processAB queue, and then the processQC queue.	qualityChecker is a component to combine new (score-based) and historical information; autoCertifier is a simple version of certifyProducts	These is the standard moveProducts call, with all products, logs and plots in the final DFO directories and the logs+plots exported to qcweb. For the renaming, versioned config files are supported. phoenix does a check on the final MCALIB names.	The phoenixMonitor is called to provide the processing overview, in the histoMonitor style. An ingestion call is written into JOBS_INGEST.	The products are ingested in a separate workflow step.

A flow diagram may also help to understand the processing workflow. Here is a sketch that combines the IDP ("SCIENCE") branch (supported by phoenix v1.x)) and the MCALIB branch, available with phoenix v2.x [click for full-size]:

Technical implementation

A) Workspace.

IDPs: Each PHOENIX process needs its own workspace, to not interfer with the normal workspaces for day-to-day operations or with other PHOENIX accounts. This is currently established by having PHOENIX accounts on dedicated PHOENIX servers. There are sciproc (for UVES),xshooter_ph, giraffe_ph etc. on the server muc08. There is also a PHOENIX account muse_ph on muc09, and the PHOENIX account muse_ph2 on muc10. To find more details and an up-to-date list check out phoenix_instances.

The workspace is defined in the resource file .dfosrc which is called upon login from .qcrc.

MCALIBs: For MCALIB production, one could use the same simple setup if there is no danger of confusion with a classical PHOENIX account for IDP production (stream). However, normally there is such danger, since the main motivation for MCALIB production is the follow-up IDP production. Hence there is a special mechanism to use the same account for PHOENIX projects of both flavours:

Define the config key MCAL_CONFIG in config.phoenix. This will mark this account as being used for MCALIB production. The phoenix tool then uses the config file as defined by this key, instead of the normal config.phoenix which is reserved for IDP configuration. For instance, you could define config.phoenix_mcal.
Then, you fill that config file with the configuration for the MCALIB production, while you keep config.phoenix for the (follow-up) IDP production.
In addition, the resource file ~/.dfosrc_X needs to be defined. This file is always sourced by phoenix (and its assisting tools) when it is called in MCALIB mode (with option -X).

For instance, you need to keep the $DFO_CAL_DIR for IDPs (which is just a storage area for downloads) separate from the one for MCALIBs (where it is the main product area from where you want to ingest into the archive). See below for more details.

B) Data directories.

IDPs: The data directories are split by PHOENIX instrument. As per standard IDP installation, all products go to $DFO_SCI_DIR, all logs to $DFO_LOG_DIR and all QC reports to $DFO_PLT_DIR. There is also the $DFO_MON_DIR hosting the reprocessing history (in $DFO_MON_DIR/FINISHED).

MCALIBs: All products go to $DFO_CAL_DIR which needs to be different from the one for the IDP production. Also, $DFO_LOG_DIR, $DFO_PLT_DIR and $DFO_MON_DIR should be separated from the respective IDP directories. $HOME, $DFO_HDR_DIR, $DFO_LST_DIR can be the same.

C) Tools.

For the pipeline, you need to check $HOME/.esorex/esorex.rc to contain the proper pipeline version in the key esorex.caller.recipe-dir.

The PHOENIX process requires a tool set which consists of a stripped-off version of the DFOS tools, and of some tools dedicated to the PHOENIX workflow. The most important key in the .dfosrc file of a PHOENIX installation is the key THIS_IS_PHOENIX which is set to YES for a PHOENIX installation, and to NO (or does not exist) for a DFOS installation. For a PHOENIX MCALIB installation, you also need the key THIS_IS_MCAL set to YES (in order to tell the cleanupProducts tool to behave like in a normal DFOS installation).

The installation is monitored by the standard tool dfosExplorer which is reading THIS_IS_PHOENIX and, if YES, downloads a reference file appropriate for PHOENIX. The DFOS tools required for a PHOENIX installation are all evaluating the key THIS_IS_PHOENIX.

Only a subset of DFOS tools (blue in the following table) is needed for PHOENIX, plus some special tools (red):

tool	config file	comments
phoenix	config.phoenix
pgi		optional, instrument-specific PGIs to control and fine-tune the workflow
Examples for UVES:
pgi_phoenixUVES_AB_760		configured as AB_PGI in config.phoenix, to edit ABs
pgi_phoenix_UVES_MCAL		configured as HDR_PGI in config.phoenix, to update downloaded MCALs for header issues
pgi_phoenix_GEN_ERR		configured as FINAL_PLUGIN in config.processAB, to suppress error messages from the QC procedures spoiling the processing log
pgi_phoenix_UVES_ABSCELL		configured as COMMON_PLUGIN in config.processAB, to manage the PRO.CATG of ABSORPTION_CELL data properly
pgi_phoenix_UVES_AIRM		configured as PRE_PLUGIN in config.processAB, to fix missing TEL.AIRM.START/END keys in raw files

[createAB	standard; OCA can be versioned	for MCALIB production only]
[filterRaw	standard	for MCALIB production only]
getStatusAB	standard	for PHOENIX: the ATAB files are created by phoenix
createJob	standard
ngasClient	none
processAB	standard	configure COMMON_PLUGIN and FINAL_PLUGIN; make sure to configure all SCIENCE RAW_TYPEs
processQC	standard	configure QCBQS_TYPE=PAR; make sure to configure all SCIENCE RAW_TYPEs
moveProducts	standard	make sure to configure all SCIENCE RAW_TYPEs
renameProducts	standard	for MCALIBs: config files can be versioned
ingestProducts	standard	to use idpConvert and call_IT for the ingestion of the IDPs; configure PATH_TO_IT and PATH_TO_CONVERTER
[idpConvert		instrument specific conversion tool, for IDPs only]
[call_IT		wrapper for phase3 ingestion tool, for IDPs only]
finishNight
cleanupProducts		for IDPs: cleans $DFO_SCI_DIR/<date>/conv (fits-->hdr; symlinks for graphics); empties $DFO_SCI_DIR/<date> for MCALIBs: like for DFOS
getObInfo		for OB-related nightlog information
phoenixMonitor	config.phoenixMonitor	non-dfos tool, adapted from histoMonitor
cascadeMonitor
mucMonitor
dfoMonitor	standard	for PHOENIX: autoDaily etc. checks removed
createReport	standard	called only if no data report found on $DFO_WEB_SERVER
dataLoader	standard	same

D) Dual-instance processing (MASTER/SLAVE model).

To speed up the processing of historical data, and to have some contingency for nights with large amounts of data, the MUSE phoenix process is set up on two instances, muse_ph@muc09 and muse_ph2@muc10. Both can run independently. After moveProducts it is important to join bookkeeping information on one account, called the MASTER. The other account is called the SLAVE.

To define the SLAVE, use the keys IS_SLAVE, MASTER_ACCOUNT, SLV_ROOT and MST_ROOT in config.phoenix for the SLAVE account. For the MASTER account, no additional phoenix configuration is necessary (but this is needed then for config.phoenixMonitor, see there).

Logs

The process log is created per standard execution (-d <date>). It goes to the command line and is stored under $DFO_MON_DIR/AUTO_DAILY/phoenix_<date>.log (in close analogy to autoDaily logs).

Output

IDPs:

SCIENCE products in $DFO_SCI_DIR/<date> (fits files), ready to be picked up by e.g. ingestion or conversion tool
ancillary information in $DFO_LOG_DIR/<date>: AB, alog, rblog etc., also exported to qcweb
QC reports in $DFO_PLT_DIR/<date>, also exported to qcweb
processing logs in $DFO_MON_DIR/AUTODAILY
statistics in $DFO_MON_DIR/PHOENIX_DAILY_<RELEASE>
ingestion calls in $DFO_JOB_DIR/JOBS_INGEST

MCALIBs:

MCALIB products in $DFO_CAL_DIR/<date> (fits files), ready to be ingested in the normal way
ancillary information in $DFO_LOG_DIR/<date>: AB, alog, rblog etc., also exported to qcweb
QC reports in $DFO_PLT_DIR/<date>, also exported to qcweb
processing logs in $DFO_MON_DIR/AUTODAILY
statistics in $DFO_MON_DIR/PHOENIX_DAILY_<RELEASE>
ingestion calls in $DFO_JOB_DIR/JOBS_INGEST

How to use

Type phoenix -h for on-line help, and phoenix -v for the version number. Call

IDPs:

phoenix -d <date>
to process all SCIENCE ABs for a specified date

phoenix -m <month>
to process all SCIENCE ABs for a specified month (in a loop over all dates), steps 1-3 (not available for MUSE)

phoenix -d <date> -C
to download & filter all ABs (step 1 only)

phoenix -d <date> -P
to download & filter & process all ABs (steps 1 and 2)

phoenix -d <date> -M
to call moveProducts (step 3 only, requires steps 1 and 2 executed before)

Note: all three options C|P|M also work with -m <month>

phoenix -d <date> -M -n same, but no cleanup of mcalibs done (useful in dry runs)

phoenix -d <date> -p
process all ABs as filtered with config.phoenix; no update of processing statistics (aiming at partial reprocessing of data after an issue has been discovered and fixed)

finishNight -d <date> -c
cleanup all remnants of date (only needed in manual use, if options -C or -P have been called before)

MCALIBs:

phoenix -X -m <month>
to create and process all CALIB ABs for a specified month (for a given pool), steps 1-3

phoenix -X -m <month> -C
to create all CALIB ABs (step 1 only)

phoenix -X -m <month> -P
to create & process all CALIB ABs (steps 1 and 2)

phoenix -X -m <month> -M
to call moveProducts (step 3 only, requires steps 1 and 2 executed before)

finishNight -d <date> -c
same as above

phoenix -X -d <date> [-C|-P|-M] exceptionally you can also call the tool in X mode by date, but then you might end up with incomplete ABs because, in mode -X, the VCAL directory always needs to be erased before a new call

Use the various options on the command-line. In mass production, when you use phoenix for reprocessing of historical data, you may want to use it typically like that:

IDPs:
a) standard mass production, full steam:
phoenix -m 2005-09
phoenix -m 2005-08
phoenix -m 2005-07
phoenix -m 2005-06
phoenix -m 2005-05
[this mode is not available for MUSE, because of its high data volume and long execution times; always use daily mode here]

b) more careful: chance to review products, fine-tune scoring etc.
phoenix -m 2005-09 -P
phoenix -m 2005-08 -P
phoenix -m 2005-07 -P
phoenix -m 2005-06 -P
phoenix -m 2005-05 -P
... and later, after review, replace -P by -M.

c) even more careful, check content of ABs, explore data history etc.:
phoenix -m 2005-09 -C
phoenix -m 2005-08 -C
phoenix -m 2005-07 -C
phoenix -m 2005-06 -C
phoenix -m 2005-05 -C
... and then, after review, replace -C by -P.

In either case, the JOBS_INGEST file needs to be executed ultimately, off-line.

MCALIBs:
a) standard mass production, full steam:
phoenix -m 2005-09 -X
phoenix -m 2005-08 -X
phoenix -m 2005-07 -X
phoenix -m 2005-06 -X
phoenix -m 2005-05 -X

... and all other cases just like above, always adding '-X'.

OCA rules and data pool (MCALIBs only)

This part is relevant for MCALIBs only.

An important concept required for the MCALIB production is the data pool. For MCALIBs, the ABs for processing need to be created using OCA rules, defining the calibration cascade with its data types, dependencies and validities. These OCA rules are derived from the operational rules, but they can be stripped-off versions, focusing on the instrument mode of the project. Also, only the data types need to be included that are relevant for the (follow-up IDP) science reduction.

It is possible (and often needed) to maintain several versions of those rules, depending on the data history. The OCA rule versioning is configured in the config file.

In general, calibrations are not taken every day, and some are not even taken at regular intervals, but maybe driven by science observations, or by scheduling constraints or ambient conditions (like twilight flats). Therefore, working on a day-by-day basis is the wrong strategy for MCALIBs. Instead, phoenix works on a data pool which is considerably larger than a day, namely a month. In general this data pool size is a good compromise between validity issues (we want to avoid too small pool sizes which by chance have important calibrations missing) and performance issues (like number of ABs and products, disk space, which all grow with pool size). To find a solution for the latter issue, the user can decide to choose smaller slices of a month, namely POOL_SIZE=10/15/30 days. The tool is then still called by month, but 3 times/2 times/once for the appropriate time range.

In general there is no overlap between different pools: calibrations for month 2005-01 are processed independently from the ones in 2005-02. This is mainly done for simplicity, and is also true for the pool slices. There is no VCAL memory in phoenix beyond the execution of a month. All VCALs are erased before createAB is called, otherwise the VCAL management would become very difficult and error-prone. One could e.g. not work on a month 2005-01 and immediately afterwards on 2008-08, which is however often needed for MCALIB production.

RELEASE definition

The configuration key RELEASE is used to define the name of the release. While the concept of $DFO_INSTRUMENT and $PROC_INSTRUMENT relates to the instrument-specific aspects of the processing (like definition of RAW_TYPEs), the RELEASE defines the logics of the (re)processing. It describes aspects like "this is the reprocessing version 2 of all UVES ECHELLE data" where this current reprocessing scheme includes certain pipeline parameter choices, or improved reduction strategies as opposed to e.g. version 1.

The RELEASE tag could be e.g. 'UVESR_2' where 'UVESR' stands for 'UVES reprocessing' and '2' for the current version.

For MCALIB projects, a tag like 'GIRAF_M' would be reasonable where the M stands for 'master calibrations'. This tag is of course not a real release but only used to define the workspace uniquely and distinguish it from the IDP releases. (The corresponding IDP project has the release name 'GIRAF_R'.)

QC procedures

While the original concept of phoenix did not include certification and scoring, it turned out that it is extremely useful, if not undispensible, to create QC reports, mainly for the purpose of quick look. This is true both for IDP and for MCALIB projects. QC reports could also be used to e.g. quickly check that the reduction strategy is the correct one, that the flux calibration looks reasonable etc. These QC reports can likely be easily derived from existing, historical QC procedures for science data.

It is also extremely useful to maintain a set of QC1 parameters and store them in a QC1 database table. This is needed for process control (for instance for proving statistically that the SNR is correctly computed) and also for the scoring. To run a PHOENIX project without QC1 parameters, reports and scores means essentially flying blind.

The existing UVES, XSHOOTER and GIRAFFE IDP streams all have scoring, refer to these examples for gathering more information. Generally the scoring for IDPs, and also for MCALIBs, should focus on key parameters relevant for science reduction. They should check for saturation, negative fluxes etc. It is also a good idea to measure and score association quality, i.e. proximity in time of an IDP to a key calibration.

Technical hints:

If you use scoring, don't forget to set the TMODE configuration key to HISTORY in your config.scoreQC, to have the red and green buttons returning useful database queries.
Note that it is extremely useful to make the QC procedures executable in parallel, then handled by condor in the same way as the AB pipeline jobs. Otherwise phoenix execution times will be dominated by the QC procedures (running one after the other in an extremely inefficient way). Actually it is unclear whether phoenix will run at all with QC procedures not enabled for parallel execution.

History scores, qualityChecker, autoCertifier (MCALIBs; for IDPs see below)

This part is relevant for MCALIBs only. phoenix v2 has a component (embedded procedure) called qualityChecker that establishes and evaluates historical processing information. It is based on the assumption that historical processing information might be useful for the quality assessment of the MCALIB products. The historical information is automatically retrieved from a configured data source (exported $DFO_LOG_DIR on qcweb). In the following table all conditions apply to previous (historical) processing with autoDaily for the daily QC loop. Of course this requires that historical ABs exist on qcweb, and that the information is stored in those ABs. Not finding any such information does not imply an error condition but effectively means more efforts for the current scoring. Without such current scoring, those products would need to be reviewed one by one. The following conditions apply:
	processed	scored	certified
no info about previous processing	0	.	.
previous processing failed	1	.	.
previous processing successful	2	.	.
previous scoring: none	.	0	.
previous scoring ylw/red	.	1	.
previous scoring green	.	2	.
previous certification: none	.	.	0
previous rejection	.	.	1
previous certification	.	.	2

These are the possible combinations:

total history score	conditions	support* by new scores?	auto-certification possible?
222	previous processing OK, scored green, certified	useful	yes (even w/o scores? TBD)
022	no information about previous processing in the AB, but scored green+certified: likely a glitch in the AB content, handled like 222	useful	yes (even w/o scores? TBD)
220	previous processing OK, scored green, no info about certification: handled like 222	useful	yes (even w/o new scores? TBD)
200	previous processing OK, no info about scores or certification: handled like 222	mandatory	yes if new scores green
212	previous score ylw/red, certified	very useful	yes if new scores green
211	previous score ylw/red, rejected: decide (put into REJ list or accept)	very useful	no (review)
012/011	same but without processed info: glitch, same handling as 211	very useful	no (review)
202	processed, no scores, certified	mandatory	yes if new scores green
201	processed, no scores, rejected	mandatory	no (review)
100	previous failed	mandatory	no (review)
000	no previous information	mandatory	yes

* in order to be auto-certified, otherwise one-by-one assessment needed

The auto-certifier is another embedded procedure of phoenix v2 that uses the history score, plus the current score if available, to automatically decide about certification.

These are the concepts behind the auto-certifier:

history score	new score	conditions	auto-certified?
222, 022, 220	green	all flags green: auto-certification safe	yes
	ylw/red	current red scores: always review!	no, review
	none	auto-certification a bit risky; current scoring safer (and easy since there was historical scoring!)	yes?
200, 202	green	previous processing OK, no info about scores or certification: handled like 222	yes if new scores green
	ylw/red	current red scores: always review!	no, review
	none	auto-certification unsafe	no, review
000	green	current scores green: auto-certification safe	yes
	ylw/red	current red scores: always review!	no, review
	none	auto-certification impossible	no, review
212/012	green	previous score ylw/red but certified; could be risky	review? (TBD)
	ylw/red, none	current red scores, or no scores: review!	no, review
211, 011, 201, 100	any score or none	decide (put into REJ list or accept)	no, review

In essence:

if there are current green scores: always auto-certify, unless there is historical evidence about an issue;
if there are current ylw/red scores: always review;
if there are no current scores: auto-certify only if there is additional historical information supporting this case (but it is highly recommended to always have current scores!)

Obviously the fraction of ABs to be reviewed can be reduced drastically with a significant current scoring system. Once the current scoring system is enabled and fine-tuned, it is realistic to expect that many months completely auto-certify.

To assist with the auto-certification, the AB monitor under phoenix displays the history scores in the 'comment' column. The flag '222' is suppressed because it is in many circumstances the standard value.

For IDPs, there is an optional step for certification, just before calling 'moveProducts'.

In a situation with SCIENCE ABs that are cascaded (like for MUSE), one might want to have a certification step that goes beyond the automatic scoring and includes e.g. visual inspection of the QC plots. The classical DFOS tool certifyProducts cannot be easily used for this task since it was designed for the case for CALIBs. You can call a pgi (configured under CERTIF_PGI) that you have to provide yourself e.g. as a customized version of certifyProducts). For MUSE this exists as 'phoenixCertify_MUSE' (see here).

Job execution patterns

Normally the DRS_TYPE to be chosen in CON (for condor) which can use up to 30 cores in parallel on muc08 (48 on muc10), both for AB processing and for QC reports. The execution pattern is then identical to the one known from the autoDaily DFOS processing.

The following special cases exist (because standard condor processing would be too inefficient):

1. QCJOB_CONCENTRATION (for QC jobs and IDP processing): this is a special queue optimization to collect QC jobs for a full month (instead of daily QC jobs). It is used for efficiency if an individual QC job takes long and only few exist in daily processing: then the standard queue mechanism would be inefficient since the processing machine cannot be saturated. Efficiency gains can be as high as 30 (running 30 jobs in parallel instead of just 1).

2. 2-stream processing (for MUSE AB jobs): this is special for MUSE. The MUSE pipeline uses an internal parallelization using 24 cores at certain phases of a recipe. With a custom JOB_PGI a 2 stream processing is set up, meaning that 2 ABs can run in parallel, using (up to) 48 cores in parallel (available on muc09 and muc10), for efficiency gains by up to a factor 2.

3. Dual-instances processing (for MUSE only): a phoenix process can be implemented on two different accounts, for gaining efficiency. While both instances can execute independently, at the end one wants to bring all relevant information to a central account, which is then the MASTER, while the other account is called the SLAVE. For more see here. A SLAVE accounts needs to be marked as such in the config.phoenix file, a MASTER account not.

QCJOB_CONCENTRATION (IDPs only)

[This section is currently only relevant for IDPs on the giraffe_ph@muc08 account.] In normal cases the execution time for pipeline jobs (ABs) is longer than for the QC jobs, and the number of daily jobs is high enough to saturate the multi-core machine:

1	AB	QC	First the AB and then the QC jobs are executed; condor manages the proper distribution; the processing system is saturated, the scheduling is optimal.
2
3
4
5
6
7
8
...
30
31	queue		Once a job is finished, condor schedules the next waiting job, until the queue is empty.
...
n

In that regime all phoenix jobs are executed day after day, and within a day batch there is first the AB call and then the QC call. Let's call this regime the "normal condor regime".

For giraffe_ph, the situation is different: typically there are only a few jobs (let's say 2) per day, meaning the muc machine cannot saturate, most cores are idle and the processing is not optimal. Furthermore, the QC jobs take much longer than the AB jobs (because a single QC jobs actually loops over up to 130 MOS products). For this regime (let's call it the "MOS condor regime") there is a special operational setup needed to optimize the processing. It is possible only for monthly processing (because then a sufficient number of AB/QC jobs exist to saturate the muc machine). In a first step all AB jobs are processed in the "normal" way (day by day, without saturation). This is suboptimal but because of the short execution time not critical for optimization. The daily QC jobs are collected but not executed. Only after all AB jobs of the month are done, the QC jobs are executed in one big call, thereby saturating the muc machine and optimizing the processing scheme. Let's call this regime the "QC job concentration regime".

	day1	day2	day3	...	last day	all days of the month	First all ABs are executed, day by day; QC jobs are collected and executed at the end, in one batch for the whole month; condor manages the proper distribution; the processing system is saturated by QC jobs, the scheduling is optimal for QC jobs.
1	AB#1	AB#4	AB#5	AB#9	AB#43	QC#1
2	AB#2		AB#6		AB#44	QC#2
3	AB#3		AB#7		AB#45	QC#3
4			AB#8			QC#4
5						QC#5
6						QC#6
7						QC#7
8						QC#8
...
n

In some test runs, the effective processing time for a month worth of GIRAFFE IDP processing with phoenix went down from more than 6 hours for "normal condor processing" to about 2 hours for the "QC job concentration processing".

The switch to this processing scheme is made via configuration (key QCJOB_CONCENTRATION). It can be used for any PHOENIX account, but currently makes sense for giraffe_ph only.

Ingestion

Each phoenix call results in an entry in the $DFO_JOB_DIR/JOBS_INGEST file, in the traditional DFOS way. The standard dfos tool ingestProducts is aware of running in the phoenix environment if

INGEST_ENABLED is set to YES in config.phoenix
PATH_TO_IT and CONVERTER are properly filled in config.ingestProducts.

If INGEST_ENABLED=YES, the tool is enabled to call the ingestionTool and the converter (optional).

Find more about the IDP ingestion process here.

After ingestion, don't forget to call the tool cleanupProducts to replace the fits files with their extracted headers, in the standard DFOS way. For SLAVE accounts, the headers are transfered to the MASTER account, for central bookkeeping.

Installation

phoenix, not being a dfos tool for standard installations, is required as the central tool for the PHOENIX process.

The PHOENIX process requires a customized file .dfosrc and a standard file .qcrc. The following keys are needed in .dfosrc in addition to the one used in DFOS:

export THIS_IS_PHOENIX=YES	YES \| NO	marking this account as PHOENIX account; many dfos tools then behave differently than in DFOS environment (optional, default=NO meaning normal DFOS)
export THIS_IS_MCAL=YES	YES \| NO	for PHOENIX accounts: mark the environment for MCAL processing (YES) or IDP processing (default: NO) Since for MCAL processing, a dedicated file .dfosrc_X is expected, that key can be set to YES only in .dfosrc_X. Don't forget to enter 'NO' in .dfosrc (important if you switch between MCAL and IDP processing).

The PHOENIX process requires the dfos tools listed above. That list might evolve. Please always use dfosExplorer to control your current PHOENIX installation.

The workspace as defined in the .dfosrc file is defined by $PROC_INSTRUMENT.

Configuration file

phoenix v2 can run two configuration files: the standard one for IDPs (always needed), config.phoenix, and a second one (config.phoenix_mcal, only needed for MCALIBs) to replace the standard one. Both have the same structure:

name	config.phoenix	config.phoenix_mcal
used for	IDPs, or to point to the other config file	MCALIBs
pointer	MCAL_CONFIG: <undefined> --> this is IDP production
	MCAL_CONFIG config.phoenix_mcal --> MCALIB
rc file		MCAL_RESOURCE .dfosrc_X

The name of the MCALIB config file is not hard-coded, you might use another name but it needs to be registered in the main config file, under MCAL_CONFIG. In MCALIB mode, the 'main' config file (let's call it IDP config file) is only read to find the MCALIB config file, no other information is used. The MCALIB config file has the same structure as the IDP config file. This has the effect that it is very easy to use the same account for either MCALIB or IDP production, by simply switching on or off the MCAL_CONFIG key in the IDP config file.

Both configuration files go to the standard $DFO_CONFIG_DIR. The following configuration keys exist (a few keys are specific for either the IDP or the MCALIB mode, they are marked below in the corresponding colours):

Section 1: GENERAL This section controls the general tool behaviour.
MCAL_CONFIG (only in config.phoenix!)		<commented out>			this is for IDP production
		config.phoenix_mcal			go to that one and use it for MCALIB production
MCAL_RESOURCE (only for MCALIBs)		.dfosrc_X			source this file to define the workspace for MCALIB production
CHECK_VCAL (only for MCALIBs)		YES			YES\|NO (YES): send notification if no VCAL exist after running createAB (which would be indicative of an issue with /opt/dfs tools)
MAIL_YN		NO			YES\|NO (NO): send notification when finished
UPDATE_YN		YES			YES\|NO (YES): moveP handling of pre-existing $DFO_SCI_DIR: update (YES) or create from scratch
INGEST_ENABLED		YES			YES/NO (NO): enabled for IDP ingestion (requires IngestionTool and probably conversion tool) or for MCALIB ingestion

ACCEPT_060		NO			YES/NO (NO): YES--> accept PROG_IDs starting with 60. or 060. (for testing only!)
CHECK_LOCAL		NO			YES/NO (NO): YES--> check for mcalibs first in $CAL_DIR_LOCAL (below), then try from NGAS; more see here (effective for IDPs only)
CAL_DIR_LOCAL		/fujisan3/qc/muse/calib			if CHECK_LOCAL=YES: root of operational $DFO_CAL_DIR where to check for mcalibs first; more see here
CERTIF_REQUIRED		NO			YES\|NO (NO): YES --> certification required before IDPs can be moved; then, custom certification tool must be defined under 2.3 (CERTIF_PGI) and provided.
QCJOB_CONCENTRATION		NO			YES/NO (NO): YES --> collect all QC jobs of a month in one single vultur_exec call, to be called after all daily AB jobs have been executed; applicable if QC jobs dominate the total execution time and if their daily number is small; more see here
For dual-instance installations; the following four keys need to be defined for the SLAVE only:
IS_SLAVE		YES			YES/NO (NO): if YES, this is the second installation; all logs and plots information is added to qcweb, but also transfered to the MASTER account, for central bookkeeping.
MASTER_ACCOUNT		muse_ph2@muc10			logs/plots are copied to that account (but not fits files!); after ingestion, also the IDP headers are collected here.
SLV_ROOT		/fujisan3/qc/muse_ph			root directory name for data on slave account
MST_ROOT		/qc/muse_ph2			same, on the master account
for MCALIB projects:
POOL_SIZE		30			10/15/30: pre-defined values for MCALIB processing which comes always per month
OCA_SCHEME		VERSIONED			VERSIONED\|STANDARD: VERSIONED means you have OCA rules per epoch, to be defined in Sect. 2.6 (default: STANDARD)
RAWDOWN_SLEEP		600			optional: forced sleep time between call of RAWDOWN file (in background) and start of processing; default: 60 [sec]. If too long, system is waiting idle; if too short, ABs might fail because system cannot download fast enough
REPLACE_LOGIC		SELECTIVE			SELECTIVE\|COMPLETE: SELECTIVE means that already archived MCALIBs should be replaced by the same name (because then historical ABs can be used for IDP production) COMPLETE means complete creation from scratch (because no pre-existing MCALIBs, or all have been deleted)

Section 2: SPECIFIC for RELEASE This section defines properties of the supported instrument(s). 2.1 Define RELEASE and PROC_INSTRUMENT
PROC_INSTRUMENT		UVES
#TAG	#PROC_INSTR	#VALUE
RELEASE	UVES	UVESR_2			uves reprocessing v2 (2000-2013+)
INSTR_MODE	UVES	UVES_ECHELLE			instrument mode, for ingestProducts statistics

#multiple entries possible:	#PROC_INSTR	#source		#validity until	#comment
ORIG_AB_DIR	UVES	public_html/UVESR_1/logs		2007-03-31	source of ABs on qc@qcweb up to until 2007-03-31
		public_html/UVES/logs		2099-99-99	until CURRENT period
2.2 Definition of RAW_TYPEs as used in the ABs
Needed for IDPs only; for MCALIBs this part is ignored and replaced by the RAW_TYPEs in the OCA rule. Multiple entries possible; in case it matters, ABs get processed in the sequence of this configuration.
RAW_TYPE	UVES	SCI_POINT_ECH_BLUE
RAW_TYPE	UVES	SCI_POINT_ECH_RED
2.3 Definition of pgi's and SUPPRESS_DATE (all optional)

DATA_PGI	VIMOS	pgi_phoenix_VIMOS_fixhdr			# optional pgi for ins-specific header manipulation (under $DFO_BIN_DIR); for MCAL mode only
AB_PGI	UVES	pgi_phoenix_UVES_AB_760			# optional pgi for ins-specific AB manipulation (under $DFO_BIN_DIR)
HDR_PGI	UVES	pgi_phoenix_UVES_MCAL			# optional pgi for ins-specific header manipulation, called in MCALDOWNFILE (under $DFO_BIN_DIR)
CERTIF_PGI	MUSE	phoenix_certifyP_MUSE			# optional pgi for ins-specific certification, to be provided under $DFO_BIN_DIR

SUPPRESS_DATE	UVES	2008-03-25			# optional; list of dates to be avoided (because of e.g. very high number of ABs) in monthly execution mode
2.4 SETUP filter
#SETUP filter (multiple): positive list: only the listed setups get processed (useful only for selective reprocessing of certain setups)
SETUP	UVES	_346			string, evaluated from SETUP keys in ABs
etc.
2.5 RENAMING scheme (MCALIBs only) (in order to match historical naming scheme, you may need to support versioning here; not needed for IDPs)
#RENAME	#RELEASE	#config filename		#valid until	#comment
RENAME	GIRAF_M	config.renameProducts_2009		2009-12-31	#old schema, with one read mode only
RENAME	GIRAF_M	config.renameProducts_CURRENT		2099-99-99	#current schema
2.6 OCA scheme (MCALIBs only) (in order to match evolution of header keys or cascade, you may need to support versioning here; not needed for IDPs; note the versioning for organization and association files!)
#OCA	#RELEASE	#organization filename	#association filename	#valid until	#comment
OCA	GIRAF_M	GIRAFFE_organization.h_first	GIRAFFE_association.h_first	2004-04-01	first half year had no INS.EXP.MODE
OCA	GIRAF_M	GIRAFFE_organization.h_second	GIRAFFE_association.h_second	2008-05-01	old CCD: some old static tables don't work with PREVIOUS rule
OCA	GIRAF_M	GIRAFFE_organization.h_curr	GIRAFFE_association.h_curr	2099-99-99	new CCD: current OCA rule, edited for applicable raw_types

Operational plug-ins (PGIs)

phoenix is a standard workflow tool. Usually instruments have a certain range of peculiarities which need to be handled via plug-ins (PGI's). This concept has proven useful for the daily workflow with autoDaily, and it is also used here. Some PGIs are configured directly in config.phoenix (or config.phoenix_mcal), others are PGIs for a specific dfos tool and are configured in their respective config files. All PGI's are expected under $DFO_BIN_DIR. Here is a list of PGIs directly related to phoenix:

phoenix PGI's
pgi	name	purpose
DATA_PGI	pgi_phoenix_VIMOS_fixhdr	ins-specific header manipulation; for MCAL mode only (example: replace indexed INS.FILT<n>.NAME by INS.FILT.NAME etc.); rarely needed, for cases that are in conflict with data flow rules
AB_PGI	pgi_phoenix_GIRAFFE_AB	for ins-specific AB manipulation; very important for IDP mode: it handles the historical evolution of ABs and makes them uniform and compatible with the current pipeline; it might even generate ABs (from existing information but it cannot replace createAB)
HDR_PGI	pgi_phoenix_UVES_MCAL	for ins-specific header manipulation of downloaded mcalib file, called in MCALDOWNFILE
CERTIF_PGI	phoenixCertif_MUSE	for ins-specific IDP certification (if CERTIF_REQUIRED=YES)

Operational workflow description

A) IDPs (basic operational mode: per date)

0. Prepare: source .dfosrc

1. Get ABs (for -C, -P, and full execution)

1.1 check if DATE is configured as SUPPRESS_DATE; if so, exit
1.2 download all ABs/ALOGs/ATABs from configured data source (ORIG_AB_DIR on qc@qcweb)
1.3 filter the ABs:

1.3.1 filter for SCIENCE; if none found, exit
1.3.2 filter for configured RAW_TYPEs; if none remain, exit
1.3.3 filter for configured SETUPs; if none remain, exit

1.4 clean&move ABs

1.4.1 remove obsolete content (RB_CONTENT, PRODUCTS, FURTHER)
1.4.2 apply AB_PGI
1.4.3 move them to $DFO_AB_DIR
1.4.4 create AB_LIST

2. Prepare job file (for -C, -P, and full execution)

2.1 create ATAB files
2.2 call getStatusAB
2.3 call 'createJob'
2.4 add the mcalDown file to the JOB_FILE (createJob doesn't do it for mode SCIENCE)
2.5 do some manipulations to the JOB_FILE
2.6 stop here if -C is set (you could now investigate the ABs, execute selected ones etc.)

End of step 1; stop here for -C

3. Call JOBS_FILE (for -P and for full execution)

3.1 call the AB and then the QC queue (unless in monthly mode QCJOB_CONCENTRATION=YES: then all AB queues are called, then the single QC queue)
3.2 check for failures; if found, send an email
3.3 stop here if -P is set (you could now investigate the products, fine-tune QC reports etc.)

End of step 2; stop here for -P

Begin of step 3; enter for -M and for full execution

4. [optional: call CERTIFY_PGI for certification]

5. Call moveProducts (for -M and for full execution)

5.1 call moveProducts -d <date> -m SCIENCE, including renameProducts
5.2 add the cascadeMonitors (AB, QC) for monitoring the process performance to $DFO_LOG_DIR

6. cleanup (finishNight) (done for -M and for full execution)

5.1 manage content of $DFO_CAL_DIR (dates beyond +/-28d of current date are deleted; if number of files in 2099-99-99 exceeds 1000, they get deleted)
5.2 call finishNight -d <date>
5.3 if this is a SLAVE account: transfer logs, plots and other information to the MASTER account
5.4 update phoenixMonitor

B) MCALIBs (basic operational mode: per month)

0. Prepare:

0.1 read MCAL_CONFIG in config.phoenix
0.2 use the new config file
0.3 source the configured MCAL_RESOURCE

1. Create ABs (for -C, -P, and full execution)

1.1 Find the monthly pattern from config file (batches of 10/15/30 days)
1.2 check for SUPPRESS_DATE, exclude it
1.3 download headers as required; call DATA_PGI if configured
1.4 filter headers for:

1.4.1 hidden files (remove them)
1.4.2 previously rejected files (remove them)

1.5 Create ABs
1.6 Filter ABs for

1.6.1 HC calibrations (RAW_MINNUM etc., using filterRaw)
1.6.2 configured SETUPs (remove ABs)

1.7 apply AB_PGI; then

1.7.1 move them to $DFO_AB_DIR
1.7.2 create AB_LIST

1.8 Catch historical information:

1.8.1 AB comments
1.8.2 historical processing, scoring, certification information

2. Prepare job file (for -C, -P, and full execution)

2.1 call getStatusAB
2.2 call 'createJob'
2.3 do some manipulations to the JOB_FILE
2.4 stop here if -C is set (you could now investigate the ABs, execute selected ones etc.)

End of step 1; stop here for -C

3. Call JOBS_FILE (for -P and for full execution)

3.1 call the AB and then the QC queue
3.2 check for failures; if found, send an email
3.3 call qualityChecker (evaluate historical scores and current information)
3.3 stop here if -P is set (you could now investigate the products, fine-tune QC reports etc.)

End of step 2; stop here for -P

Begin of step 3; enter for -M and for full execution)

4. Call autoCertifier (in automatic mode, continue only if no issue occurred; in -M mode, offer issues for review); end of monthly mode

5. Daily loop: call moveProducts (for -M and for full execution)

5.1 call moveProducts -d <date> -m SCIENCE, including renameProducts
5.2 apply naming scheme checker, leave issues for clarification
5.3 add the cascadeMonitors (AB, QC) for monitoring the process performance to $DFO_LOG_DIR

6. Call finishNight (done for -M and for full execution)

6.1 call finishNight -d <date>
6.2 update phoenixMonitor

7. Clean-up products

7.1 Call JOBS_CLEANUP for the stream
7.2 Call 'cleanupProducts -E' for the batch

Self-management of $DFO_CAL_DIR and $DFO_RAW_DIR, and the other DFO directories with permanent data

Although the data disk of muc08 is big (7 TB), it is reasonable to constrain the size of these data directories. Here is a description of the applied strategies (same strategy for IDPs and MCALIBs unless otherwise noted):

$DFO_HDR_DIR:
Filled for MCALIBs, otherwise not needed routinely.

$DFO_RAW_DIR:
The <date> directory is always deleted after processing.

$DFO_CAL_DIR for IDPs:
The downloaded ABs have their master calibrations listed in the MCALIB section, with a path $DFO_CAL_DIR/<date>. Depending on the data source, the downloaded ABs might either have that section filled in the standard DFOS way (by the actual date of acquisition; case 1), or in some non-standard way, e.g. in a pool (2099-99-99 as for the UVES reprocessing project version 1; case 2).

For case 1, the phoenix tool checks at the end of processing all dates outside a window of +/-28days of the current processing date. If found, these dates are deleted. In doing so, all downloaded master calibrations for the next night can be used again, and only very few files need to be downloaded, or deleted, in the routine loop mode.

While this strategy is already conservative (master calibrations are not deleted night by night but only for a monthly window), this step can be turned off entirely by calling the tool like 'phoenix -d <date> -M -n'. This is advantageous in a dry run situation when tests are being made across many different dates or months.

For case 2, all master calibrations are accumulated until their number reaches 1000, in which case all files are deleted.

For the MUSE phoenix installation, data volume, NGAS access and download times are much more critical than for the others. Therefore, an new setup is used there: configuring CHECK_LOCAL=YES and defining CAL_DIR_LOCAL, the mcalDown file is written in such a way that it checks first for the requested mcalibs on the operational data disk. Only if not found there, the file is downloaded from NGAS. This is only possible for the muse_ph@muc09 account.

$DFO_CAL_DIR for MCALIBs:

This is the final product directory for MCALIBs (hence logically like the $DFO_SCI_DIR for IDPs). It is not managed by phoenix, hence the products will pile up. Once ingested they can be removed using cleanupProducts.

If you run an MCALIB and an IDP project on the same account, make sure to have the $DFO_CAL_DIR not overlapping, otherwise your MCALIB products might eventually disappear!

$DFO_SCI_DIR:
All products are stored here. The ingestion process takes them from here and ingests the products as IDPs. Once ingested they are replaced by headers using cleanupProducts. On SLAVE accounts, the headers are transfered to the MASTER account.

$DFO_LOG_DIR and $DFO_PLT_DIR:
The usual content of these directories is preserved and copied to qc@qcweb. All logs and plots are accessible from the exported phoenixMonitor. On MASTER accounts, also logs and plots from the SLAVE account are collected.

$DFO_LST_DIR for IDPs:
Usually the content here is just copied from the operational accounts.

$DFO_LST_DIR for MCALIBs:
The content is filled from the downloaded headers (data report).

Data structure on qcweb

AB source (IDPs only):

The DFOS qc accounts on qcweb are the central source of ABs for phoenix. The tool normally does not create its own ABs (but there are exceptions for complex science cascades like for MUSE!). Also, the processed ABs (which might be modified) are stored after phoenix processing. The scheme is the following, for the example of UVES:

the AB source is configured as $ORIG_AB_DIR, which gets expanded into qc@qcweb:${ORIG_AB_DIR};
the processed ABs are stored locally, and also export to qc@qcweb:public_html/$RELEASE/logs where $RELEASE is configured.

In this way the input and output ABs are kept separate.

The phoenixMonitor tool creates an overview of the phoenix processing history, in the same way as histoMonitor does for DFOS operations.

Special notes

mcalDown file: it is called twice (to account for unsuccessful downloads in the first instance).

rawDown file: for the MCALIB mode running per month, it is important to have all raw data available before the condor queue starts. Because of dependencies between the data types, the processing does not start with an easily predictable dataset; incomplete ABs would then start firing many download requests for the missing raw files in parallel which might kill the queue. Therefore all download requests are channelled into 6 queues and launched in parallel. Condor is allowed to start only once they are all finished.

VCAL directory: for MCALIBs, virtual calibrations are created and used for the AB creation in the standard way. Contrary to DFOS operations, there is no sequential processing, but processing within a data pool (a month). At the end of the AB creation, all VCALs are therefore deleted.
MCAL directory is empty for MCALIBs. Only the usual GEN_CALIB directory is needed (and could be identical to the one used for IDPs).

config.createAB: the key N_MCAL_LIST should be set to '0' since there is no value in scanning the existing MCALIBs, all associations are done within the pool of VCALIBs.

OCA rules: for the MCALIB mode you should edit the OCA rules (in particular the classification rule) such that only the data types relevant for the MCALIB project remain active. This is the most efficient way to restrict the number of ABs. Then, fine-tuning can happen as necessary using the SETUP configuration in the phoenix configuration.

MCAL directory: for IDPs, a pool of downloaded master calibrations is kept for approximately one month around the current processing date, simply because of (download) efficiency. Therefore, going sequentially further back in time is more efficient (since donwloads are needed only incrementally) than jumping forth and back (then essentially all mcalibs need to be downloaded every time).
VCAL directory is emtpy for IDPs.

Don't do

Never call 'moveProducts -m SCIENCE -d <date>' on the command-line in a PHOENIX environment. The tool will export all log and plot information to qc@qcweb:public_html/${DFO_INSTRUMENT}/logs/<date> and plots/<date> where it will a) get mixed up with dfos information (calibrations, historical science processing), and b) get used as AB source the next time you call the phoenix tool again on that date (if ever). There is no way to prevent this safely (setting DFO_INSTRUMENT to the value of RELEASE would also cause issues). Only if wrapped in 'phoenix -d <date> -M' (or 'phoenix -d <date>', the tool sets the target directory properly to qc@qcweb:public_html/${RELEASE}/logs/<date>.

Notification about phoenix on operational account

In the following we assume that each PHOENIX account has an operational counterpart on muc01...muc07. The case '(re)processing of data of a non-operational (retired) instrument' could be supported but currently is not.

For the stream part of a PHOENIX process (after the bulk processing has been done), it is desirable to get a signal from the operational account that a certain set of master calibrations has been finished and is available for phoenix. This 'batch unit' for phoenix is one month, for no other than pragmatic reasons.

For 'PHOENIX-enabled' instruments, the dfos tools dfoMonitor and histoMonitor on the operational account can be configured to become aware of the respective PHOENIX account. If the configuration keys PHOENIX_ENABLED and PHOENIX_ACCOUNT are set in $DFO_CONFIG_DIR/config.dfoMonitor, then the following happens:

a) histoMonitor, when encountering a new month upon being called from 'finishNight', sends a signal (email) to the QC scientist that a new month has started, meaning that a set of certified master calibrations is available for the previous month. At the same time, a new status flag 'phoenix_Ready' is written into DFO_STATUS, along with the previous month (format YYYY-MM).

b) This flag is catched by dfoMonitor and used to flag that month on the main output page:

PHO
ENIX:
2013-06

If no new PHOENIX job is on the ToDo list, this field is empty:

PHO
ENIX:
none

c) It is left to the QC scientist, when and how this new PHOENIX task is launched. There is no automatic job launch, and there is no communication between the operational machine and the PHOENIX machine (muc08/09/10).

The standard way is to launch 'phoenix -m 2013-06' as sciproc@muc08 (or xshooter_ph@muc08 etc.). Depending on the data themselves, and on the data volume, this might take a few hours. Since in principle it might be that in the future there are several concurrent phoenix jobs, or an operational account in addition to sciproc, there is no automatic mechanism to launch PHOENIX jobs.

d) When this step has been done, the QC scientist can confirm this PHOENIX job to be done, by pushing the button 'done'. This triggers a dialogue, where the user is asked to confirm the execution, and then this month is removed from the DFO_STATUS file. If there is more than one month, all months will offered for confirmation, one after the other.

IDPs:
phoenix -d <date>	to process all SCIENCE ABs for a specified date
phoenix -m <month>	to process all SCIENCE ABs for a specified month (in a loop over all dates), steps 1-3 (not available for MUSE)
phoenix -d <date> -C	to download & filter all ABs (step 1 only)
phoenix -d <date> -P	to download & filter & process all ABs (steps 1 and 2)
phoenix -d <date> -M	to call moveProducts (step 3 only, requires steps 1 and 2 executed before)
	Note: all three options C\|P\|M also work with -m <month>
phoenix -d <date> -M -n	same, but no cleanup of mcalibs done (useful in dry runs)
phoenix -d <date> -p	process all ABs as filtered with config.phoenix; no update of processing statistics (aiming at partial reprocessing of data after an issue has been discovered and fixed)
finishNight -d <date> -c	cleanup all remnants of date (only needed in manual use, if options -C or -P have been called before)

MCALIBs:
phoenix -X -m <month>	to create and process all CALIB ABs for a specified month (for a given pool), steps 1-3
phoenix -X -m <month> -C	to create all CALIB ABs (step 1 only)
phoenix -X -m <month> -P	to create & process all CALIB ABs (steps 1 and 2)
phoenix -X -m <month> -M	to call moveProducts (step 3 only, requires steps 1 and 2 executed before)
finishNight -d <date> -c	same as above
phoenix -X -d <date> [-C\|-P\|-M]	exceptionally you can also call the tool in X mode by date, but then you might end up with incomplete ABs because, in mode -X, the VCAL directory always needs to be erased before a new call

Common DFOS tools: Documentation