We give here a short description of the run control software, how it is designed and translated into code as well as how it works internally when the user issues a certain command (e.g. when the user starts a run).
It is useful to keep in mind that this project started using an OO methodology (Shlaer-Mellor). Although it evolved quite a bit away from being a pure Shlaer-Mellor project, it nevertheless still has a lot of concepts that survived the course of time. Therefore we follow this OO methodology in describing in the first section the layout of the software (OOA of the run control). We first describe the classes and how they interrelate (using an information model or ERD [entity relationship diagram]). After that we say a few words about the state models of the few active objects in the run control.
In the second section we briefly describe how the model has been translated into C++ code and how it relates to the rest of the ONCS software.
In the following three sections we briefly describe what really happens inside the run control when the it is started, the user issues the download command or when a run is started.
Finally it should be kept in mind that the PHENIX ONCS project is far from being completed.
Many things will be changed as the detector is completed and all four arms become
operational....
Currently the active objects (objects with a state machine) in the run control
are: partition, process stage and run. Their state machine will be described in
next:
The figure above shows the state model of the partition object, which
gets created into the configuring state at the startup time of the
run control. A configure command can
either be a command such as "allocate granule XYZ" or
"GTM.DC.W modebitfile ./GTM.DC.W.gtm", which would set the
configurable parameter modebit file for the process unit GTM.DC.W.
Once the user is finished configuring, he/she issues the download command.
All the partition does at that point is to inform all process
stages that are active to start downloading. The partition then
waits until all process stages have reported a successful download
(or the occurance of an error). A timeout mechanism guarantees
that the error state is reached, if one or more process stages do
not reply within a given time window.
A subtlety should be mentioned here: we normally talk about the
download that bring the data acquisition chain into the ready state.
As a matter of fact this is done in two steps: The first step is
the "initialise" procedure. This is normally only done once during
the lifetime of a partition. Then there is the proper download,
which can be done several times during the lifetime of the partition.
As example: the GTM objects in VxWorks get created and reset at the
beginning when the run control start up (initialisation phase).
However the modebitfile can be loaded many times over the lifetime
of a partition (e.g. download the modebitfile, take 1000 events,
end the run, download another modebitfile, take another 1000 events, etc).
Therefore the loading the modebitfile is done during the download
phase of the run control.
Finally it should be noticed that the partition has no information about the
process stages or the process units. All it does is to pass the
user commands to the process stages and waits for their reply.
The process stage normally waits for a command of the form
"initiate state transition to ready" (which really means
"bring all your process units into the desired state X"
(say X = ready). Upon
arrival of such a command, the process stage informs all its
process units that they should start the process which will
eventually get them into the ready state.
Note that the process stage has no clue about the detailed
operations, that the process units have to do. All that information
is encapsulated in the process units themselves. All the process stage
does is to call a virtual member function of the process unit
(e.g. initialise(), download(), start_run() or stop_run() ).
The process stage then waits until it received the information from
all process units that they have arrived in the desired state.
The partition object gets notified if that is the case within
a given timeout window. Otherwise the process stage drops into the
error state.
There is the possibility that a process unit could report
a state change ("pu done" in the figure above) outside the above
described sequence. That situation
is handled in the state "update process unit state".
Finally a word about the relation between the process stage and
the process unit. The process stage is the active object, the
process units are not, but the grand summary of their state
is an attribute of the process stage. However the state of
the process units does not play a role in the state
transitions of the process stage. The process stage per se is
only interested to bring the process units from an initial
state X to a final stage Y. OOA of the Run Control
Information Model
The figure below shows the information model of the run control
(i.e. the main objects of the run control and how they relate
to each other). The figure is followed by a brief description
of the main objects.
State Models
State Model for the Partition Object
Once all process stages have reported a success, the partition
arrives in the Ready state, from where the user can either issue a new
configure command or the start_run command.
However the user typically never sees that there is the additional
step of the initialisation done the first time "download" is
executed.
State Model for the Process Stage Object
The process stage object is really only a container object for
a bunch of process units. As such it has no clue what the process
units really are. This enables a uniform state model for all instances
of the process units, i.e. the behaviour of this object is independent
of what it controls.
The subtypes of the process units (eg GTM or DCM objects) have all
the relevant informations what exactly has to be done when during
the download/start/stop run sequence.
State Model for the Run Object
The state model of the run object is very simple: It gets created,
waits until it gets the "started" command from the run control
and drops into the running state. Once in the running state, it
will perform a number of checks in regular intervals (such as
updating how many events have been taken and to check whether
the event limit has been reached (if the user has specified a limit)).
OOD of the Run Control
How is the above described run control model mapped into C++ code?
The following rules have been applied:
Relationships between objects are translated into adding pointers
to the corresponding instance (for a to-one relationship)
or a STL list of pointers (for a to-many relationship). [Note that this
will change as move some of the objects into the database].
What Happens at the Startup Time of the Run Control?
The following operations take place when the user starts the run control
(e.g. by typing daq_boot.sh Id DC.W+DC.E+PBSC.W).
What Happens at Download Time of the Run Control?
The following operations take place when the user issues the
download command.
The process units in turn invoke the virtual member function
on their process units, which send the CORBA commands to the event
notifier, which in turn will pass them to the CORBA objects in the
outside world. As these objects execute the commands, they reports the
status back to the proxy object in the run control. If an error is
reported, the process stage drops into the error state. If success
is reported, update_pu() is called on the process unit, until the
process unit reports that it has reached the initialised state.
The process stage reports back to the partition that the initiliased
state has been reached, when all process units have reached the
initialised state.
The user can also issue another configuration
command, which will bring the partition object back into the
configuring state. Note that a download command has to be issued
before a new run can be started. The download command has a parameter
which allows to download only those components whose configuration
has been changed. This allows to significantly reduce the download
time.
Very simple: the initialise member function of the FEM object
only sets the state attribute to initialising. When the GTM object
has performed the reset and resetGLINK, it starts the
actual Arcnet initialise command.
What Happens at StartRun/StopRun Time?
The following operations take place when the user issues the
startRun/stopRun command:
The end run sequence is almost identical. Note also that all the sequences
(initialisation, download, start/stop run) are pretty much the same!
done
event back to their run control proxies,
which will either send another command or when all commands have been send, they
inform the process stage that they are now running.