Fix:
Make sure you are using a mainframe with the proper EPROMS. The
affected chips are U19 and U22. They should be Version 3.13 or
later. Also make sure you have up-to-date versions of the files
listed above.
Detailed Description:
The mainframe command LS, which returns a series of hexadecimal numbers
cooresponding to the summary of changes to each logical unit within a mainframe,
contained a bug. This bug, coded unto the mainframe's EPROM, issued an
error statement in place of the logical unit summary in mainframes which
contained more than 25 logical units. This wasn't a problem before, because
only mainframes using 1469s could possibly contain this many logical units
(every LeCroy module is considered to be one logical unit, except for 1469s,
which have 2). The error message was sent because the Arcnet could not
handle sending a character string as long as the response for 25+ logical
units would require. Since our boot sequence and asynchoronous tasks depend
on the LS command, all of this meant we could not control a mainframe of
13 or more 1469 modules via Arcnet.
Our Solution:
We contacted LeCroy and informed them of the bug. They send out some
updated EPROMS, which split the response into 2 lines, allowing it to be
sent over Arcnet. The first line is prefixed with the character 'C' as
its first character -- this is a flag to other software that the line will
have a second, continuation line. The second line does not contain a flag
character, nor does it have the standard command label (all other responses
from the mainframe are prefixed with the command that caused the response).
The support software that handles the responses from the mainframes was
modified to handle this special case. Currently, it assumes that the very
next line of response from that mainframe will be the continuation line.
Also note: LeCroy introduced another bug in the new EPROM software: for
the case of 24 logical units, the LS response is repeated. They have promised
to repair this bug.
Fix:
You need to get an updated version of the above files and recompile.
Detailed Description:
The original hiv EPICS record was based on 1461 modules. The 1471s contained
three properties not present in the 1461 -- Measured Peak Current, Peak Current
Trip, and Ramp Trip Enable. These properties had to be reflected in the hiv
EPICS record, and be included in the support software.
Our Solution:
The three properties were added to the software. The hiv record was expanded
to include these properties; those modules without them still have the
same fields, but they are initialized to zero and never accessed or changed.
Of special note is the Measured Peak Current. As it is a measured quantity, it must be checked in the seq task (located in seqArcnet.c). This involves adding a code segment to check the current checksum against the last checksum retrieved and, if a difference is detected, issue a command to retrieve the new value of the measured property.
In general, it is best to duplicate the format used for the existing properties when coding new properties. We advise that anyone trying to add new properties run a search for the places in the above files were the changed for the 1471 (use 1471 as a keyword) and then make similar changes for the properties you are adding. It may be necessary, for memory reasons, to create a new version of the hiv record, but we have thus far avoided doing this.
The other possiblity is that the program you are using is issuing commands to the IOC so fast that a race condition develops.
Fix:
If using your own code, try to slow down the rate at which commands are
issued to the IOC. A good way of doing this is to change one field at a time,
then verify that it has been changed before changing the next field.
Detailed Description:
After issuing a command that changed a large group of HV properties, the
IOC would report Access Faults in one or more of the 'scan' tasks that
sweep the EPICS records periodically (note that the hiv records are
processed both periodically and are also processed passively -- it depends
upon whether the fields in question are being read or set).
We uncovered this problem when testing 1469 modules for the first time. At
that point, we were unaware of the 'Too Many Records' problem, and believed
that the trouble was based solely on a race condition developed by the
HV test_stand code. Now it seems likely that the 'Too Many Records'
problem may have contributed as well.
Our Solution:
We added a section of code to those subroutines that change the HV properties
in large groups. The new section of code forces the program to verify
that each change is made before changing the next one. Unfortunately, this
slows down the code by quite a bit. The code sections can be disabled
by undefining the variable _GROUPVERIFY_ in hv_group.cc. Now that the
'Too Many Records' problem has been discovered and remedied, it may be possible to
remove/disable these verification routines.
Fix:
One of the defined time outs in HiV.h, ASYNC_TIME_OUT, needs to be
increased. This will prevent records from being given an Undefined
status. The other TIME_OUTs should be examined to see if they are
causing similiar problems.
Detailed Description:
Previously, this condition did not yield any sort of warning or
error message. When a program tries to access uninitialized fields,
it can react strangely. A common reaction is 'scan' tasks
having Access Faults and being suspended. Also, MEDM will not
be able to display some fields.
Our Solution:
We increased ASYNC_TIME_OUT from 60 to 300. This seemed to
be enough for the 700+ records we were testing with. We have noticed
no adverse effects from the increase. We have also added an
error message to the code, should the timeouts occur again.
Fix:
Update your version of hv2db, or comment out the lines:
# if ($nrec == 501) {
# print OUTPUT "}\n";
# close(OUTPUT);
# print "Database continued in $ARGV[0]_2.db!!!!\n";
# open(OUTPUT,">$ARGV[0]_2.db");
# print OUTPUT "database($ARGV[0]_2) { nowhere() {\n";
# print OUTPUT "}\n";
# }
in your hv2db file. Rebuild your .db file with the new hv2db script.
You could also just make sure that the second .db file, phhv_2.db, was loaded on to the IOC with the original phhv.db.
Detailed Description:
The hv2db Perl script reads .dat files and outputs an EPICS .db file.
In an older version of EPICS, the DB was loaded with a binary file, and
took so long that spawned tasks on the IOC would time out. The people at
CEBAF thus limited their .db files to 500 records or less. The remaining
records were placed in a file called phhv_2.db. A problem arose for large
databases when this second file was not loaded onto the IOC and thus, many
records simply weren't present in EPICS.
Our Solution:
The code that split the records into two files was obsolete and therefore
was removed.
Fix:
There are 3 general ways to deal with this:
Detailed Description:
The MEDM csh functions and others like them send a series of commands
to the IOC to executed, which are buffered. If the IOC runs out of buffer space, you get
this problem.
Our Solution:
We are currently looking into purchasing more memory for the IOC. For now, the
dbPvdTableSize has been increased to 1024.