Rotator Tape replacement - 8-2021

Over summer downtime in 2021, the rotator tape encoder was replaced. I was asked the make the necessary software changes to run with the new tape.

This ought to simply be editing two values in the file "config.h" in the mount software. The trick is determining what these two values ought to be. In addition, there is bound to be extra testing and fiddling if the tape heads are not aligned as they should be. Another concern is that if the software is fired up with the old offset values, the rotator may take off to hell and gone to try to position itself at what it thinks to be position zero.

It turned out that absolutely none of the following was required! I had not expected this, but apparently Heidenhain makes each tape identical to the last (they don't just chop so many feet off of supply on a reel like I assumed). So a new tape can be installed without any recalibration. Will also has records that a tape was installed in 2007 - an event I have no memory of, and this would make sense if nothing in software needed to be changed.

Where is the source code and how is it compiled

First here is the short version: The "make clean" is absolutely required. The Makefile for the mount code does not handle the dependencies as it should.

But before you do just that, read and understand all that follows:

I do all of my building on the machine "hacksaw". This is only expected to work for user "tom", so these instructions may mention macros or scripts that will need to be sought in /home/tom/bin -- but there ought to be very little of that. I have a "cdm" alias that takes me to the directory with the mount code. One thing that is important is a script "vxmake" that is in /home/tom/bin and runs make after defining environment variables needed by the cross compiler.

Someone, at some time, apparently checked all of this code into a git repository, but I don't actually make use of that and was not included in whatever was done in that way, so caveat emptor.

I have a macro "cdm" that takes me to the directory "/mmt/vxsource/mmt/mount/src". I edit files and type "make" to rebuild the code in this directory. Because I am lazy and careless I actually do this since my Makefile does not have a proper full set of dependencies.

make clean ; make

This is my approach to any and all projects -- have a directory with the source files and typing "make" should rebuild it all. Building in this directory in no way affects the running code, so I feel free to edit, compile and fool around in this directory. When I am ready to actually deplay the code, something else is required.

Some notes about files in /mmt

The source files and build environment is kept in /mmt/vxsource. The /mmt directory is NFS mounted from our buggy NAS server box. For reasons I don't understand, the timestamps on files in this directory are totally haywire. For example I just rebuild the mount code (in August of 2021) and it shows:
ls -l rot*
-rw-rw-r-- 1 tom mmtsoft 34250 Jan 16  2001 rotator.c
Once again, caveat emptor.
This is what you get when you buy NAS boxes from the lowest bidder.

Deploying a new version of the mount code

First I build a version of the mount code I am satisfied with in the "src" directory, as discussed above. Then I "cd .." which places me in the directory "/mmt/vxsource/mmt/mount". The Makefile in this directory is what is used to put the compiled mount code where it will be picked up by the next mount reboot.
All that is required is:
make mount
The mount crate boots from hacksaw and looks for stuff in:
/home/vwuser/Vwstuff/Mount
Typing "make mount" will do 3 important things: For example, after typing "make mount", I see:
cd /home/vwuser/Vwstuff
ls -lrt
Aug 18 14:26  Mount_devel -> /home/vwuser/Vwstuff/Mount_versions/v1.94.210818_142646
Aug 18 14:26  Mount
Note that this in no way affects the Mount_stable link. See below for that.

Backup versions

Every version of the mount code that has ever been built and deployed is kept in the Mount_versions directory (which is actually a symbolic link to "/mmt/vxsource/mmt/mount_versions/Mount_versions". This directory contains a bunch of timestamped "snapshot" directories. Any one of these could be copied to Mount. So the "Mount" directory is ephemeral -- its contents can change at any time and it is expected that it will be deleted and replaced with other contents at various times.

If you start wondering just what version is in the Mount directory, cat the file "TIMESTAMP". If you are running the most recent code, you will see something like:

v1.94.201022_134412
This will match (within a few seconds) the name of some directory in Mount_versions. The TIMESTAMP file in that directory will match exactly.

"retro" versions

We have a script named "retro" in /mmt/scripts. The idea behind this script is to make it easy (trivial) for the operators to revert to some proven version of the mount code if a new version is causing trouble. This allows them a fallback in the event the cognizant engineer cannot be reached, or whatever. The operator simply types "retro mount", then reboots the mount crate and gets on with things.

The retro scheme is based on two links in the /home/vwuser/Vwstuff directory, namely:

Mount_devel -> /home/vwuser/Vwstuff/Mount_versions/v1.94.201022_134414
Mount_stable -> /home/vwuser/Vwstuff/Mount_versions/v1.94.200903_151348
Typing "retro mount" will delete everything in "Mount" and replace it with the contents of "Mount_stable".

Mount_stable must be maintained by hand, i.e. by doing something like:

cd /home/vwuser/Vwstuff
rm Mount_stable
ln -s /home/vwuser/Vwstuff/Mount_versions/v1.94.201022_134414 Mount_stable
ls -lrt
The idea here is that the stable version may remain stable for some time while multiple development versions come and go. Nothing is ever designated as a stable version automatically, it is done by hand after long deliberation and careful consideration.

How does the darn rotator work?

The observant reader may have noticed that up to now we have said nothing whatsoever that pertains specifically to the rotator. However, without being in firm command of what has been discussed so far, we dare not make any changes to the mount software.

The rotator has two heads that read the same tape at locations that are a nominal 180 degrees apart. We can run with either or both heads and have long been running with just one (I believe the NE head, as the SW head had long refused to allow us to calibrate using it). We use a pair of lm628 chips to read the tape. One is for the NE head and one is for the SW head.

The tape has nice evenly spaced lines to give relative position. The lm628 starts a counter at zero at whatever position it is at when the system starts up, and counts up or down as the rotator moves. Marks are detected by an "index" input to the lm628. This could interrupt the processor, but we don't do things that way. We call a function at 100 Hz and check a status register in the lm628 to find out it we are currently on a mark. If we are, we record the count position. Search the code for "INDEX_BIT" to see where we detect marks.

A person might wonder how reliable this is, and if we might miss marks. Do we have to move slowly so as not to miss marks? The lm628 actually has a "capture" facility. It sets a status bit when a mark is detected and latches the counter value so it can be read out later. So the only way to miss a mark is to encounter a second one before the first has been read out.

We only care about marks during initialization when we jog the tape back and forth in order to harvest enough marks to calculate absolute position. Once we have done that, we ignore the marks and just let the lm628 counter register keep track of position.

Show me

Here is a tip to anyone who comes along someday and needs to deal with my code for VxWorks. I like to set up "show" routines for each subsystem I am working with. These are C functions with no arguments that display interesting and important aspects of whatever we happen to be working with. So a "grep" for show in rotator.c serves to remind me of what is available:
show_marks -- displays all marks harvested thus far.
tape_show -- show current readings from the two tape heads.
rot_show -- some overlap with the above and current position.
Once you get to the VxWorks shell, you type in any of the above whenever you care to. They should always be harmless and yield information.

tape_show

This looks invaluable for what we are doing. It displays data from both the NE and SW head as follows: Do we (or should we) count marks separately for the NE and SW heads? Good question. We gather them all up at once in the same scan, but flag each captured mark as to which head it came from. Then later when we calibrate one head or the other, we consider only the marks that came from that head.

What I see in the code for position reporting is this:

    printf ( "NE tape position: %d + %d --> %8.3f\n", rpos, tape_off_ne, dpos );
    printf ( "NE published tape position: %8.3f\n", tcs.rot_tape_ne_deg );
I added at line to this as follows. The "abs" value would be the new tape offset if the rotator was positioned at 0.0.
    printf ( "NE tape position: %d + %d --> %8.3f\n", rpos, tape_off_ne, dpos );
    printf ( "NE tape abs, off: %d , %d\n", apos, TAPE_ZERO_NE );
    printf ( "NE published tape position: %8.3f\n", tcs.rot_tape_ne_deg );
Running the old code with the new tape 8-20-2021, this yields:
-> tape_show
Rot-tape lm628 NE status: a08
Rot-tape lm628 SW status: c1
NE tape position: -231648 + 2176 -->   27.087
NE published tape position:   27.075
SW tape position: 0 + 2257 -->   -0.266
SW published tape position:   -0.266
4 marks captured
-> rot_show
Rot lm_rpos: -256335
tNE lm_rpos: -256335
tSW lm_rpos: 0
Rot ipos: 254160
Rot incr_offset(zpos): 2175
Rot curp: 254159
Rot comp: 254150
Rot com_derot: 30.000
Rot cur_derot: 30.001
Rot closed on tape encoder
So, we display 4 different values for the tape position, let's look at each of these.

rpos - This is the raw count from the lm628, so it is the distance in tape counts from wherever the system started up.

dpos - This is a position given as a floating point number in degrees.

Here is the code to calculate dpos, note the sign change.

	if ( who == TAPE_NE ) {
            pos = -(lm_rpos ( tape_lm628_ne ) + tape_off_ne);
        } else {
            pos = -(lm_rpos ( tape_lm628_sw ) + tape_off_sw);
        }

        dpos =  pos / TAPE_CPD_LM628;
        if ( dpos > 180.0 ) dpos -= 360.0;
        if ( dpos < -180.0 ) dpos += 360.0;
        return dpos;
The variable "tape_off_ne" incorporates both the adjustment to get from the random count from zero based on starting up who knows where (we get this from the tape marks) and the value in the config file that specifies what absolute tape value we want to call zero. The current values in the config.h file are:
#define TAPE_ZERO_NE    (  -594471 )
#define TAPE_ZERO_SW    ( -2119488 )
These are used in tape_cal() as follows:
tape_off_ne = ne_pos - ne_index - TAPE_ZERO_NE;
The two variables, ne_pos and ne_index are returned by the routine tape_calpos() and give the raw reading and the absolute reading that corresponds to that same raw reading. This is for the "far" (i.e. most negative) mark of the pair used for calibration. The routine uses the first suitable pair it finds and calls it good.

Given this, suppose we now get a raw position count from the lm628. For a moment, let TAPE_ZERO_NE be zero (just ignore it). We convert the raw count to an absolute value by calculating:

    abs = raw - (ne_pos - ne_index);
    abs = raw = tape_off_ne;
Either of the two above expressions yield the same value. Now let us put the rotator at the position we would like to call zero. Now let us record the "abs" value change its sign, and call it TAPE_ZERO_NE. Now we can calculate a proper zero referenced absolute value as:
    abs = raw - (ne_pos - ne_index - TAPE_ZERO_NE);
    abs = raw - tape_off_ne;
You might quibble about the sign we attach to this "zero offset", but since we lump it into tape_off_ne and have already decided to subtract it, it all works out. If it bugs you, you could change the sign in the config.h file and in the expression above, but I chose to just leave it be.

Rotator startup

Some initialization gets done when the mount code boots, but the interesting action happens later when the "drive" gets turned on.

Much of this is handled in the file axis.c, and a key routine is turn_derot_drive_on(). This is called within the routine start_motor() in axis.c The code there looks like:

	printf ( "start motor: %s drive on\n", ap->name );
	turn_derot_drive_on();
	printf ( "start motor: %s calibrate\n", ap->name );
	calibrate_rotator_first ();

The first call (turn_derot_drive_on()) simply sets the bit in the 26 volt interlock rack to enable power to the rotator servo amplifiers. The second routine takes us, as expected, to the file rotator.c. There is code to take a short cut on subsequent rotator startups and use (and trust) previous calibrations, as well as ways to override that, although those are tricky and virtually never used.

The action after that is driven by a state machine in the routine tape_func() (which gets called at 100 Hz by the servo task. It causes motion by setting values in the vatiable:

#define TAPE_CAL_DEL		( 2.50 * DEGTORAD )

    axis[DEROT].cur_slew_rad = tape_cal_start + TAPE_CAL_DEL;
    axis[DEROT].cur_slew_rad = tape_cal_start - TAPE_CAL_DEL;
    axis[DEROT].cur_slew_rad = tape_cal_start;
The code above is how the tape "jiggle" is produced. The starting position is read, then moves are done relative to that, an finally we return to where we started. The amount of the wiggle is 2.5 degrees plus and minus from the starting position.

Importantly, no large motion of the rotator should ever be expected at startup. The drive should start and hold position, then the jiggle motion should be 2.5 degrees around that starting position. Any large motion would be a runaway and would indicate lack of encoder feedback. Note importantly that we currently close the servo around the tape encoder.

Runaway detection and handling

There are two levels of checks, dubbed the E check and F check.

Some important background first. We have two lm628 for the rotator. These are in an IP-servo module on our PCI IP carrier board. CHIP0 is used for the NE head, and CHIP1 is used for the SW head. CHIP0 is also used to run the servo (i.e. provide the signal to the motor). We have run for a long time without the SW head. It would only be used for calibration if operational.

E check is done entirely by our software. If we have had a non-zero servo velocity and have seen no motion for some amount of time, we call it a runaway and this is the E check.

The E check time is 1 second (SERVO_HZ ticks of the servo clock) and the threshold for considering the velocity non-zero is ap->echeck_vthresh. This variable is set to 400. (For the Alt and Az axes it is set to 100000 so the E check never happens).

F check is done by the lm628. The lm628 has two parameters that can be set, LPEI and LPES. We set LPES. The difference is that LPES will "stop" the motor (by setting the DAC to what it thinks is a zero value, i.e. mid range). LPEI only sets a status bit. LPES sets the same bit. So they work exactly the same, but LPES does the extra thing of zeroing the motor command. It is up to us to check the bit and do anything else we want (such as killing power to the amplifier and reporting the runaway).

The lm_lpes() routine in the lm628 driver, sets the parameter. It gets set to ap->epos. This variable is set to 2000 in axis.c (an old comment says it is set to 0x3ff, which would be 1024 counts). This value was selected in 2004 and the comment says "aggressive".

According to a comment in tcs_loop.c we see the bit LMS_OFF get set on an EPOS event, not the EPOS or CERR bit. We watch for this and report an F check if we see it set.

Note that both echeck and fcheck can be enabled or disabled via the network protocol. The command would be "fcheckrot on/off". This is done via the bit AF_FCHECK in ap->flags. It would be interesting to inspect this flag to verify it is active.