DEAD_SUN (Nov96)       furnace

NAME

dead_sun --- procedure for switching to the backup workstation

DESCRIPTION

This help file describes the simple procedure for switching to the backup sun workstation (dorado) if the primary workstation (crater) should fail.

WARNING NOTES

Do not panic and switch too hastily. Switching workstations makes a mess of the oven data by breaking image files into several pieces. If you make a mistake, the database files could also be damaged.

Neither Pilot nor Pilot2 should be running oven on the other workstation (crater or dorado) at the same time. This is not impossible, but it is rather dangerous --- you could destroy the other database if you make a mistake.

Make sure the other workstation is functional before you get too involved in this process.

Note that pilot's home directory is physically on crater!/d0. So that if crater dies, pilot cannot login to any machine anywhere. Similarly pilot2's home directory is physically on dorado!/d0, so pilot2 is able to login and run the oven from dorado if crater dies. (Also remember that pilot has a non-home directory on dorado!/d0/ and vice-versa that pilot2 has a non-home directory on crater!/d0/.)

STEP 0: Call for help

If a workstation has died, or even appears to have died, you need lots of help. Start with J. Hill and S. Schaller.

STEP 1: logoff of crater (as pilot)

This is assuming that crater is not already dead. Try to kill the oven daemons if you can; otherwise they may continue to read the error messages.

STEP 2: login to dorado (as pilot2)

Kill whatever is running on dorado; then follow the normal instructions in pilot_login or pilot_short. Be careful with the oven parameters because they might be set to "readonly". This login will possibly involve creating shared memory and possibly killing old daemons. Be careful not to load database from disk until step 4.

STEP 3: save the database backup

The last updated backup database file (with stale clocks) should be located in "dorado!/d0/pilot/". In the blue window on dorado, "copy /d0/pilot/database /home/pilot2/database". Also in the blue window on dorado, "copy database database_save" to make sure you don't lose the old database file when reading database from oven.

STEP 4: read database from oven

This should give you the latest database file with current clocks. Do NOT do this step unless you are sure that the problem was with crater and not with the on-board computers.

FINISHED

You should now be piloting the oven again. If not, call for more help. Some additional supplementary information is given below, but it should not be needed under normal circumstances.

DATABASE AUTOMATIC BACKUP

If the IRAF environment variable "backup" contains a valid path, the oven program should write a backup copy of database in that directory whenever a change to the database is made. For pilot running on crater, the command to set the backup path is "set backup = dorado!/d0/pilot/" (usually found in the loginuser.cl file). This provides a copy of the database on another machine in case of a disk or CPU problem on crater. Copies of the last 9 changes to the database are always maintained in the pilot's working directory. These automatic backups can get overwritten very quickly, so the pilot should always know the location of a "safe copy" of the database. The logbook should also carry sufficient information for the pilot to be able to reconstruct the clocks at any time (but don't change the clock time manually unless you have qualified assistance).

ON-BOARD

Usually we run the oven VxWorks software from EPROM. The EPROM version of the software can reboot without the Sun, and can get oven database information from other on-board computers or from non-volatile RAM. The typical boot setup looks like this:

> bootChange
 
'.' = clear field;  '-' = go to previous field;  ^D = quit
 
boot device          : ln 
processor number     : 0 
host name            : crater 
file name            : /opt/vxworks/oven0v1/vxWorks 
inet on ethernet (e) : 128.196.32.11:ffffff00 t (g)     : 
user (u)             : vwuser 
ftp password (pw) (blank = use rsh): 
flags (f)            : 0x0 
target name (tn)     : oven0v1 
startup script (s)   : /opt/vxworks/oven0v1/startup.cmd 
other (o)            : 
 
value = 0 = 0x0

But, if you are using VxWorks which boots from disk, you will also need to change the boot parameters of the on-board computers before they can successfully reboot. To change the boot parameters, interrupt the auto-boot sequence and type "c" for change. If you are changing the network address of the VME computer, be sure to reboot it again and verify the name before connecting it to the network.

The typical network boot parameters look like:

-> bootChange
 
'.' = clear field;  '-' = go to previous field;  ^D = quit
 
boot device          : ln 
processor number     : 0 
host name            : dorado 
file name            : /opt/vxworks/oven1v0/vxWorks 
inet on ethernet (e) : 128.196.32.13:ffffff00 
inet on backplane (b): 0 
host inet (h)        : 128.196.32.1 
gateway inet (g)     : 
user (u)             : vwuser 
ftp password (pw) (blank = use rsh): 
flags (f)            : 0x0 
target name (tn)     : oven1v0 
startup script (s)   : /opt/vxworks/oven1v0/startup.cmd 
other (o)            : 
 
value = 0 = 0x0

The relevant network computers are: (from /etc/hosts)

128.196.32.1    dorado.as.arizona.edu dorado
128.196.32.2    libra.as.arizona.edu libra
128.196.32.3    crater.as.arizona.edu crater
128.196.32.4    ncdxtb4.as.arizona.edu ncdxtb4
128.196.32.5    ncdxte3.as.arizona.edu ncdxte3
128.196.32.6    vwlog4.as.arizona.edu vwlog4
128.196.32.7    vw5.as.arizona.edu vw5
128.196.32.8    afone.as.arizona.edu afone
128.196.32.9    aftwo.as.arizona.edu aftwo
128.196.32.10   oven0v0.as.arizona.edu oven0v0
128.196.32.11   oven0v1.as.arizona.edu oven0v1
128.196.32.12   oven0v2.as.arizona.edu oven0v2
128.196.32.13   oven1v0.as.arizona.edu oven1v0
128.196.32.14   mmtcell.as.arizona.edu mmtcell
128.196.32.15   xcell.as.arizona.edu xcell
128.196.32.16   lw4.as.arizona.edu lw4
128.196.32.19   mlpc10.as.arizona.edu mlpc10
128.196.32.20   mlpc11.as.arizona.edu mlpc11
128.196.32.21   mlpc12.as.arizona.edu mlpc12
128.196.32.22   mlpc13.as.arizona.edu mlpc13
128.196.32.23   mlpc14.as.arizona.edu mlpc14
128.196.32.24   mlpc15.as.arizona.edu mlpc15

BUGS

RELATED MIRROR HELP TASKS

oven, ovenb, ovenp, ovend, ovene, oveng, ovenw, ovenc, ovenr

RELATED FURNACE HELP TASKS

pilot_login, pilot_short, simul_login

RELATED IRAF HELP TASKS

jobs, kill, spy, lpar, unlearn, flpr, prc, cursors, help

RELATED UNIX MAN PAGES

xterm, ps, grep, ipcs, ipcrm, openwin, kill, w, netstat, X11, pstat, saoimage