DEAD_SUN (Nov96)       furnace


dead_sun --- procedure for switching to the backup workstation


This help file describes the simple procedure for switching to the backup sun workstation (dorado) if the primary workstation (crater) should fail.


Do not panic and switch too hastily. Switching workstations makes a mess of the oven data by breaking image files into several pieces. If you make a mistake, the database files could also be damaged.

Neither Pilot nor Pilot2 should be running oven on the other workstation (crater or dorado) at the same time. This is not impossible, but it is rather dangerous --- you could destroy the other database if you make a mistake.

Make sure the other workstation is functional before you get too involved in this process.

Note that pilot's home directory is physically on crater!/d0. So that if crater dies, pilot cannot login to any machine anywhere. Similarly pilot2's home directory is physically on dorado!/d0, so pilot2 is able to login and run the oven from dorado if crater dies. (Also remember that pilot has a non-home directory on dorado!/d0/ and vice-versa that pilot2 has a non-home directory on crater!/d0/.)

STEP 0: Call for help

If a workstation has died, or even appears to have died, you need lots of help. Start with J. Hill and S. Schaller.

STEP 1: logoff of crater (as pilot)

This is assuming that crater is not already dead. Try to kill the oven daemons if you can; otherwise they may continue to read the error messages.

STEP 2: login to dorado (as pilot2)

Kill whatever is running on dorado; then follow the normal instructions in pilot_login or pilot_short. Be careful with the oven parameters because they might be set to "readonly". This login will possibly involve creating shared memory and possibly killing old daemons. Be careful not to load database from disk until step 4.

STEP 3: save the database backup

The last updated backup database file (with stale clocks) should be located in "dorado!/d0/pilot/". In the blue window on dorado, "copy /d0/pilot/database /home/pilot2/database". Also in the blue window on dorado, "copy database database_save" to make sure you don't lose the old database file when reading database from oven.

STEP 4: read database from oven

This should give you the latest database file with current clocks. Do NOT do this step unless you are sure that the problem was with crater and not with the on-board computers.


You should now be piloting the oven again. If not, call for more help. Some additional supplementary information is given below, but it should not be needed under normal circumstances.


If the IRAF environment variable "backup" contains a valid path, the oven program should write a backup copy of database in that directory whenever a change to the database is made. For pilot running on crater, the command to set the backup path is "set backup = dorado!/d0/pilot/" (usually found in the file). This provides a copy of the database on another machine in case of a disk or CPU problem on crater. Copies of the last 9 changes to the database are always maintained in the pilot's working directory. These automatic backups can get overwritten very quickly, so the pilot should always know the location of a "safe copy" of the database. The logbook should also carry sufficient information for the pilot to be able to reconstruct the clocks at any time (but don't change the clock time manually unless you have qualified assistance).


Usually we run the oven VxWorks software from EPROM. The EPROM version of the software can reboot without the Sun, and can get oven database information from other on-board computers or from non-volatile RAM. The typical boot setup looks like this:

> bootChange
'.' = clear field;  '-' = go to previous field;  ^D = quit
boot device          : ln 
processor number     : 0 
host name            : crater 
file name            : /opt/vxworks/oven0v1/vxWorks 
inet on ethernet (e) : t (g)     : 
user (u)             : vwuser 
ftp password (pw) (blank = use rsh): 
flags (f)            : 0x0 
target name (tn)     : oven0v1 
startup script (s)   : /opt/vxworks/oven0v1/startup.cmd 
other (o)            : 
value = 0 = 0x0

But, if you are using VxWorks which boots from disk, you will also need to change the boot parameters of the on-board computers before they can successfully reboot. To change the boot parameters, interrupt the auto-boot sequence and type "c" for change. If you are changing the network address of the VME computer, be sure to reboot it again and verify the name before connecting it to the network.

The typical network boot parameters look like:

-> bootChange
'.' = clear field;  '-' = go to previous field;  ^D = quit
boot device          : ln 
processor number     : 0 
host name            : dorado 
file name            : /opt/vxworks/oven1v0/vxWorks 
inet on ethernet (e) : 
inet on backplane (b): 0 
host inet (h)        : 
gateway inet (g)     : 
user (u)             : vwuser 
ftp password (pw) (blank = use rsh): 
flags (f)            : 0x0 
target name (tn)     : oven1v0 
startup script (s)   : /opt/vxworks/oven1v0/startup.cmd 
other (o)            : 
value = 0 = 0x0

The relevant network computers are: (from /etc/hosts) dorado libra crater ncdxtb4 ncdxte3 vwlog4 vw5 afone aftwo oven0v0 oven0v1 oven0v2 oven1v0 mmtcell xcell lw4 mlpc10 mlpc11 mlpc12 mlpc13 mlpc14 mlpc15



oven, ovenb, ovenp, ovend, ovene, oveng, ovenw, ovenc, ovenr


pilot_login, pilot_short, simul_login


jobs, kill, spy, lpar, unlearn, flpr, prc, cursors, help


xterm, ps, grep, ipcs, ipcrm, openwin, kill, w, netstat, X11, pstat, saoimage