Setting up a server to host VxWorks

Introduction

March 4, 2013

The best thing about VxWorks is that it does NOT try to be a self sufficient and self contained operating system. The typical VxWorks system is diskless and boots over the network from some server. This short note is a checklist of items that need to be set up for this scheme to work.

We have been using linux machines as servers to boot VxWorks systems for many years with complete success. These notes have recently been updated for Fedora 18. We still run vxWorks 5.3.1, but these instructions may have relevance for more modern vxworks installations.

dhcp

If you are running an intel based vxworks client, the first thing you will probably want to do is to get dhcp running on the server. For Motorola clients, you can skip this section (and the one on tftp).

Our preference for a network card to use with Intel clients is the Intel eepro100 card (we favor it because it is the only 100 Mbit card supported under our old VxWorks release for the pc486 target). Many eepro100 cards have a pxeboot prom on the card and with some fiddling, you can arrange for this to boot vxWorks. This is nice, because it avoids any need to burn new boot proms.

To get the dhcpd software and set it to running, you do:

yum install dhcp
systemctl enable dhcpd.service
systemctl restart dhcpd.service

You need to open up UDP port 68 on your firewall. I find that this is the only port I need to open up, although if you do some reading you will find that port 67 is used to send data to the server and port 68 is used to send data to the client. Also note that this pertains to BOOTP (which we are not using), as well as DHCP; perhaps this explains the situation.

You won't have much luck with this unless you set up the dhcpd.conf file first. This used to be /etc/dhcpd.conf, but it is now located at /etc/dhcp/dhcpd.conf (just to make life interesting I guess).

The first thing to do is to get the MAC address for the eepro100 and make an entry in dhcpd.conf on the server.

The dhcpd software is quite unhelpful when things are wrong with the configuration file. You can at least run it in the foreground using the following command and hope to get console error messages:

dhcpd -d

tftp

You can skip this for Motorola clients.
yum install tftp-server
systemctl enable xinetd.service
systemctl restart xinetd.service

Open up UDP port 69 in your firewall.

Installing the tftp-server package will also pull in the xinetd package. You edit /etc/xinetd.d/tftp - chaning the disable line from yes to no, and I change the directory to the traditional /tftpboot rather than /var/lib/tftpboot.

systemctl enable xinetd.service
systemctl restart xinetd.service

What I do is to have pxeboot bounce the boot through etherboot, i.e. I let the eepro100 pxe agent pick up an etherboot image (namely eepro100.lzpxe) which I place in /tftpboot. As an aside, I have actually burned etherboot into other network cards (like the 3com etherlink 3) and then used etherboot to boot VxWorks.

I place the vxworks boot image (the thing that normally goes into boot proms for a VME system) into a file I call vxboot.system_name. This file needs to be recompiled with the proper bootline for each system, so you may end up with several vxboot.* files if your server boots several systems. These go into the /tftpboot directory also.

To rebuild the vxboot image, I go to the directory /home/vxworks/VW531/target/config/pc486 and edit the file config.h looking for the definition of DEFAULT_BOOT_LINE. I make the relevant changes here, then issue the command:

vxmake vxboot.nbi
in the same directory. This builds the bootrom_uncmp image and combines it with the nbstart image using the mknbi_vx utility that I threw together to put all this into a format that etherboot recognizes.

If anyone out in cyberspace ends up reading this and gets hot to do this, they will have to get in touch with me to get these things.

Once this is all set up, the /tftpboot directory will contain two files:

eepro100.lzpxe
vxboot.system_name

The VxWorks user

The VxWorks boot image wants to use the "rsh" protocol to obtain the kernel. Running the rsh protocol can be a security issue, and we limit our exposure by setting up a special user for the sole purpose of providing the files needed to boot VxWorks. We use the fictitious user "vwuser", and give him virtually no privileges whatsoever. He just owns the small set of files needed to boot the VxWorks image. It is not a good idea to use an existing real user as the VxWorks user. You can ponder the various security issues.

This user must have appropriate entries in the /etc/passwd and /etc/shadow files.

The .rhosts file

Because we use the rsh/rlogin protocols to boot VxWorks, there needs to be a .rhosts file in /home/vwuser with a very specific set of permissions and ownership. In particular this file must be owned by the user "vwuser", and not be writeable by any other user.

rsh

To install the rsh server, do this:
yum install rsh-server
This also requires xinetd (and will pull it in if needed), but you may have already installed this if and when you installed the tftp server.

You enable the rsh service by editing the file /etc/xinetd.d/rsh. Find the line disable=yes and change it to disable=no. After this:

systemctl restart xinetd.service.

You will also need to open up TCP port 514, as well as the entire range of TCP ports 1011-1023 to use rsh.

At one time I thought that you also needed to enable the rlogin and rexec service to boot a VxWorks system, but this is not the case.

Weird linux rsh bug

This did not give me any troubles when setting up a server using Fedora 18, but here it is for the record.

I have pulled my hair out over what I can only call an intermittent bug in the linux rsh software. The overall symptoms are that rsh to the affected machine does not work. This can be tested from some other machine, or by just attmpting to rsh to localhost. A peek at /var/log/messages will show something like this:

xinetd[30223]: START: shell pid=777 from=::ffff:128.199.101.197
Jul 10 21:21:12 server rshd[777]: Could not allocate space for cmdbuf
Jul 10 21:21:12 hacksaw xinetd[30223]: EXIT: shell status=1 pid=777 duration=0(sec)
The key string in all of this is "Could not allocate space for cmdbuf". This message actually comes from the rsh process. At one point I obtained the source code for rsh and was looking into this, but managed to perform a "rain dance" that solved the problem. ("Rain dance" is a folklore term for trying random things that shouldn't really solve the problem and finding one that works).

Years later this same issue cropped up (just after we disabled the rlogin and rlogin services, which is sort of a red herring, but did seem to trigger the bug).

The solution was to reboot the server, restarting xinetd did not seem to fix the problem. What truly does not make sense about this is that restarting xinetd did not start us from a clean slate, somehow and somewhere "state" is being held that affects the rsh process. A dark mystery.

Feel free to look into this more deeply and let me know what you find.

Firewalls and ports

As mentioned, each service above has one or more network ports that need to be unblocked if you are using a firewall (as you should be). I do this by hand editing the /etc/sysconfig/iptables file, but you may prefer to use some GUI firewall manager.

The ports used by the rsh family of services are:

We have found that we also need to open up a range of ports (we open up 1011-1023) for rsh to connect back to. Typically it uses just port 1022.

Troubleshooting

Take a look at /var/log/messages and perhaps also /var/log/secure.
Have any comments? Questions? Drop me a line!

Adventures in Computing / tom@mmto.org