- 5 -
running on the server accepts mouse tracking and button data from the client and returns to the client a few
frames per second of software compressed video. There are two client processes. One process decom-
presses and displays the video, the other tracks a mouse in the video window and forwards that information
to the server. The video is sent using UDP packets; the mouse tracking information is sent using TCP
packets.
The software used to send the video was derived from the widely used network video program - nv
[2, 3]. This program is widely used for videoconferences on the Internet, particularly on the multicast back-
bone (MBONE) [MBONE]. A Tk/Tcl [4] interface was added to standard nv to allow only an authorized
user to control the car. The authentication for this user is via a challenge-response password. If the car’s
control computer is on a multicast backbone, others may watch the video while the authorized user drives
the car. This feature comes free by using nv as the basis of the video transmission software.
Several modifications were made to the nv program in order to improve its video compression rate,
while still retaining interoperability with standard nv programs. First, the change from YUV format video
to the internal representation used by nv is now done only for blocks that need to be updated, rather than for
all blocks as in the original program. Second, the central loop used for the conversion has been unrolled.
Third, this loop was optimized for our RISC workstation by using word rather than byte operations wher-
ever possible. With these speedups, given unlimited network bandwidth, the compression speed for scenes
with comparable motion increases from about 1.5 frames/sec to 15 frame/sec.
5. Experience Driving The Car
The difference between driving the car over the network and driving the car directly from the control
computer is that the video changes from full NTSC video to a few frames per second of compressed video.
Surprisingly, at least for driving the car around the narrow corridors of our facility, this is of little practical
consequence. The reason is that the car has a high power to weight ratio and so one needs to pulse the
throttle by pulsing the mouse button. Thus, the best way to drive the car is to point the wheels, pulse the
throttle for a fraction of a second and then see what progress has been made. For this operation one really
only uses a few frames per second of information: all of the intermediate frames are a waste. This is quite
unlike video conferencing over the Internet where the slow frame rate is very disconcerting and, in fact, the
video is sometimes more of an annoyance than a help.
We have found that pointing the mouse in the video window is an effective way to drive the car. We
have calibrated the steering so that to drive to an object, the driver merely has to point to the object and
click on it. This has made it possible for users to gain a feel for driving the car within minutes: on several
occasions, we have invited people walking down the hall to play with the car, and in every case, they were
able to do so. The use of a mouse to drive seems to be not too big a problem. We can, alternatively, point
by looking at the window with a "head mouse" that we have developed - here, using a Polhemus sensor
mounted on a bicycle helmet allows the computer to move the mouse to wherever the user’s head points to.
While it is in some ways more natural to drive the car with a driver’s eye view of the world, it is actu-
ally more difficult to drive it this way than it is to drive the car in the conventional manner of looking down
on the car and its environment while controlling it with a hand held radio with joystick controls. This is
true even when we operate the car locally and have full motion video available. There are, we believe,
three reasons for this, though the last one is dominant.
First, the radio control’s joystick is more natural to use than a mouse for car control applications,
because it provides proportional tactile (force) feedback that a mouse does not.
Second, vision driving the car is monocular rather than stereoscopic. For a moving vehicle in an
ordinarily scaled world this makes little difference. We get depth clues from having moving views and
from the scale of objects in the scene. Stereoscopic vision is easy to achieve via an optical lens attachment
and an optical viewer in front of the terminal.
Third, the camera is in a fixed position relative to the car. (We are in the process of rectifying this.
The camera is small enough to be turned by being directly mounted on standard model servos. If one con-
nects two servos at right angles to each other and mounts the camera on those a panning and tilting through
about 90 degrees is simply achieved. One does need to use a four channel radio for this and modify the
control board and software to deal with the four channel setup.) Because one cannot look around at will
Comentarios a estos manuales