Plash supports open()
on directories. It supports the use of
fchdir()
and close()
on the resulting directory file
descriptor. However, it doesn't support dup()
on directory FDs,
and execve()
won't preserve them.
Directory file descriptors require special handling. Under Plash,
when open()
is called on a file, it will return a real,
kernel-level file descriptor for a file. The file server passes the
client this file descriptor across a socket. But it's not safe to do
this with kernel-level directory file descriptors, because if the
client obtained one of these it could use it to break out of its
chroot jail (using the kernel-level fchdir
system call).
A complete solution would be to virtualize file descriptors fully, so
that every libc call involving file descriptors is intercepted and
replaced. This would be a lot of work, because there are quite a few
FD-related calls. It raises some tricky questions, such as what bits
of code use real kernel FDs and which use virtualised FDs. It might
impact performance. And it's potentially dangerous: if the changes to
libc failed to replace one FD-related call, it could lead to the wrong
file descriptors being used in some operation, because in this case a
virtual FD number would be treated as a real, kernel FD number.
(There is no similar danger with virtualising the system calls that
use the file namespace, because the use of chroot()
means that
the process's kernel file namespace is almost entirely empty.)
However, a complete solution is complete overkill. There are probably
no programs that pass a directory file descriptor to select()
,
and no programs that expect to keep a directory file descriptor across
a call to execve()
or in the child process after fork()
.
So I have adopted a partial solution to virtualising file descriptors.
When open()
needs to return a virtualized file descriptor -- in
this case, for a directory -- the server returns two parts to the
client: it returns the real, kernel-level file descriptor that it gets
from opening /dev/null
(a "dummy" file descriptor), and it
returns a reference to a dir_stack object (representing the
directory).
Plash's libc open()
function returns the kernel-level
/dev/null
file descriptor to the client program, but it
stores the dir_stack object in a table maintained by libc. Plash's
fchdir()
function in libc consults this table; it can only work if
there is an entry for the given file descriptor number in the table.
Creating a "dummy" kernel-level file descriptor ensures that the file
descriptor number stays allocated from the kernel's point of view. It
provides a FD that can be used in any context where an FD can be used,
without -- as far as I know -- any harmful effects. The client
program will get a more appropriate error than EBADF if it passes the
file descriptor to functions which aren't useful for directory file
descriptors, such as select()
or write()
.