I ask this in interviews sometimes. Not as a gotcha — as a conversation starter. “Implement ls.” Four words. The answer tells you a lot about where someone’s mental model of the OS ends.

Most candidates say opendir and readdir. That is the correct answer for production code. If I am writing an application I will use readdir too, because I am not trying to be clever, I am trying to list a directory. But I usually follow up: what does readdir actually do? That is where things get interesting.

What ls Actually Does

At the syscall level, listing a directory means calling getdents(2). opendir and readdir are libc wrappers — readdir calls getdents, interprets the kernel buffer, and hands you one struct dirent at a time. The kernel populates that buffer by reading the on-disk directory structure through the filesystem driver.

The part worth knowing: getdents reads from the on-disk directory entry directly. It does not go through VFS name lookup. lstat(2), open(2), stat(2) — those all go through namei, the VFS path resolver. getdents does not. That distinction is irrelevant in normal operation and very relevant when a filesystem is misbehaving. More on that in a moment.

The On-Disk Entry

Each entry in the getdents buffer is a struct direct:

struct direct {
    uint32_t d_fileno;                   /* inode number */
    uint16_t d_reclen;                   /* length of this record */
    uint8_t  d_type;                     /* file type */
    uint8_t  d_namlen;                   /* length of d_name */
    char     d_name[FFS_MAXNAMLEN + 1];  /* name, null-terminated */
};

d_reclen is the length of the entire record including padding. Entries are variable-length — you walk the buffer by advancing i += dir->d_reclen, not by sizeof. This trips people up. Iterating by size will work on a clean filesystem and silently corrupt your walk on one with unusual alignment. Use d_reclen.

d_type maps to the usual file type constants: DT_REG (8), DT_DIR (4), DT_LNK (10), DT_CHR (2), DT_BLK (6). DT_UNKNOWN (0) means the type was not determined — some filesystems cannot know the type without additional I/O (a stat(2) call), so returning DT_UNKNOWN is architecturally valid, not a bug. Never assume d_type is populated.

d_fileno is the inode number. Inode 2 is always the filesystem root — UFS_ROOTINO is hardcoded to 2 in FFS. If you see a directory entry pointing at inode 2, you are looking at the root of a mounted filesystem.

The Implementation

#include <fcntl.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <dirent.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/dirent.h>

#define FFS_MAXNAMLEN   255
#define BUF_SIZE        4096

struct direct {
    uint32_t d_fileno;
    uint16_t d_reclen;
    uint8_t  d_type;
    uint8_t  d_namlen;
    char     d_name[FFS_MAXNAMLEN + 1];
};

int main(int argc, char *argv[])
{
    char buf[BUF_SIZE];
    struct dirent *dir;
    int fd, size = 0;

    if (argc < 2) {
        printf("usage: getdents <directory>\n");
        return -1;
    }

    fd = open(argv[1], O_RDONLY | O_DIRECTORY);
    if (fd == -1) {
        printf("open failed\n");
        return -1;
    }

    if (lseek(fd, 0, SEEK_SET) < 0) {
        printf("lseek failed\n");
        return -1;
    }

    while (1) {
        int ssize = getdents(fd, buf, BUF_SIZE);
        if (ssize <= 0) break;
        size += ssize;
    }

    printf("|inode_nr|rec_len|file_type|name_len(name)|\n");
    for (int i = 0; i < size; ) {
        dir = (struct dirent *)(buf + i);
        printf("#: %6lu, %5u, %5u, %u (%.*s)\n",
               (unsigned long)dir->d_fileno,
               dir->d_reclen,
               dir->d_type,
               dir->d_namlen,
               dir->d_namlen, dir->d_name);
        i += dir->d_reclen;
    }

    return 0;
}

A few notes. O_DIRECTORY makes open fail if the path is not a directory — worth having. getdents is called in a loop because a 4 KB buffer may not hold the whole directory; for most directories this runs once, but correctness requires the loop. The buffer is only 4 KB here, which means this truncates large directories silently. Acceptable for a debugging tool; not for ls -la /usr/lib.

Running it against a root partition:

|inode_nr|rec_len|file_type|name_len(name)|
#:      2,    16,     4,  1 (.)
#:      2,    16,     4,  2 (..)
#:      5,    24,     8,  6 (.cshrc)
#:      6,    24,     8,  8 (.profile)
#: 3872128,   24,     4,  3 (mnt)
#: 5315584,   24,     4,  4 (mnt1)

Why This Matters Beyond the Interview

I keep this tool around for a specific reason: when you are working with a corrupted filesystem, getdents and lstat can disagree. lstat goes through VFS name resolution — if something in that path is broken, it returns ENOENT. getdents reads the raw directory entry and does not care about any of that.

# ls -alh /mnt1
ls: /mnt1: No such file or directory

# stat /mnt1
stat: /mnt1: lstat: No such file or directory

# ./getdents /
...
#: 5315584,   24,     4,  4 (mnt1)

mnt1 is right there on disk. VFS just cannot see it because the root inode of the mounted filesystem was zeroed out by a fuzzer. That disagreement was the starting point for a longer debugging session on FFS mount failures.

Back to the interview: candidates who get here — who know that readdir calls getdents, know what struct dirent looks like, and can reason about when the VFS abstraction breaks down — are the ones I want to keep talking to. Candidates who implement ls by calling system("ls") also teach me something useful, just not about filesystems.