Understanding the Linux "fortune" Utility and How to Read Its .dat Files in Python

Mar 13, 2025

$ fortune
Those who have had no share in the good fortunes of the mighty

Often have a share in their misfortunes.

		-- Bertolt Brecht, "The Caucasian Chalk Circle"

The Linux fortune utility is a classic command-line tool that displays random humorous, insightful, or philosophical messages when executed. It has been a staple of Unix and Linux systems for decades, offering users a touch of humor, wisdom, or inspiration with each command.

In this article, we'll explore how the fortune utility works, how its data is stored, and how you can use Python to read and extract random fortunes from the .dat files that fortune relies on.

What is the `fortune` Command?

Source : Distrotech/fortune-mod | manual: fortune

The fortune command prints a random quotation or aphorism from a predefined database. It is often used for amusement to start a terminal session or display in login shell.

$ fortune
Those who have had no share in the good fortunes of the mighty
Often have a share in their misfortunes.
		-- Bertolt Brecht, "The Caucasian Chalk Circle"

And you will receive a randomly selected fortune from the available databases.

How Does `fortune` Work?

Source code: Distrotech/fortune-mod

The fortune utility selects random text snippets ("fortunes") from a database file. The fortunes are stored in two files per category:

A plain text file (.dat) containing multiple fortune messages, separated by a delimiter (usually %).
A binary index file (.dat file) that holds metadata, including offsets to the different fortune strings within the text file. This allows fortune to quickly retrieve and print a random fortune instead of parsing the entire file.

One can view datafiles used by fortune in `/usr/share` or `/usr/share/games` folder in linux:

[advaeta@vbox tmp]$ ls /usr/share/games/fortune/
art               drugs           humorists            linux.u8           osfortune.dat   pratchett.dat             tao
art.dat           drugs.dat       humorists.dat        literature         paradoxum       pratchett.u8              tao.dat
art.u8            drugs.u8        humorists.u8         literature.dat     paradoxum.dat   riddles                   tao.u8
ascii-art         education       humorix-misc         literature.u8      paradoxum.u8    riddles.dat               translate-me
ascii-art.dat     education.dat   humorix-misc.dat     love               people          riddles.u8                translate-me.dat
ascii-art.u8 ....

Did you notice anything? There are two files with same prefixes eg. “art” and “art.dat” file.

In the sourcode you don’t see *.dat files. So, where do these .dat files or INDEX FILES are come from? (shushhh strfile)

Understanding `strfile` and `.dat` Files

To optimize lookup speed, fortune does not read the text files directly each time it is executed. Instead, it relies on an associated binary index file (with a .dat extension). This index file is generated using the strfile utility.

The `strfile` Utility

man: strfile

strfile is the command-line tool used to create the .dat index files required by fortune. It scans the text file, identifies the positions of each individual fortune (demarcated by a delimiter, usually %), and generates an index file with metadata and offsets.

Using `strfile` to Generate a `.dat` File

To create a .dat file from a text file of fortunes:

strfile fortunes fortunes.dat

The fortunes file contains the actual text entries, each separated by a % symbol.
Running strfile on fortunes generates fortunes.dat, which contains the metadata and offsets.

Once you have both fortunes and fortunes.dat, you can use the fortune command like this:

$ fortune fortunes

This will randomly pick and print a fortune from your custom file.

The generated *.dat file can be customized further storing offsets either randomly, or ordered alphabetically, and an option to ignore case as well. Read the manual for more details.

Now, lets understand the structure of a *.dat file.

Structure of a `fortunes.dat` File

The .dat file is a binary file that contains a header followed by a list of offsets. Each offset marks the start of a fortune in the corresponding .fortune text file.

Header Structure (21 bytes total)

The format of the header is:

#define VERSION 1
unsigned long str_version; /* version number */
unsigned long str_numstr; /* # of strings in the file */
unsigned long str_longlen; /* length of longest string */
unsigned long str_shortlen; /* shortest string length */
#define STR_RANDOM 0x1 /* randomized pointers */
#define STR_ORDERED 0x2 /* ordered pointers */
#define STR_ROTATED 0x4 /* rot-13'd text */
unsigned long str_flags; /* bit field for flags */
char str_delim; /* delimiting character */

# Important
3 : byte for padding.

Breakdown of str_flags:

Can you guess the header size? 21 or ??

Header size is 24. ( 5 * long + 1 * char + 3 padding)

All fields are written in big-endian byte order.

Reading the `fortune` `.dat` File in Python

To extract fortunes from a .dat file in Python, we need to read both the header information and the list of offsets, then use those offsets to locate and print random fortunes from the corresponding .fortune text file.

How to print header info?

full source code : fortuneHeadersReader.py

#######################
# This program is intentionally made ugly to make it easy to understand
########################

    # printing headers: Integer
    headers = ["version", "numstr", "longlen", "shortlen", "flags"]

    for header in headers:
        chunk = datfile.read(4)
        header_int = struct.unpack('>I', chunk)[0] # reading unsigned Integers

        if header == "flags":
            print(f"{header:<8} (str_random): {(header_int & 0x00000001)}")
            print(f"{header:<7} (str_ordered): {(header_int & 0x00000002)}")
            print(f"{header:<7} (str_rotated): {(header_int & 0x00000004)}")
            continue

        print(f"{header:<20} : {header_int:<10}")
    
    # delimiter: 1 byte
    delim_byte = struct.unpack('>B', datfile.read(1))[0] # reading Byte
    print(f"{'Delimiter':<20} : {chr(delim_byte)}")

Python Script to Read the `.dat` File and Print a Random Fortune

Source: fortuneReader.py

def read_fortune_file(dat_file, fortune_file):
    with open(dat_file, 'rb') as df, open(fortune_file, 'r') as ff:
        # Read header (5 unsigned integers + 1 byte delimiter)
        header_format = ">5I"  # Little-endian: 5 unsigned ints (20 bytes total)
        header_size = struct.calcsize(header_format)
        header_data = df.read(header_size)
        
        if len(header_data) != header_size:
            raise ValueError("Invalid .dat file: header size mismatch.")
        
        version, num_str, longlen, shortlen, flags = struct.unpack(header_format, header_data)
        
        # Read the delimiter character
        delim_byte = df.read(1)
        delim_char = chr(struct.unpack('>B', delim_byte)[0])
        
        print("Header Information:")
        print(f"Version: {version}")
        print(f"Number of Fortunes: {num_str}")
        print(f"Longest Fortune: {longlen} characters")
        print(f"Shortest Fortune: {shortlen} characters")
        print(f"Flags: {flags}")
        print(f"Delimiter: {delim_char} (ASCII: {ord(delim_char)})")
        # Remeber: 3 bytes for padding : len(header) == 24



        # Pick a random fortune
        random_index = random.randint(0, num_str-1)
        next_offset = 24 + (random_index * 4)

        df.seek(next_offset)
        random_offset = struct.unpack(">I", df.read(4))[0]  # Read as unsigned int

        print("\n---------")
        print("-- Random Quote@", random_offset)


        ff.seek(random_offset)
        for c in iter( partial(ff.read,1), '%'):
            print(c, end='')
        
        print(ff.readline())

Comment below if you don’t understand the code. Happy to help (:

1️⃣ Scalable Design and The Concept of `.dat` Files for Efficient Parsing

.dat files store structured offset information along with data.
This enables the program to quickly jump to any fortune without scanning the entire file.
The format typically includes:
- A header (metadata: version, count, longest/shortest string, etc.)
- A delimiter (marks the end of each fortune)
- A list of offsets (pointers to where each fortune starts)
- The actual fortune texts
Instead of searching for the nth fortune line-by-line, we can:
- Read a random offset from the offset table(*.dat files).
- Seek directly to that byte position in the data file(fortune file).
- Read until we hit the delimiter.

This is much faster than iterating over every line in a large text file!

Final Thoughts

The fortune utility uses a .dat file with precomputed offsets for fast random access, making it efficient for large datasets. Understanding the strfile format helps us design smarter and faster programs by enabling direct access to stored text without loading entire files into memory. 🚀

Understanding the Linux "fortune" Utility and How to Read Its .dat Files in Python

What is the fortune Command?

How Does fortune Work?

Understanding strfile and .dat Files

The strfile Utility

Using strfile to Generate a .dat File

Structure of a fortunes.dat File