In Part 2, we mapped the major functions, string references, and 16 CLI commands using nothing but strings, pattern matching, and a Python script. That left us with a suspicious route-engine string:

WARNING: no routes programmed for stack

Part 2 could show that the string exists and that it belongs to the route subsystem. It could not prove when that path runs, or whether it is the exact runtime symptom. For that we need structure. In this post, we will decompile the firmware, trace the execution path through the route engine, find why two boot routes silently miss the forwarding table, patch the binary, and recalculate the firmware image’s CRC. Four bytes is enough to restore forwarding in this sample.


From Disassembly to C: What Decompilers Give You

Reading MIPS assembly works for short functions, but once you are staring at a 60-instruction function with nested loops and multiple branches, you want C. Decompilers like Ghidra take machine code and produce readable (if imperfect) C.

Here is what Ghidra produces for our iterate_active_routes function. This is the raw, unannotated output – no cleanup, exactly what the tool generates:

void FUN_80000efc(int param_1)
{
    int *piVar1;
    int iVar2;
    int iVar3;
    int iVar4;

    iVar3 = 0;
    iVar4 = 0;
    piVar1 = (int *)0x80040008;
    do {
        iVar2 = *piVar1;
        if (iVar2 != 0) {
            if ((iVar2 & 4) != 0) {
                FUN_80000b74(4, s_route_bridge_flag_set_skipping_80001f70);
                iVar3 = iVar3 + 1;
            }
            else if ((iVar2 & 1) != 0) {
                FUN_80000b74(4, s_route_programming_forwarding_entry_80001f94);
                iVar4 = iVar4 + 1;
            }
        }
        piVar1 = piVar1 + 4;
    } while (piVar1 != (int *)0x80040208);
    if (iVar4 == 0) {
        FUN_80000b74(4, s_WARNING_no_routes_programmed_80001fb8);
    }
    if (0 < iVar3) {
        FUN_80000b74(4, s_bridge_routes_deferred_80001fe0);
    }
    return;
}

Ghidra got the structure right. You can see the loop, the flag checks, the log messages. But everything has auto-generated names: FUN_80000efc, param_1, piVar1, iVar2. The string references are the one lifeline – they tell you what each branch does, even when the variable names are meaningless.

This is what decompilation gives you: structure without semantics. You get the control flow and data flow right, but understanding what it means requires annotation.


Annotation: From piVar1 to entry->flags

Let us clean up Ghidra’s output. Using the strings, the function name we discovered in Part 2, and our knowledge of the route table structure, we can annotate everything:

/* iterate_active_routes() at 0x80000EFC
 * Walks the route table and programs active routes into the
 * hardware forwarding table. Bridge-flagged routes are deferred.
 */
void iterate_active_routes(int stack_idx)
{
    route_entry_t *entry;
    int flags;
    int skipped = 0;       /* bridge-deferred routes */
    int programmed = 0;    /* routes sent to forwarding table */

    entry = &g_route_table_entries[0];   /* Ghidra's raw piVar1 points at entry[0].flags */
    do {
        flags = entry->flags;
        if (flags != 0) {                    /* skip empty slots */
            if ((flags & 0x04) != 0) {       /* ROUTE_FLAG_BRIDGE = 0x04 */
                log_msg(MOD_ROUTE, "route: bridge flag set, skipping");
                skipped++;
            }
            else if ((flags & 0x01) != 0) {  /* ROUTE_FLAG_ACTIVE = 0x01 */
                log_msg(MOD_ROUTE, "route: programming forwarding entry");
                programmed++;
            }
        }
        entry++;
    } while (entry != &g_route_table_entries[32]);

    if (programmed == 0) {
        log_msg(MOD_ROUTE, "WARNING: no routes programmed for stack");
    }
    if (skipped > 0) {
        log_msg(MOD_ROUTE, "bridge routes deferred to bridge handler");
    }
}

Now it reads like source code. And now the bug jumps out.


The Bug: Priority Inversion

Look at the two if-statements in the loop:

if ((flags & 0x04) != 0) {       /* check BRIDGE flag first */
    skipped++;
    continue;
}
else if ((flags & 0x01) != 0) {  /* check ACTIVE flag second */
    programmed++;
}

The bridge flag (0x04) is checked before the active flag (0x01). This creates an if/else where a route that has both flags set hits the bridge check first and gets diverted – the active check never runs.

Let us trace what happens during boot. The firmware adds four routes:

route_add(0x0A01, 0x01, ACTIVE | STATIC);           /* 10.1.x.x → port 0 */
route_add(0x0A02, 0x02, ACTIVE | STATIC);           /* 10.2.x.x → port 1 */
route_add(0xC0A8, 0x04, ACTIVE | BRIDGE);           /* 192.168.x → port 2 */
route_add(0xAC10, 0x08, ACTIVE | STATIC | BRIDGE);  /* 172.16.x → port 3 */

Routes 1 and 2 have flags 0x03 (ACTIVE + STATIC). The BRIDGE check fails (0x03 & 0x04 = 0), so they fall through to the ACTIVE check (0x03 & 0x01 = 1) and get programmed. Good.

Routes 3 and 4 have flags 0x05 and 0x07 respectively – both include BRIDGE (0x04). The bridge check succeeds first: 0x05 & 0x04 = 4, which is non-zero, so the bridge path wins. The route’s ACTIVE bit exists, but the programming path never gets to use it.

The result? Only 2 of 4 routes get programmed into the forwarding table. Traffic to 192.168.x.x and 172.16.x.x silently blackholes. What the final boot scan actually produces:

[MOD:4] route: programming forwarding entry    ← route 1 (10.1.x)
[MOD:4] route: programming forwarding entry    ← route 2 (10.2.x)
[MOD:4] route: bridge flag set, skipping       ← route 3 (192.168.x) SKIPPED!
[MOD:4] route: bridge flag set, skipping       ← route 4 (172.16.x) SKIPPED!
[MOD:4] bridge routes deferred to bridge handler

iterate_active_routes is called by route_add after every insertion, so the scan runs four times during boot – not once at the end. Calls 1 and 2 only see ACTIVE+STATIC entries and program them cleanly. Call 3 hits route 3’s BRIDGE flag for the first time. Call 4 does the same for route 4.

Routes 1 and 2 get re-programmed on every rescan (idempotent), so programmed is never zero – the WARNING: no routes programmed guard never fires for this route set. It is a safety net for a completely empty forwarding table, not for a partially-broken one. That is the correction decompilation gives us over the string-only view from Part 2: the warning string was a clue to inspect this function, while the actual bug is the quieter “bridge flag set, skipping” path.

This is a priority inversion – the bridge flag check has higher priority than the active flag check, but the developer intended them to be independent. A route should be programmed if it is ACTIVE, regardless of whether it is also BRIDGE. The bridge handler should get a copy of bridge-flagged routes, not steal them from the forwarding table entirely.

In the real vendor firmware project, we found exactly this pattern: a flag check at instruction 0xC012936C tested a P2P flag before the active-capability flag, causing an entire subsystem to silently fail whenever a specific hardware configuration was present. The symptom was identical – “no rules programmed” – and it took weeks of trace analysis to find those four bytes.


The Disassembly: Four Instructions

Let us look at the exact machine code. The bug lives in four instructions at offsets 0x0F6C-0x0F78:

80000f64:   8e020000    lw      v0, 0(s0)       # v0 = entry->flags
80000f68:   1040fffb    beqz    v0, next_entry  # if flags == 0, skip

80000f6c:   30430004    andi    v1, v0, 0x4     # v1 = flags & BRIDGE  ← THE BUG
80000f70:   1460fff5    bnez    v1, bridge_skip # if BRIDGE set → branch to bridge_skip
80000f74:   30420001    andi    v0, v0, 0x1     # delay slot: executes even when branch is taken
80000f78:   1040fff7    beqz    v0, next_entry  # ← never reached when branch taken

The instruction at 0x80000F70 is the problem: bnez v1, bridge_skip. When the BRIDGE bit is set this branch fires – and because 0x80000F74 is the branch delay slot, the andi there executes either way. What never executes is 0x80000F78: the branch that tests the ACTIVE result before the programming path. Routes with both BRIDGE and ACTIVE flags are treated as bridge-only, not as active routes that also participate in forwarding.


Binary Patching: The Four-Byte Fix

The simplest forwarding fix: NOP out the branch at 0x0F70. Replace bnez v1, bridge_skip with nop, so the bridge flag check still runs but never diverts execution. All non-empty routes reach the ACTIVE check.

import struct

with open('firmware.bin', 'rb') as f:
    data = bytearray(f.read())

# The bug: bnez instruction at offset 0x0F70
bug_offset = 0x0F70
original = struct.unpack_from('<I', data, bug_offset)[0]
print(f"Original: 0x{original:08X}  (bnez v1, bridge_skip)")

# The fix: NOP (0x00000000)
struct.pack_into('<I', data, bug_offset, 0x00000000)
print(f"Patched:  0x00000000  (nop)")

with open('firmware_patched.bin', 'wb') as f:
    f.write(data)

Four bytes. That is the entire forwarding patch. The bridge flag still gets tested by the andi instruction, but the result is never acted on – all non-empty routes proceed to the ACTIVE check.

After patching:
80000f6c:   30430004    andi    v1, v0, 0x4     # v1 = flags & BRIDGE (still runs)
80000f70:   00000000    nop                      # ← PATCHED (was: bnez v1, bridge_skip)
80000f74:   30420001    andi    v0, v0, 0x1     # former delay slot; already executed before patch
80000f78:   1040fff7    beqz    v0, next_entry  # ← now always reached

Now routes with ACTIVE | BRIDGE (flags 0x05 or 0x07) hit the NOP, the former delay-slot andi runs as before, but execution falls through to 0x80000F78 – the beqz that was previously unreachable. The ACTIVE flag is set, the branch is not taken, and the route gets programmed. The forwarding table gets all four boot routes instead of two.


CRC Recalculation: You Break It, You Fix It

Remember the firmware image from Part 1? Each partition has a CRC-32 checksum, and the image header has a global checksum. Patching even a single byte invalidates both:

Original firmware.bin CRC-32:  0x9F25E0DC
Patched firmware.bin CRC-32:   0x156FE139   ← completely different!

If we packed the patched binary into a firmware image without updating the CRC, the device’s bootloader would reject it. We need to repack:

$ python3 fwpack.py firmware_patched.bin firmware_patched.img

Packed firmware image: firmware_patched.img
  Total size:    9152 bytes
  Image CRC-32:  0xC5E88C64
  Partitions:    3

  [bootloader]
    Offset:  0x0120
    Size:    32 bytes
    CRC-32:  0x4F7A6ACA
  [main_fw]
    Offset:  0x0160
    Size:    8496 bytes
    CRC-32:  0x156FE139    ← updated automatically
  [config]
    Offset:  0x22B0
    Size:    256 bytes
    CRC-32:  0x2683AC5B

The packager recalculates all checksums from the payload data. In practice, reverse engineering the vendor’s CRC algorithm (or identifying it as standard CRC-32) is a necessary step before you can deploy any binary patch. Some vendors use non-standard polynomials, or CRC the payload with a salt, or layer RSA signatures on top. Each additional layer makes patching harder – but the fundamental workflow is the same: patch the code, recalculate the integrity checks, repack.


What We Changed (And What We Didn’t)

Let us be precise about what this patch does:

  Before patch After patch
Routes with ACTIVE only Programmed Programmed
Routes with BRIDGE only Skipped (deferred log) Skipped (silent)
Routes with ACTIVE + BRIDGE Skipped (bug!) Programmed (fixed!)
Routes with no flags Skipped (empty) Skipped (empty)

The ACTIVE + BRIDGE path is fixed. The ACTIVE-only path is unchanged. But the BRIDGE-only path changes: before the patch, a BRIDGE-only route incremented skipped and triggered the “bridge routes deferred” log. After the NOP, the branch that drove that path is gone – the route falls through to the ACTIVE check, which fails (flags & 0x01 == 0), and exits via next_entry with no counter and no log.

In this firmware, that is fine. The “bridge handler” is a log message; there is no real deferred queue behind it. If there were, NOPping the branch would trade one bug for another. In a real project you would replace the branch with proper independent checks – test BRIDGE and ACTIVE separately rather than as a chain. The NOP is the minimal patch that fixes the symptom we can measure; the right fix is a code restructure.


The Bigger Picture

We went from suspicious route strings to a root cause and a binary patch in one post. Let us trace back the full path:

  1. Part 1: Built the flat binary – 8,496 bytes, no symbols, no sections
  2. Part 2: strings found suspicious route messages; xref scanning pointed us at the route subsystem
  3. Part 3: Decompilation revealed the flag priority inversion in iterate_active_routes at 0x80000EFC
  4. Binary patch: NOP at offset 0x0F70 (4 bytes)
  5. CRC update: Repack firmware image with new checksums

That is the real-world firmware RE workflow. Find a clue in the strings. Trace it to a function. Decompile the function. Understand the logic. Patch the binary. Update the checksums.

In the real project, this same workflow – from a warning message in a UART log to a binary patch at a specific address – took weeks. Most of that time was spent building the tools and understanding the firmware’s architecture. The actual bug was four bytes, just like this one.

But here is the thing: we patched four bytes. What if the fix was more complex – restructuring a function, adding new code, changing data structures? You cannot do that with a hex editor. You would need to recompile from source.

And that raises a question: we have Ghidra’s decompiled C for every function. We have the source filenames from the assert strings. Can’t we just… compile it?

In Part 4: “Symbols, Scripts, and other linking nightmares”, we will try. And we will discover why that simple question has a very complicated answer.


References