So, I decided to write a BUFR decoder... Part 2

Introduction
The Structure of BUFR files and BUFR Messages
- BUFR Files
- BUFR Messages
Choosing Zig for BUFR Decoding
- Surveying the Landscape
- Compile-Time Reflection and Zig
Compile-Time BUFR Template Definition
Conclusions and Part 3…

Introduction

This is part 2 of my deep dive into writing a BUFR decoder. If you are jumping in for the first time and have questions such as:

Why do you hate yourself?
What is BUFR, anyway?
What is the end goal of this endeavor?

Then I encourage you to catch up and read Part 1. It’s not too terribly long of a read! I can’t say the same for this one. I promised in Part 1 that I would get into the weeds, so I hope you brought some bug spray.

Here in Part 2, I want to dive into the structure of BUFR messages, what language I chose (Zig) and why it is particularly well suited for this kind of problem space, and how I use really cool features of the language to define compile-time templates for decoding BUFR messages. This was my first project diving into things like byte-streams, bit manipulation, and compile-time reflection — and I am doing it all without a single external library dependency.

Without further ado, let’s descend into madness together. The work is mysterious and important…

It may not be pretty, but somebody's gotta do it...

The Structure of BUFR files and BUFR Messages

BUFR Files

Usage of the phrase “BUFR file” is a bit tricky, because there is technically no standard “BUFR file” that acts as a container for a series of BUFR messages. A “BUFR file” you’ve encountered in the wild could potentially be a series of complete BUFR messages, one after the other (potentially with additional bytes between them), or just a single message. For example, observed U.S. Radiosonde BUFR data for a single site received over NOAAPORT has all balloon observations during the ascent contained within a single BUFR message. If you’ve ever opened up a model point forecast sounding BUFR file on AWS or GCP, you’ll see that it is actually a series of BUFR messages in one file, with some extra bytes in between.

Your outie has BUFR Table B memorized...

The general structure is to have the first 64 bits (8 bytes) of the file be the number of bytes contained in the message, followed by the BUFR message, followed by the 64 bits detailing the number of bytes of the previous message, and then 64 bits describing the number of bytes for the next message, followed by the next message… and so on until there are no more BUFR messages. Presumably, this is to make navigating forward and backward between BUFR messages easier, but the main thing that I want to point out is that none of this is part of the official BUFR standard. Though not part of the standard, it also doesn’t break the standard either… it’s just a choice NCEP made, and is one of the many oddities about “BUFR files”.

Why am I pointing this out? Well, if you were to take a NCEP model point forecast sounding “BUFR file” from one of the cloud providers, and say, provide it to the pybufrkit web decoder for inspection, you would only get the first message. To actually decode the file, you would have to use the Python API and directly iterate over the number of messages in the file. I only figured this out through trial and error, along with a brief discussion with the author of pybufrkit. Hopefully this saves you some trouble in the future!

Another “fun” fact I learned in this process is that, at least based on appearances, the first BUFR message of the point forecast sounding data appears to be a BUFR-encoded BUFR table! So, presumably, you can use that first message to get the local NCEP table used for encoding, and then use that to decode the subsequent messages… which is already giving me a headache when thinking about tackling that later on down the road. Ugh…

BUFR Messages

With some of the nuances about “BUFR files” out of the way, let’s talk about the structure and formatting of a BUFR message. At a high level, each message is split into sections that contain metadata needed for decoding, and the actual data to be decoded.

Each section is split into “octets” — sub-sections that are defined as 8-bit chunks — used to store information about the BUFR version, length of sections, presence of optional sections, and more. These sections can differ between BUFR versions, but are expected to be the same for all messages of a particular version. For example, Section 1 has a different octet layout and size between BUFR version 3 and 4, but all BUFR v4 messages have the same Section 1 structure. Some other properties of BUFR sections that are important:

Section 0 has a fixed size of 8 octets, and Section 5 has a fixed size of 4 octets.
All other octets have a variable length that is encoded in the first 3 octets.
Each section shall be an integer multiple of 8 bits. Zero-bits are appended to the end of a section if needed.
Data within the BUFR sections are not always byte aligned.

That last point is really important, because it is also a characteristic of decoding the actual data within a BUFR message too. Data can be encoded as an arbitrary number of bits that are not multiples of 8. This means you have to do some funky bit manipulation to extract data from BUFR messages, especially since most system libraries read data from storage as byte streams (streams of 8-bit chunks). To illustrate, I’m going to further break down the structure of BUFR Section 1, since it has a variable length, flags that indicate the presence of optional sections, and sub-byte fields in the form of bit flags. It’s a reasonable representation of the expected complexity.

The above diagram is a “packet diagram” used for displaying the layout of network packets. It also happens to be useful for the layout of a BUFR message! Each section represents a number of bits and its interpretation. Bits 0-23 (24 bits, or 3 bytes) are used to encode the total length of Section 1, including those bits, followed by 8 bits that encode the BUFR Master Table number, and so on… There is also a single-bit flag labeled as “S2?” that is used to encode whether or not an optional Section 2 follows, with the remaining 7 bits reserved and set to zero. The final section is optional, and I chose an arbitrary length to make the diagram complete. The actual length would be determined by the number of bits read up until that point, subtracted from the value encoded in the first 3 bytes. As per the rules above, if the section does not end on a multiple of 8 bits, it is zero-padded.

Knowing the structured bit-layout of this and the other BUFR sections, as well as some of the other facts (such as arbitrary bit-width decoding), and the knowledge that decoding of full data will be done using table-encoded metadata, we can start to piece together the puzzle of how exactly we are going to solve this decoding problem… and with what language.

Choosing Zig for BUFR Decoding

Surveying the Landscape

There really aren’t many base-level requirements a language needs in order to decode BUFR data other than having the capacity for bit-manipulation. As mentioned in Part 1, NCEP uses FORTRAN, there’s a Python-based decoder, and there’s also C and C++ code for this task floating around on GitHub, though I am not as well versed in those implementations. All of these implimentations require runtime loading of tables, which as mentioned in Part 1, the whole idea is to avoid that. The Python implementation by pybufrkit is rather intriguing, though — it uses JSON to encode the templates for the BUFR sections, as well as the various table data, and uses that to define the structure of and decode BUFR data. What if instead of JSON, this information could be encoded at compile time? It would certainly solve the problem of needing to load tables and configuration at runtime, and allow for a lot of work to be done at compile-time instead (making things a little faster and more memory efficient).

I’ve been no stranger about advocating for C++ when it makes sense, but after some initial prototyping, it became clear that template meta-programming and the compile-time logic available in C++17 (the current version available on NWS/NCEP systems) would be difficult and messy at best. Python wasn’t a solution because this decoder needs to be portable and used in compiled C++ visualization code and in Web Assembly browser environments. A very intelligent and capable colleague suggested Rust, and it very well may be the correct route long-term.

You should totally build a BUFR decoder in Rust... HMU and let's chat about it
— Daniel Rothenberg (@danielrothenberg.com) May 26, 2025 at 8:14 PM

While Rust would be a strong choice for memory safety, has many of the strengths and some of the performance benefits of C++, can compile to Web Assembly, and is arguably a more production-ready language, I’ll admit that I already had some internal bias against using it due to some ideas that had been rattling around my head about how to solve BUFR decoding. Rust, in its current state, does not support something called “compile-time reflection”, and ongoing efforts to bring it to the language died off due to internal drama within the Rust Foundation. Rust does have the ability to use macros, much like C and C++, but there is strong division within programming communities about the code obfuscation and difficulty in debugging that macros bring. Compile-time reflection is officially being brought to C++26 as a new, core feature, but it will be many years before that trickles down into the operational environments I am able to write for.

Compile-Time Reflection and Zig

So what is compile-time reflection anyway, and why would the lack of it push me away from otherwise solid solutions?

… Reflective programming or reflection is the ability of a process to examine, introspect, and modify its own structure and behavior — Wikipedia

It is ill advised to stare at your code reflection for too long.

The compiler has access to a lot of information when compiling your code, and it is possible to use that information to perform logic and generate code at compile-time, inspect the types of your data, or even iterate over the named fields within your struct/data structure. In fact, reflection is a form of generic programming analogous to templates in C++, but with stronger introspection than is typically available.

There’s a newer language on the block called Zig, and it is really awesome. Zig was written as and is intended to be a systems programming language in the same class as C (if not outright intending to one day replace C). It’s intended to be minimalistic, portable to embedded platforms, have no hidden control flow, no hidden memory allocations, and no preprocessors or macros. Additionally, Zig has native support for compile-time reflection, compile-time execution of code, and supports types as first class citizens of the language. Finally, it is a zero-dependency drop-in replacement for a C and C++ compiler, and can interface with C and C++ cross-compilation out-of-the-box… which is sick! While newer and relatively unstable (again, primarily in the standard library), it is already being used for large, performance-critical applications. Bun, a performance focused drop-in replacement for Node.js, is written in Zig and already powers major websites such as Anthropic, X, Midjourney, and the website you are reading right now! Zig was also used to make Ghostty — a very powerful terminal emulator that I highly recommend checking out. So while the lack of maturity could be cause for concern for some, there are some big players writing big projects and getting great results out of using Zig. Zig is also developed and funded through a nonprofit and has a strong open source community behind it.

Some other awesome functionality:

While not garbage collected, you can easily check for memory leaks by using the Debug Allocator to track memory usage.
Supports integers of arbitrary bit-widths.
Built-in functions for compile-time reflection and execution of code.
Tagged unions for runtime and compile-time function dispatch.

Sometimes, things don’t make a whole lot of sense until you see code… so let’s take a look at some (admittedly contrived) examples of what some of this looks like in Zig, and how you can use it. This may seem like a bit of a detour, but the magic of using reflection for BUFR templates cannot be fully appreciated if you don’t have at least a high-level understanding of what it is doing!

// A struct that stores some data of various types, 
// and has a print method used to describe the data 
// using reflection and compile-time operations
const Tornado = struct {
    name: []const u8,
    rating: u3,
    max_wind: u9,
    length_miles: f16,
    width_miles: f16,
    duration_minutes: u8,

    pub fn print(self: @This()) void {
        const fields = @typeInfo(@This()).@"struct".fields;

        // inline for loops are loops executed at
        // compile-time, and must be done using
        // data known at compile-time
        inline for (fields) |field| {
            const value = @field(self, field.name);
            const field_type_info = @typeInfo(field.type);

            switch (field_type_info) {
                .pointer => {
                    std.debug.print("{s}: {s}\n", .{ field.name, value });
                },
                inline else => {
                    std.debug.print("{s}: {}\n", .{ field.name, value });
                },
            }
        }
        std.debug.print("\n", .{});
    }
};

What I have done here is create a struct that contains some data, and has a method that prints some output. Pretty simple in principle, but if you take a closer look, there is some powerful stuff happening. First, I have defined some types with arbitrary bit-widths, which is pretty uncommon for a language to have. I chose a u3 for the EF-scale rating because I know the maximum value is 5, and an unsigned integer of width 3 can represent 0-7. That makes it the smallest possible size for representing the data. I also have a pointer to a string array slice, denoted by []const u8. Rather than writing code that manually accesses each variable within the struct, or having branching if-statements, I am able to use reflection to iterate over the struct fields and switch based on the field type to define the print behavior.

Functions with the ”@” decorator are compiler builtins that can be used for reflection. I used the @This() function to get the containing struct and pass it to the @typeInfo() builtin to get an array of fields the struct contains. I then use the array of fields and iterate in a compile-time loop in order to extract the value of that field, and get the @typeInfo() of the struct field to determine how it should be printed (strings need different formatting than numbers). I then switch based on the field type, and perform one print operation for string arrays, and then have all other cases print as numbers. It should be noted that I am taking a shortcut for the sake of example, but a fully general implementation would not lump in everything else as a number quite like this.

If it still has not sunk in why this is incredible, think about the scenario where I add another struct field called fatalities. What happens? Well, I can tell you what doesn’t happen: I don’t have to add another condition or hard-coded struct field access to print the number of fatalities associated with that tornado. Which is AWESOME. In order to demonstrate this with a fully executable program, all we have to do is define an array of Tornado, iterate over it, and call the print function.

pub fn main() !void {
    // a stack allocated array with
    // an inferred length
    const tornadoes = [_]Tornado{
        .{
            .name = "El Reno",
            .max_wind = 313,
            .duration_minutes = 40,
            .length_miles = 16.2,
            .width_miles = 2.6,
            .rating = 3,
        },
        .{
            .name = "Bridge Creek/Moore",
            .max_wind = 321,
            .duration_minutes = 85,
            .length_miles = 38,
            .width_miles = 1,
            .rating = 5,
        },
        .{
            .name = "Joplin",
            .max_wind = 225,
            .duration_minutes = 38,
            .length_miles = 21.62,
            .width_miles = 1.0,
            .rating = 5,
        },
    };

    std.debug.print("List of Notable Tornadoes\n", .{});
    inline for (tornadoes) |tornado| {
        tornado.print();
    }
}

// import the standard library
const std = @import("std");

Output:

List of Notable Tornadoes
name: El Reno
rating: 3
max_wind: 313
length_miles: 1.62e1
width_miles: 2.6e0
duration_minutes: 40

name: Bridge Creek/Moore
rating: 5
max_wind: 321
length_miles: 3.8e1
width_miles: 1e0
duration_minutes: 85

name: Joplin
rating: 5
max_wind: 225
length_miles: 2.162e1
width_miles: 1e0
duration_minutes: 38

If it hasn’t already been made apparent in the example above, this is incredibly useful functionality to have when decoding BUFR messages, and is not just some sort of weird side-quest into a niche, new language. BUFR data are heavily templated, and those templates can be leveraged at compile time to tell a program how to interpret and decode a BUFR message — something that no other BUFR decoder I am aware of is able to do, and precisely enables the desired use case of performing decoding in a sandboxed WebAssembly environment without runtime tables.

Compile-Time BUFR Template Definition

If you still are not convinced that this is perfect for BUFR, or perhaps are a little confused, I am going to show you the logic for decoding one of the simpler BUFR Message sections. The general design strategy I came up with is to have section templates defined as structs, and those structs contain an array of []BUFRSectionParam that detail the name of the field, and the type/bit-width of the field to be decoded. These types can then be pipelined and read sequentially using a bit reader that has an array buffer and a pointer/index to the current bit. The details of the bit reader are not necessary, but the general strategy can be represented with the following code:

pub const BUFRSectionParam = struct {
    name: []const u8,
    dtype: type,
    expected: ?[]const u8 = null,
};

pub const BUFRSection = struct {
    index: u8,
    description: []const u8,
    default: bool,
    optional: bool,
    parameters: []const BUFRSectionParam,
};

Each BUFR Message section to be decoded can be represented with an index (Section 0, 1, 2, etc), a description string, booleans that determine whether it is expected by default or optional, and an array of BUFRSectionParam. Each BUFRSectionParam can be represented by a string name, a type, and I include an optional value that can be used to validate the BUFR start and end signatures. To take this from generalized to a specific instantiation, I created the following code for Section 4, the data section:

// This is the actual encoding of how 
// Section 4 is structured, including 
// the bit-widths/types of the fields 
// to be decoded
pub const Template = BUFRSection{
    .index = 4,
    .description = "Data section",
    .optional = false,
    .default = true,
    .parameters = &[_]BUFRSectionParam{
        BUFRSectionParam{
            .name = "section_length",
            .dtype = u24,
        },
        BUFRSectionParam{
            .name = "reserved_bits",
            .dtype = u8,
        },
        BUFRSectionParam{
            .name = "template_data",
            .dtype = []const u8,
        },
    },
};

// More on this later, but it is used to create a 
// return type specific to the values of Template.parameters
pub const Data = bufr_section.make_section_return_type(Template);

// The decoder function takes a pointer to a bit reader.
// The bit reader has a function called 'read' that takes 
// in a type, determines the width of the type using 
// reflection, and then reads and returns the value having 
// read that number of bits.
pub fn decode(bit_reader: *BitReader) !SectionData {
    var result: Data = undefined;

    comptime var bits_read: usize = 0;
    // the last field is for the encoded data
    // described using BUFR tables...
    // that will be read using read_slice.
    // Also, remember that 'inline for' is a 
    // compile-time loop. It does not "read"
    // the BUFR file at compile-time, since 
    // that is runtime data... but it does 
    // effectively unroll/pipeline the 
    // decoding instructions. At least, 
    // that is how I understand it.
    inline for (0..Template.parameters.len - 1) |idx| {
        const param = Template.parameters[idx];
        const value = try bit_reader.read(param.dtype);
        // use the @field accessor to store 
        // the decoded value in our return type
        @field(result, param.name) = value;
        bits_read += @bitSizeOf(param.dtype);
    }

    // the remaining data is the data to be decoded
    // by the BUFR tables...
    const bytes_remaining = result.section_length - (bits_read / 8);
    // read bytes_remaining in 8-bit chunks and store the slice
    @field(result, "template_data") = try bit_reader.read_slice(u8, bytes_remaining);

    // SectionData is just a tagged union that 
    // we can use to return a common type for 
    // all section results.
    return SectionData{ .section4 = result };
}

There is a decent amount going on, and maybe it is unreasonable for anyone reading this to truly comprehend it unless you’ve spent some time with Zig or take time to digest it, but it should feel somewhat magical. I tried to leave some insightful comments, so be sure to give them a close read. As previously mentioned, the overall workflow is to iterate over the compile-time template definition and get the bit-width of each type, read that number of bits, and store the result. A small snippet of decoding the raw bytes follows. This snippet assumes Section 0 - Section 3 has already been successfully decoded.

// fb is just the import alias for the library,
// short for 'fastbufr' and subject to change...
// The use of 'comptime' tells the compiler to 
// evaluate this statement during compile-time. 
// Perhaps not entirely necessary, but it makes 
// it clear to me when this gets evaluated.
const sec4 = comptime fb.sections.BUFRSectionTemplate{
    .section4 = fb.definitions.Section4.Template,
};

// at runtime, use the template definitions to 
// decode each set of bytes required for the Section4 
// fields.
const sec4_data = try fb.sections.decode(sec4, &bit_reader);
std.debug.print("{any}\n", .{sec4_data});

The last little bit of magic that makes all of this work is the make_section_return_type function I glossed over. This section heavily uses reflection, and essentially tells the compiler to create a new struct type using the array of parameters from the BUFRSection. It uses the compiler builtin functions to essentially create what the internal representation of the struct should look like, including using the string names of the parameters as field names, and return that type.

pub fn make_section_return_type(comptime section: BUFRSection) type {
    comptime var fields: [section.parameters.len]std.builtin.Type.StructField = undefined;

    inline for (section.parameters, 0..) |param, i| {
        fields[i] = .{
            .name = @ptrCast(param.name),
            .type = param.dtype,
            .default_value_ptr = null,
            .is_comptime = false,
            .alignment = @alignOf(param.dtype),
        };
    }

    return @Type(.{ .@"struct" = .{
        .layout = .auto,
        .fields = &fields,
        .decls = &.{},
        .is_tuple = false,
    } });

}
// In the case of Section 4, the return type 
// looks something like...
// .{
//     .section_length: u24,
//     .reserved_bits: u8,
//     .template_data: []const u8,
// }

All of this happens while the compiler is generating the code, and none of it happens when the code is decoding the BUFR messages.This function is fully reusable, and can create return types for Section 0 through Section 5! The heavily structured nature of BUFR data, combined with reflection, turns otherwise tedious and complicated code into something that is relatively simple and heavily reusable. It also means that it is straightforward to encode the logic of previous BUFR versions (BUFR 3) or even future versions if they come around. There is some code still not being shown here, but this is the meat and potatoes of how this is getting done… and the incredible success of getting this to work gives me hope that the same can be done for the BUFR Tables needed to decode the remaining data.

Oh, and by the way… everything I just demonstrated to you performs the decoding of the BUFR sections with zero dynamic memory allocation. Yeah, go ahead and let that sink in. This would, in theory, fulfill at least one of the NASA requirements for safety-critical code!

Conclusions and Part 3…

This was a lot to digest and cover, and hopefully now you can understand why I separated this into parts. My goal for these articles has been to motivate this task, give some insight into the technical details of BUFR, and to paint a mental picture of how a language like Zig is extremely well suited for the structure and complexity of BUFR data. My worry is that perhaps even after breaking things up into separate parts, it is already so niche and so dense as to have likely lost most people by the time you get to the end. However, I have found the process of tackling this challenge deeply rewarding, and I have learned a lot of new tools that I look forward to applying to other challenges in the future. I hope that I have done an okay-enough job that at least some of this makes sense, but if not, don’t hesitate to reach out and ask some questions. Clearly, I like talking about it.

This also is not the end of the BUFR decoding saga. While this was a monumental leap forward and enabled me to fully parse all of the sections within a BUFR message, I am still deep in the weeds of writing the logic to use the BUFR tables to actually decode data. Using a very similar strategy, and still limiting myself to no runtime reading of table files, I have been parsing and formatting the tables into code that can be compiled. More accurately, I have been writing code that downloads and parses the BUFR tables from the WMO and turns it into Zig code. In the process, I even found a (very minor) table formatting error and let the WMO know! I have yet to discuss how BUFR descriptors and tables work, but the general idea of pipelining reads remains the overall end-goal.

It will likely be a little further in the future, but there is room for a Part 3 that discusses:

The structure of BUFR Descriptors
The structure of BUFR Tables
How BUFR Descriptors and Tables can be used to decode the raw bytes of data.

It has taken a not insignificant amount of time and effort to put this all together, so I welcome a little bit of a break while I solve the remaining problems… but if this was useful and insightful, or even if you have suggestions for how to improve this sort of content in the future, please let me know. Hearing from you helps motivate me to write stuff like this more often.

Until then, I descend back into madness on my quest for glorious purpose, attempting to turn BUFR Sequence Descriptors into a tree of decoding operations…