Core primitive types for a blockchain in Rust

Why primitive types are the foundation

A blockchain is, at its core, a pile of bytes with strict rules about what each pile means. A 32-byte array might be a block hash, a transaction hash, or a public key digest. A 20-byte array might be an account address. A u128 might be a token balance, a nonce, or a gas price. If you pass these around as bare [u8; 32] and u128, nothing stops you from hashing an address, using a balance as a nonce, or adding two amounts that are denominated differently.

The fix is the newtype pattern: a tuple struct with a single field that wraps the raw representation and gives it a distinct, named identity. Newtypes are a zero-cost abstraction — they compile down to the inner type with no runtime overhead — but at compile time they are completely incompatible with each other and with the underlying primitive.

We’ll build three: Hash, Address, and Amount. By the end you’ll have types that the compiler defends for you, with the derives and serialization a real node needs.

The newtype pattern, applied to a Hash

Start with the simplest case: a cryptographic hash is exactly 32 bytes. We use a fixed-size array [u8; 32] rather than a Vec<u8> because the size is an invariant — it’s known at compile time, it lives on the stack, and it’s Copy. A Vec would heap-allocate and force you to handle a “wrong length” error path that should never exist.

hash.rs

use std::fmt;

/// A 32-byte cryptographic hash. The inner array is private so callers
/// cannot construct an arbitrary `Hash` without going through our API.
#[derive(Clone, Copy, PartialEq, Eq, Hash, PartialOrd, Ord)]
pub struct Hash([u8; 32]);

impl Hash {
    pub const ZERO: Hash = Hash([0u8; 32]);

    /// Wrap raw bytes. Construction is explicit so the type means something.
    pub const fn new(bytes: [u8; 32]) -> Self {
        Hash(bytes)
    }

    /// Borrow the inner bytes for hashing, serialization, comparison, etc.
    pub fn as_bytes(&self) -> &[u8; 32] {
        &self.0
    }
}

Three things deserve attention.

The field is private (Hash([u8; 32]) with no pub). Outside this module you cannot write Hash(some_array); you must call Hash::new. That gives you a single chokepoint to add validation later without breaking callers.

The derives are deliberate. Clone, Copy because 32 bytes is cheap and value semantics are convenient. PartialEq, Eq so you can compare hashes. Hash (the std trait, confusingly same name) so a Hash can be a HashMap key — essential for indexing blocks and transactions. PartialOrd, Ord so hashes sort deterministically, which matters when you canonicalize sets of items before hashing them.

Display and parsing: hex is the lingua franca

Raw bytes are unreadable in logs and JSON. Hashes are conventionally rendered as hex. We implement Display/Debug for output and a from_hex for input, using the hex crate.

Cargo.toml

[dependencies]
hex = "0.4"

hash_hex.rs

use std::fmt;
pub struct Hash([u8; 32]);
impl Hash { pub fn as_bytes(&self) -> &[u8; 32] { &self.0 } pub fn new(b: [u8;32]) -> Self { Hash(b) } }

impl fmt::Display for Hash {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        write!(f, "0x{}", hex::encode(self.0))
    }
}

impl fmt::Debug for Hash {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        // Reuse Display so debug output is the same readable hex.
        write!(f, "Hash({self})")
    }
}

impl Hash {
    /// Parse from a hex string, with or without a leading "0x".
    pub fn from_hex(s: &str) -> Result<Self, hex::FromHexError> {
        let s = s.strip_prefix("0x").unwrap_or(s);
        let mut out = [0u8; 32];
        hex::decode_to_slice(s, &mut out)?;
        Ok(Hash::new(out))
    }
}

Note decode_to_slice decodes directly into our fixed array and fails if the input isn’t exactly 64 hex characters (32 bytes). The length invariant is enforced at the boundary, so internal code never deals with a wrong-sized hash.

Address: a different size, a different meaning

An address is structurally similar — a fixed byte array — but semantically distinct. Making it a separate newtype is the whole point: the compiler will reject fn transfer(to: Address) called with a Hash. We’ll use 20 bytes here (a common account-address width), which on its own proves that Hash and Address cannot be silently confused even if they were the same length.

address.rs

#[derive(Clone, Copy, PartialEq, Eq, Hash, PartialOrd, Ord)]
pub struct Address([u8; 20]);

impl Address {
    pub const fn new(bytes: [u8; 20]) -> Self {
        Address(bytes)
    }
    pub fn as_bytes(&self) -> &[u8; 20] {
        &self.0
    }
    /// Derive an address from the trailing 20 bytes of a 32-byte key hash.
    pub fn from_key_hash(hash: &[u8; 32]) -> Self {
        let mut out = [0u8; 20];
        out.copy_from_slice(&hash[12..32]);
        Address(out)
    }
}

from_key_hash is the kind of domain logic newtypes let you centralize: the rule “an address is the last 20 bytes of a public-key hash” lives in exactly one place.

Amount: wrapping an integer, not bytes

Money is the type that most rewards the newtype treatment, because integer arithmetic silently does the wrong thing. A balance must never overflow into wraparound, and you must not accidentally mix raw counts with denominated amounts. We wrap a u128 (wide enough for large supplies with many decimal places) and expose only checked arithmetic.

amount.rs

#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Default)]
pub struct Amount(u128);

impl Amount {
    pub const ZERO: Amount = Amount(0);

    pub const fn from_raw(v: u128) -> Self {
        Amount(v)
    }
    pub const fn raw(&self) -> u128 {
        self.0
    }

    /// Returns None on overflow rather than panicking or wrapping.
    pub fn checked_add(self, rhs: Amount) -> Option<Amount> {
        self.0.checked_add(rhs.0).map(Amount)
    }

    /// Returns None if the result would be negative (insufficient funds).
    pub fn checked_sub(self, rhs: Amount) -> Option<Amount> {
        self.0.checked_sub(rhs.0).map(Amount)
    }
}

Crucially, we do not implement std::ops::Add/Sub. The + operator can’t return an Option, so offering it would invite a panic-on-overflow path into consensus-critical code. Forcing every caller to write a.checked_add(b)? makes the overflow case impossible to ignore. This is the newtype pattern earning its keep: we removed a footgun that a bare u128 hands you by default. (See Effective Rust, Item 6 for more on encoding invariants this way.)

Serialization: the network and disk boundary

Nodes serialize these types constantly — to gossip blocks, to persist state, to answer RPC. serde is the standard way to make a type serializable without hand-writing the logic. Because our fields are private, we derive on the newtype and let serde see through the wrapper with #[serde(transparent)], so an Amount serializes as a bare number and a Hash as a bare array — no superfluous { "0": ... } nesting.

Cargo.toml

[dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"

serialize.rs

use serde::{Serialize, Deserialize};

#[derive(Clone, Copy, Serialize, Deserialize)]
#[serde(transparent)]
pub struct Amount(u128);

#[derive(Clone, Copy, Serialize, Deserialize)]
#[serde(transparent)]
pub struct Hash([u8; 32]);

For on-disk and wire formats you would typically pair this with a compact binary codec (e.g. bincode), but the derive is identical — serde decouples the what from the how.

Proving it works

A test ties it together: arithmetic respects overflow, hex round-trips, and — the headline guarantee — the types don’t unify. The last point is enforced by the compiler, so we demonstrate it in a comment rather than a failing assertion.

tests.rs

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn amount_arithmetic_is_safe() {
        let a = Amount::from_raw(100);
        let b = Amount::from_raw(u128::MAX);
        assert_eq!(a.checked_add(Amount::from_raw(1)).unwrap().raw(), 101);
        assert!(a.checked_add(b).is_none()); // overflow -> None, no panic
        assert!(Amount::ZERO.checked_sub(a).is_none()); // underflow -> None
    }

    #[test]
    fn hash_hex_roundtrips() {
        let h = Hash::new([0xab; 32]);
        let s = h.to_string();
        assert_eq!(s, format!("0x{}", "ab".repeat(32)));
        assert_eq!(Hash::from_hex(&s).unwrap(), h);
    }

    // The following would NOT compile, which is the entire point:
    //   fn needs_addr(_: Address) {}
    //   needs_addr(Hash::ZERO); // error: expected `Address`, found `Hash`
}

Takeaways

Wrap every domain concept — hashes, addresses, amounts — in a newtype with a private field. The compiler then refuses to confuse them, at zero runtime cost.
Use fixed-size arrays ([u8; N]) for hashes and addresses so the length invariant is structural, and validate it once at the parsing boundary.
Choose derives intentionally: Eq/Hash for map keys, Ord for deterministic ordering, Copy only when the type is genuinely small and value-semantic.
Expose checked arithmetic for amounts and omit the Add/Sub operators so overflow can’t slip silently into consensus code.
Derive serde with #[serde(transparent)] so wrappers don’t pollute your wire format.

In the next lesson we put these to work: hashing real data into a Hash and assembling those hashes into a Merkle tree with verifiable inclusion proofs.