Chapter 2: Mathematical Foundations of Fractals¶

This work is licensed under a Creative Commons Attribution 4.0 International License.

This chapter develops the mathematical framework for understanding and measuring fractals. We begin with the fundamental question of dimension, then explore increasingly sophisticated tools for quantifying fractal geometry.

2.1 Topological vs Fractal Dimension¶

Integer Dimensions in Topology¶

In classical topology, dimension is an integer that describes an object's fundamental structure:

A point has dimension 0
A line has dimension 1
A plane has dimension 2
A volume has dimension 3

The topological dimension \(D_T\) of a set is formally defined through covering properties. A set has topological dimension \(n\) if every point has arbitrarily small neighborhoods whose boundaries have dimension \(n-1\), and \(n\) is the smallest such integer.

For a line segment, any small neighborhood is an interval, and its boundary consists of two points (dimension 0). Hence, the line has topological dimension 1. This recursive definition grounds our intuitive understanding of dimension.

Why Fractals Have Non-Integer Dimensions¶

Fractals challenge this framework. Consider the Koch curve: it is constructed from a line segment, so topologically it remains one-dimensional. Yet it exhibits properties that suggest something more complex:

Infinite length in finite space: The Koch curve has infinite length contained within a bounded region
Space-filling tendency: It occupies more "space" than a simple curve
Anomalous scaling: When magnified by factor 3, it contains 4 self-similar copies

These observations suggest that the Koch curve is "more than" one-dimensional but "less than" two-dimensional. This intuition leads us to fractal dimension.

Dimension as a Scaling Exponent¶

The key insight is to redefine dimension through scaling behavior. For regular geometric objects:

A line segment scaled by factor \(r\) produces \(r^1\) copies
A square scaled by factor \(r\) produces \(r^2\) copies
A cube scaled by factor \(r\) produces \(r^3\) copies

In general, scaling by factor \(r\) produces \(N = r^D\) copies, where \(D\) is the dimension. Solving for \(D\):

\[ D = \frac{\log N}{\log r} \]

For the Koch curve, scaling by \(r = 3\) produces \(N = 4\) copies, yielding:

\[ D = \frac{\log 4}{\log 3} \approx 1.262 \]

This non-integer value captures the Koch curve's intermediate complexity between a line and a plane.

2.2 Hausdorff-Besicovitch Dimension¶

The Hausdorff dimension provides the most rigorous mathematical foundation for fractal dimension. It builds on measure theory to assign a dimension to any set.

Hausdorff Measure¶

For a set \(S \subset \mathbb{R}^n\) and a positive real number \(d\), the \(d\)-dimensional Hausdorff measure is constructed as follows.

First, define the \(d\)-dimensional Hausdorff content at scale \(\delta\):

\[ H^d_\delta(S) = \inf \left\{ \sum_{i=1}^{\infty} (\text{diam } U_i)^d : S \subseteq \bigcup_{i=1}^{\infty} U_i, \; \text{diam } U_i < \delta \right\} \]

Here we cover \(S\) with countably many sets \(U_i\) of diameter less than \(\delta\), then sum the \(d\)-th powers of their diameters. The infimum is taken over all such coverings.

The Hausdorff measure is then:

\[ H^d(S) = \lim_{\delta \to 0} H^d_\delta(S) \]

Definition of Hausdorff Dimension¶

The crucial observation is that \(H^d(S)\) exhibits a threshold behavior:

For \(d < D_H\): \(H^d(S) = \infty\)
For \(d > D_H\): \(H^d(S) = 0\)

The Hausdorff dimension is this critical value:

\[ D_H = \inf\{d \geq 0 : H^d(S) = 0\} = \sup\{d \geq 0 : H^d(S) = \infty\} \]

At \(d = D_H\), the measure \(H^{D_H}(S)\) may be zero, positive and finite, or infinite.

Example: The Cantor Set¶

The middle-thirds Cantor set \(C\) is constructed by:

Start with \([0,1]\)
Remove the middle third \((1/3, 2/3)\)
Repeat for each remaining interval

At stage \(n\), we have \(2^n\) intervals, each of length \(3^{-n}\). The Cantor set is what remains after infinitely many iterations.

To compute \(D_H\): scaling by factor 3 produces 2 copies, so:

\[ D_H = \frac{\log 2}{\log 3} \approx 0.631 \]

This value reflects that the Cantor set is "more than" a finite collection of points (dimension 0) but "less than" a line segment (dimension 1).

Example: The Koch Curve¶

For the Koch curve, each iteration replaces one segment with four segments, each scaled by factor ⅓:

\[ D_H = \frac{\log 4}{\log 3} \approx 1.262 \]

When to Use Hausdorff Dimension¶

Hausdorff dimension is the gold standard for theoretical work because:

It is defined for any subset of \(\mathbb{R}^n\)
It satisfies desirable mathematical properties (monotonicity, countable stability)
It agrees with topological dimension for smooth manifolds

However, computing \(H^d(S)\) directly is often impractical. For applications, we turn to more computationally tractable alternatives.

2.3 Box-Counting (Minkowski-Bouligand) Dimension¶

The box-counting dimension provides a practical method for estimating fractal dimension from data.

Definition¶

Cover the set \(S\) with a grid of \(n\)-dimensional boxes (hypercubes) of side length \(\varepsilon\). Let \(N(\varepsilon)\) denote the number of boxes that intersect \(S\). The box-counting dimension is:

\[ D_B = \lim_{\varepsilon \to 0} \frac{\log N(\varepsilon)}{\log(1/\varepsilon)} \]

When this limit exists, it equals the box-counting dimension. More generally, we define upper and lower box dimensions using \(\limsup\) and \(\liminf\).

Computational Procedure¶

Discretize: Overlay a grid of boxes with side length \(\varepsilon\) on the set
Count: Determine \(N(\varepsilon)\), the number of boxes containing part of the set
Repeat: Vary \(\varepsilon\) across multiple scales (typically powers of 2)
Regress: Plot \(\log N(\varepsilon)\) vs \(\log(1/\varepsilon)\) and fit a line
Extract: The slope estimates \(D_B\)

Example Calculation¶

For a filled square of side 1:

\(\varepsilon\)	\(1/\varepsilon\)	\(N(\varepsilon)\)	\(\log N\)	\(\log(1/\varepsilon)\)
½	2	4	0.602	0.301
¼	4	16	1.204	0.602
⅛	8	64	1.806	0.903
1/16	16	256	2.408	1.204

The slope is \(2.408/1.204 = 2\), confirming the expected dimension.

Advantages¶

Computationally tractable: Easy to implement for digital images and point sets
Scale-independent: Works across multiple orders of magnitude
Widely applicable: Can be applied to empirical data without knowing the generating process

Limitations¶

Boundary effects: Boxes at boundaries may skew counts
Scale range: Results depend on the range of \(\varepsilon\) values chosen
Resolution limits: Digital images have inherent pixelation that limits minimum \(\varepsilon\)
Not equivalent to Hausdorff: For some sets, \(D_B \neq D_H\)

Relationship to Hausdorff Dimension¶

For any set \(S\):

\[ D_H \leq \underline{D}_B \leq \overline{D}_B \]

where \(\underline{D}_B\) and \(\overline{D}_B\) are the lower and upper box dimensions. For self-similar sets satisfying the open set condition, all three are equal.

2.4 Differential Box-Counting for Grayscale and Continuous Data¶

Standard box-counting applies to binary (presence/absence) data. The differential box-counting (DBC) method extends this to grayscale images and continuous surfaces.

Extension to Grayscale Images¶

For an \(M \times M\) grayscale image with intensity values in \([0, G-1]\):

Partition: Divide the image into non-overlapping \(s \times s\) pixel blocks
Scale intensity: Map intensity to a third dimension, creating a 3D surface
Cover with boxes: Use 3D boxes of size \(s \times s \times s'\) where \(s' = s \cdot G/M\)
Count contribution: For each \((i,j)\) block, let \(n_r(i,j)\) be the number of boxes needed to cover the intensity column:

\[ n_r(i,j) = \left\lceil \frac{\max(I) - \min(I) + 1}{s'} \right\rceil \]

where \(\max(I)\) and \(\min(I)\) are the maximum and minimum intensities in block \((i,j)\).

Sum: \(N_r = \sum_{i,j} n_r(i,j)\)
Regress: Compute \(D_B\) from the slope of \(\log N_r\) vs \(\log(1/r)\)

Methodology for 3D Data¶

For volumetric data such as canopy height models (CHMs) or LiDAR point clouds:

Voxelize: Partition 3D space into cubic voxels of side \(\varepsilon\)
Mark occupied voxels: A voxel is occupied if it contains data points or intersects the surface
Count: \(N(\varepsilon)\) = number of occupied voxels
Multi-scale analysis: Repeat for multiple \(\varepsilon\) values
Estimate dimension: Linear regression of \(\log N(\varepsilon)\) vs \(\log(1/\varepsilon)\)

Applications in Remote Sensing¶

Differential box-counting is valuable for characterizing:

Forest canopy structure: Higher \(D_B\) indicates more complex, multi-layered canopies
Urban morphology: City skylines exhibit fractal scaling across scales
Terrain roughness: Mountainous regions show higher fractal dimensions than plains
Coral reef complexity: Reef structure correlates with biodiversity

For a canopy height model, typical fractal dimensions range from 2.0 (flat surface) to 2.9 (highly complex structure), providing a single metric that captures structural complexity across scales.

2.5 Power Laws and Scaling Relations¶

The fundamental relationship underlying fractal analysis is the power law relating count to scale.

The Fundamental Scaling Relationship¶

For a fractal set, the number of features \(N\) at scale \(\varepsilon\) follows:

\[ N(\varepsilon) \propto \varepsilon^{-D} \]

or equivalently:

\[ N(\varepsilon) = C \cdot \varepsilon^{-D} \]

where \(C\) is a constant and \(D\) is the fractal dimension.

Taking logarithms:

\[ \log N(\varepsilon) = \log C - D \log \varepsilon = \log C + D \log(1/\varepsilon) \]

This is a linear equation with slope \(D\).

Log-Log Plots and Linear Regression¶

The practical procedure:

Collect data: Measure \(N(\varepsilon_i)\) for multiple scales \(\varepsilon_i\)
Transform: Compute \(x_i = \log(1/\varepsilon_i)\) and \(y_i = \log N(\varepsilon_i)\)
Fit line: Use least-squares regression: \(y = mx + b\)
Extract dimension: \(D = m\) (the slope)

Example: Box-counting data for a coastline

\(\varepsilon\) (km)	\(N(\varepsilon)\)	\(\log(1/\varepsilon)\)	\(\log N\)
100	12	-2.00	1.08
50	28	-1.70	1.45
25	68	-1.40	1.83
12.5	162	-1.10	2.21
6.25	390	-0.80	2.59

Linear regression yields slope \(\approx 1.26\), suggesting \(D \approx 1.26\).

Statistical Considerations¶

Rigorous fractal analysis requires attention to:

Scale range: The power law may hold only over a limited range. Plot residuals to identify deviations at extreme scales.

Goodness of fit: Report \(R^2\) values. Fractal behavior requires \(R^2 > 0.99\) over at least one order of magnitude.

Confidence intervals: Bootstrap resampling provides uncertainty estimates for \(D\).

Sample size: More scale values improve estimates. Use at least 10-15 scales spanning 2+ orders of magnitude.

Binning effects: Different box sizes or bin widths can affect results. Test sensitivity to discretization choices.

2.6 The Hurst Exponent and Fractional Brownian Motion¶

The Hurst exponent characterizes the long-range dependence structure of time series and spatial processes.

Definition of the Hurst Exponent¶

For a time series \(X(t)\), the Hurst exponent \(H\) describes how the range of cumulative deviations scales with time:

\[ \mathbb{E}[R(n)/S(n)] \propto n^H \]

where: - \(R(n)\) is the range of cumulative deviations over \(n\) observations - \(S(n)\) is the standard deviation over \(n\) observations - \(H \in [0, 1]\)

This is the rescaled range (R/S) analysis introduced by Harold Edwin Hurst in his studies of Nile River floods.

Fractional Brownian Motion¶

Fractional Brownian motion (fBm) \(B_H(t)\) is a continuous-time stochastic process that generalizes standard Brownian motion:

\(B_H(0) = 0\)
\(B_H(t)\) has stationary increments
\(\mathbb{E}[B_H(t)] = 0\)
\(\mathbb{E}[(B_H(t) - B_H(s))^2] = |t-s|^{2H}\)

For \(H = 0.5\), fBm reduces to standard Brownian motion with independent increments.

Relationship to Fractal Dimension¶

For a one-dimensional fBm trace (the graph of \(B_H(t)\) vs \(t\)), the fractal dimension is:

\[ D = 2 - H \]

For a surface generated by 2D fBm:

\[ D = 3 - H \]

This relationship connects temporal correlation structure (Hurst) with geometric complexity (fractal dimension).

Persistence vs Anti-Persistence¶

The Hurst exponent reveals the correlation structure of increments:

\(H = 0.5\): Random walk (Brownian motion) - Increments are uncorrelated - No memory of past behavior - \(D = 1.5\) for the trace

\(H > 0.5\): Persistent (trending) behavior - Positive correlation: increases tend to follow increases - Long-range positive dependence - Smoother paths (\(D < 1.5\)) - Examples: climate records, stock market momentum

\(H < 0.5\): Anti-persistent (mean-reverting) behavior - Negative correlation: increases tend to follow decreases - Long-range negative dependence - Rougher paths (\(D > 1.5\)) - Examples: regulated systems, some heartbeat intervals

Applications in Time Series Analysis¶

The Hurst exponent is used to:

Detect long-range dependence in hydrological, financial, and physiological time series
Characterize surface roughness from profile measurements
Distinguish noise types: White noise (\(H = 0.5\)), pink noise (\(H \approx 1\)), Brownian noise (\(H = 1\))
Forecast volatility in financial markets

Estimation Methods¶

Several methods estimate \(H\):

R/S analysis: Original method; prone to bias for short series
Detrended Fluctuation Analysis (DFA): Robust to trends and non-stationarity
Wavelet-based estimation: Efficient for long series
Periodogram regression: Based on spectral density slope

For a power spectrum \(S(f) \propto f^{-\beta}\), the relationship is:

\[ H = \frac{\beta - 1}{2} \]

2.7 Lacunarity: Measuring Texture and Gaps¶

Fractal dimension alone does not fully characterize a fractal's appearance. Lacunarity quantifies the distribution of gaps (lacunae) and complements fractal dimension.

Definition and Intuition¶

Two fractals can have identical fractal dimensions but look quite different due to their gap structure. Lacunarity measures this "gappiness" or texture:

High lacunarity: Large, heterogeneous gaps; clumped or clustered patterns
Low lacunarity: Small, uniform gaps; homogeneous, translationally invariant patterns

Formally, lacunarity \(\Lambda\) at scale \(r\) is defined through the coefficient of variation of box masses:

\[ \Lambda(r) = \frac{\text{Var}[M(r)]}{(\mathbb{E}[M(r)])^2} + 1 = \frac{\mathbb{E}[M(r)^2]}{(\mathbb{E}[M(r)])^2} \]

where \(M(r)\) is the mass (count) in a box of size \(r\).

The Gliding Box Algorithm¶

The standard method for computing lacunarity:

Choose box size \(r\)
Place box at position \((i,j)\) in the image
Count mass \(M_{ij}(r)\) = number of occupied pixels in the box
Slide box across all valid positions (gliding window)
Compute statistics:
\(Z^{(1)} = \sum M / N_{boxes}\) (mean mass)
\(Z^{(2)} = \sum M^2 / N_{boxes}\) (mean squared mass)
\(\Lambda(r) = Z^{(2)} / (Z^{(1)})^2\)
Repeat for multiple box sizes \(r\)

Multi-Scale Lacunarity¶

Lacunarity is scale-dependent. A complete characterization requires:

\[ \Lambda(r) \text{ for } r \in [r_{min}, r_{max}] \]

For fractal sets, lacunarity often follows a power law:

\[ \Lambda(r) \propto r^{-\alpha} \]

The exponent \(\alpha\) and the overall lacunarity curve shape provide additional texture information beyond fractal dimension.

Lacunarity vs Fractal Dimension¶

Property	Fractal Dimension	Lacunarity
Measures	Complexity, space-filling	Texture, gappiness
Value range	Typically 1-3	\(\geq 1\) (higher = more heterogeneous)
Scale dependence	Single value (ideally)	Function of scale
Translational invariance	Insensitive	Sensitive
Rotational invariance	Typically insensitive	Can be directional

Ecological Applications¶

Lacunarity analysis is widely used in landscape ecology:

Habitat fragmentation: High lacunarity indicates patchy, fragmented habitats
Forest structure: Gaps in canopy cover affect light availability and species composition
Urban sprawl: Lacunarity tracks changes in development patterns over time
Species distribution: Clustered vs dispersed populations have different lacunarity signatures

Example: Two forests may have the same canopy fractal dimension (say, \(D = 1.8\)), but: - Forest A: Uniform gap distribution, low lacunarity - Forest B: Clustered gaps (clearings), high lacunarity

These forests would support different ecological communities despite identical fractal dimensions.

Combined Analysis¶

Best practice combines fractal dimension and lacunarity:

Compute \(D_B\) to quantify overall complexity
Compute \(\Lambda(r)\) across scales to characterize texture
Report both metrics for complete characterization
Use \(D_B\)-\(\Lambda\) phase space to classify different pattern types

This dual approach distinguishes patterns that fractal dimension alone cannot separate.

Summary¶

This chapter introduced the mathematical toolkit for fractal analysis:

Concept	Symbol	Key Formula	Application
Hausdorff dimension	\(D_H\)	\(\inf\{d : H^d(S) = 0\}\)	Theoretical foundation
Box-counting dimension	\(D_B\)	\(\lim \frac{\log N(\varepsilon)}{\log(1/\varepsilon)}\)	Practical measurement
Power law scaling	-	\(N \propto \varepsilon^{-D}\)	Fundamental relationship
Hurst exponent	\(H\)	\(D = 2 - H\) (1D)	Time series, roughness
Lacunarity	\(\Lambda\)	\(\frac{\mathbb{E}[M^2]}{(\mathbb{E}[M])^2}\)	Texture, gaps

These tools enable rigorous quantification of fractal properties in both theoretical constructs and empirical data. The following chapters apply these foundations to specific fractal types and real-world applications.

Exercises¶

Cantor set variations: Compute the Hausdorff dimension of a Cantor set where the middle ¼ (instead of ⅓) is removed at each stage.
Box-counting practice: Given box counts \(N = \{10, 35, 130, 480, 1800\}\) at scales \(\varepsilon = \{1, 0.5, 0.25, 0.125, 0.0625\}\), estimate the fractal dimension using linear regression.
Hurst exponent interpretation: A time series has estimated \(H = 0.73\). Describe the expected behavior and calculate the fractal dimension of its trace.
Lacunarity comparison: Sketch two binary patterns with \(D_B \approx 1.5\) but different lacunarities. Explain what features distinguish them.
Scaling limits: A dataset shows \(D_B = 1.8\) for \(\varepsilon \in [0.01, 0.1]\) but \(D_B = 1.2\) for \(\varepsilon \in [0.001, 0.01]\). What might explain this scale-dependent behavior?