Skip to content

Chapter 2: Mathematical Foundations of Fractals

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

This chapter develops the mathematical framework for understanding and measuring fractals. We begin with the fundamental question of dimension, then explore increasingly sophisticated tools for quantifying fractal geometry.


2.1 Topological vs Fractal Dimension

Integer Dimensions in Topology

In classical topology, dimension is an integer that describes an object's fundamental structure:

  • A point has dimension 0
  • A line has dimension 1
  • A plane has dimension 2
  • A volume has dimension 3

The topological dimension \(D_T\) of a set is formally defined through covering properties. A set has topological dimension \(n\) if every point has arbitrarily small neighborhoods whose boundaries have dimension \(n-1\), and \(n\) is the smallest such integer.

For a line segment, any small neighborhood is an interval, and its boundary consists of two points (dimension 0). Hence, the line has topological dimension 1. This recursive definition grounds our intuitive understanding of dimension.

Why Fractals Have Non-Integer Dimensions

Fractals challenge this framework. Consider the Koch curve: it is constructed from a line segment, so topologically it remains one-dimensional. Yet it exhibits properties that suggest something more complex:

  1. Infinite length in finite space: The Koch curve has infinite length contained within a bounded region
  2. Space-filling tendency: It occupies more "space" than a simple curve
  3. Anomalous scaling: When magnified by factor 3, it contains 4 self-similar copies

These observations suggest that the Koch curve is "more than" one-dimensional but "less than" two-dimensional. This intuition leads us to fractal dimension.

Dimension as a Scaling Exponent

The key insight is to redefine dimension through scaling behavior. For regular geometric objects:

  • A line segment scaled by factor \(r\) produces \(r^1\) copies
  • A square scaled by factor \(r\) produces \(r^2\) copies
  • A cube scaled by factor \(r\) produces \(r^3\) copies

In general, scaling by factor \(r\) produces \(N = r^D\) copies, where \(D\) is the dimension. Solving for \(D\):

\[ D = \frac{\log N}{\log r} \]

For the Koch curve, scaling by \(r = 3\) produces \(N = 4\) copies, yielding:

\[ D = \frac{\log 4}{\log 3} \approx 1.262 \]

This non-integer value captures the Koch curve's intermediate complexity between a line and a plane.


2.2 Hausdorff-Besicovitch Dimension

The Hausdorff dimension provides the most rigorous mathematical foundation for fractal dimension. It builds on measure theory to assign a dimension to any set.

Hausdorff Measure

For a set \(S \subset \mathbb{R}^n\) and a positive real number \(d\), the \(d\)-dimensional Hausdorff measure is constructed as follows.

First, define the \(d\)-dimensional Hausdorff content at scale \(\delta\):

\[ H^d_\delta(S) = \inf \left\{ \sum_{i=1}^{\infty} (\text{diam } U_i)^d : S \subseteq \bigcup_{i=1}^{\infty} U_i, \; \text{diam } U_i < \delta \right\} \]

Here we cover \(S\) with countably many sets \(U_i\) of diameter less than \(\delta\), then sum the \(d\)-th powers of their diameters. The infimum is taken over all such coverings.

The Hausdorff measure is then:

\[ H^d(S) = \lim_{\delta \to 0} H^d_\delta(S) \]

Definition of Hausdorff Dimension

The crucial observation is that \(H^d(S)\) exhibits a threshold behavior:

  • For \(d < D_H\): \(H^d(S) = \infty\)
  • For \(d > D_H\): \(H^d(S) = 0\)

The Hausdorff dimension is this critical value:

\[ D_H = \inf\{d \geq 0 : H^d(S) = 0\} = \sup\{d \geq 0 : H^d(S) = \infty\} \]

At \(d = D_H\), the measure \(H^{D_H}(S)\) may be zero, positive and finite, or infinite.

Example: The Cantor Set

The middle-thirds Cantor set \(C\) is constructed by:

  1. Start with \([0,1]\)
  2. Remove the middle third \((1/3, 2/3)\)
  3. Repeat for each remaining interval

At stage \(n\), we have \(2^n\) intervals, each of length \(3^{-n}\). The Cantor set is what remains after infinitely many iterations.

To compute \(D_H\): scaling by factor 3 produces 2 copies, so:

\[ D_H = \frac{\log 2}{\log 3} \approx 0.631 \]

This value reflects that the Cantor set is "more than" a finite collection of points (dimension 0) but "less than" a line segment (dimension 1).

Example: The Koch Curve

For the Koch curve, each iteration replaces one segment with four segments, each scaled by factor ⅓:

\[ D_H = \frac{\log 4}{\log 3} \approx 1.262 \]

When to Use Hausdorff Dimension

Hausdorff dimension is the gold standard for theoretical work because:

  • It is defined for any subset of \(\mathbb{R}^n\)
  • It satisfies desirable mathematical properties (monotonicity, countable stability)
  • It agrees with topological dimension for smooth manifolds

However, computing \(H^d(S)\) directly is often impractical. For applications, we turn to more computationally tractable alternatives.


2.3 Box-Counting (Minkowski-Bouligand) Dimension

The box-counting dimension provides a practical method for estimating fractal dimension from data.

Definition

Cover the set \(S\) with a grid of \(n\)-dimensional boxes (hypercubes) of side length \(\varepsilon\). Let \(N(\varepsilon)\) denote the number of boxes that intersect \(S\). The box-counting dimension is:

\[ D_B = \lim_{\varepsilon \to 0} \frac{\log N(\varepsilon)}{\log(1/\varepsilon)} \]

When this limit exists, it equals the box-counting dimension. More generally, we define upper and lower box dimensions using \(\limsup\) and \(\liminf\).

Computational Procedure

  1. Discretize: Overlay a grid of boxes with side length \(\varepsilon\) on the set
  2. Count: Determine \(N(\varepsilon)\), the number of boxes containing part of the set
  3. Repeat: Vary \(\varepsilon\) across multiple scales (typically powers of 2)
  4. Regress: Plot \(\log N(\varepsilon)\) vs \(\log(1/\varepsilon)\) and fit a line
  5. Extract: The slope estimates \(D_B\)

Example Calculation

For a filled square of side 1:

\(\varepsilon\) \(1/\varepsilon\) \(N(\varepsilon)\) \(\log N\) \(\log(1/\varepsilon)\)
½ 2 4 0.602 0.301
¼ 4 16 1.204 0.602
8 64 1.806 0.903
1/16 16 256 2.408 1.204

The slope is \(2.408/1.204 = 2\), confirming the expected dimension.

Advantages

  • Computationally tractable: Easy to implement for digital images and point sets
  • Scale-independent: Works across multiple orders of magnitude
  • Widely applicable: Can be applied to empirical data without knowing the generating process

Limitations

  • Boundary effects: Boxes at boundaries may skew counts
  • Scale range: Results depend on the range of \(\varepsilon\) values chosen
  • Resolution limits: Digital images have inherent pixelation that limits minimum \(\varepsilon\)
  • Not equivalent to Hausdorff: For some sets, \(D_B \neq D_H\)

Relationship to Hausdorff Dimension

For any set \(S\):

\[ D_H \leq \underline{D}_B \leq \overline{D}_B \]

where \(\underline{D}_B\) and \(\overline{D}_B\) are the lower and upper box dimensions. For self-similar sets satisfying the open set condition, all three are equal.


2.4 Differential Box-Counting for Grayscale and Continuous Data

Standard box-counting applies to binary (presence/absence) data. The differential box-counting (DBC) method extends this to grayscale images and continuous surfaces.

Extension to Grayscale Images

For an \(M \times M\) grayscale image with intensity values in \([0, G-1]\):

  1. Partition: Divide the image into non-overlapping \(s \times s\) pixel blocks
  2. Scale intensity: Map intensity to a third dimension, creating a 3D surface
  3. Cover with boxes: Use 3D boxes of size \(s \times s \times s'\) where \(s' = s \cdot G/M\)
  4. Count contribution: For each \((i,j)\) block, let \(n_r(i,j)\) be the number of boxes needed to cover the intensity column:
\[ n_r(i,j) = \left\lceil \frac{\max(I) - \min(I) + 1}{s'} \right\rceil \]

where \(\max(I)\) and \(\min(I)\) are the maximum and minimum intensities in block \((i,j)\).

  1. Sum: \(N_r = \sum_{i,j} n_r(i,j)\)

  2. Regress: Compute \(D_B\) from the slope of \(\log N_r\) vs \(\log(1/r)\)

Methodology for 3D Data

For volumetric data such as canopy height models (CHMs) or LiDAR point clouds:

  1. Voxelize: Partition 3D space into cubic voxels of side \(\varepsilon\)
  2. Mark occupied voxels: A voxel is occupied if it contains data points or intersects the surface
  3. Count: \(N(\varepsilon)\) = number of occupied voxels
  4. Multi-scale analysis: Repeat for multiple \(\varepsilon\) values
  5. Estimate dimension: Linear regression of \(\log N(\varepsilon)\) vs \(\log(1/\varepsilon)\)

Applications in Remote Sensing

Differential box-counting is valuable for characterizing:

  • Forest canopy structure: Higher \(D_B\) indicates more complex, multi-layered canopies
  • Urban morphology: City skylines exhibit fractal scaling across scales
  • Terrain roughness: Mountainous regions show higher fractal dimensions than plains
  • Coral reef complexity: Reef structure correlates with biodiversity

For a canopy height model, typical fractal dimensions range from 2.0 (flat surface) to 2.9 (highly complex structure), providing a single metric that captures structural complexity across scales.


2.5 Power Laws and Scaling Relations

The fundamental relationship underlying fractal analysis is the power law relating count to scale.

The Fundamental Scaling Relationship

For a fractal set, the number of features \(N\) at scale \(\varepsilon\) follows:

\[ N(\varepsilon) \propto \varepsilon^{-D} \]

or equivalently:

\[ N(\varepsilon) = C \cdot \varepsilon^{-D} \]

where \(C\) is a constant and \(D\) is the fractal dimension.

Taking logarithms:

\[ \log N(\varepsilon) = \log C - D \log \varepsilon = \log C + D \log(1/\varepsilon) \]

This is a linear equation with slope \(D\).

Log-Log Plots and Linear Regression

The practical procedure:

  1. Collect data: Measure \(N(\varepsilon_i)\) for multiple scales \(\varepsilon_i\)
  2. Transform: Compute \(x_i = \log(1/\varepsilon_i)\) and \(y_i = \log N(\varepsilon_i)\)
  3. Fit line: Use least-squares regression: \(y = mx + b\)
  4. Extract dimension: \(D = m\) (the slope)

Example: Box-counting data for a coastline

\(\varepsilon\) (km) \(N(\varepsilon)\) \(\log(1/\varepsilon)\) \(\log N\)
100 12 -2.00 1.08
50 28 -1.70 1.45
25 68 -1.40 1.83
12.5 162 -1.10 2.21
6.25 390 -0.80 2.59

Linear regression yields slope \(\approx 1.26\), suggesting \(D \approx 1.26\).

Statistical Considerations

Rigorous fractal analysis requires attention to:

Scale range: The power law may hold only over a limited range. Plot residuals to identify deviations at extreme scales.

Goodness of fit: Report \(R^2\) values. Fractal behavior requires \(R^2 > 0.99\) over at least one order of magnitude.

Confidence intervals: Bootstrap resampling provides uncertainty estimates for \(D\).

Sample size: More scale values improve estimates. Use at least 10-15 scales spanning 2+ orders of magnitude.

Binning effects: Different box sizes or bin widths can affect results. Test sensitivity to discretization choices.


2.6 The Hurst Exponent and Fractional Brownian Motion

The Hurst exponent characterizes the long-range dependence structure of time series and spatial processes.

Definition of the Hurst Exponent

For a time series \(X(t)\), the Hurst exponent \(H\) describes how the range of cumulative deviations scales with time:

\[ \mathbb{E}[R(n)/S(n)] \propto n^H \]

where: - \(R(n)\) is the range of cumulative deviations over \(n\) observations - \(S(n)\) is the standard deviation over \(n\) observations - \(H \in [0, 1]\)

This is the rescaled range (R/S) analysis introduced by Harold Edwin Hurst in his studies of Nile River floods.

Fractional Brownian Motion

Fractional Brownian motion (fBm) \(B_H(t)\) is a continuous-time stochastic process that generalizes standard Brownian motion:

  1. \(B_H(0) = 0\)
  2. \(B_H(t)\) has stationary increments
  3. \(\mathbb{E}[B_H(t)] = 0\)
  4. \(\mathbb{E}[(B_H(t) - B_H(s))^2] = |t-s|^{2H}\)

For \(H = 0.5\), fBm reduces to standard Brownian motion with independent increments.

Relationship to Fractal Dimension

For a one-dimensional fBm trace (the graph of \(B_H(t)\) vs \(t\)), the fractal dimension is:

\[ D = 2 - H \]

For a surface generated by 2D fBm:

\[ D = 3 - H \]

This relationship connects temporal correlation structure (Hurst) with geometric complexity (fractal dimension).

Persistence vs Anti-Persistence

The Hurst exponent reveals the correlation structure of increments:

\(H = 0.5\): Random walk (Brownian motion) - Increments are uncorrelated - No memory of past behavior - \(D = 1.5\) for the trace

\(H > 0.5\): Persistent (trending) behavior - Positive correlation: increases tend to follow increases - Long-range positive dependence - Smoother paths (\(D < 1.5\)) - Examples: climate records, stock market momentum

\(H < 0.5\): Anti-persistent (mean-reverting) behavior - Negative correlation: increases tend to follow decreases - Long-range negative dependence - Rougher paths (\(D > 1.5\)) - Examples: regulated systems, some heartbeat intervals

Applications in Time Series Analysis

The Hurst exponent is used to:

  • Detect long-range dependence in hydrological, financial, and physiological time series
  • Characterize surface roughness from profile measurements
  • Distinguish noise types: White noise (\(H = 0.5\)), pink noise (\(H \approx 1\)), Brownian noise (\(H = 1\))
  • Forecast volatility in financial markets

Estimation Methods

Several methods estimate \(H\):

  1. R/S analysis: Original method; prone to bias for short series
  2. Detrended Fluctuation Analysis (DFA): Robust to trends and non-stationarity
  3. Wavelet-based estimation: Efficient for long series
  4. Periodogram regression: Based on spectral density slope

For a power spectrum \(S(f) \propto f^{-\beta}\), the relationship is:

\[ H = \frac{\beta - 1}{2} \]

2.7 Lacunarity: Measuring Texture and Gaps

Fractal dimension alone does not fully characterize a fractal's appearance. Lacunarity quantifies the distribution of gaps (lacunae) and complements fractal dimension.

Definition and Intuition

Two fractals can have identical fractal dimensions but look quite different due to their gap structure. Lacunarity measures this "gappiness" or texture:

  • High lacunarity: Large, heterogeneous gaps; clumped or clustered patterns
  • Low lacunarity: Small, uniform gaps; homogeneous, translationally invariant patterns

Formally, lacunarity \(\Lambda\) at scale \(r\) is defined through the coefficient of variation of box masses:

\[ \Lambda(r) = \frac{\text{Var}[M(r)]}{(\mathbb{E}[M(r)])^2} + 1 = \frac{\mathbb{E}[M(r)^2]}{(\mathbb{E}[M(r)])^2} \]

where \(M(r)\) is the mass (count) in a box of size \(r\).

The Gliding Box Algorithm

The standard method for computing lacunarity:

  1. Choose box size \(r\)
  2. Place box at position \((i,j)\) in the image
  3. Count mass \(M_{ij}(r)\) = number of occupied pixels in the box
  4. Slide box across all valid positions (gliding window)
  5. Compute statistics:
  6. \(Z^{(1)} = \sum M / N_{boxes}\) (mean mass)
  7. \(Z^{(2)} = \sum M^2 / N_{boxes}\) (mean squared mass)
  8. \(\Lambda(r) = Z^{(2)} / (Z^{(1)})^2\)
  9. Repeat for multiple box sizes \(r\)

Multi-Scale Lacunarity

Lacunarity is scale-dependent. A complete characterization requires:

\[ \Lambda(r) \text{ for } r \in [r_{min}, r_{max}] \]

For fractal sets, lacunarity often follows a power law:

\[ \Lambda(r) \propto r^{-\alpha} \]

The exponent \(\alpha\) and the overall lacunarity curve shape provide additional texture information beyond fractal dimension.

Lacunarity vs Fractal Dimension

Property Fractal Dimension Lacunarity
Measures Complexity, space-filling Texture, gappiness
Value range Typically 1-3 \(\geq 1\) (higher = more heterogeneous)
Scale dependence Single value (ideally) Function of scale
Translational invariance Insensitive Sensitive
Rotational invariance Typically insensitive Can be directional

Ecological Applications

Lacunarity analysis is widely used in landscape ecology:

  • Habitat fragmentation: High lacunarity indicates patchy, fragmented habitats
  • Forest structure: Gaps in canopy cover affect light availability and species composition
  • Urban sprawl: Lacunarity tracks changes in development patterns over time
  • Species distribution: Clustered vs dispersed populations have different lacunarity signatures

Example: Two forests may have the same canopy fractal dimension (say, \(D = 1.8\)), but: - Forest A: Uniform gap distribution, low lacunarity - Forest B: Clustered gaps (clearings), high lacunarity

These forests would support different ecological communities despite identical fractal dimensions.

Combined Analysis

Best practice combines fractal dimension and lacunarity:

  1. Compute \(D_B\) to quantify overall complexity
  2. Compute \(\Lambda(r)\) across scales to characterize texture
  3. Report both metrics for complete characterization
  4. Use \(D_B\)-\(\Lambda\) phase space to classify different pattern types

This dual approach distinguishes patterns that fractal dimension alone cannot separate.


Summary

This chapter introduced the mathematical toolkit for fractal analysis:

Concept Symbol Key Formula Application
Hausdorff dimension \(D_H\) \(\inf\{d : H^d(S) = 0\}\) Theoretical foundation
Box-counting dimension \(D_B\) \(\lim \frac{\log N(\varepsilon)}{\log(1/\varepsilon)}\) Practical measurement
Power law scaling - \(N \propto \varepsilon^{-D}\) Fundamental relationship
Hurst exponent \(H\) \(D = 2 - H\) (1D) Time series, roughness
Lacunarity \(\Lambda\) \(\frac{\mathbb{E}[M^2]}{(\mathbb{E}[M])^2}\) Texture, gaps

These tools enable rigorous quantification of fractal properties in both theoretical constructs and empirical data. The following chapters apply these foundations to specific fractal types and real-world applications.


Exercises

  1. Cantor set variations: Compute the Hausdorff dimension of a Cantor set where the middle ¼ (instead of ⅓) is removed at each stage.

  2. Box-counting practice: Given box counts \(N = \{10, 35, 130, 480, 1800\}\) at scales \(\varepsilon = \{1, 0.5, 0.25, 0.125, 0.0625\}\), estimate the fractal dimension using linear regression.

  3. Hurst exponent interpretation: A time series has estimated \(H = 0.73\). Describe the expected behavior and calculate the fractal dimension of its trace.

  4. Lacunarity comparison: Sketch two binary patterns with \(D_B \approx 1.5\) but different lacunarities. Explain what features distinguish them.

  5. Scaling limits: A dataset shows \(D_B = 1.8\) for \(\varepsilon \in [0.01, 0.1]\) but \(D_B = 1.2\) for \(\varepsilon \in [0.001, 0.01]\). What might explain this scale-dependent behavior?


Further Reading

  • Falconer, Kenneth. Fractal Geometry: Mathematical Foundations and Applications. 3rd ed., Wiley, 2014.
  • Mandelbrot, Benoit B. The Fractal Geometry of Nature. W.H. Freeman, 1982.
  • Plotnick, Roy E., et al. "Lacunarity Analysis: A General Technique for the Analysis of Spatial Patterns." Physical Review E, vol. 53, no. 5, 1996, pp. 5461-5468.
  • Hurst, H.E. "Long-Term Storage Capacity of Reservoirs." Transactions of the American Society of Civil Engineers, vol. 116, 1951, pp. 770-799.