Interactive companion for fitting splines to load trajectories, projecting samples, and reading windowed Fourier transforms
A load carried up a curved tree leaves a trajectory whose shape is dominated by the tree itself.
To detect anything behavioral riding on top of that shape — pre-fork hesitation, side-to-side
deliberation, drift — one first needs to subtract the trend. This widget illustrates two complementary
ways to do it: (1) fit a smooth spline to the trajectory and study the perpendicular residuals
d(s), or (2) compute a windowed Fourier transform of lateral position
x(z) along the trunk and treat low spatial frequencies as the trend.
The trajectory is modeled as an autocorrelated random walk along the centerline, clipped to the trunk surface — every sample sits on the tree, mimicking samples from a single video of a load being cooperatively transported.
t is the normalized trajectory parameter:
t = 0 at the base of the trunk, t = 1 at the
tip of the right branch, and t = 0.5 at the fork.
The lateral coordinate follows an AR(1) random walk, with an elevated wobble in a window
around t = 0.5 (the “deliberation” signal).
Green curve is the cubic B-spline fitted to the samples via least squares.
Red lines are perpendicular deviations di from each sample to its closest point on the spline.
What is DF? Degrees of freedom = number of B-spline basis functions. A single cubic (degree 3) polynomial has 4 free parameters (the coefficients of 1, t, t², t³), so DF = 4 corresponds to zero interior knots — one unbroken polynomial spanning the whole trajectory. Interior knots = DF − 4 (i.e. DF − (degree + 1)), so DF = 5 has 1 knot; DF = 6 has 2; DF = 14 has 10. Toggle Spline knots to see them marked.
di against arc length si
of the foot on the spline. For a Y-tree, the vertical orange line marks the arc-length location of the
tree's fork projected onto the fitted spline. The tree's fork is at a fixed geometric
point, but its arc-length coordinate along the spline shifts slightly with each refit because
the spline itself shifts. Dashed gray lines indicate ±1σ
of the residual estimated away from the deliberation window.
If the spline captures the tree's shape — the smooth path the trajectory would
follow without behavior — the residuals d(s) contain everything else:
autocorrelated noise, drift, hesitation. Pre-fork wobble shows up as elevated residual variance
around the fork.
The smoothing is a Goldilocks choice: too few degrees of freedom and the spline cuts across the bend, dumping tree shape into the residual; too many and it threads through the deliberation, hiding the signal. See the Choosing the DF tab.
On the smoothing parameter. For an ordinary least-squares fit to a cubic B-spline basis
(what you get from splines::bs(t, df = k, degree = 3) followed by lm()),
DF is the only knob — it sets the number of basis functions. Smoothing splines
(e.g. mgcv::gam(y ∼ s(t))) take a different route: a large fixed basis
plus a wiggliness penalty λ chosen by GCV or REML.
DF and λ are two parameterizations of the same trade-off.
library(splines) # `ants` columns: t (parameter), x, y (load position). # Fit cubic B-splines to x and y independently as functions of t. df <- 6 fit_x <- lm(x ~ bs(t, df = df, degree = 3), data = ants) fit_y <- lm(y ~ bs(t, df = df, degree = 3), data = ants) # Dense evaluation of the curve r(t) = (x_hat, y_hat) t_grid <- seq(min(ants$t), max(ants$t), length.out = 600) curve <- data.frame( t = t_grid, x = predict(fit_x, newdata = data.frame(t = t_grid)), y = predict(fit_y, newdata = data.frame(t = t_grid)) ) # Cumulative arc length along the curve ds <- sqrt(diff(curve$x)^2 + diff(curve$y)^2) curve$s <- c(0, cumsum(ds))
library(splines)
library(dplyr)
library(tibble)
df <- 6
fit_x <- lm(x ~ bs(t, df = df, degree = 3), data = ants)
fit_y <- lm(y ~ bs(t, df = df, degree = 3), data = ants)
curve <- tibble(t = seq(min(ants$t), max(ants$t), length.out = 600)) |>
mutate(
x = predict(fit_x, newdata = pick(t)),
y = predict(fit_y, newdata = pick(t))
) |>
mutate(
dx = x - lag(x, default = first(x)),
dy = y - lag(y, default = first(y)),
s = cumsum(sqrt(dx^2 + dy^2))
) |>
select(-dx, -dy)
Each sample pi = (xi, yi) is mapped to two scalars:
a signed deviation di from the fitted curve and an arc length
si giving its position along the curve.
1. Find the foot. The nearest point on the curve is
ti∗ = argmint∈[0,1] ‖ pi − r(t) ‖where ‖ · ‖ is the Euclidean distance — the ordinary straight-line distance √[(x1−x2)² + (y1−y2)²].
In practice we evaluate r(t) at a few hundred equally spaced parameter
values and pick the index where the squared distance is minimized.
2. Compute the signed scalar distance. Let τ(ti∗) be the unit tangent at the foot and θi the angle from τ(ti∗) to (pi − r(ti∗)), measured counterclockwise as shown in the figure. Define the sign factor sgn(θi) by embedding both vectors in 3D (zero z-components) and extracting the z-component of their cross product:
sgn(θi) = (0,0,1) · τ(ti∗) × (pi − r(ti∗)) ‖τ(ti∗)‖ · ‖pi − r(ti∗)‖Because r(ti∗) is the nearest point on the curve to pi, the displacement (pi − r(ti∗)) is perpendicular to τ(ti∗), so sgn(θi) = ±1 exactly. The signed scalar distance is then:
di = ‖pi − r(ti∗)‖ · sgn(θi)Positive on the left of the oriented curve, negative on the right — though this sign convention is arbitrary. Reversing the cross-product order to (pi − r(ti∗)) × τ(ti∗) flips the sign and labels the right side positive instead.
3. Tag with arc length. The companion coordinate is
si = ∫0ti∗ ‖ r′(u) ‖ du
computed numerically as a cumulative sum of chord lengths over the same dense sequence of parameter values.
Plotting di against si gives the residual signal
used in the first tab.
# Project samples (x, y) onto a dense evaluation of the fitted
# spline (curve$x, curve$y) and compute the SCALAR d_i and s_i.
K <- nrow(curve)
# Unit tangent tau at each curve point (central difference)
i_prev <- pmax(seq_len(K) - 1, 1)
i_next <- pmin(seq_len(K) + 1, K)
tx <- curve$x[i_next] - curve$x[i_prev]
ty <- curve$y[i_next] - curve$y[i_prev]
mag <- sqrt(tx^2 + ty^2)
tx <- tx / mag; ty <- ty / mag
# For each sample, find nearest curve point (foot)
project_one <- function(px, py) which.min((curve$x - px)^2 + (curve$y - py)^2)
i_foot <- mapply(project_one, ants$x, ants$y)
# Signed scalar distance: 2D cross product of unit tangent with gap vector.
# Equals ||p_i - r(t_i*)|| * sin(theta_i) where theta_i is the signed angle
# from tau to the displacement. Positive on the left of the oriented curve.
d_i <- tx[i_foot] * (ants$y - curve$y[i_foot]) -
ty[i_foot] * (ants$x - curve$x[i_foot])
s_i <- curve$s[i_foot]
resid <- data.frame(s = s_i, d = d_i)library(dplyr)
library(purrr)
# Add unit tangent columns to the curve tibble
curve <- curve |>
mutate(
tx = lead(x, default = last(x)) - lag(x, default = first(x)),
ty = lead(y, default = last(y)) - lag(y, default = first(y)),
mag = sqrt(tx^2 + ty^2),
tx = tx / mag,
ty = ty / mag
)
# Project each sample; d_i = ||p_i - r(t_i*)|| * sin(theta_i)
# computed directly as the 2D cross product tau x delta.
resid <- ants |>
mutate(
i_foot = map2_int(x, y, ~ which.min((curve$x - .x)^2 + (curve$y - .y)^2)),
d = curve$tx[i_foot] * (y - curve$y[i_foot]) -
curve$ty[i_foot] * (x - curve$x[i_foot]),
s = curve$s[i_foot]
)
DF and smoothness λ — two sides of the same coin.
Ordinary least-squares fitting (used here and in splines::bs() + lm())
specifies DF, the number of basis functions, which directly fixes the number of
interior knots and the flexibility of the curve.
Penalized splines (e.g. mgcv::gam) instead start from a large fixed basis
and add a wiggliness penalty λ · ∫|f′′(t)|² dt.
Large λ suppresses oscillations (low effective DF); small λ allows high
flexibility (high effective DF). Increasing DF ⇔ decreasing λ. The rest
of this tab discusses DF-based selection; identical logic applies to λ.
Define the residual sum of squares between two parametric curves sampled at N common parameter values as
RSS(A, B) ≜ ∑k=1N ‖ Ak − Bk ‖2Both quantities in the plot below compare the fitted spline to a reference set of N points evaluated at the same parameter values:
• RSS(spline, data) — spline points vs. sample positions. Decreases monotonically with DF (more flexibility always reduces in-sample error, reaching zero when DF ≥ N). It has no minimum, so it cannot choose DF on its own. RSS(spline, data) is shown not because it picks DF, but as a foil: its relentless decrease illustrates why a monotone fit criterion is insufficient and a complexity penalty is needed.
• RSS(spline, tree) — spline points vs. true tree-centerline positions (oracle). For a Y-shaped tree it is U-shaped: at low DF the spline cannot follow the fork geometry (underfitting); at high DF it chases trajectory noise away from the true shape (overfitting), and the minimum df* identifies the ideal smoothing level. For a straight (I-shaped) tree the geometry is linear — fully captured at df = 4 — so there is no underfitting regime and RSS(spline, tree) rises monotonically: higher DF only lets the spline drift further from the straight backbone.
How to use both curves together. The two curves diverge as DF grows: RSS(spline, data) keeps falling while RSS(spline, tree) passes its minimum and starts to rise. The onset of this divergence marks the overfitting threshold — at that DF, the spline begins fitting behavioral noise rather than tree geometry. Practically, choose DF in the neighborhood of df* (or slightly to its left for a conservative smoothing). The widening gap between the two curves tells you how much of the fit improvement is noise-fitting rather than tree-shape recovery: a narrow gap means most of the fit gain is genuinely structural; a wide gap means the spline is mostly chasing noise.
Can you get the tree skeleton from real video? Yes — the trunk centerline can be extracted by image segmentation or frame-to-tree registration. When that is possible, RSS(spline, tree) is directly usable, not just an oracle. When it is not available, AIC, BIC, and GCV replace the oracle: instead of measuring how far the spline is from the true skeleton, they penalize the in-sample RSS by the number of parameters, discouraging the spline from chasing noise.
• AIC = N log(RSS/N) + 2 DF
(Akaike — lighter penalty, slightly favors richer models).
• BIC = N log(RSS/N) + log(N) DF
(heavier penalty, sparser; usually preferable for large N).
Both AIC and BIC are minimized over a grid of DF values.
• GCV from mgcv::gam avoids that grid search entirely by
working in the λ domain: it fits a penalized spline and automatically chooses the
smoothness λ to minimize an approximation of leave-one-out prediction error. Because
λ and DF are equivalent parameterizations of the same flexibility/smoothness
trade-off, GCV is solving the same model-selection problem — just without the sweep.
All three criteria enforce complexity penalties for the same reason: a spline with enough DF can pass through every sample (RSS = 0), but that spline is useless — it captures noise, not tree shape. The penalty mimics the U-shape of RSS(spline, tree) that arises for non-straight trees, without needing the true skeleton.
df overlaid on the same trajectory. The
current df (from Tab 1's slider) is drawn in blue; the fixed reference values
are gray. The dark-brown centerline running through the tree is the true skeleton backbone
used as ground truth for RSS(spline, tree).
df; the dotted green line
marks df* = argmin RSS(spline, tree). Hit
Average over 30 seeds to smooth the noise realizations and see the underlying
U-shape clearly.
library(splines) # provides bs()
library(mgcv)
# bs(t, df = k, degree = 3) builds a cubic B-spline design matrix: an n x k
# matrix whose columns are B-spline basis functions evaluated at each t value.
# Passing it to lm() fits by OLS over that k-dimensional function space —
# the same least-squares cubic B-spline with k free parameters used throughout
# this widget. Interior knots are placed at equally spaced quantiles of t;
# df = k = (degree + 1) + number_of_interior_knots = 4 + interior knots.
# Option 1: compare RSS(spline, data) vs RSS(spline, tree) across a DF grid.
# `skel` holds the tree-centerline (x, y) at each sample's t value.
# RSS(spline, data) always decreases; RSS(spline, tree) is U-shaped for Y-trees.
rss_df <- function(df) {
fit_x <- lm(x ~ bs(t, df = df, degree = 3), data = ants)
fit_y <- lm(y ~ bs(t, df = df, degree = 3), data = ants)
xh <- predict(fit_x, newdata = ants)
yh <- predict(fit_y, newdata = ants)
c(data = sum((ants$x - xh)^2 + (ants$y - yh)^2),
tree = sum((skel$x - xh)^2 + (skel$y - yh)^2))
}
rss_mat <- sapply(4:20, rss_df)
colnames(rss_mat) <- 4:20
df_star <- as.integer(colnames(rss_mat)[which.min(rss_mat["tree", ])])
# Option 2a: sweep DF, pick by AIC or BIC
ic <- sapply(4:20, function(df) {
fx <- lm(x ~ bs(t, df = df, degree = 3), data = ants)
fy <- lm(y ~ bs(t, df = df, degree = 3), data = ants)
c(aic = AIC(fx) + AIC(fy),
bic = BIC(fx) + BIC(fy))
})
colnames(ic) <- 4:20
df_aic <- as.integer(colnames(ic)[which.min(ic["aic", ])])
df_bic <- as.integer(colnames(ic)[which.min(ic["bic", ])])
# Option 2b: penalized cubic regression spline, smoothness by GCV
gx <- gam(x ~ s(t, bs = "cr"), data = ants, method = "GCV.Cp")
gy <- gam(y ~ s(t, bs = "cr"), data = ants, method = "GCV.Cp")
edf <- gx$edf + gy$edf # effective DF from GCVlibrary(splines) # provides bs()
library(mgcv)
library(dplyr)
library(purrr)
library(tibble)
# bs(t, df = k, degree = 3) constructs a cubic B-spline design matrix: k basis
# functions evaluated at each t, used by lm() as predictors for OLS fitting.
# This produces the same least-squares cubic B-spline (k free parameters) as
# the widget's interactive spline. Interior knots fall at t quantiles;
# df = k = 4 + number_of_interior_knots.
# Option 1: compare RSS(spline, data) vs RSS(spline, tree) across a DF grid.
rss_sweep <- tibble(df = 4:20) |>
mutate(
fit_x = map(df, ~ lm(x ~ bs(t, df = .x, degree = 3), data = ants)),
fit_y = map(df, ~ lm(y ~ bs(t, df = .x, degree = 3), data = ants)),
rss_data = map2_dbl(fit_x, fit_y, ~ {
xh <- predict(.x, newdata = ants)
yh <- predict(.y, newdata = ants)
sum((ants$x - xh)^2 + (ants$y - yh)^2)
}),
rss_tree = map2_dbl(fit_x, fit_y, ~ {
xh <- predict(.x, newdata = ants)
yh <- predict(.y, newdata = ants)
sum((skel$x - xh)^2 + (skel$y - yh)^2)
})
)
df_star <- rss_sweep |> slice_min(rss_tree) |> pull(df)
# Option 2a: sweep DF, pick by AIC / BIC
ic <- rss_sweep |>
mutate(
aic = map2_dbl(fit_x, fit_y, ~ AIC(.x) + AIC(.y)),
bic = map2_dbl(fit_x, fit_y, ~ BIC(.x) + BIC(.y))
)
df_aic <- ic |> slice_min(aic) |> pull(df)
df_bic <- ic |> slice_min(bic) |> pull(df)
# Option 2b: GCV-chosen smoothness via mgcv::gam
gx <- gam(x ~ s(t, bs = "cr"), data = ants, method = "GCV.Cp")
gy <- gam(y ~ s(t, bs = "cr"), data = ants, method = "GCV.Cp")
edf <- gx$edf + gy$edf
Instead of fitting a spline and subtracting it, we can move directly into the spatial-frequency
domain. Let d⊥(z) be the perpendicular deviation from the tree centerline
as a function of arc-length height z, resampled onto a uniform grid.
(Using the raw x coordinate instead would mix in the lateral trend of the branch,
creating a large DC component at branch heights — the perpendicular deviation is
zero-mean by construction.)
The trunk’s gentle curvature and the trajectory’s slow drift contribute power
at low spatial frequencies; pre-fork deliberation contributes power at
higher frequencies localized near the fork. A windowed Fourier transform of
d⊥(z) exposes both at once.
The slow trajectory shape has some characteristic length scale Lbend.
A cutoff at spatial frequency fcut = 1 / Lbend
plays the same role as a spline detrend: power below the cutoff is “trend”
and is ignored; power above the cutoff is candidate behavioral structure. Drag the
Lbend slider to move the cutoff.
Why vertical = z. The spectrogram is plotted with the same vertical axis as the tree on the left, so you can read frequencies at any height by sliding your eye up the column.
x(z) used
by the spectrogram on the right is the horizontal coordinate of these samples.
d⊥(z) from the tree centerline, as a function of arc-length height z.
Using the perpendicular deviation (rather than raw x) removes the systematic
lateral trend from the angled branch, leaving only the behavioral wobble.
Vertical axis is height z (aligned to the tree on the left); horizontal axis
is spatial frequency. The horizontal orange line marks z at the fork.
The dashed white vertical line is the detrending cutoff
fcut = 1/Lbend; power to its left is treated as trend.
Window length W — number of samples in each STFT slice. Larger W
gives better frequency resolution but coarser localization in z; a Heisenberg-style
trade-off. With K = 256 uniform z samples and
W = 32, each slice spans roughly 12% of the trunk.
Hop — shift between consecutive windows. Smaller hop means more overlap (smoother
spectrogram, more compute). Default hop = 4 with
W = 32 gives 87.5% overlap.
Window type — Hann and Hamming taper smoothly to (near) zero at the edges, suppressing spectral leakage from window boundaries; their side-lobe behavior differs slightly. Rectangular applies no taper — sharp edges leak energy across frequencies. Gaussian tapers smoothly and is optimal in the joint time-frequency uncertainty sense (Gabor's bound).
Lbend — characteristic length scale of slow trajectory shape (the trunk's
bend, an ant's drift toward one side, anything you want to call “trend”).
The cutoff fcut = 1/Lbend partitions the spectrum:
below is trend, above is candidate signal.
log magnitude — switch from linear to log10(1+9v) scaling
of the spectrogram so weaker components become visible.
Per-slice normalization — rescale each z-slice to its own max,
useful when overall power varies along the trunk.
library(signal)
# Resample x as a function of height z onto a uniform grid.
# (y from the trajectory plays the role of trunk-aligned height.)
z_grid <- seq(min(ants$y), max(ants$y), length.out = 256)
xz <- approx(ants$y, ants$x, xout = z_grid)$y
dz <- diff(z_grid)[1]
# Windowed Fourier transform: Hann window, hop = W - overlap
W <- 32
hop <- 4
spec <- specgram(xz, n = W, Fs = 1/dz,
window = hanning(W),
overlap = W - hop)
# Mark the fork height and the spatial-scale cutoff
z_fork <- 1 # height of the geometric fork
L_bend <- 0.15 # estimated bend length scale
f_cut <- 1 / L_bend # detrending cutoff in 1/length
# Plot with VERTICAL = z (transpose so z runs up the y-axis):
image(x = spec$f, y = spec$t, z = abs(spec$S),
col = hcl.colors(64, "viridis"),
xlab = "spatial frequency (1/length)",
ylab = "height z")
abline(h = z_fork, col = "darkorange", lwd = 2)
abline(v = f_cut, col = "white", lty = 2, lwd = 1.5)library(signal)
library(tibble)
library(dplyr)
library(tidyr)
library(ggplot2)
z_grid <- seq(min(ants$y), max(ants$y), length.out = 256)
xz <- approx(ants$y, ants$x, xout = z_grid)$y
dz <- diff(z_grid)[1]
spec <- specgram(xz, n = 32, Fs = 1/dz,
window = hanning(32), overlap = 28)
spec_df <- expand_grid(
z = spec$t,
f = spec$f
) |>
mutate(magnitude = as.numeric(abs(spec$S)))
# Vertical axis = z, horizontal axis = spatial frequency
ggplot(spec_df, aes(x = f, y = z, fill = magnitude)) +
geom_raster() +
geom_hline(yintercept = 1, color = "darkorange", linewidth = 1.2) +
geom_vline(xintercept = 1 / 0.15, color = "white", linetype = "dashed") +
scale_fill_viridis_c() +
labs(x = "spatial freq (1/length)", y = "height z") +
theme_minimal()