Distance Sampling (DS)
→ Open simulation · GitHub repository
Overview
An observer traverses a fixed straight transect. Animals within a strip around the transect are detected with probability that declines with perpendicular distance. The recorded distances are then used to estimate detectability and, from that, density.
The simulation shows two concurrent views of the same scene:
- the omniscient view: all animal positions visible on screen, detected animals in red, missed animals in grey;
- the method’s view: only the perpendicular distances of detected animals are available to the estimator.
Glossary
Controls
| Control | Description |
|---|---|
| Speed | Playback speed: Slow / Normal / Fast. Affects only how fast the observer traverses the transect; does not change what gets detected. |
| Seed | RNG seed for animal placement (0–99999). Reset with the same seed reproduces the exact same population. |
| 🎲 New | Draws a new random seed and regenerates the population. |
| Preset | Named parameter set. Two groups: Cetacean scenarios (harbour porpoise, bottlenose dolphin, common dolphin, minke whale, bowhead whale) and Terrestrial surveys (songbird, shorebird, raptor, bird nest survey, snake). Applying a preset overwrites all Complications sliders and selectors. |
| Field truth | Detection function used to simulate detections: half-normal or hazard-rate. Updates live; affects future detection draws only (existing stored detections are not changed). |
| Model | Detection function used for MLE fitting. Mismatching Field truth and Model produces biased D̂ (warning shown). Updates live. |
| σ | Detection scale (km). For half-normal: distance at which detection probability = \(e^{-1/2} \approx 0.607\). For hazard-rate: scale of the decay. Updates live. Default: 0.25 km. |
| W | Truncation distance (km). Animals beyond W are never detected; stored detections beyond W are excluded from analysis if W is later reduced. Updates live. Default: 0.50 km. |
| b (shape) | Hazard-rate shape parameter. Visible only when hazard-rate is selected. Higher b = wider flat shoulder near the transect before a sharper falloff. Default: 2.5. |
| D | True animal density (animals per km²). Takes effect on Reset. Default: 50 /km². |
| L | Transect length (km). Sets the width of the arena. Takes effect on Reset. Default: 4.0 km. |
| Distribution | Spatial distribution of animals: Uniform, Clustered, or Regular. Takes effect on Reset. |
| s_clump | Cluster spread as a fraction of transect length. Visible when Distribution = Clustered. Default: 0.10. |
| ρ (regularity) | Grid jitter: 0 = perfect grid, 1 = maximum jitter. Visible when Distribution = Regular. Default: 0.50. |
| Keep previous runs | Overlays the previous runs’ fitted curves and D̂ traces as faded ghost lines (up to 8 retained). |
Estimates readout
| Symbol | Meaning |
|---|---|
| \(n\) | Count of within-\(W\) detections in the current run. |
| Effort (\(L\)) | Observer x-position (km) at the time of the most recent detection. |
| ESW | Effective strip width (km): \(\int_0^W g(x)\,dx\). Computed from \(\hat\sigma\) (or true \(\sigma\) if fewer than 3 detections). |
| \(\hat\sigma\) | MLE estimate of \(\sigma\) (requires \(n \geq 3\); shown as — otherwise). |
| True \(\sigma\) | Current slider value of \(\sigma\). |
| \(\hat{D}\) | Estimated density: \(n / (2L \cdot \mathrm{ESW})\). |
| True \(D\) | Realised density \(= \text{areaN} / (L \times H)\), which may differ slightly from the slider due to rounding. |
Key symbols
| Symbol | Meaning |
|---|---|
| \(g(x)\) | Detection function: probability of detecting an animal at perpendicular distance \(x\) from the transect. |
| \(\sigma\) | Detection scale parameter (km). |
| \(W\) | Truncation distance (km). |
| \(\mathrm{ESW}\) | Effective strip width (km). |
| \(\hat\sigma\) | MLE estimate of \(\sigma\). |
| \(\hat{D}\) | Estimated density (animals per km²). |
| \(D\) | True density (animals per km²). |
| \(L\) | Transect length / effort (km). |
| \(n\) | Number of within-\(W\) detections. |
| \(b\) | Hazard-rate shape parameter (dimensionless). |
Arena geometry
The simulation canvas maps directly to a rectangle in kilometre coordinates. All positions used by the engine are stored and computed in km; pixel positions are computed only for rendering.
- Transect length (\(L\), km) is set by the transect slider. This is the x-dimension of the arena:
arenaWKm = sliderValue. - Arena height (\(H\), km) is derived from the canvas dimensions so that one pixel represents the same distance in both axes:
arenaH = canvasHeight / (canvasWidth / arenaWKm). - Transect runs horizontally along the centreline of the canvas: \(y = H/2\) in km coordinates. It is a fixed line; the observer moves along it.
- Pixel scale:
PX_PER_KM = canvasWidth / arenaWKm. Recomputed on every reset (canvas width is fixed; a different transect length changes the scale). - Observer x-position (
boatX) runs from 0 toarenaWKmin km. The observer icon is drawn atboatX * PX_PER_KMpixels from the left edge.
Animal placement
All three distribution modes call into src/ds-engine.js. All use the seeded placement RNG (see §RNG below). All coordinates are in km.
Animal count
areaN = Math.round(density * arenaWKm * arenaH)
After rounding, the realised true density fed to the analytics is:
trueD = areaN / (arenaWKm * arenaH)
This will differ slightly from the slider value when rounding is non-trivial.
Uniform distribution (placeAnimals)
Each of the \(n\) animals is placed by two independent draws:
x = rng() * arenaW
y = rng() * arenaH
where rng() returns a value in \([0, 1)\) from the seeded placement RNG. The result is a realisation of a homogeneous Poisson process with no spatial structure beyond the arena boundary.
Clustered distribution (placeAnimalsClumped)
A two-level process. Parameters: clumpScale (slider default 0.10).
Number of clusters:
nClusters = Math.max(3, Math.round(n / 6)).Cluster centres: each placed at
(rng()*arenaW, rng()*arenaH)using the seeded RNG, uniform independently in each axis.Animal scatter: for each animal, a cluster is chosen uniformly at random (seeded), then a displacement is drawn using the Box–Muller transform:
u1 = rng(), u2 = rng() mag = sigmaCluster * sqrt(-2 * log(u1)) angle = 2π * u2 dx = mag * cos(angle), dy = mag * sin(angle)where
sigmaCluster = arenaW * clumpScale. The angular direction is isotropic (circular Gaussian scatter around the cluster centre).Wrap-around: animals that scatter beyond the arena edge are wrapped using modular arithmetic:
x = ((cx + dx) % arenaW + arenaW) % arenaW y = ((cy + dy) % arenaH + arenaH) % arenaHThis preserves the total count and keeps all animals inside the arena.
The clumpScale slider controls sigmaCluster as a fraction of transect length. At clumpScale = 0.10 (default), the cluster spread is 10% of the transect length.
Regular distribution (placeAnimalsRegular)
A perturbed grid. Parameters: regularity (slider default 0.50).
Grid dimensions:
nCols = Math.round(sqrt(n * arenaW / arenaH)),nRows = Math.ceil(n / nCols). The aspect ratio of the grid matches the arena.Cell size:
dx = arenaW / nCols,dy = arenaH / nRows.Animal position: for the \(i\)-th animal,
col = i % nCols,row = floor(i / nCols):x = (col + 0.5 + (rng() - 0.5) * regularity) * dx y = (row + 0.5 + (rng() - 0.5) * regularity) * dyThe jitter term
(rng() - 0.5) * regularityis drawn from the seeded RNG.
The regularity parameter spans two extremes:
regularity = 0: no jitter; animals sit exactly at grid cell centres.regularity = 1: maximum jitter; each animal is displaced by up to ±0.5 cell widths in each axis independently. This makes spacing variable but still non-Poisson (the grid skeleton persists in expectation).
Values in between produce intermediate over-dispersion relative to a Poisson process.
Seed and RNG architecture
The simulation uses two separate random number generators with different seeding strategies. This is a deliberate design choice.
Placement RNG (seeded)
placementRng = createRng(seed); // src/rng.js — Alea PRNGAll animal positions (uniform draws, cluster centres, cluster assignments, scatter displacements, grid jitter) are generated by this seeded generator. Fixing the seed and pressing Reset reproduces the exact same spatial layout of animals.
Detection RNG (unseeded)
detectionRng = new Math.seedrandom(); // fresh random seed each initSim() callAll Bernoulli detection draws (whether a given animal is detected when the observer passes) use this fresh, unseeded generator. Consequence: the same population can yield different detected sets on each replay, reflecting genuine sampling variability rather than just seed choice.
Seed value
- Range: 0–99999 (five-digit integer).
- On page load:
seed = Math.floor(Math.random() * 100000), random for each browser session. - Seed input field: type a value, press Enter or click away. The field strips non-digit characters and clamps to [0, 99999]. Any invalid entry reverts to the current seed.
- New Population button: draws a completely new random seed, then calls
initSim(). - Reset button: reads the current seed-input value and calls
initSim().
Observer mechanics
The observer moves at constant speed along the transect (y = arenaH / 2) from boatX = 0 to boatX = arenaWKm.
Speed settings
| Label | km per frame |
|---|---|
| Slow | 0.0025 |
| Normal | 0.005 |
| Fast | 0.015 |
At 60 fps, Normal speed traverses a 4 km transect in approximately 13 seconds of real time. The speed setting affects only playback rate; it does not change what gets detected (because of the one-shot detection rule; see §Detection below).
Each frame when running = true, advanceBoat() adds the current speed increment to boatX. When boatX >= arenaWKm, the run stops and the play button switches to “↺ Run again”.
Detection mechanism
All detection logic is in src/ds-engine.js (tryDetect). The key properties are:
One-shot rule
Detection is attempted exactly once per animal per run. The trigger condition is:
boatX - animal.x ∈ [0, boatSpeed)
In words: the observer’s current x-position has just crossed the animal’s x-position this frame. If boatX - animal.x < 0 (not yet reached) or >= boatSpeed (already passed), the function returns false immediately and no draw is made. This ensures each animal gets precisely one Bernoulli trial per traversal, regardless of playback speed.
Perpendicular distance
perpDist = |animal.y - arenaH / 2| (km)
If perpDist > W, the animal is outside the truncation strip and tryDetect returns false without a draw.
Bernoulli draw
If the one-shot condition is met and perpDist <= W, a single uniform draw u = detectionRng() is compared to the detection probability:
detected = u < g(perpDist, sigma)
where g is either halfNormal or hazardRate depending on the field truth function setting.
Detection functions
Both functions are implemented in src/stats.js.
Half-normal
\[ g(x,\sigma) = \exp\!\left(-\frac{x^2}{2\sigma^2}\right) \]
- \(g(0,\sigma) = 1\) for all \(\sigma\): perfect detectability on the transect line.
- \(\sigma\) (km) is the distance at which detection probability falls to \(e^{-1/2} \approx 0.607\).
- Slider range: 0.05–0.50 km, step 0.01, default 0.25 km.
Hazard-rate
\[ g(x,\sigma,b) = 1 - \exp\!\left[-\left(\frac{\sigma}{x}\right)^b\right] \]
- At \(x = 0\):
hazardRatereturns 1 (special-cased to avoid division by zero). - \(b\) controls the shape of the “shoulder” near the transect. Higher \(b\) gives a wider flat shoulder followed by a sharper drop.
- \(b\) slider range: 1.0–10.0, step 0.1, default 2.5.
State recording
When a detection fires, recordDetection(perpDist, boatX) is called:
perpDist: perpendicular distance in km (raw, not yet filtered by W).boatX: observer x-position in km at the moment of detection.
Both values are appended to parallel arrays in the shared state singleton (ds/state.js):
state.detectedDistances[] // perpendicular distances, all raw detections
state.detectedEfforts[] // boatX at each detection
These arrays grow monotonically during a run. They are cleared on reset (resetState()).
Truncation and W post-hoc filtering
The truncation distance \(W\) has two roles:
Hard truncation during detection:
tryDetectreturns false whenperpDist > W. No draw is made; the animal is never recorded.Post-hoc filtering in analytics: the analytics layer refilters stored distances on every state change:
filteredDists = detectedDistances.filter((d) => d <= W);This means if \(W\) is decreased after some detections have been stored, those beyond-W distances are excluded from the histogram, MLE, and D̂ history, even though they were recorded. If \(W\) is increased, detections from earlier in the run that were beyond the old W (but within the new W) will not appear, because they were never recorded in the first place.
The consequence is asymmetric: lowering \(W\) live re-analyses with a smaller sample; raising \(W\) live has no effect on data already collected.
ESW computation
Half-normal (analytical)
\[ \mathrm{ESW}(\sigma, W) = \sigma\sqrt{\frac{\pi}{2}}\;\mathrm{erf}\!\left(\frac{W}{\sigma\sqrt{2}}\right) \]
The error function is computed via the Abramowitz & Stegun polynomial approximation (max error \(1.5 \times 10^{-7}\)).
Hazard-rate (numerical)
200-step trapezoid rule over \([0, W]\):
\[ \mathrm{ESW}(\sigma, b, W) \approx \sum_{i=0}^{199} \frac{g(x_i) + g(x_{i+1})}{2}\,\Delta x, \quad \Delta x = W/200 \]
where \(g\) is hazardRate(x, sigma, b).
MLE fitting
fitSigmaMLE in src/stats.js finds the \(\hat{\sigma}\) that minimises the negative log-likelihood. The analytics layer calls it when there are at least 3 within-W detections; fewer returns null and σ̂ is shown as “—”.
The search uses golden-section over a fixed interval \([\sigma_{\min}, \sigma_{\max}]\). The shape parameter \(b\) is not optimised; it is held at the current slider value throughout.
NLL for half-normal
\[ \mathrm{NLL}(\sigma) = n\log\bigl[\mathrm{ESW}(\sigma,W)\bigr] + \sum_{i=1}^{n}\frac{d_i^2}{2\sigma^2} \]
NLL for hazard-rate
\[ \mathrm{NLL}(\sigma) = n\log\bigl[\mathrm{ESW}(\sigma,b,W)\bigr] - \sum_{i=1}^{n}\log\bigl[g_{\mathrm{HR}}(d_i,\sigma,b)\bigr] \]
In both cases, \(d_i\) are the within-W perpendicular distances and ESW is computed by the respective method above.
Density estimator
estimateDensity(n, L, sigma, W, modelFn, b) in src/stats.js returns:
\[ \hat{D} = \frac{n}{2L\,\mathrm{ESW}(\hat\sigma, W)} \]
where \(L\) is effort in km, \(n\) is the count of within-W detections, and ESW uses \(\hat\sigma\) if fitted, or the true \(\sigma\) slider value if fewer than 3 detections are available.
Per-detection effort in the convergence chart
The D̂ convergence chart does not use the final transect length as effort. Instead, for each successive within-W detection, the chart plots the density estimate using effort = boatX at the moment that detection was made. For the \(k\)-th within-W detection, \(L_k\) = observer x-position (km) when the \(k\)-th animal was detected.
This means the convergence trace shows how \(\hat{D}\) evolves as data accumulates during the survey, including the effect of early noise when \(n\) is small.
The full D̂ history array is rebuilt from scratch on every state-change event (each detection, and whenever W, σ, b, or model function changes).
Analytics panels
Three panels are drawn by ds/analytics.js using D3. They subscribe to state and re-render on every change.
1. Detection function chart
Blue solid curve: the field truth detection function \(g(x, \sigma_{\rm true})\), drawn with the currently selected field truth function and the true \(\sigma\) value.
Pink dashed curve: the MLE-fitted function \(g(x, \hat\sigma)\), drawn with the model function when \(\hat\sigma\) is available (\(n \geq 3\)).
Histogram bars (light blue, 10 equal bins over \([0, W]\)): observed within-W distances. Bar heights are normalised to the \(g(x)\) scale using the formula:
scale = n * binWidth / ESW_truth barHeight[i] = binCount[i] / scaleso that a perfect half-normal detection function would make the histogram lie on the true curve.
Ghost fitted curves: if “Keep previous runs” is checked, past fitted curves are shown as progressively more transparent pink dashed lines (up to 8 past runs retained, oldest most faded).
The x-axis domain updates to the current W value; the y-axis domain expands if bar heights exceed 1.1.
2. Running estimates strip
Displays six values updated live:
| Field | Value |
|---|---|
| \(n\) | count of within-W detections |
| Effort | boatX progress (km); updates state.transectLength on each detection from detectedEfforts[last] (see note below) |
| ESW | computed from \(\hat\sigma\) (or true \(\sigma\) if no fit) and current W |
| \(\hat\sigma\) | MLE estimate (km), or “—” if \(n < 3\) |
| True \(\sigma\) | slider value |
| \(\hat{D}\) | last entry in the D̂ history |
| True \(D\) | trueD = areaN / (arenaWKm * arenaH) |
The “Effort” readout is the boatX at the time of the last detection, not the total traversed distance. During a run, it thus reflects distance surveyed up to the most recent animal found.
3. D̂ convergence chart
- Green line: D̂ vs cumulative within-W detections for the current run.
- Red dashed horizontal line: true D.
- Ghost traces: previous runs overlaid as faded green lines (same cap and fade logic as detection function ghosts).
- Axes use expand-only domain tracking; they grow when data goes outside the current range but never contract mid-run. On hard reset (no data, no ghosts) the axes snap back to their initial defaults.
Ghost / keep-runs mechanic
The “Keep previous runs” checkbox controls ghost retention. Ghosts are managed entirely within analytics.js; they are not stored in the shared state singleton.
When a reset is detected (state goes from \(n > 0\) to \(n = 0\)):
- If the checkbox is checked, the current run’s D̂ history and fitted \(\hat\sigma\) are saved as a ghost object:
{ dhatHistory, sigmaHat, trueD, modelFn, b }. - The ghost list is capped at 8 entries (oldest removed when limit is exceeded).
- Ghost curves on the detection function chart record the model function active at the time of that run, so a run fitted with hazard-rate is always displayed with hazard-rate even if the model is later changed.
“Clear runs” empties the ghost list immediately and forces a redraw.
Control taxonomy
Live update (takes effect without reset)
These controls change simulation parameters immediately. Analytics are rebuilt from stored detections on the next state notification.
| Control | Variable | Effect |
|---|---|---|
| \(\sigma\) slider | sigma |
Changes both the truth curve and MLE fitting |
| \(W\) slider | W |
Restricts or expands the truncation strip; all stored detections are refiltered |
| Field truth function | truthFn |
Changes which \(g(x)\) governs future detection draws and the blue truth curve |
| Model function | modelFn |
Changes which \(g(x)\) is fitted by MLE and shown as the pink curve |
| Shape \(b\) | b |
Affects hazard-rate detection and MLE (only visible when truth or model is hazard-rate) |
Note: changing sigma or truthFn live affects future detections for the current run (if still in progress), not past ones that are already stored.
Reset required (takes effect on next Reset or New Population)
| Control | Variable |
|---|---|
| Density \(D\) | density |
| Transect length \(L\) | arenaWKm |
| Distribution type | distribution |
| Clump scale \(s_{\rm clump}\) | clumpScale |
| Regularity \(\rho\) | regularity |
| Seed | seed |
Changing these updates the displayed readout immediately but the population is only regenerated when initSim() is called (via Reset, New Population, or Run Again after completion).
Reset Defaults
Resets all controls to their factory defaults, then calls Reset:
| Parameter | Default |
|---|---|
| \(\sigma\) | 0.25 km |
| \(W\) | 0.50 km |
| Density | 50 /km² |
| Transect length | 4.0 km |
| \(b\) | 2.5 |
| Field truth function | Half-normal |
| Model function | Half-normal |
| Distribution | Uniform |
| Clump scale | 0.10 |
| Regularity | 0.50 |
Assumptions and limitations
- Animals are static throughout a run. There is no movement model.
- The transect is a fixed straight line through the centre of the arena.
- Each animal receives exactly one Bernoulli trial per traversal. Multiple passes are not modelled.
- \(g(0) = 1\) holds in both half-normal and hazard-rate modes.
- The MLE holds \(b\) fixed; joint optimisation of \((\sigma, b)\) is not implemented.
- The hazard-rate ESW uses a fixed 200-step numerical integration; for very small \(b\) or extreme parameter combinations, this approximation may be imprecise.
- The detector sees perpendicular distance only: no forward detection angle or acoustic range model.
- There is no measurement error; recorded distances are exact.