Calculating Catch Per Unit Effort In R

Catch Per Unit Effort (CPUE) Calculator

Calculate fishing efficiency metrics using R-based statistical methods

Standardized CPUE
Confidence Interval
Effort Efficiency
Area Density

Comprehensive Guide to Calculating Catch Per Unit Effort (CPUE) in R

Catch Per Unit Effort (CPUE) is a fundamental metric in fisheries science that standardizes catch data to account for varying levels of fishing effort. This guide provides a complete walkthrough of CPUE calculation methods using R, including statistical considerations, data visualization techniques, and practical applications for fisheries management.

1. Understanding CPUE Fundamentals

CPUE represents the amount of fish caught per unit of fishing effort, typically expressed as:

CPUE = Total Catch / Total Effort

Where:

  • Total Catch: Number or weight of fish captured
  • Total Effort: Quantified fishing activity (hours, nets, hooks, etc.)

CPUE serves as a relative index of fish abundance, assuming that catchability (the probability of catching a fish given it’s present) remains constant. This assumption is critical for valid interpretations.

2. Data Requirements for CPUE Analysis

To calculate CPUE effectively, you’ll need:

  1. Catch Data: Species-specific counts or weights
  2. Effort Data: Quantified fishing activity metrics
  3. Environmental Covariates (optional): Depth, temperature, time of day
  4. Spatial Data (optional): GPS coordinates, fishing zones
  5. Temporal Data: Date/time stamps for time-series analysis
Data Type Example Metrics Collection Method
Catch Data Number of fish, total weight, size distribution Observer programs, logbooks, electronic monitoring
Effort Data Fishing hours, number of nets, hook counts Vessel monitoring systems, logbook entries
Environmental Water temperature, salinity, oxygen levels CTD sensors, satellite data, buoy networks
Spatial Latitude/longitude, depth, habitat type GPS devices, echo sounders, GIS mapping

3. Basic CPUE Calculation in R

The simplest CPUE calculation in R uses base functions:

# Sample data
total_catch <- 450  # number of fish
total_effort <- 12   # fishing hours

# Basic CPUE calculation
basic_cpue <- total_catch / total_effort
basic_cpue  # Returns 37.5 fish per hour
        

For more robust analysis with real-world data:

# Create a data frame with fishing trip data
fishing_data <- data.frame(
  trip_id = 1:10,
  catch = c(45, 62, 38, 55, 49, 67, 52, 41, 58, 63),
  effort_hours = c(12, 15, 10, 14, 11, 16, 13, 9, 14, 15),
  depth = c(45, 60, 35, 50, 40, 65, 55, 30, 52, 60),
  temperature = c(12.5, 11.8, 13.2, 12.0, 12.7, 11.5, 12.3, 13.0, 11.9, 11.7)
)

# Calculate CPUE for each trip
fishing_data$cpue <- fishing_data$catch / fishing_data$effort_hours

# View results
head(fishing_data)
        

4. Advanced CPUE Analysis Techniques

For more sophisticated analyses, consider these approaches:

4.1 Generalized Linear Models (GLMs)

GLMs account for non-normal distributions common in fisheries data:

# Using GLM with negative binomial distribution for count data
glm_model <- glm(catch ~ effort_hours + depth + temperature,
                  data = fishing_data,
                  family = quasipoisson())

# Or for zero-inflated data:
# library(pscl)
# zeroinfl_model <- zeroinfl(catch ~ effort_hours + depth | temperature,
#                            data = fishing_data, dist = "negbin")

summary(glm_model)
        

4.2 Standardization Methods

Standardization accounts for variables affecting catchability:

# Using the 'FLa4R' package for standardization
# install.packages("FLa4R")
library(FLa4R)

# Prepare data in FLQuant format
catch_flq <- FLQuant(catch ~ year + quarter,
                      data = fishing_data)

effort_flq <- FLQuant(effort_hours ~ year + quarter,
                       data = fishing_data)

# Calculate standardized CPUE
std_cpue <- catch_flq / effort_flq
        

4.3 Delta Models

For data with many zeros (common in fisheries):

# Two-part model: presence/absence and positive catches
library(pscl)

# Part 1: Probability of catch (logistic)
part1 <- glm(I(catch > 0) ~ depth + temperature,
             data = fishing_data,
             family = binomial)

# Part 2: Positive catches (gamma GLM)
part2 <- glm(catch ~ effort_hours + depth,
             data = subset(fishing_data, catch > 0),
             family = Gamma(link = "log"))

# Combine predictions for expected CPUE
        

5. Visualizing CPUE Data

Effective visualization is crucial for interpreting CPUE trends:

library(ggplot2)

# Basic CPUE vs Effort plot
ggplot(fishing_data, aes(x = effort_hours, y = cpue)) +
  geom_point(size = 3, color = "#2563eb") +
  geom_smooth(method = "lm", color = "#ef4444", se = FALSE) +
  labs(title = "Relationship Between Effort and CPUE",
       x = "Fishing Hours",
       y = "CPUE (fish/hour)") +
  theme_minimal()

# Time series of standardized CPUE
ggplot(fishing_data, aes(x = trip_id, y = cpue, group = 1)) +
  geom_line(color = "#2563eb", size = 1) +
  geom_point(color = "#2563eb", size = 3) +
  geom_smooth(method = "loess", color = "#ef4444", span = 0.75) +
  labs(title = "CPUE Trends Across Fishing Trips",
       x = "Trip Number",
       y = "Standardized CPUE") +
  theme_minimal()
        

6. Interpreting CPUE Results

Proper interpretation requires understanding:

  • Temporal Patterns: Seasonal or annual variations
  • Spatial Variations: Differences between fishing zones
  • Fishing Gear Effects: Impact of different gear types
  • Environmental Influences: Temperature, depth, currents
  • Biological Factors: Fish behavior, migration patterns
Example CPUE Interpretation Guide
CPUE Value Effort Level Likely Interpretation Management Implications
> 50 fish/hour Low (5-10 hours) High abundance, easily catchable Potential for increased quotas
20-50 fish/hour Moderate (10-20 hours) Stable population, sustainable levels Maintain current regulations
5-20 fish/hour High (20+ hours) Declining population or low catchability Consider reduced effort or gear restrictions
< 5 fish/hour Very High Severely depleted or highly dispersed Immediate conservation measures needed

7. Common Pitfalls and Solutions

Avoid these frequent mistakes in CPUE analysis:

  1. Ignoring Zero Catch Data: Use delta models or zero-inflated distributions
  2. Assuming Constant Catchability: Incorporate environmental covariates
  3. Pooling Heterogeneous Data: Stratify by gear type, area, or season
  4. Neglecting Effort Misreporting: Validate with observer data
  5. Overlooking Size Composition: Analyze length-frequency data
  6. Disregarding Spatial Autocorrelation: Use geostatistical methods

8. CPUE in Fisheries Management

CPUE data informs critical management decisions:

  • Stock Assessment: Estimating population size and trends
  • Quota Setting: Determining sustainable catch limits
  • Season Timing: Optimizing fishing periods
  • Gear Regulations: Evaluating gear efficiency and selectivity
  • Closed Areas: Identifying essential fish habitats
  • Bycatch Monitoring: Assessing non-target species impact

International organizations like the FAO and regional fisheries management organizations (RFMOs) rely heavily on standardized CPUE data for transboundary stock assessments.

9. Advanced Topics in CPUE Analysis

9.1 Bayesian Approaches

Bayesian methods incorporate prior knowledge and quantify uncertainty:

library(rstanarm)

bayes_model <- stan_glm(catch ~ effort_hours + depth + temperature,
                        data = fishing_data,
                        family = poisson,
                        prior_intercept = normal(0, 2.5, autoscale = TRUE),
                        chains = 4, iter = 5000)

summary(bayes_model)
plot(bayes_model)
        

9.2 Machine Learning Applications

For complex, non-linear relationships:

library(xgboost)

# Prepare data
dtrain <- xgb.DMatrix(
  data = as.matrix(fishing_data[, c("effort_hours", "depth", "temperature")]),
  label = fishing_data$catch
)

# Train model
xgb_params <- list(
  objective = "reg:squarederror",
  eval_metric = "rmse",
  eta = 0.1,
  max_depth = 6,
  subsample = 0.8,
  colsample_bytree = 0.8
)

xgb_model <- xgb.train(
  params = xgb_params,
  data = dtrain,
  nrounds = 100,
  verbose = 0
)

# Predict CPUE
fishing_data$cpue_pred <- predict(xgb_model, dtrain)
        

9.3 Spatio-Temporal Models

For data with spatial and temporal components:

library(INLA)

# Example using R-INLA for spatio-temporal modeling
# Requires proper spatial and temporal indexing
# formula <- catch ~ effort_hours +
#           f(space, model = "iid") +
#           f(time, model = "ar1") +
#           f(space_time, model = "iid")

# result <- inla(formula, family = "poisson", data = spatio_temp_data)
        

10. Software and Package Recommendations

Essential R packages for CPUE analysis:

Package Purpose Key Functions
FLa4R Fisheries stock assessment FLStock(), FLIndex(), FLQuant()
ggplot2 Data visualization ggplot(), geom_point(), facet_wrap()
mgcv GAMs for non-linear relationships gam(), s(), te()
pscl Zero-inflated models zeroinfl(), hurdle()
brms Bayesian regression brm(), pp_check(), plot()
sf Spatial data handling st_read(), st_transform(), st_intersects()
lubridate Date/time handling ymd(), year(), month()

Authoritative Resources on CPUE Analysis

For additional scientific guidance on CPUE methodologies:

11. Case Study: Atlantic Cod CPUE Analysis

This practical example demonstrates a complete CPUE analysis for Atlantic cod in the Gulf of Maine:

# Load required packages
library(tidyverse)
library(lubridate)
library(mgcv)

# Simulated Atlantic cod data (2010-2022)
set.seed(123)
cod_data <- data.frame(
  date = seq(ymd("2010-01-15"), ymd("2022-12-15"), by = "3 months"),
  latitude = runif(52, 41, 44),
  longitude = runif(52, -71, -67),
  depth = sample(50:200, 52, replace = TRUE),
  temperature = rnorm(52, 8, 2),
  effort_hours = sample(8:24, 52, replace = TRUE),
  cod_catch = rpois(52, lambda = 20)
)

# Calculate basic CPUE
cod_data$cpue <- cod_data$cod_catch / cod_data$effort_hours

# Add temporal variables
cod_data <- cod_data %>%
  mutate(
    year = year(date),
    quarter = quarter(date),
    month = month(date)
  )

# GAM with spatial and temporal components
cod_gam <- gam(cod_catch ~ s(effort_hours) + s(depth) + s(temperature) +
                 s(latitude, longitude) + s(year, bs = "re") +
                 s(month, bs = "cc"),
               data = cod_data,
               family = nb(),
               method = "REML")

# Summary and diagnostics
summary(cod_gam)
plot(cod_gam, pages = 1, scheme = 2)

# Predict standardized CPUE
cod_data$std_cpue <- predict(cod_gam, type = "response") / cod_data$effort_hours

# Visualize trends
ggplot(cod_data, aes(x = date, y = std_cpue)) +
  geom_line(color = "#2563eb") +
  geom_point(color = "#2563eb") +
  geom_smooth(method = "loess", color = "#ef4444", span = 0.75) +
  labs(title = "Standardized CPUE for Atlantic Cod (2010-2022)",
       x = "Date",
       y = "Standardized CPUE (fish/hour)") +
  theme_minimal() +
  theme(plot.title = element_text(hjust = 0.5))
        

12. Future Directions in CPUE Research

Emerging technologies and methods are enhancing CPUE analysis:

  • Electronic Monitoring: Camera systems for accurate effort documentation
  • Machine Learning: Improved pattern recognition in complex datasets
  • Genetic Stock ID: Linking CPUE to specific populations
  • Real-time Reporting: Mobile apps for immediate data collection
  • Integration with Ecosystem Models: Holistic assessment approaches
  • Autonomous Vehicles: Unmanned systems for data collection
  • Blockchain Technology: Secure, transparent data sharing

The field continues to evolve with increasing computational power and interdisciplinary collaboration between fisheries scientists, statisticians, and data scientists.

Leave a Reply

Your email address will not be published. Required fields are marked *