Search this site:

Predictive Models


Swim advisories or closings are issued by beach managers on the basis of standards for concentrations of bacterial indicators—Escherichia coli (E. coli) or enterococci for freshwaters and enterococci for marine waters. The analytical methods for these organisms, however, take at least 18–24 hours to complete. Recreational water-quality conditions may change during this time, leading to erroneous assessments of public-health risk. As a result, some agencies have turned to predictive modeling to obtain near-real-time estimates of recreational water quality. Predictive models, developed through statistical techniques such as multiple linear regression (MLR), use easily measured environmental and water-quality variables to estimate bacterial-indicator concentrations or the probability of exceeding target concentrations.

Ohio Nowcast
Huntington Beach. At eight Lake Erie beaches and one recreational river, predictions based on models are available to the public during the recreational season (May-Aug) through an Internet-based system called the Ohio Nowcast.

The nowcast is like a weather forecast, in that it provides the probability (in percent) that the bathing-water standard for E. coli will be exceeded. (The Ohio single-sample bathing-water standard for E. coli is 235 colony-forming units/100 milliliters). So on any given morning, there could be from a 1- to 100- percent probability that the standard would be exceeded. How does one know when the probability presents too great a risk to go swimming? Would you go swimming if there was an 80-percent probability that the standard would be exceeded? What about a 25 percent chance? To help out, beach managers established threshold probabilities for their beaches based on historical data. If the probability is greater than or equal to the threshold, than the beach is posted with an advisory on the Ohio Nowcast.

How did the nowcast system perform in past years? To find out, refer to the Ohio Nowcast website.[click the About icon]

The USGS and its partners will continue to work to improve the predictive abilities of the Ohio Nowcast models.

How can we develop models for our beaches?
Edgewater Beach.
To find out how to develop predictive models for your beaches in a step-by-step fashion, click on the techniques report. The steps to develop predictive models are data collection; exploratory data analysis; model development, selection, and diagnosis; determination of model out values; and model validation and refinement.

The U.S. Environmental Protection Agency developed a free software program, called Virtual Beach, that enables beach managers and others to develop or update models using statistical techniques. The software is user friendly and can be used by those without a strong statistics background.

A spreadsheet was designed by the USGS to automate the compilation of lake level data retrieved from the nearest offshore buoy operated by the National Oceanic and Atmospheric Administration. The spreadsheet will organize hourly lake-level data and calculate the change in lake level over 24 hrs. The spreadsheet is available as Appendix 1 as part of a USGS Scientific Investigations Report. Contact Amie Brady for more information.

A software routine, called PROCESSNOAA, was designed by the USGS to automate the compilation of weather data from the nearest National Weather Service airport site. The software processes hourly rainfall, wind direction and speed, and barometric pressure, and provides lagged and weighted rainfall variables. The software is available as Appendix 2 as part of a USGS Scientific Investigations Report. Contact Donna Francy for more information.

Collecting better data for predictive models:
Predictive modeling is a dynamic process meant to augment existing beach-monitoring programs. Models should be continuously validated and refined to improve predictions.

The USGS Ohio Water Science Center is working with the Lake County General Health District to collect local weather data at Mentor Headlands State Park, Ohio. A USGS operated weather station measures wind speed, wind direction, barometric pressure, air temperature, net solar radiation, incident light, and rainfall. Data are available in real time for USGS station number 414514081174400.

We are also working to identify additional explanatory variables to include in the models. For example, a sensor to measure photosynthetically active radiation (PAR) was installed at Huntington. Increased sunlight, as measured by PAR, has been shown to result in decreased levels of E. coli. At Edgewater, a temporary piezometer (shallow water well) equipped with a pressure transducer and data logger to measure and record water levels every 30 minutes was installed 20 ft inland from the edge of water during each recreational season. Edgewater is as gently sloping beach with reservoirs of E. coli (presumably from bird populations) in the sand and shallow groundwater acting as a potential source of contamination to the lake. Including a variable that quantifies the interaction of shallow groundwater with lake water, such as the water level measured in the piezometer, may help to improve model performance at Edgewater.