# The importance of duplicates and triplicates

**Pond replicates: duplicates, triplicates and robust experimental results **

Owning replicate algae raceways saves time and money by decreasing uncertainty in your data. Our experience is that triplicates are the optimum given the natural variability in algae cultures.

Algae raceway ponds are complex biological systems supporting diverse cultures of algae, bacteria, and other organisms. The complexity of this environment introduces significant variability; especially when coupled with weather variations such as wind, temperature, precipitation, and solar insolation.

Ponds run in triplicate offer several advantages over ponds run in duplicate. The first advantage is that triplicate ponds allow clear identification of pond operating outside the expected range. For example, if there are three ponds being run in the exact same way and two are demonstrating a productivity of 25 g/m2-day ±2 g/m2-day and the third pond is measuring 85 g/m2-day it is easy to identify the aberrant pond. This pond can then be carefully evaluated to determine what conditions (if any) are leading to the much higher productivities. Having triplicate ponds is important evaluative tool for identifying measurements that fall outside the expected range.

The second advantage of of having ponds in triplicate is statistical. Increasing the number of ponds running under the same conditions increases the sample size (n) and increases the precision of the measurements (equation 1,2). In our experience three replicate ponds are required to have confidence in average measured values and to determine a standard deviation or standard error that allows for statistical differentiation between experiments. Realistically, statisticians would probably recommend you have 5, 7, or even 10 pond replicates!

Table 1: In Case 2, Pond 2 has a much higher value than the other ponds, so it should be evaluated for accuracy. In Case 1 it may be difficult or impossible to determine which pond is providing the more accurate value. Experience may allow the operator to identify the high value in Pond 2, but this is not certain in all cases. Identifying errors in sampling or technique can allow analysts to remove inaccurate data (due to human error, electronic error or culture crash etc.) from the data set resulting in more accurate means and less variability. Additionally, these errors can be quickly identified, evaluated and corrected. If after investigation the value for Pond 2 was verified as accurate, having the third pond significantly decreases the standard deviation of the data set.

In Case 1, Pond 2 may still be identified as a bad value due to operator or some other identifiable error. The data could then be removed from the data set, but as a result no average or standard deviation could be determined for the experimental condition. This would make evaluation of variability, and the determination of statistical differences between mean values impossible.

Table 2, Figure 2: In this experiment the goal is to determine if the operational differences between Case 1 and Case 2 have any effect on pond productivity. In Case 1a ponds are being run in duplicate and in the Case 1b the same ponds under the same conditions are being run in triplicate. Case 2 is run in triplicate under different conditions than Case 1. The graph shows error bars with plus and minus one standard deviation from the mean.

Based on the mean productivity it appears that the operational conditions in Case 2 significantly decrease the productivity; however, this is difficult to say with statistical certainty when the error bars for Case 1a and Case 2 are compared. Significant overlap between the error bars in Case 1a, and Case 2 make it more difficult to say with certainty that the differences in the mean productivity are due to operational changes and not natural variability. When Case 1b and Case 2 are compared there is no overlap of the standard deviation error bars indicating that it is likely the means for Case 1B and and Case 2 are different. This difference in mean can then likely be attributed to the experimental differences between the ponds.

To summarize; if you're considering purchasing equipment for the purpose of determining algal productivities, wastewater treatment, biofuels potential, etc. it is essential that you closely evaluate your experimental plan including the number of replicate ponds. In our experience running an experiment in a single pond will give you almost no useful information. Duplicate ponds may provide some confidence in your results if your team is experienced in both analytical and operations of algal systems and nothing unexpected goes wrong. However, the best option is to use triplicate ponds as it will reduce the need for repeating experiments, and allow you and others to have more confidence in your results. Robust unimpeachable results are key to a successful algae program.

### Author:

Ruth Spierling, M.S.

Research Engineer