The Timber Species Identifier (TSI) attempts to segment and identify each tree in a target area and perform a discrete analysis of the LiDAR points to determine tree species. Overall accuracy is typically measured against cruise & scale reports or individual stem tests. Each of these measures are useful but not perfect.
Cruise estimates carry their own error bars and the latter tend to get larger for smaller areas. Cruise estimates are also extremely sensitive to plot placement. Scale or harvest information can be skewed by timber left on site or simply not recorded accurately to the correct source block.
Individual stem tests would seem a better solution but it is difficult & expensive to get a good mix of testable stems across large areas of interest. Nature is messy and subtle differences in canopy shape and reflectivity are part of what makes species identification so challenging. As a result, accuracies are heavily influenced by the trees selected for testing. In small sample sizes, say under 50 samples per species, accuracies in one area can prove above 90% while another zone might record 60%.
Object Raku Technology implemented a probability score in TSI version 1.6.15 to help client's identify high accuracy areas and to better leverage operational harvest requirements. The probability scores represent the strength of signal; how well does a particular tree match up to species characteristics in the project species database.
In TSI's analysis, each segmented tree is examined and evaluated against the species library. In the table above, we can see the probability scores calculated for each tree. For example, TSI calculates that Tree #4 has a 71.5% probability score for Hw. This does not mean that there's a 71.5% likelihood the tree is Hw; rather that within the TSI species library, the tree matches up most strongly with Hw in the database.
At the other end of the spectrum, Tree #2 has a leading probability score of only 36.7%, again for Hw. This time the strength of signal multiple is less than twice that of Fd, the second place species. Looking again at the first example for contrast, the descriptor match for Tree #4 to western hemlock is 7X that for Fd, which was the next highest species at 10.6%.
For the stem accuracy test, the TSI project model was run against 636 trees in the project area of interest that were not in the model. These are "alien" trees; trees that are outside of TSI's known universe. The model scored 73.6% accuracy with strong scores for Fd, Ss, and Mb and weak scores for Cw and Yc.
The model confuses Cw for Yc in this test calling 12 red cedar yellow cedar and 10 yellow cedar red cedar. However it is western hemlock (Hw) that accounted for the largest problems in this model with both a low correct % and a low precision % across a large test sample population (122). The precision is calculated by dividing the total correct (82) by the total predicted by TSI (143). On the other side of that coin, both Fd and Ss were strongly identified with a minimum of false positives.
In the graph (Probability of Correctness per Score or Score Difference), we can see how well the TSI probability scores correlate to the correct decisions in a stem accuracy test. In the stem accuracy test above, the model was run against a sample of ground truth trees(field and stereo collects) completely unknown to TSI that were not in the TSI model.
The solid line represents the function for the actual scores against the correct samples in the stem test. The dashed line represents the difference between the highest probability score and the second place score. So for this TSI model, a score difference of 23% equated to a 50% probability correctness.
Below: Trees with the highest probability per species are called the winning probabilities. In the table and chart above we can see that in general, the higher the probability score, the more closely correlated with accuracy. For model #1715, the sample is distributed and 45% of the values are 80% or higher (bins 0.85 & up). This makes intuitive sense with the 73% overall stem accuracy seen above.
We conducted a regression analysis that showed that a high proportion of TSI probabilities at 75% or higher are correlated with higher cruise report comparison accuracies. In the screen capture of Block 1, we see the number of segmented TSI trees coloured by whether they were above or below 75% probability. The example block is mature and situated in coastal British Columbia.
In Table 1 above, the cruise called for Hw and Ba to comprise 89% of the block volume. TSI records similar proportions but with the two leading species making up a higher overall 97% composition. TSI thought there were a few large CW and didn't find any YC.
In Table 2 below, we summarize results for the 7,761 stems segmented in block 1. TSI calculates a probability score for each species in the project species library. We can see that of the 9 species in the species library for this project area, TSI had probability scores above 75% primarily for HW and BA. TSI called for a majority of Hw & Ba in the block with 4,787 trees above 75% in the block.
This information can guide & direct operational planners to optimize return on investment and product mix.