Prediction of structured noisy data using boosted regression trees
When: Thursday, 14 April 2016 11:30 AM-12:30 PM
Where: GP-B Block, Level 5, Room 507
- Prof Kerrie Mengersen (Principal)
- Dr Alan Woodley
- Kerrie Mengersen (Chair) (Mathematical Sciences)
- Michael Schmidt ()
- Alan Woodley (Electrical Engineering, Computer Science, Data Science)
- Gentry White (Mathematical Sciences)
Spatio-temporal time series analyses are important to get more in-depth knowledge in processes from the past. We can learn from observing the past for the future. An estimation of future biomass is based on the knowledge from the past in order to understand triggers of change in the vegetation and its impacts. Since the vegetation cover is a dynamic process with a high variation due to climate variability it is important to understand where (as a spatial question) did happen what (as a spatio-temporal question and why (as a spectral question which are showing individual grey values indicating the reflectance). The grass biomass estimation is challenging since the phenological growing cycle of naturally existing grass is a dynamic process influenced by a lot of complex parameters. Therefore, climate variability and resulting land use change monitored through a time series need to be accounted for this investigation. In addition to that, the biomass cannot be measured directly since the animals are grazing on the paddocks and eating the grass. To quantify biomass a retrospective approach is needed. We will use a surrogate, which is called Animal Equivalent (AE) and existing grazing information as an intermediary step.
One challenge will be to combine all data sources together, as well as address the spatial, spectral, spatio-temporal and provided stocking data aspects. This project will attempt to develop statistical machine learning models to meet these challenges.
Approaches of regression tree and especially boosted regression tree (BRT) show good results in ecological time series studies based on remotely sensed imagery. To date there has been little statistical modelling done in estimating naturally existing biomass via a surrogate and investigation of related environmental and stocking data. There is no current statistical method which combines data with the above mentioned challenges.