Sim2Real-Fire: A Multi-modal Simulation Dataset for Forecast and Backtracking of Real-world Forest Fire
Highlights
-
The dataset is novel for cleanly combining and aligning multi-modal information as well as studying how well models trained on simulated data generalize to real-world wildfire events. I may definitely need to use this in my current research.
-
They also released a machine learning model, but honestly, that's not nearly as cool as this dataset.
Summary
Research into wildfire forecasting and backtracking is increasingly relying on AI which requires large multi-modal datasets. To address this issue, this paper introduces Sim2Real-Fire, a benchmark dataset comprising of over 1 million wildfire scenarios with aligned topography, vegetation, fuel, weather, and satellite data for training as well as 1000 real-world wildfire scenarios for testing. An intended side-effect of this dataset is characterizing how well models trained on simulated data generalize to real wildfire events. The authors also propose S2R-FireTR, a deep transformer based model that leverages their multi-modal data to achieve state of the art results.
Key Contributions
-
Massive multi-modal wildfire forecasting and backtracking dataset comprising of 1 million wildfire scenarios for training, and 1000 real-world wildfire scenarios for testing
-
The multi-model dataset also contains aligned topology, vegetation, fuel, weather, and satellite data which is incredibly helpful.
-
A deep-transformer based machine learning model, but honestly that's much less impressive than the dataset they created.
Strengths
-
Extremely helpful and well put-together dataset paper
-
Paper was easy to follow along
Weaknesses / Questions
- With regards to "we eliminate the images without clearly observing the fire regions due to dense clouds or smoke occlusion", I observed that the extremely large and devastating wildfires in certain regions have heavily obscured fire regions. I wish the author would further state the kinds of data they threw away.
Related Work
- Sim2Real-Fire dataset (1M simulated scenarios)