Skip to main content

Sim2Real-Fire: A Multi-modal Simulation Dataset for Forecast and Backtracking of Real-world Forest Fire

Venue
NeurIPS
Year
2024
Authors
Yanzhi Li, Keqiu Li, Guohui Li, Zumin Wang, Changqing Ji, Lubo Wang, Die Zuo, Qing Guo, Feng Zhang, Manyu Wang, Di Lin
Topic
ML

🌟 Highlights

  • The dataset is novel for cleanly combining and aligning multi-modal information as well as studying how well models trained on simulated data generalize to real-world wildfire events. I may definitely need to use this in my current research.

  • They also released a machine learning model, but honestly, that's not nearly as cool as this dataset.

📝 Summary

Research into wildfire forecasting and backtracking is increasingly relying on AI which requires large multi-modal datasets. To address this issue, this paper introduces Sim2Real-Fire, a benchmark dataset comprising of over 1 million wildfire scenarios with aligned topography, vegetation, fuel, weather, and satellite data for training as well as 1000 real-world wildfire scenarios for testing. An intended side-effect of this dataset is characterizing how well models trained on simulated data generalize to real wildfire events. The authors also propose S2R-FireTR, a deep transformer based model that leverages their multi-modal data to achieve state of the art results.

🧩 Key Contributions

  • Massive multi-modal wildfire forecasting and backtracking dataset comprising of 1 million wildfire scenarios for training, and 1000 real-world wildfire scenarios for testing

  • The multi-model dataset also contains aligned topology, vegetation, fuel, weather, and satellite data which is incredibly helpful.

  • A deep-transformer based machine learning model, but honestly that's much less impressive than the dataset they created.

Strengths

  • Extremely helpful and well put-together dataset paper

  • Paper was easy to follow along

⚠️ Weaknesses / Questions

  • With regards to "we eliminate the images without clearly observing the fire regions due to dense clouds or smoke occlusion", I observed that the extremely large and devastating wildfires in certain regions have heavily obscured fire regions. I wish the author would further state the kinds of data they threw away.

🔍 Related Work

  • Sim2Real-Fire dataset (1M simulated scenarios)

📄 Attachments

PDF
📄 View PDF
Poster
🎤 View Poster
Code
🧑‍💻 GitHub Repository
Paper Link
🔗 External Page