This study developed a comprehensive framework using Google Earth Engine to efficiently generate a forest fire inventory dataset, which enhanced data accessibility without specialized knowledge or access to private datasets. The framework is applicable globally, and the datasets generated are freely accessible and shareable. By implementing the framework in Peninsular Malaysia, significant forest fire factors were successfully extracted, including the Keetch-Byram Drought Index (KBDI), soil moisture, temperature, windspeed, land surface temperature (LST), Palmer Drought Severity Index (PDSI), Normalized Vegetation Index (NDVI), landcover, and precipitation, among others. Additionally, this study also adopted large language models, specifically GPT-4 with the Noteable plugin, for preliminary data analysis to assess the dataset's validity. Although the plugin effectively performed basic statistical analyses and visualizations, it demonstrated limitations, such as selectively dropping or choosing only relevant columns for tests and automatically modifying scales. These behaviors underscore the need for users to perform additional checks on the codes generated to ensure that they accurately reflect the intended analyses. The initial findings indicate that factors such as KBDI, LST, climate water deficit, and precipitation significantly impact forest fire occurrences in Peninsular Malaysia. Future research should explore extending the framework's application to various regions and further refine it to accommodate a broader range of factors. Embracing and rigorously validating large language model technologies, alongside developing new tools and plugins, are essential for advancing the field of data analysis.
View source