Robust Clustering of the Local Milky Way Stellar Kinematic Substructures with Gaia eDR3
Identifying Substructures using HDBSCAN with measurement uncertainties factored in
In the most widely accepted model of galaxy formation, the Milky Way formed through hierarchical mergers with smaller galaxies, many of which were tidally disrupted and contributed to the stellar halo. Studying these accreted stars provides insight into the Milky Way’s merger history and the properties of the merging galaxies, offering valuable information about galaxy evolution and dark matter distribution. Large-scale surveys like Gaia have been instrumental in identifying these stellar structures and streams in the Milky Way.
This study applies an unsupervised machine learning algorithm, Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), to identify substructures in the Milky Way using data from Gaia eDR3. By incorporating uncertainties into the clustering process for the first time, we robustly identify known structures like the Gaia-Enceladus/Sausage (GSE), Sequoia, the Helmi Stream, and two globular clusters (NGC 3201 and NGC 104). Our approach improves the accuracy of identifying stellar members in these clusters. However, some structures, like Nyx, Thamnos, and Arjuna, were not recovered, highlighting the limitations of the method when constrained by the available data and orbital distance cuts. Our results emphasize the importance of accounting for uncertainties in machine learning applications to enhance the robustness of substructure identification in the Milky Way.