Our approach consists of two stages: First, we input a large amount of unstructured news text into GraphRAG, where during the clustering and retrieval process, global statistics are dynamically updated to accurately cluster relationships between entities. This results in a directed knowledge graph
of companies, with edges representing both direct and indirect relationships between companies. In the second stage, inspired by Ibrahim et al. (2022) and Barigozzi & Brownlees (2019), we transform the knowledge graph into a mask matrix using the Leiden algorithm Traag et al. (2019). This matrix is then used as a regularizer in the multivariate time series model, combined with relevant covariates. This method effectively enhances the generalization ability of models on scarce company data and significantly improves prediction performance, as illustrated in Figure 1. The specific algorithm details are provided in Algorithms 1 and 2.
Bostonian•1h ago
Our approach consists of two stages: First, we input a large amount of unstructured news text into GraphRAG, where during the clustering and retrieval process, global statistics are dynamically updated to accurately cluster relationships between entities. This results in a directed knowledge graph of companies, with edges representing both direct and indirect relationships between companies. In the second stage, inspired by Ibrahim et al. (2022) and Barigozzi & Brownlees (2019), we transform the knowledge graph into a mask matrix using the Leiden algorithm Traag et al. (2019). This matrix is then used as a regularizer in the multivariate time series model, combined with relevant covariates. This method effectively enhances the generalization ability of models on scarce company data and significantly improves prediction performance, as illustrated in Figure 1. The specific algorithm details are provided in Algorithms 1 and 2.