graph generation model
Generative Graph Models
第八章傳統的圖生成方法>
The previous parts of this book introduced a wide variety of methods for learning representations of graphs. In this final part of the book, we will discuss a distinct but closely related task: the problem of > graph generation.
本書前面的部分介紹了學習圖形表示的各種方法。在本書的最後部分,我們將討論一個截然不同但密切相關的任務:圖形生成問題。
The goal of graph generation is to build models that can generate realistic(逼真的) graph structures. In some ways, we can view this graph generation problem as the mirror image of the graph embedding problem. Instead of assuming that we are given a graph structure G = (V,E) as input to our model, in graph generation we want the output of our model to be a graph G. Of course, simply generating an arbitrary graph is not necessarily that challenging. For instance, it is trivial to generate a fully connected graph or a graph with no edges. The key challenge in graph generation, however, is generating graphs that have certain desirable properties. As we will see in the following chapters, the way in which we define 「desirable properties」—and how we perform graph generation—varies drastically between different approaches.
圖形生成的目標是構建能夠生成逼真圖形結構的模型。在某種程度上,我們可以把這個圖生成問題看作是圖嵌入問題的鏡像。在圖生成過程中,我們希望模型的輸出是圖G,而不是假設我們得到一個圖結構G=(V,E)作為我們模型的輸入。當然,簡單地生成一個任意圖並不一定具有那麼大的挑戰性。例如,生成一個完全連通的圖或一個沒有邊的圖是微不足道的。然而,圖形生成中的關鍵挑戰是生成具有某些所需屬性的圖形。正如我們將在接下來的章節中看到的,我們定義「理想屬性」的方式–以及我們如何執行圖形生成–在不同的方法之間有很大的不同
In this chapter, we begin with a discussion of traditional approaches to graph generation. These tradiational approaches pre-date most research on graph representation learning—and even machine learning research in general. The methods we will discuss in this chapter thus provide the backdrop to motivate the deep learning-based approaches that we will introduce in Chapter 9.
在本章中,我們首先討論傳統的圖形生成方法。這些傳統的方法早於大多數關於圖形表示學習的研究,甚至比一般的機器學習研究更早。因此,我們將在本章中討論的方法為激勵我們將在第9章介紹的基於深度學習的方法提供了背景。
8.1 Overview of Traditional Approaches 傳統方法概述
Traditional approaches to graph generation generally involve specifying some kind of generative process, which defines how the edges in a graph are created. In most cases we can frame this generative process as a way of specifying the probability or likelihood P(A[u, v] = 1) of an edge existing between two nodes u and v. The challenge is crafting some sort of generative process that is both tractable and also able to generate graphs with non-trivial properties or acteristics. Tractability is essential because we want to be able to sample or analyze the graphs that are generated. However, we also want these graphs to have some properties that make them good models for the kinds of graphs we see in the real world.
傳統的圖形生成方法通常涉及指定某種生成過程,該過程定義了圖形中的邊是如何創建的。在大多數情況下,我們可以將這種生成過程框定為指定存在於兩個節點u和v之間的邊的概率或可能性P(A[u,v]=1)的一種方式。挑戰在於設計出一種既易於處理又能夠生成具有非平凡性質或特徵的圖的生成過程。在大多數情況下,我們可以將這種生成過程框定為指定存在於兩個節點u和v之間的邊的概率或可能性P(A[u,v]=1)。易操縱性是必不可少的,因為我們希望能夠對生成的圖表進行取樣或分析。然而,我們也希望這些圖具有一些屬性,使它們成為我們在現實世界中看到的各種圖的良好模型。
The three approaches we review in this subsection represent a small but representative subset of the traditional graph generation approaches that exist in the literature. For a more thorough survey and discussion, we recommend Newman [2018] as a useful resource.
我們在這一小節中回顧的三種方法代表了文獻中存在的傳統圖形生成方法中的一小部分但具有代表性。要進行更深入的調查和討論,我們推薦紐曼[2018]作為一個有用的資源。
8.2 ER模型
Perhaps the simplest and most well-known generative model of graphs is the ER model . In this model we define the likelihood of an edge occurring between any pair of nodes as
也許最簡單和最著名的圖生成模型是ER模型。在此模型中,我們將任意節點對之間出現邊的可能性定義為
where r ∈ [0,1] is parameter controlling the density of the graph. In other words, the ER model simply assumes that the probability of an edge occurring between any pairs of nodes is equal to r.
其中r∈[0,1]是控制圖形密度的參數。換句話說,ER模型簡單地假設在任何節點對之間出現邊的概率等於r。
The ER model is attractive due to its simplicity. To generate a random ER graph, we simply choose (or sample) how many nodes we want, set the density parameter r, and then use Equation (8.1) to generate the adjacency matrix. Since the edge probabilities are all independent, the time complexity to generate a graph is O(|V|2), i.e., linear in the size of the adjacency matrix.
ER模型由於其簡單性而頗具吸引力。要生成隨機ER圖,我們只需選擇(或取樣)需要多少個節點,設置密度參數r,然後使用等式(8.1)生成鄰接矩陣。由於邊的概率都是獨立的,所以生成一個圖的時間複雜度是O(|V|2),即鄰接矩陣的大小是線性的。
The downside of the ER model, however, is that it does not generate very realistic graphs. In particular, the only property that we can control in the ER model is the density of the graph, since the parameter r is equal (in expectation) to the average degree in the graph. Other graph properties—such as the degree distribution, existence of community structures, node clustering coefficients, and the occurrence of structural motifs—are not captured by the ER model. It is well known that graphs generated by the ER model fail to reflect the distribution of these more complex graph properties, which are known to be important in the structure and function of real-world graphs.
然而,ER模型的缺點是它不能生成非常逼真的圖形。特別地,在ER模型中我們可以控制的唯一屬性是圖的密度,因為參數r等於(在期望中)圖中的平均度。ER模型沒有捕獲其他圖屬性,例如度分布、社區結構的存在、節點聚類係數和結構基元的出現。眾所周知,ER模型生成的圖不能反映這些更複雜的圖屬性的分布,而這些屬性在現實世界圖的結構和功能中是非常重要的。
8.3 Stochastic Block Models 隨機塊圖
Many traditional graph generation approaches seek to improve the ER model by better capturing additional properties of real-world graphs, which the ER model ignores. One prominent example is the class of stochastic block models (SBMs), which seek to generate graphs with community structure.
許多傳統的圖形生成方法試圖通過更好地捕捉真實圖形的附加屬性來改進ER模型,而ER模型忽略了這一點。一個突出的例子是隨機區塊模型(SBM),它尋求生成具有社區結構的圖。
In a basic SBM model, we specify a number γ of different blocks: C1, …,Cγ. Every node u ∈ V then has a probability piof belonging to block i, i.e. pi= P(u ∈ Ci),∀u ∈ V, i = 1, …, γ wherePγ i=1pi= 1. Edge probabilities are then specified by a block-to-block probability matrix P ∈ [0,1]γ×γ, where C[i, j] gives the probability of an edge occuring between a node in block Ciand a node in block Cj. The generative process for the basic SBM model is as follows:
在基本的SBM模型中,我們指定不同塊的數量γ:C1,…,Cγ。然後,每個節點u∈V具有屬於塊i的概率pof,即pi=P(u∈ci),∀u∈V,i=1,…,γ,其中Pγi=1pi=1。然後,由塊到塊概率矩陣P∈[0,1]γ×γ指定邊概率,其中C[i,j]給出塊Ci中的節點和塊Cj中的節點之間出現邊的概率。基本SBM模型的生成過程如下: