第32篇-解析interfacevirtual位元組碼指令

2021 年 10 月 28 日
筆記

在前面介紹invokevirtual指令時，如果判斷出ConstantPoolCacheEntry中的_indices欄位的_f2屬性的值為空，則認為調用的目標方法沒有連接，也就是沒有向ConstantPoolCacheEntry中保存調用方法的相關資訊，需要調用InterpreterRuntime::resolve_invoke()函數進行方法連接，這個函數的實現比較多，我們分幾部分查看：

InterpreterRuntime::resolve_invoke()函數第1部分：

Handle receiver(thread, NULL);
if (bytecode == Bytecodes::_invokevirtual || bytecode == Bytecodes::_invokeinterface) {
    ResourceMark rm(thread);
    // 調用method()函數從當前的棧幀中獲取到需要執行的方法
    Method* m1 = method(thread);
    methodHandle m (thread, m1);

    // 調用bci()函數從當前的棧幀中獲取需要執行的方法的位元組碼索引
    int i1 = bci(thread);
    Bytecode_invoke call(m, i1);

    // 當前需要執行的方法的簽名
    Symbol* signature = call.signature();

    frame fm = thread->last_frame();
    oop x = fm.interpreter_callee_receiver(signature);
    receiver = Handle(thread,x);
}

當位元組碼為invokevirtual或invokeinterface這樣的動態分派位元組碼時，執行如上的邏輯。獲取到了receiver變數的值。接著看實現，如下：

InterpreterRuntime::resolve_invoke()函數第2部分：

CallInfo info;
constantPoolHandle pool(thread, method(thread)->constants());

{
    JvmtiHideSingleStepping jhss(thread);
    int cpcacheindex = get_index_u2_cpcache(thread, bytecode);
    LinkResolver::resolve_invoke(info, receiver, pool,cpcacheindex, bytecode, CHECK);
    ...
} 

// 如果已經向ConstantPoolCacheEntry中更新了調用的相關資訊則直接返回
if (already_resolved(thread))
  return;

根據存儲在當前棧中的bcp來獲取位元組碼指令的操作數，這個操作數通常就是常量池快取項索引。然後調用LinkResolver::resolve_invoke()函數進行方法連接。這個函數會間接調用LinkResolver::resolve_invokevirtual()函數，實現如下：

void LinkResolver::resolve_invokevirtual(
 CallInfo&           result,
 Handle              recv,
 constantPoolHandle  pool,
 int                 index,
 TRAPS
){

  KlassHandle  resolved_klass;
  Symbol*      method_name = NULL;
  Symbol*      method_signature = NULL;
  KlassHandle  current_klass;

  // 解析常量池時，傳入的參數pool（根據當前棧中要執行的方法找到對應的常量池）和
  // index（常量池快取項的快取，還需要映射為原常量池索引）是有值的，根據這兩個值能夠
  // 解析出resolved_klass和要查找的方法名稱method_name和方法簽名method_signature
  resolve_pool(resolved_klass, method_name,  method_signature, current_klass, pool, index, CHECK);

  KlassHandle  recvrKlass(THREAD, recv.is_null() ? (Klass*)NULL : recv->klass());

  resolve_virtual_call(result, recv, recvrKlass, resolved_klass, method_name, method_signature, current_klass, true, true, CHECK);
}

其中會調用resolve_pool()和resolve_vritual_call()函數分別連接常量池和方法調用指令。調用會涉及到的相關函數如下圖所示。

下面介紹resolve_pool()和resolve_virtual_call()函數及其調用的相關函數的實現。

1、LinkResolver::resolve_pool()函數

調用的resolve_pool()函數會調用一些函數，如下圖所示。

每次調用LinkResolver::resolve_pool()函數時不一定會按如上的函數調用鏈執行，但是當類還沒有解析時，通常會調用SystemDictionary::resolve_or_fail()函數進行解析，最終會獲取到指向Klass實例的指針，最終將這個類更新到常量池中。

resolve_pool()函數的實現如下：

void LinkResolver::resolve_pool(
 KlassHandle& resolved_klass,
 Symbol*&     method_name,
 Symbol*&     method_signature,
 KlassHandle& current_klass,
 constantPoolHandle pool,
 int          index,
 TRAPS
) {
  resolve_klass(resolved_klass, pool, index, CHECK);

  method_name      = pool->name_ref_at(index);
  method_signature = pool->signature_ref_at(index);
  current_klass    = KlassHandle(THREAD, pool->pool_holder());
}

其中的index為常量池快取項的索引。resolved_klass參數表示需要進行解析的類（解析是在類生成周期中連接相關的部分，所以我們之前有時候會稱為連接，其實具體來說是解析的意思），而current_klass為當前擁有常量池的類，由於傳遞參數時是C++的引用傳遞，所以同值會直接改變變數的值，調用者中的值也會隨著改變。

調用resolve_klass()函數進行類解析，一般來說，類解析會在解釋常量池項時就會進行，這在《深入剖析Java虛擬機：源碼剖析與實例詳解（基礎卷）》一書中介紹過，這裡需要再說一下。

調用的resolve_klass()函數及相關函數的實現如下：

void LinkResolver::resolve_klass(
 KlassHandle&         result,
 constantPoolHandle   pool,
 int                  index,
 TRAPS
) {
  Klass* result_oop = pool->klass_ref_at(index, CHECK);
  // 通過引用進行傳遞
  result = KlassHandle(THREAD, result_oop);
}

Klass* ConstantPool::klass_ref_at(int which, TRAPS) {
  int x = klass_ref_index_at(which);
  return klass_at(x, CHECK_NULL);
}

int klass_ref_index_at(int which) {
  return impl_klass_ref_index_at(which, false);
}

調用的impl_klass_ref_index_at()函數的實現如下：　　

int ConstantPool::impl_klass_ref_index_at(int which, bool uncached) {
  int i = which;
  if (!uncached && cache() != NULL) {
	// 從which對應的ConstantPoolCacheEntry項中獲取ConstantPoolIndex
    i = remap_instruction_operand_from_cache(which);
  }

  assert(tag_at(i).is_field_or_method(), "Corrupted constant pool");
  // 獲取
  jint ref_index = *int_at_addr(i);
  // 獲取低16位，那就是class_index
  return extract_low_short_from_int(ref_index);
}

根據斷言可知，在原常量池索引的i處的項肯定為JVM_CONSTANT_Fieldref、JVM_CONSTANT_Methodref或JVM_CONSTANT_InterfaceMethodref，這幾項的格式如下：

CONSTANT_Fieldref_info{
  u1 tag;
  u2 class_index; 
  u2 name_and_type_index; // 必須是欄位描述符
}

CONSTANT_InterfaceMethodref_info{
  u1 tag;
  u2 class_index; // 必須是介面
  u2 name_and_type_index; // 必須是方法描述符
}

CONSTANT_Methodref_info{
  u1 tag;
  u2 class_index; // 必須是類
  u2 name_and_type_index; // 必須是方法描述符
}

3項的格式都一樣，其中的class_index索引處的項必須為CONSTANT_Class_info結構，表示一個類或介面，當前類欄位或方法是這個類或介面的成員。name_and_type_index索引處必須為CONSTANT_NameAndType_info項。　　

通過調用int_at_addr()函數和extract_low_short_from_int()函數獲取class_index的索引值，如果了解了常量池記憶體布局，這裡函數的實現理解起來會很簡單，這裡不再介紹。

在klass_ref_at()函數中調用klass_at()函數，此函數的實現如下：

Klass* klass_at(int which, TRAPS) {
    constantPoolHandle h_this(THREAD, this);
    return klass_at_impl(h_this, which, CHECK_NULL);
}

調用的klass_at_impl()函數的實現如下：

Klass* ConstantPool::klass_at_impl(
 constantPoolHandle this_oop,
 int                which,
 TRAPS
) {
  
  CPSlot entry = this_oop->slot_at(which);
  if (entry.is_resolved()) { // 已經進行了連接
    return entry.get_klass();
  }

  bool do_resolve = false;
  bool in_error = false;

  Handle  mirror_handle;
  Symbol* name = NULL;
  Handle  loader;
  {
     MonitorLockerEx ml(this_oop->lock());

    if (this_oop->tag_at(which).is_unresolved_klass()) {
      if (this_oop->tag_at(which).is_unresolved_klass_in_error()) {
        in_error = true;
      } else {
        do_resolve = true;
        name   = this_oop->unresolved_klass_at(which);
        loader = Handle(THREAD, this_oop->pool_holder()->class_loader());
      }
    }
  } // unlocking constantPool

  // 省略當in_error變數的值為true時的處理邏輯
 
  if (do_resolve) {
    oop protection_domain = this_oop->pool_holder()->protection_domain();
    Handle h_prot (THREAD, protection_domain);
    Klass* k_oop = SystemDictionary::resolve_or_fail(name, loader, h_prot, true, THREAD);
    KlassHandle k;
    if (!HAS_PENDING_EXCEPTION) {
      k = KlassHandle(THREAD, k_oop);
      mirror_handle = Handle(THREAD, k_oop->java_mirror());
    }

    if (HAS_PENDING_EXCEPTION) {
      ...
      return 0;
    }

    if (TraceClassResolution && !k()->oop_is_array()) {
      ...      
    } else {
      MonitorLockerEx ml(this_oop->lock());
      do_resolve = this_oop->tag_at(which).is_unresolved_klass();
      if (do_resolve) {
        ClassLoaderData* this_key = this_oop->pool_holder()->class_loader_data();
        this_key->record_dependency(k(), CHECK_NULL); // Can throw OOM
        this_oop->klass_at_put(which, k()); // 注意這裡會更新常量池中存儲的內容，這樣就表示類已經解析完成，下次就不需要重複解析了
      }
    }
  }

  entry = this_oop->resolved_klass_at(which);
  assert(entry.is_resolved() && entry.get_klass()->is_klass(), "must be resolved at this point");
  return entry.get_klass();
}

函數首先調用slot_at()函數獲取常量池中一個slot中存儲的值，然後通過CPSlot來表示這個slot，這個slot中可能存儲的值有2個，分別為指向Symbol實例（因為類名用CONSTANT_Utf8_info項表示，在虛擬機內部統一使用Symbol對象表示字元串）的指針和指向Klass實例的指針，如果類已經解釋，那麼指針表示的地址的最後一位為0，如果還沒有被解析，那麼地址的最後一位為1。

當沒有解析時，需要調用SystemDictionary::resolve_or_fail()函數獲取類Klass的實例，然後更新常量池中的資訊，這樣下次就不用重複解析類了。最後返回指向Klass實例的指針即可。

繼續回到LinkResolver::resolve_pool()函數看接下來的執行邏輯，也就是會獲取JVM_CONSTANT_Fieldref、JVM_CONSTANT_Methodref或JVM_CONSTANT_InterfaceMethodref項中的name_and_type_index，其指向的是CONSTANT_NameAndType_info項，格式如下：

CONSTANT_NameAndType_info{
   u1 tag;
  u2 name_index;
  u2 descriptor index;
}

獲取邏輯就是先根據常量池快取項的索引找到原常量池項的索引，然後查找到CONSTANT_NameAndType_info後，獲取到方法名稱和簽名的索引，進而獲取到被調用的目標方法的名稱和簽名。這些資訊將在接下來調用的resolve_virtual_call()函數中使用。　

2、LinkResolver::resolve_virtual_call()函數

resolve_virtual_call()函數會調用的相關函數如下圖所示。

LinkResolver::resolve_virtual_call()的實現如下：

void LinkResolver::resolve_virtual_call(
 CallInfo&     result,
 Handle        recv,
 KlassHandle   receiver_klass,
 KlassHandle   resolved_klass,
 Symbol*       method_name,
 Symbol*       method_signature,
 KlassHandle   current_klass,
 bool         check_access,
 bool         check_null_and_abstract,
 TRAPS
) {
  methodHandle resolved_method;

  linktime_resolve_virtual_method(resolved_method, resolved_klass, method_name, method_signature, current_klass, check_access, CHECK);

  runtime_resolve_virtual_method(result, resolved_method, resolved_klass, recv, receiver_klass, check_null_and_abstract, CHECK);
}

首先調用LinkResolver::linktime_resolve_virtual_method()函數，這個函數會調用如下函數：

void LinkResolver::resolve_method(
 methodHandle&  resolved_method,
 KlassHandle    resolved_klass,
 Symbol*        method_name,
 Symbol*        method_signature,
 KlassHandle    current_klass,
 bool          check_access,
 bool          require_methodref,
 TRAPS
) {

  // 從解析的類和其父類中查找方法
  lookup_method_in_klasses(resolved_method, resolved_klass, method_name, method_signature, true, false, CHECK);

  // 沒有在解析類的繼承體系中查找到方法
  if (resolved_method.is_null()) { 
    // 從解析類實現的所有介面（包括間接實現的介面）中查找方法
    lookup_method_in_interfaces(resolved_method, resolved_klass, method_name, method_signature, CHECK);
    // ...

    if (resolved_method.is_null()) {
      // 沒有找到對應的方法
      ...
    }
  }

  // ...
}

如上函數中最主要的就是根據method_name和method_signature從resolved_klass類中找到合適的方法，如果找到就賦值給resolved_method變數。

調用lookup_method_in_klasses()、lookup_method_in_interfaces()等函數進行方法的查找，這裡暫時不介紹。

下面接著看runtime_resolve_virtual_method()函數，這個函數的實現如下：

void LinkResolver::runtime_resolve_virtual_method(
 CallInfo&      result,
 methodHandle   resolved_method,
 KlassHandle    resolved_klass,
 Handle         recv,
 KlassHandle    recv_klass,
 bool          check_null_and_abstract,
 TRAPS
) {

  int vtable_index = Method::invalid_vtable_index;
  methodHandle selected_method;

  // 當方法定義在介面中時，表示是miranda方法
  if (resolved_method->method_holder()->is_interface()) { 
    vtable_index = vtable_index_of_interface_method(resolved_klass,resolved_method);

    InstanceKlass* inst = InstanceKlass::cast(recv_klass());
    selected_method = methodHandle(THREAD, inst->method_at_vtable(vtable_index));
  } else {
    // 如果走如下的程式碼邏輯，則表示resolved_method不是miranda方法，需要動態分派且肯定有正確的vtable索引
    vtable_index = resolved_method->vtable_index();

    // 有些方法雖然看起來需要動態分派，但是如果這個方法有final關鍵字時，可進行靜態綁定，所以直接調用即可
    // final方法其實不會放到vtable中，除非final方法覆寫了父類中的方法
    if (vtable_index == Method::nonvirtual_vtable_index) {
      selected_method = resolved_method;
    } else {
      // 根據vtable和vtable_index以及inst進行方法的動態分派
      InstanceKlass* inst = (InstanceKlass*)recv_klass();
      selected_method = methodHandle(THREAD, inst->method_at_vtable(vtable_index));
    }
  }  
 
  // setup result  resolve的類型為CallInfo，為CallInfo設置了連接後的相關資訊
  result.set_virtual(resolved_klass, recv_klass, resolved_method, selected_method, vtable_index, CHECK);
}

當為miranda方法時，調用 LinkResolver::vtable_index_of_interface_method()函數查找；當為final方法時，因為final方法不可能被子類覆寫，所以resolved_method就是目標調用方法；除去前面的2種情況後，剩下的方法就需要結合vtable和vtable_index進行動態分派了。

如上函數將查找到調用時需要的所有資訊並存儲到CallInfo類型的result變數中。　

在獲取到調用時的所有資訊並存儲到CallInfo中後，就可以根據info中相關資訊填充ConstantPoolCacheEntry。我們回看InterpreterRuntime::resolve_invoke()函數的執行邏輯。

InterpreterRuntime::resolve_invoke()函數第2部分：

switch (info.call_kind()) {
  case CallInfo::direct_call: // 直接調用
    cache_entry(thread)->set_direct_call(
		  bytecode,
		  info.resolved_method());
    break;
  case CallInfo::vtable_call: // vtable分派
    cache_entry(thread)->set_vtable_call(
		  bytecode,
		  info.resolved_method(),
		  info.vtable_index());
    break;
  case CallInfo::itable_call: // itable分派
    cache_entry(thread)->set_itable_call(
		  bytecode,
		  info.resolved_method(),
		  info.itable_index());
    break;
  default:  ShouldNotReachHere();
}

無論直接調用，還是vtable和itable動態分派，都會在方法解析完成後將相關的資訊存儲到常量池快取項中。調用cache_entry()函數獲取對應的ConstantPoolCacheEntry項，然後調用set_vtable_call()函數，此函數會調用如下函數更新ConstantPoolCacheEntry項中的資訊，如下：

void ConstantPoolCacheEntry::set_direct_or_vtable_call(
 Bytecodes::Code  invoke_code,
 methodHandle     method,
 int              vtable_index
) {
  bool is_vtable_call = (vtable_index >= 0);  // FIXME: split this method on this boolean
 
  int byte_no = -1;
  bool change_to_virtual = false;

  switch (invoke_code) {
    case Bytecodes::_invokeinterface:
       change_to_virtual = true;

    // ...
    // 可以看到，通過_invokevirtual指令時，並不一定都是動態分發，也有可能是靜態綁定
    case Bytecodes::_invokevirtual: // 當前已經在ConstantPoolCacheEntry類中了
      {
        if (!is_vtable_call) {
          assert(method->can_be_statically_bound(), "");
          // set_f2_as_vfinal_method checks if is_vfinal flag is true.
          set_method_flags(as_TosState(method->result_type()),
                           (                             1      << is_vfinal_shift) |
                           ((method->is_final_method() ? 1 : 0) << is_final_shift)  |
                           ((change_to_virtual         ? 1 : 0) << is_forced_virtual_shift), // 在介面中調用Object中定義的方法
                           method()->size_of_parameters());
          set_f2_as_vfinal_method(method());
        } else {
          // 執行這裡的邏輯時，表示方法是非靜態綁定的非final方法，需要動態分派，則vtable_index的值肯定大於等於0
          set_method_flags(as_TosState(method->result_type()),
                           ((change_to_virtual ? 1 : 0) << is_forced_virtual_shift),
                           method()->size_of_parameters());
          // 對於動態分發來說，ConstantPoolCacheEntry::_f2中保存的是vtable_index
          set_f2(vtable_index);
        }
        byte_no = 2;
        break;
      }
      // ...
  }

  if (byte_no == 1) {
    // invoke_code為非invokevirtual和非invokeinterface位元組碼指令
    set_bytecode_1(invoke_code);
  } else if (byte_no == 2)  {
    if (change_to_virtual) {
      if (method->is_public()) 
         set_bytecode_1(invoke_code);
    } else {
      assert(invoke_code == Bytecodes::_invokevirtual, "");
    }
    // set up for invokevirtual, even if linking for invokeinterface also:
    set_bytecode_2(Bytecodes::_invokevirtual);
  } 
}

連接完成後ConstantPoolCacheEntry中的各個項如下圖所示。

所以對於invokevirtual來說，通過vtable進行方法的分發，在ConstantPoolCacheEntry中，_f1欄位沒有使用，而對_f2欄位來說，如果調用的是非final的virtual方法，則保存的是目標方法在vtable中的索引編號，如果是virtual final方法，則_f2欄位直接指向目標方法的Method實例。