Android OOM 問題探究 — 從入門到放棄
- 2022 年 8 月 20 日
- 筆記
- Android Programing
一、前言
最近客戶回饋了一些OOM的問題,很早之前自己也有簡單了解過OOM的知識,但時間久遠,很多東西都記不清了。
現在遇到這個OOM問題,也即趁此搜索了一些資料,對OOM問題做一些探究,把資料記錄於此,一遍後續查閱。本文內容大量借鑒參考了網路上經典文章的內容,站在巨人的肩膀上登高望遠!
註:以下分析基於 Android R source
二、OOM問題的可能原因
網路上可以搜索到很多的解釋,都很詳細,我在此也做一個簡單的總結,當然可能不全面,僅供學習參考
Android系統中,OutOfMemoryError這個錯誤是怎麼被系統拋出的?在程式碼進行搜索可以看到
重點關注下面兩點
✔️ 堆記憶體分配失敗時的OOM == /art/runtime/gc/heap.cc
✔️ 創建執行緒失敗時的OOM == /art/runtime/thread.cc
三、OOM — 堆記憶體分配失敗
在source code中我們可以看到,當堆記憶體分配失敗時,會拋出一些典型的log,如下程式碼
void Heap::ThrowOutOfMemoryError(Thread* self, size_t byte_count, AllocatorType allocator_type) {
...
std::ostringstream oss;
size_t total_bytes_free = GetFreeMemory();
oss << "Failed to allocate a " << byte_count << " byte allocation with " << total_bytes_free
<< " free bytes and " << PrettySize(GetFreeMemoryUntilOOME()) << " until OOM,"
<< " target footprint " << target_footprint_.load(std::memory_order_relaxed)
<< ", growth limit "
<< growth_limit_;
// If the allocation failed due to fragmentation, print out the largest continuous allocation.
...
}
在出現OOM問題時,logcat中應該會看到類似下面的資訊輸出
08-19 11:34:53.860 28028 28028 E AndroidRuntime: java.lang.OutOfMemoryError: Failed to allocate a 20971536 byte allocation with 6147912 free bytes and 6003KB until OOM, target footprint 134217728, growth limit 134217728
上面這段logcat的大概解釋:想要去分配 20971536 bytes的heap memory,但時app剩餘可用的free heap只有6147912 bytes,而且當前app最大可分配的heap是134217728 bytes
堆記憶體分配失敗的原因可以分兩種情況:1. 超過APP進程的heap記憶體上限 與 2. 沒有足夠大小的連續地址空間
3.1 超過APP進程的記憶體上限
Android設備上java虛擬機對單個應用的最大記憶體分配做了約束,超出這個值就會OOM。由Runtime.getRuntime.MaxMemory()可以得到Android中每個進程被系統分配的記憶體上限,當進程佔用記憶體達到這個上限時就會發生OOM,這也是Android中最常見的OOM類型。
Android系統有如下約定:
- /vendor/build.prop有定義屬性值來對單個應用的最大記憶體分配做約束
dalvik.vm.heapgrowthlimit
常規app使用的參數
dalvik.vm.heapsize
應用在AndroidManifest.xml
設置了android:largeHeap="true"
,將會變成大應用的設置
- 程式碼中也可以使用如下API來獲取記憶體限制的資訊
ActivityManager.getMemoryClass()
常規app最大可用的堆記憶體,對應 dalvik.vm.heapgrowthlimit;
ActivityManager.getLargeMemoryClass()
應用在AndroidManifest.xml設置了android:largeHeap=”true”,將會變成大應用時最大可用的堆記憶體,對應dalvik.vm.heapsize;
Runtime.getRuntime().maxMemory()
可以得到Android中每個進程被系統分配的記憶體上限,等於上面的兩個值之一;
如下是一段簡單的程式碼來演示這種類型的OOM:
private void testOOMCreatHeap(Context context) {
ActivityManager activityManager =(ActivityManager)context.getSystemService(Context.ACTIVITY_SERVICE);
Log.d("OOM_TEST", "app maxMemory = " + activityManager.getMemoryClass() + "MB");
Log.d("OOM_TEST", "large app maxMemory = " + activityManager.getLargeMemoryClass() + "MB");
Log.d("OOM_TEST", "current app maxMemory = " + Runtime.getRuntime().maxMemory()/1024/1024 + "MB");
List<byte[]> bytesList = new ArrayList<>();
int count = 0;
while (true) {
try {
Thread.sleep(20);
} catch (InterruptedException e) {
e.printStackTrace();
}
Log.e("OOM-TEST", "allocate 20MB heap: " + count++ + ", total " + 20*count + "MB");
// 每次申請20MB記憶體
bytesList.add(new byte[1024 * 1024 * 20]);
}
}
註:我的測試平台 dalvik.vm.heapgrowthlimit=128MB , dalvik.vm.heapsize=384MB
上面的測試程式碼中,我們每次分配20MB的記憶體
情況一:常規應用,不要在AndroidManifest.xml設置android:largeHeap=”true”,此時APP的Dalvik heap的分配上限應該是 dalvik.vm.heapgrowthlimit=128MB
看運行結果:
08-19 11:34:53.555 28028 28028 D OOM_TEST: app maxMemory = 128MB
08-19 11:34:53.556 28028 28028 D OOM_TEST: large app maxMemory = 384MB
08-19 11:34:53.556 28028 28028 D OOM_TEST: current app maxMemory = 128MB
08-19 11:34:53.576 28028 28028 E OOM-TEST: allocate 20MB heap: 0, total 20MB
08-19 11:34:53.596 28028 28028 E OOM-TEST: allocate 20MB heap: 1, total 40MB
08-19 11:34:53.617 28028 28028 E OOM-TEST: allocate 20MB heap: 2, total 60MB
08-19 11:34:53.637 28028 28028 E OOM-TEST: allocate 20MB heap: 3, total 80MB
08-19 11:34:53.658 28028 28028 E OOM-TEST: allocate 20MB heap: 4, total 100MB
08-19 11:34:53.678 28028 28028 E OOM-TEST: allocate 20MB heap: 5, total 120MB
08-19 11:34:53.699 28028 28028 E OOM-TEST: allocate 20MB heap: 6, total 140MB
08-19 11:34:53.699 28028 28028 I com.demo: Waiting for a blocking GC Alloc
08-19 11:34:53.713 28028 28042 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.713 28028 28028 I com.demo: WaitForGcToComplete blocked Alloc on AddRemoveAppImageSpace for 14.279ms
08-19 11:34:53.713 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.713 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.713 28028 28042 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.713 28028 28028 I com.demo: WaitForGcToComplete blocked Alloc on AddRemoveAppImageSpace for 14.279ms
08-19 11:34:53.713 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.713 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.732 28028 28028 I com.demo: Alloc young concurrent copying GC freed 4(31KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 73us total 19.225ms
08-19 11:34:53.733 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.766 28028 28028 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.767 28028 28028 I com.demo: Alloc concurrent copying GC freed 6(16KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 71us total 33.715ms
08-19 11:34:53.767 28028 28028 I com.demo: Forcing collection of SoftReferences for 20MB allocation
08-19 11:34:53.767 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.792 28028 28028 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.792 28028 28028 I com.demo: Alloc concurrent copying GC freed 1046(50KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 57us total 25.120ms
08-19 11:34:53.792 28028 28028 W com.demo: Throwing OutOfMemoryError "Failed to allocate a 20971532 byte allocation with 6147912 free bytes and 6003KB until OOM, target footprint 134217728, growth limit 134217728" (VmSize 1264080 kB)
08-19 11:34:53.793 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.793 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.808 28028 28028 I com.demo: Alloc young concurrent copying GC freed 4(31KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 62us total 15.229ms
08-19 11:34:53.808 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.835 28028 28028 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.836 28028 28028 I com.demo: Alloc concurrent copying GC freed 3(16KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 59us total 27.042ms
08-19 11:34:53.836 28028 28028 I com.demo: Forcing collection of SoftReferences for 20MB allocation
08-19 11:34:53.836 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.857 28028 28028 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.857 28028 28028 I com.demo: Alloc concurrent copying GC freed 6(16KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 50us total 21.249ms
08-19 11:34:53.857 28028 28028 W com.demo: Throwing OutOfMemoryError "Failed to allocate a 20971536 byte allocation with 6147912 free bytes and 6003KB until OOM, target footprint 134217728, growth limit 134217728" (VmSize 1264016 kB)
08-19 11:34:53.858 28028 28028 E InputEventSender: Exception dispatching finished signal.
08-19 11:34:53.858 28028 28028 E MessageQueue-JNI: Exception in MessageQueue callback: handleReceiveCallback
08-19 11:34:53.859 28028 28028 E MessageQueue-JNI: java.lang.OutOfMemoryError: Failed to allocate a 20971536 byte allocation with 6147912 free bytes and 6003KB until OOM, target footprint 134217728, growth limit 134217728
08-19 11:34:53.859 28028 28028 E MessageQueue-JNI: at com.demo.MainActivity.testOOMCreatHeap(MainActivity.java:393)
08-19 11:34:53.859 28028 28028 E MessageQueue-JNI: at com.demo.MainActivity.onClick(MainActivity.java:450)
解釋:
最後一次請求分配heap memory時,此時因為已經分配了120+MB的記憶體,如果繼續分配20MB顯然超過了限制的128MB,而且此時GC並沒有能回收掉任何記憶體,最終分配失敗,拋出OutOfMemoryError
情況二:常規應用,在AndroidManifest.xml設置android:largeHeap=”true”,此時APP的Dalvik heap的分配上限應該是 dalvik.vm.heapsize=384MB
看運行結果:
08-19 11:32:22.660 27539 27539 D OOM_TEST: app maxMemory = 128MB
08-19 11:32:22.660 27539 27539 D OOM_TEST: large app maxMemory = 384MB
08-19 11:32:22.660 27539 27539 D OOM_TEST: current app maxMemory = 384MB
08-19 11:32:23.048 27539 27539 E OOM-TEST: allocate 20MB heap: 18, total 380MB
08-19 11:32:23.061 27539 27553 I com.mediacodec: Clamp target GC heap from 406MB to 384MB
08-19 11:32:23.069 27539 27539 E OOM-TEST: allocate 20MB heap: 19, total 400MB
08-19 11:32:23.069 27539 27539 I com.demo: Starting a blocking GC Alloc
08-19 11:32:23.226 27539 27539 W com.demo: Throwing OutOfMemoryError "Failed to allocate a 20971536 byte allocation with 1900608 free bytes and 1856KB until OOM, target footprint 402653184, growth limit 402653184" (VmSize 2053220 kB)
08-19 11:32:23.226 27539 27539 E InputEventSender: Exception dispatching finished signal.
08-19 11:32:23.226 27539 27539 E MessageQueue-JNI: Exception in MessageQueue callback: handleReceiveCallback
08-19 11:32:23.227 27539 27539 E MessageQueue-JNI: java.lang.OutOfMemoryError: Failed to allocate a 20971536 byte allocation with 1900608 free bytes and 1856KB until OOM, target footprint 402653184, growth limit 402653184
08-19 11:32:23.227 27539 27539 E MessageQueue-JNI: at com.demo.MainActivity.testOOMCreatHeap(MainActivity.java:393)
08-19 11:32:23.227 27539 27539 E MessageQueue-JNI: at com.demo.MainActivity.onClick(MainActivity.java:450)
解釋:
最後一次請求分配heap memory時,此時因為已經分配了380+MB的記憶體,如果繼續分配20MB顯然超過了限制的384MB,而且此時GC並沒有能回收掉任何記憶體,最終分配失敗,拋出OutOfMemoryError
3.2 沒有足夠大小的連續地址空間
:failed due to fragmentation (required continguous free 「<< required_bytes << 「 bytes for a new buffer where largest contiguous free 」 << largest_continuous_free_pages << 「 bytes)」
void RosAlloc::LogFragmentationAllocFailure(std::ostream& os, size_t failed_alloc_bytes) {
...
if (required_bytes > largest_continuous_free_pages) {
os << "; failed due to fragmentation ("
<< "required contiguous free " << required_bytes << " bytes" << new_buffer_msg
<< ", largest contiguous free " << largest_continuous_free_pages << " bytes"
<< ", total free pages " << total_free << " bytes"
<< ", space footprint " << footprint_ << " bytes"
<< ", space max capacity " << max_capacity_ << " bytes"
<< ")" << std::endl;
}
}
四、OOM — 創建執行緒失敗
Android中執行緒(Thread)的創建及記憶體分配過程分析可以參見如下這篇文章://blog.csdn.net/u011578734/article/details/109331764
執行緒創建會消耗大量的系統資源(例如記憶體),創建過程涉及java層和native的處理。實質工作是在native層完成的,相關程式碼位於 /art/runtime/thread.cc
void Thread::CreateNativeThread(JNIEnv* env, jobject java_peer, size_t stack_size, bool is_daemon) {
// 此處省略一萬字
{
std::string msg(child_jni_env_ext.get() == nullptr ?
StringPrintf("Could not allocate JNI Env: %s", error_msg.c_str()) :
StringPrintf("pthread_create (%s stack) failed: %s",
PrettySize(stack_size).c_str(), strerror(pthread_create_result)));
ScopedObjectAccess soa(env);
soa.Self()->ThrowOutOfMemoryError(msg.c_str());
}
}
大概總結如下:下圖借鑒了網路上的資料(偷懶了)
4.1 創建JNI Env 失敗
一般有兩種原因
1. FD溢出導致JNIEnv創建失敗了,一般logcat中可以看到資訊 Too many open files … Could not allocate JNI Env
當進程fd數(可以通過 ls /proc/pid/fd | wc -l 獲得)突破 /proc/pid/limits中規定的Max open files時,產生OOM
E/art: ashmem_create_region failed for 'indirect ref table': Too many open files java.lang.OutOfMemoryError:Could not allocate JNI Env at java.lang.Thread.nativeCreate(Native Method) at java.lang.Thread.start(Thread.java:730)
2. 虛擬記憶體不足導致JNIEnv創建失敗了,一般logcat中可以看到資訊 Could not allocate JNI Env: Failed anonymous mmap
08-19 17:51:50.662 3533 3533 E OOM_TEST: create thread : 1104
08-19 17:51:50.663 3533 3533 W com.demo: Throwing OutOfMemoryError "Could not allocate JNI Env: Failed anonymous mmap(0x0, 8192, 0x3, 0x22, -1, 0): Operation not permitted. See process maps in the log." (VmSize 2865432 kB)
08-19 17:51:50.663 3533 3533 E InputEventSender: Exception dispatching finished signal.
08-19 17:51:50.663 3533 3533 E MessageQueue-JNI: Exception in MessageQueue callback: handleReceiveCallback
08-19 17:51:50.668 3533 3533 E MessageQueue-JNI: java.lang.OutOfMemoryError: Could not allocate JNI Env: Failed anonymous mmap(0x0, 8192, 0x3, 0x22, -1, 0): Operation not permitted. See process maps in the log.
08-19 17:51:50.668 3533 3533 E MessageQueue-JNI: at java.lang.Thread.nativeCreate(Native Method)
08-19 17:51:50.668 3533 3533 E MessageQueue-JNI: at java.lang.Thread.start(Thread.java:887)
08-19 17:51:50.671 3533 3533 E AndroidRuntime: FATAL EXCEPTION: main
08-19 17:51:50.671 3533 3533 E AndroidRuntime: Process: com.demo, PID: 3533
08-19 17:51:50.671 3533 3533 E AndroidRuntime: java.lang.OutOfMemoryError: Could not allocate JNI Env: Failed anonymous mmap(0x0, 8192, 0x3, 0x22, -1, 0): Operation not permitted. See process maps in the log.
08-19 17:51:50.671 3533 3533 E AndroidRuntime: at java.lang.Thread.nativeCreate(Native Method)
08-19 17:51:50.671 3533 3533 E AndroidRuntime: at java.lang.Thread.start(Thread.java:887)
4.2 創建執行緒失敗
一般有兩種原因
1. 虛擬記憶體不足導致失敗,一般logcat中可以看到資訊 mapped space: Out of memory … pthread_create (1040KB stack) failed: Out of memory
native層通過FixStackSize設置執行緒棧大小,默認情況下,執行緒棧所需記憶體總大小 = 1M + 8k + 8k,即為1040k。
// /art/runtime/thread.cc
static size_t FixStackSize(size_t stack_size) {
// 這裡面設置計算 stack_size,一般默認1040KB
}
發生OOM時的典型logcat如下:
W/libc: pthread_create failed: couldn't allocate 1073152-bytes mapped space: Out of memory
W/tch.crowdsourc: Throwing OutOfMemoryError with VmSize 4191668 kB "pthread_create (1040KB stack) failed: Try again"
java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Try again
at java.lang.Thread.nativeCreate(Native Method)
at java.lang.Thread.start(Thread.java:753)
2. 執行緒數量超過了限制導致失敗,一般logcat中可以看到資訊 pthread_create failed: clone failed: Try again
08-19 18:55:07.725 22139 22139 E OOM_TEST: create thread : 54
08-19 18:55:07.725 22139 22139 W libc : pthread_create failed: clone failed: Try again
08-19 18:55:07.726 22139 22139 W com.demo: Throwing OutOfMemoryError "pthread_create (1040KB stack) failed: Try again" (VmSize 1715684 kB)
08-19 18:55:07.733 22139 22139 E InputEventSender: Exception dispatching finished signal.
08-19 18:55:07.733 22139 22139 E MessageQueue-JNI: Exception in MessageQueue callback: handleReceiveCallback
08-19 18:55:07.734 22786 22786 W externalstorag: Using default instruction set features for ARM CPU variant (generic) using conservative defaults
08-19 18:55:07.734 22786 22786 W libc : pthread_create failed: clone failed: Try again
08-19 18:55:07.735 22786 22786 F externalstorag: thread_pool.cc:66] pthread_create failed for new thread pool worker thread: Try again
08-19 18:55:07.737 22139 22139 E MessageQueue-JNI: java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Try again
08-19 18:55:07.737 22139 22139 E MessageQueue-JNI: at java.lang.Thread.nativeCreate(Native Method)
08-19 18:55:07.739 22139 22139 E AndroidRuntime: java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Try again
08-19 18:55:07.739 22139 22139 E AndroidRuntime: at java.lang.Thread.nativeCreate(Native Method)
08-19 18:55:07.739 22139 22139 E AndroidRuntime: at java.lang.Thread.start(Thread.java:887)
4.3 debug技巧
- 對於FD的限制
可以執行 cat /proc/pid/limits
來查看Max open files 最大打開的文件數量
可以執行 ls /proc/pid/fd | wc -l
來查看進程打開的文件數量
- 對於執行緒數量的限制
可以執行cat /proc/sys/kernel/threads-max
查看系統最多可以創建多少執行緒
可以執行echo 3000 > /proc/sys/kernel/threads-max
修改這個值,做測試
查看系統當前的執行緒數 top -H
查看當前進程執行緒數量cat /proc/{pid}/status
- 對於虛擬記憶體使用情況
可以執行 cat /proc/pid/status | grep Vm
查看VmSize及VmPeak
4.4 OOM演示
可以使用下面這段程式做簡單演示,不過不同設備由於參數配置不同,可能會OOM error會有不同
private void testOOMCreatThread() {
int count = 0;
while (true) {
Log.e("OOM_TEST", "create thread : " + ++count);
new Thread(new Runnable() {
@Override
public void run() {
try {
Thread.sleep(10000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}, "thread-" + count).start();
}
}
—
五、參考及推薦閱讀文章
✔️ Android應用OutOfMemory — 1.OOM機制了解
✔️ 關於虛擬機參數的調整 — heapgrowthlimit/heapsize的配置
✔️ Android中執行緒(Thread)的創建及記憶體分配過程分析
—