Android OOM 问题探究 — 从入门到放弃
- 2022 年 8 月 20 日
- 筆記
- Android Programing
一、前言
最近客户反馈了一些OOM的问题,很早之前自己也有简单了解过OOM的知识,但时间久远,很多东西都记不清了。
现在遇到这个OOM问题,也即趁此搜索了一些资料,对OOM问题做一些探究,把资料记录于此,一遍后续查阅。本文内容大量借鉴参考了网络上经典文章的内容,站在巨人的肩膀上登高望远!
注:以下分析基于 Android R source
二、OOM问题的可能原因
网络上可以搜索到很多的解释,都很详细,我在此也做一个简单的总结,当然可能不全面,仅供学习参考
Android系统中,OutOfMemoryError这个错误是怎么被系统抛出的?在代码进行搜索可以看到
重点关注下面两点
✔️ 堆内存分配失败时的OOM == /art/runtime/gc/heap.cc
✔️ 创建线程失败时的OOM == /art/runtime/thread.cc
三、OOM — 堆内存分配失败
在source code中我们可以看到,当堆内存分配失败时,会抛出一些典型的log,如下代码
void Heap::ThrowOutOfMemoryError(Thread* self, size_t byte_count, AllocatorType allocator_type) {
...
std::ostringstream oss;
size_t total_bytes_free = GetFreeMemory();
oss << "Failed to allocate a " << byte_count << " byte allocation with " << total_bytes_free
<< " free bytes and " << PrettySize(GetFreeMemoryUntilOOME()) << " until OOM,"
<< " target footprint " << target_footprint_.load(std::memory_order_relaxed)
<< ", growth limit "
<< growth_limit_;
// If the allocation failed due to fragmentation, print out the largest continuous allocation.
...
}
在出现OOM问题时,logcat中应该会看到类似下面的信息输出
08-19 11:34:53.860 28028 28028 E AndroidRuntime: java.lang.OutOfMemoryError: Failed to allocate a 20971536 byte allocation with 6147912 free bytes and 6003KB until OOM, target footprint 134217728, growth limit 134217728
上面这段logcat的大概解释:想要去分配 20971536 bytes的heap memory,但时app剩余可用的free heap只有6147912 bytes,而且当前app最大可分配的heap是134217728 bytes
堆内存分配失败的原因可以分两种情况:1. 超过APP进程的heap内存上限 与 2. 没有足够大小的连续地址空间
3.1 超过APP进程的内存上限
Android设备上java虚拟机对单个应用的最大内存分配做了约束,超出这个值就会OOM。由Runtime.getRuntime.MaxMemory()可以得到Android中每个进程被系统分配的内存上限,当进程占用内存达到这个上限时就会发生OOM,这也是Android中最常见的OOM类型。
Android系统有如下约定:
- /vendor/build.prop有定义属性值来对单个应用的最大内存分配做约束
dalvik.vm.heapgrowthlimit
常规app使用的参数
dalvik.vm.heapsize
应用在AndroidManifest.xml
设置了android:largeHeap="true"
,将会变成大应用的设置
- 代码中也可以使用如下API来获取内存限制的信息
ActivityManager.getMemoryClass()
常规app最大可用的堆内存,对应 dalvik.vm.heapgrowthlimit;
ActivityManager.getLargeMemoryClass()
应用在AndroidManifest.xml设置了android:largeHeap=”true”,将会变成大应用时最大可用的堆内存,对应dalvik.vm.heapsize;
Runtime.getRuntime().maxMemory()
可以得到Android中每个进程被系统分配的内存上限,等于上面的两个值之一;
如下是一段简单的代码来演示这种类型的OOM:
private void testOOMCreatHeap(Context context) {
ActivityManager activityManager =(ActivityManager)context.getSystemService(Context.ACTIVITY_SERVICE);
Log.d("OOM_TEST", "app maxMemory = " + activityManager.getMemoryClass() + "MB");
Log.d("OOM_TEST", "large app maxMemory = " + activityManager.getLargeMemoryClass() + "MB");
Log.d("OOM_TEST", "current app maxMemory = " + Runtime.getRuntime().maxMemory()/1024/1024 + "MB");
List<byte[]> bytesList = new ArrayList<>();
int count = 0;
while (true) {
try {
Thread.sleep(20);
} catch (InterruptedException e) {
e.printStackTrace();
}
Log.e("OOM-TEST", "allocate 20MB heap: " + count++ + ", total " + 20*count + "MB");
// 每次申请20MB内存
bytesList.add(new byte[1024 * 1024 * 20]);
}
}
注:我的测试平台 dalvik.vm.heapgrowthlimit=128MB , dalvik.vm.heapsize=384MB
上面的测试代码中,我们每次分配20MB的内存
情况一:常规应用,不要在AndroidManifest.xml设置android:largeHeap=”true”,此时APP的Dalvik heap的分配上限应该是 dalvik.vm.heapgrowthlimit=128MB
看运行结果:
08-19 11:34:53.555 28028 28028 D OOM_TEST: app maxMemory = 128MB
08-19 11:34:53.556 28028 28028 D OOM_TEST: large app maxMemory = 384MB
08-19 11:34:53.556 28028 28028 D OOM_TEST: current app maxMemory = 128MB
08-19 11:34:53.576 28028 28028 E OOM-TEST: allocate 20MB heap: 0, total 20MB
08-19 11:34:53.596 28028 28028 E OOM-TEST: allocate 20MB heap: 1, total 40MB
08-19 11:34:53.617 28028 28028 E OOM-TEST: allocate 20MB heap: 2, total 60MB
08-19 11:34:53.637 28028 28028 E OOM-TEST: allocate 20MB heap: 3, total 80MB
08-19 11:34:53.658 28028 28028 E OOM-TEST: allocate 20MB heap: 4, total 100MB
08-19 11:34:53.678 28028 28028 E OOM-TEST: allocate 20MB heap: 5, total 120MB
08-19 11:34:53.699 28028 28028 E OOM-TEST: allocate 20MB heap: 6, total 140MB
08-19 11:34:53.699 28028 28028 I com.demo: Waiting for a blocking GC Alloc
08-19 11:34:53.713 28028 28042 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.713 28028 28028 I com.demo: WaitForGcToComplete blocked Alloc on AddRemoveAppImageSpace for 14.279ms
08-19 11:34:53.713 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.713 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.713 28028 28042 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.713 28028 28028 I com.demo: WaitForGcToComplete blocked Alloc on AddRemoveAppImageSpace for 14.279ms
08-19 11:34:53.713 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.713 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.732 28028 28028 I com.demo: Alloc young concurrent copying GC freed 4(31KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 73us total 19.225ms
08-19 11:34:53.733 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.766 28028 28028 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.767 28028 28028 I com.demo: Alloc concurrent copying GC freed 6(16KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 71us total 33.715ms
08-19 11:34:53.767 28028 28028 I com.demo: Forcing collection of SoftReferences for 20MB allocation
08-19 11:34:53.767 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.792 28028 28028 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.792 28028 28028 I com.demo: Alloc concurrent copying GC freed 1046(50KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 57us total 25.120ms
08-19 11:34:53.792 28028 28028 W com.demo: Throwing OutOfMemoryError "Failed to allocate a 20971532 byte allocation with 6147912 free bytes and 6003KB until OOM, target footprint 134217728, growth limit 134217728" (VmSize 1264080 kB)
08-19 11:34:53.793 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.793 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.808 28028 28028 I com.demo: Alloc young concurrent copying GC freed 4(31KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 62us total 15.229ms
08-19 11:34:53.808 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.835 28028 28028 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.836 28028 28028 I com.demo: Alloc concurrent copying GC freed 3(16KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 59us total 27.042ms
08-19 11:34:53.836 28028 28028 I com.demo: Forcing collection of SoftReferences for 20MB allocation
08-19 11:34:53.836 28028 28028 I com.demo: Starting a blocking GC Alloc
08-19 11:34:53.857 28028 28028 I com.demo: Clamp target GC heap from 146MB to 128MB
08-19 11:34:53.857 28028 28028 I com.demo: Alloc concurrent copying GC freed 6(16KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 122MB/128MB, paused 50us total 21.249ms
08-19 11:34:53.857 28028 28028 W com.demo: Throwing OutOfMemoryError "Failed to allocate a 20971536 byte allocation with 6147912 free bytes and 6003KB until OOM, target footprint 134217728, growth limit 134217728" (VmSize 1264016 kB)
08-19 11:34:53.858 28028 28028 E InputEventSender: Exception dispatching finished signal.
08-19 11:34:53.858 28028 28028 E MessageQueue-JNI: Exception in MessageQueue callback: handleReceiveCallback
08-19 11:34:53.859 28028 28028 E MessageQueue-JNI: java.lang.OutOfMemoryError: Failed to allocate a 20971536 byte allocation with 6147912 free bytes and 6003KB until OOM, target footprint 134217728, growth limit 134217728
08-19 11:34:53.859 28028 28028 E MessageQueue-JNI: at com.demo.MainActivity.testOOMCreatHeap(MainActivity.java:393)
08-19 11:34:53.859 28028 28028 E MessageQueue-JNI: at com.demo.MainActivity.onClick(MainActivity.java:450)
解释:
最后一次请求分配heap memory时,此时因为已经分配了120+MB的内存,如果继续分配20MB显然超过了限制的128MB,而且此时GC并没有能回收掉任何内存,最终分配失败,抛出OutOfMemoryError
情况二:常规应用,在AndroidManifest.xml设置android:largeHeap=”true”,此时APP的Dalvik heap的分配上限应该是 dalvik.vm.heapsize=384MB
看运行结果:
08-19 11:32:22.660 27539 27539 D OOM_TEST: app maxMemory = 128MB
08-19 11:32:22.660 27539 27539 D OOM_TEST: large app maxMemory = 384MB
08-19 11:32:22.660 27539 27539 D OOM_TEST: current app maxMemory = 384MB
08-19 11:32:23.048 27539 27539 E OOM-TEST: allocate 20MB heap: 18, total 380MB
08-19 11:32:23.061 27539 27553 I com.mediacodec: Clamp target GC heap from 406MB to 384MB
08-19 11:32:23.069 27539 27539 E OOM-TEST: allocate 20MB heap: 19, total 400MB
08-19 11:32:23.069 27539 27539 I com.demo: Starting a blocking GC Alloc
08-19 11:32:23.226 27539 27539 W com.demo: Throwing OutOfMemoryError "Failed to allocate a 20971536 byte allocation with 1900608 free bytes and 1856KB until OOM, target footprint 402653184, growth limit 402653184" (VmSize 2053220 kB)
08-19 11:32:23.226 27539 27539 E InputEventSender: Exception dispatching finished signal.
08-19 11:32:23.226 27539 27539 E MessageQueue-JNI: Exception in MessageQueue callback: handleReceiveCallback
08-19 11:32:23.227 27539 27539 E MessageQueue-JNI: java.lang.OutOfMemoryError: Failed to allocate a 20971536 byte allocation with 1900608 free bytes and 1856KB until OOM, target footprint 402653184, growth limit 402653184
08-19 11:32:23.227 27539 27539 E MessageQueue-JNI: at com.demo.MainActivity.testOOMCreatHeap(MainActivity.java:393)
08-19 11:32:23.227 27539 27539 E MessageQueue-JNI: at com.demo.MainActivity.onClick(MainActivity.java:450)
解释:
最后一次请求分配heap memory时,此时因为已经分配了380+MB的内存,如果继续分配20MB显然超过了限制的384MB,而且此时GC并没有能回收掉任何内存,最终分配失败,抛出OutOfMemoryError
3.2 没有足够大小的连续地址空间
:failed due to fragmentation (required continguous free “<< required_bytes << “ bytes for a new buffer where largest contiguous free ” << largest_continuous_free_pages << “ bytes)”
void RosAlloc::LogFragmentationAllocFailure(std::ostream& os, size_t failed_alloc_bytes) {
...
if (required_bytes > largest_continuous_free_pages) {
os << "; failed due to fragmentation ("
<< "required contiguous free " << required_bytes << " bytes" << new_buffer_msg
<< ", largest contiguous free " << largest_continuous_free_pages << " bytes"
<< ", total free pages " << total_free << " bytes"
<< ", space footprint " << footprint_ << " bytes"
<< ", space max capacity " << max_capacity_ << " bytes"
<< ")" << std::endl;
}
}
四、OOM — 创建线程失败
Android中线程(Thread)的创建及内存分配过程分析可以参见如下这篇文章://blog.csdn.net/u011578734/article/details/109331764
线程创建会消耗大量的系统资源(例如内存),创建过程涉及java层和native的处理。实质工作是在native层完成的,相关代码位于 /art/runtime/thread.cc
void Thread::CreateNativeThread(JNIEnv* env, jobject java_peer, size_t stack_size, bool is_daemon) {
// 此处省略一万字
{
std::string msg(child_jni_env_ext.get() == nullptr ?
StringPrintf("Could not allocate JNI Env: %s", error_msg.c_str()) :
StringPrintf("pthread_create (%s stack) failed: %s",
PrettySize(stack_size).c_str(), strerror(pthread_create_result)));
ScopedObjectAccess soa(env);
soa.Self()->ThrowOutOfMemoryError(msg.c_str());
}
}
大概总结如下:下图借鉴了网络上的资料(偷懒了)
4.1 创建JNI Env 失败
一般有两种原因
1. FD溢出导致JNIEnv创建失败了,一般logcat中可以看到信息 Too many open files … Could not allocate JNI Env
当进程fd数(可以通过 ls /proc/pid/fd | wc -l 获得)突破 /proc/pid/limits中规定的Max open files时,产生OOM
E/art: ashmem_create_region failed for 'indirect ref table': Too many open files java.lang.OutOfMemoryError:Could not allocate JNI Env at java.lang.Thread.nativeCreate(Native Method) at java.lang.Thread.start(Thread.java:730)
2. 虚拟内存不足导致JNIEnv创建失败了,一般logcat中可以看到信息 Could not allocate JNI Env: Failed anonymous mmap
08-19 17:51:50.662 3533 3533 E OOM_TEST: create thread : 1104
08-19 17:51:50.663 3533 3533 W com.demo: Throwing OutOfMemoryError "Could not allocate JNI Env: Failed anonymous mmap(0x0, 8192, 0x3, 0x22, -1, 0): Operation not permitted. See process maps in the log." (VmSize 2865432 kB)
08-19 17:51:50.663 3533 3533 E InputEventSender: Exception dispatching finished signal.
08-19 17:51:50.663 3533 3533 E MessageQueue-JNI: Exception in MessageQueue callback: handleReceiveCallback
08-19 17:51:50.668 3533 3533 E MessageQueue-JNI: java.lang.OutOfMemoryError: Could not allocate JNI Env: Failed anonymous mmap(0x0, 8192, 0x3, 0x22, -1, 0): Operation not permitted. See process maps in the log.
08-19 17:51:50.668 3533 3533 E MessageQueue-JNI: at java.lang.Thread.nativeCreate(Native Method)
08-19 17:51:50.668 3533 3533 E MessageQueue-JNI: at java.lang.Thread.start(Thread.java:887)
08-19 17:51:50.671 3533 3533 E AndroidRuntime: FATAL EXCEPTION: main
08-19 17:51:50.671 3533 3533 E AndroidRuntime: Process: com.demo, PID: 3533
08-19 17:51:50.671 3533 3533 E AndroidRuntime: java.lang.OutOfMemoryError: Could not allocate JNI Env: Failed anonymous mmap(0x0, 8192, 0x3, 0x22, -1, 0): Operation not permitted. See process maps in the log.
08-19 17:51:50.671 3533 3533 E AndroidRuntime: at java.lang.Thread.nativeCreate(Native Method)
08-19 17:51:50.671 3533 3533 E AndroidRuntime: at java.lang.Thread.start(Thread.java:887)
4.2 创建线程失败
一般有两种原因
1. 虚拟内存不足导致失败,一般logcat中可以看到信息 mapped space: Out of memory … pthread_create (1040KB stack) failed: Out of memory
native层通过FixStackSize设置线程栈大小,默认情况下,线程栈所需内存总大小 = 1M + 8k + 8k,即为1040k。
// /art/runtime/thread.cc
static size_t FixStackSize(size_t stack_size) {
// 这里面设置计算 stack_size,一般默认1040KB
}
发生OOM时的典型logcat如下:
W/libc: pthread_create failed: couldn't allocate 1073152-bytes mapped space: Out of memory
W/tch.crowdsourc: Throwing OutOfMemoryError with VmSize 4191668 kB "pthread_create (1040KB stack) failed: Try again"
java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Try again
at java.lang.Thread.nativeCreate(Native Method)
at java.lang.Thread.start(Thread.java:753)
2. 线程数量超过了限制导致失败,一般logcat中可以看到信息 pthread_create failed: clone failed: Try again
08-19 18:55:07.725 22139 22139 E OOM_TEST: create thread : 54
08-19 18:55:07.725 22139 22139 W libc : pthread_create failed: clone failed: Try again
08-19 18:55:07.726 22139 22139 W com.demo: Throwing OutOfMemoryError "pthread_create (1040KB stack) failed: Try again" (VmSize 1715684 kB)
08-19 18:55:07.733 22139 22139 E InputEventSender: Exception dispatching finished signal.
08-19 18:55:07.733 22139 22139 E MessageQueue-JNI: Exception in MessageQueue callback: handleReceiveCallback
08-19 18:55:07.734 22786 22786 W externalstorag: Using default instruction set features for ARM CPU variant (generic) using conservative defaults
08-19 18:55:07.734 22786 22786 W libc : pthread_create failed: clone failed: Try again
08-19 18:55:07.735 22786 22786 F externalstorag: thread_pool.cc:66] pthread_create failed for new thread pool worker thread: Try again
08-19 18:55:07.737 22139 22139 E MessageQueue-JNI: java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Try again
08-19 18:55:07.737 22139 22139 E MessageQueue-JNI: at java.lang.Thread.nativeCreate(Native Method)
08-19 18:55:07.739 22139 22139 E AndroidRuntime: java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Try again
08-19 18:55:07.739 22139 22139 E AndroidRuntime: at java.lang.Thread.nativeCreate(Native Method)
08-19 18:55:07.739 22139 22139 E AndroidRuntime: at java.lang.Thread.start(Thread.java:887)
4.3 debug技巧
- 对于FD的限制
可以执行 cat /proc/pid/limits
来查看Max open files 最大打开的文件数量
可以执行 ls /proc/pid/fd | wc -l
来查看进程打开的文件数量
- 对于线程数量的限制
可以执行cat /proc/sys/kernel/threads-max
查看系统最多可以创建多少线程
可以执行echo 3000 > /proc/sys/kernel/threads-max
修改这个值,做测试
查看系统当前的线程数 top -H
查看当前进程线程数量cat /proc/{pid}/status
- 对于虚拟内存使用情况
可以执行 cat /proc/pid/status | grep Vm
查看VmSize及VmPeak
4.4 OOM演示
可以使用下面这段程序做简单演示,不过不同设备由于参数配置不同,可能会OOM error会有不同
private void testOOMCreatThread() {
int count = 0;
while (true) {
Log.e("OOM_TEST", "create thread : " + ++count);
new Thread(new Runnable() {
@Override
public void run() {
try {
Thread.sleep(10000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}, "thread-" + count).start();
}
}
—
五、参考及推荐阅读文章
✔️ Android应用OutOfMemory — 1.OOM机制了解
✔️ 关于虚拟机参数的调整 — heapgrowthlimit/heapsize的配置
✔️ Android中线程(Thread)的创建及内存分配过程分析
—