使用bandit對目標python程式碼進行安全函數掃描

技術背景

在一些對python開源庫程式碼的安全掃描中,我們有可能需要分析庫中所使用到的函數是否會對程式碼的執行環境造成一些非預期的影響。典型的例如python的沙箱逃逸問題,通過一些python的第三方庫可以執行系統shell命令,而這就不在python的沙箱防護範圍之內了。關於python的沙箱逃逸問題,這裡不作展開,這也是困擾業界多年的一個問題,連python官方也提過python的沙箱是沒有完美的防護方案的,這裡僅作為一個背景案例使用:

# subprocess_Popen.py

import subprocess
import uuid

subprocess.Popen('touch ' + str(uuid.uuid1()) +'.txt', shell = True)

這裡演示的功能是使用subprocess函數庫開啟一個系統shell,並執行一個touch的指令,可以生成一個指定文件名的文件,類似於mkdir產生一個文件夾。我們可以看到這個文件成功執行後會在當前的目錄下生成一個uuid隨機命名的txt文件:

[dechin@dechin-manjaro bandit_test]$ python3 subprocess_Popen.py 
[dechin@dechin-manjaro bandit_test]$ ll
總用量 4
-rw-r--r-- 1 dechin dechin   0  1月 26 23:03 b7aa0fc8-5fe7-11eb-b5d3-058313e110e4.txt
-rw-r--r-- 1 dechin dechin 123  1月 26 23:03 subprocess_Popen.py

然而,本次的關注點並不在與這個函數執行了什麼功能,而是這個函數中用到了subprocess這個函數庫。按照python的語言特點,當你的系統中如果存在這樣的一個模組引用了subprocess庫,那麼任何可以調用該功能模組的函數,都可以調用到subprocess這個函數,以下是另外一個惡意用戶的python程式碼

# bad.py

from subprocess_Popen import subprocess as subprocess

subprocess.Popen('touch bad.txt', shell = True)

該程式碼的目的是在不直接import subprocess的條件下,通過前面創建好的subprocess_Popen.py來進行搭橋調用subprocess的功能函數。這個腳本的執行結果如下:

[dechin@dechin-manjaro bandit_test]$ python3 bad.py 
[dechin@dechin-manjaro bandit_test]$ ll
總用量 12
-rw-r--r-- 1 dechin dechin    0  1月 26 23:13 0fda7ede-5fe9-11eb-80a8-ad279ab4e0a6.txt
-rw-r--r-- 1 dechin dechin    0  1月 26 23:03 b7aa0fc8-5fe7-11eb-b5d3-058313e110e4.txt
-rw-r--r-- 1 dechin dechin  113  1月 26 23:13 bad.py
-rw-r--r-- 1 dechin dechin    0  1月 26 23:13 bad.txt
drwxr-xr-x 2 dechin dechin 4096  1月 26 23:13 __pycache__
-rw-r--r-- 1 dechin dechin  123  1月 26 23:03 subprocess_Popen.py

這個結果意味著,我們成功的使用bad.py調用了subprocess_Popen.py中所引用的subprocess,成功touch了一個bad.txt的文件。

到這裡我們的背景案例演示結束,但我們需要重新梳理這些案例中所包含的邏輯:我們原本是希望在自己的系統中不引入python的沙箱逃逸問題,我們會對其他人傳遞過來的程式碼進行掃描,如使用下文中將要介紹的bandit工具來屏蔽subprocess等”危險函數”。而如果我們在自己寫的python庫或者引入的第三方python庫中存在類似於subprocess的引用,這就會導致我們的屏蔽失效,用戶可以任意的通過這些引用的搭橋直接調用subprocess的函數功能。因此,在特殊的條件要求下,我們需要對自己的程式碼進行安全函數掃描,以免為其他人的系統帶來不可預期的安全風險。bandit只是其中的一種安全函數掃描的工具,接下來我們介紹一下其基本安裝和使用方法。

用pip安裝bandit

這裡直接使用pip來安裝bandit,有需要的也可以從源碼直接安裝。關於在pip的使用中配置中國鏡像源的方法,可以參考這篇部落格中對python安裝第三方庫的介紹。

[dechin@dechin-manjaro bandit_test]$ python3 -m pip install bandit
Collecting bandit
  Downloading bandit-1.7.0-py3-none-any.whl (115 kB)
     |████████████████████████████████| 115 kB 101 kB/s 
Requirement already satisfied: PyYAML>=5.3.1 in /home/dechin/anaconda3/lib/python3.8/site-packages (from bandit) (5.3.1)
Collecting GitPython>=1.0.1
  Downloading GitPython-3.1.12-py3-none-any.whl (159 kB)
     |████████████████████████████████| 159 kB 28 kB/s 
Requirement already satisfied: six>=1.10.0 in /home/dechin/anaconda3/lib/python3.8/site-packages (from bandit) (1.15.0)
Collecting stevedore>=1.20.0
  Downloading stevedore-3.3.0-py3-none-any.whl (49 kB)
     |████████████████████████████████| 49 kB 25 kB/s 
Collecting gitdb<5,>=4.0.1
  Downloading gitdb-4.0.5-py3-none-any.whl (63 kB)
     |████████████████████████████████| 63 kB 28 kB/s 
Collecting pbr!=2.1.0,>=2.0.0
  Downloading pbr-5.5.1-py2.py3-none-any.whl (106 kB)
     |████████████████████████████████| 106 kB 26 kB/s 
Collecting smmap<4,>=3.0.1
  Downloading smmap-3.0.5-py2.py3-none-any.whl (25 kB)
Installing collected packages: smmap, gitdb, GitPython, pbr, stevedore, bandit
Successfully installed GitPython-3.1.12 bandit-1.7.0 gitdb-4.0.5 pbr-5.5.1 smmap-3.0.5 stevedore-3.3.0

安裝結束之後,可以通過以下指令驗證是否安裝成功:

[dechin@dechin-manjaro bandit_test]$ bandit -h
usage: bandit [-h] [-r] [-a {file,vuln}] [-n CONTEXT_LINES] [-c CONFIG_FILE] [-p PROFILE] [-t TESTS] [-s SKIPS] [-l] [-i] [-f {csv,custom,html,json,screen,txt,xml,yaml}] [--msg-template MSG_TEMPLATE] [-o [OUTPUT_FILE]] [-v] [-d] [-q]
              [--ignore-nosec] [-x EXCLUDED_PATHS] [-b BASELINE] [--ini INI_PATH] [--exit-zero] [--version]
              [targets [targets ...]]

Bandit - a Python source code security analyzer

positional arguments:
  targets               source file(s) or directory(s) to be tested

optional arguments:
  -h, --help            show this help message and exit
  -r, --recursive       find and process files in subdirectories
  -a {file,vuln}, --aggregate {file,vuln}
                        aggregate output by vulnerability (default) or by filename
  -n CONTEXT_LINES, --number CONTEXT_LINES
                        maximum number of code lines to output for each issue
  -c CONFIG_FILE, --configfile CONFIG_FILE
                        optional config file to use for selecting plugins and overriding defaults
  -p PROFILE, --profile PROFILE
                        profile to use (defaults to executing all tests)
  -t TESTS, --tests TESTS
                        comma-separated list of test IDs to run
  -s SKIPS, --skip SKIPS
                        comma-separated list of test IDs to skip
  -l, --level           report only issues of a given severity level or higher (-l for LOW, -ll for MEDIUM, -lll for HIGH)
  -i, --confidence      report only issues of a given confidence level or higher (-i for LOW, -ii for MEDIUM, -iii for HIGH)
  -f {csv,custom,html,json,screen,txt,xml,yaml}, --format {csv,custom,html,json,screen,txt,xml,yaml}
                        specify output format
  --msg-template MSG_TEMPLATE
                        specify output message template (only usable with --format custom), see CUSTOM FORMAT section for list of available values
  -o [OUTPUT_FILE], --output [OUTPUT_FILE]
                        write report to filename
  -v, --verbose         output extra information like excluded and included files
  -d, --debug           turn on debug mode
  -q, --quiet, --silent
                        only show output in the case of an error
  --ignore-nosec        do not skip lines with # nosec comments
  -x EXCLUDED_PATHS, --exclude EXCLUDED_PATHS
                        comma-separated list of paths (glob patterns supported) to exclude from scan (note that these are in addition to the excluded paths provided in the config file) (default:
                        .svn,CVS,.bzr,.hg,.git,__pycache__,.tox,.eggs,*.egg)
  -b BASELINE, --baseline BASELINE
                        path of a baseline report to compare against (only JSON-formatted files are accepted)
  --ini INI_PATH        path to a .bandit file that supplies command line arguments
  --exit-zero           exit with 0, even with results found
  --version             show program's version number and exit

CUSTOM FORMATTING
-----------------

Available tags:

    {abspath}, {relpath}, {line},  {test_id},
    {severity}, {msg}, {confidence}, {range}

Example usage:

    Default template:
    bandit -r examples/ --format custom --msg-template \
    "{abspath}:{line}: {test_id}[bandit]: {severity}: {msg}"

    Provides same output as:
    bandit -r examples/ --format custom

    Tags can also be formatted in python string.format() style:
    bandit -r examples/ --format custom --msg-template \
    "{relpath:20.20s}: {line:03}: {test_id:^8}: DEFECT: {msg:>20}"

    See python documentation for more information about formatting style:
    //docs.python.org/3/library/string.html

The following tests were discovered and loaded:
-----------------------------------------------
        B101    assert_used
        B102    exec_used
        B103    set_bad_file_permissions
        B104    hardcoded_bind_all_interfaces
        B105    hardcoded_password_string
        B106    hardcoded_password_funcarg
        B107    hardcoded_password_default
        B108    hardcoded_tmp_directory
        B110    try_except_pass
        B112    try_except_continue
        B201    flask_debug_true
        B301    pickle
        B302    marshal
        B303    md5
        B304    ciphers
        B305    cipher_modes
        B306    mktemp_q
        B307    eval
        B308    mark_safe
        B309    httpsconnection
        B310    urllib_urlopen
        B311    random
        B312    telnetlib
        B313    xml_bad_cElementTree
        B314    xml_bad_ElementTree
        B315    xml_bad_expatreader
        B316    xml_bad_expatbuilder
        B317    xml_bad_sax
        B318    xml_bad_minidom
        B319    xml_bad_pulldom
        B320    xml_bad_etree
        B321    ftplib
        B323    unverified_context
        B324    hashlib_new_insecure_functions
        B325    tempnam
        B401    import_telnetlib
        B402    import_ftplib
        B403    import_pickle
        B404    import_subprocess
        B405    import_xml_etree
        B406    import_xml_sax
        B407    import_xml_expat
        B408    import_xml_minidom
        B409    import_xml_pulldom
        B410    import_lxml
        B411    import_xmlrpclib
        B412    import_httpoxy
        B413    import_pycrypto
        B501    request_with_no_cert_validation
        B502    ssl_with_bad_version
        B503    ssl_with_bad_defaults
        B504    ssl_with_no_version
        B505    weak_cryptographic_key
        B506    yaml_load
        B507    ssh_no_host_key_verification
        B601    paramiko_calls
        B602    subprocess_popen_with_shell_equals_true
        B603    subprocess_without_shell_equals_true
        B604    any_other_function_with_shell_equals_true
        B605    start_process_with_a_shell
        B606    start_process_with_no_shell
        B607    start_process_with_partial_path
        B608    hardcoded_sql_expressions
        B609    linux_commands_wildcard_injection
        B610    django_extra_used
        B611    django_rawsql_used
        B701    jinja2_autoescape_false
        B702    use_of_mako_templates
        B703    django_mark_safe

從這個列表中的屏蔽函數我們可以看出所謂的”危險函數”到底都有哪些,比如常用的subprocessrandom都被包含在內。subprocess是因為其對shell的調用而被列為”危險函數”,而random則是因為其偽隨機數的性質(這裡簡單說明一下,現在一般推薦使用secrets中的所謂安全隨機數,但是實際上只有量子疊加測量才能夠真正實現真隨機數)。

bandit常用使用方法

  1. 直接對py文件進行掃描:
[dechin@dechin-manjaro bandit_test]$ bandit subprocess_Popen.py 
[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: None
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[main]  INFO    running on Python 3.8.5
[node_visitor]  INFO    Unable to find qualified name for module: subprocess_Popen.py
Run started:2021-01-26 15:31:00.425603

Test results:
>> Issue: [B404:blacklist] Consider possible security implications associated with subprocess module.
   Severity: Low   Confidence: High
   Location: subprocess_Popen.py:3
   More Info: //bandit.readthedocs.io/en/latest/blacklists/blacklist_imports.html#b404-import-subprocess
2
3       import subprocess
4       import uuid

--------------------------------------------------
>> Issue: [B602:subprocess_popen_with_shell_equals_true] subprocess call with shell=True identified, security issue.
   Severity: High   Confidence: High
   Location: subprocess_Popen.py:6
   More Info: //bandit.readthedocs.io/en/latest/plugins/b602_subprocess_popen_with_shell_equals_true.html
5
6       subprocess.Popen('touch ' + str(uuid.uuid1()) +'.txt', shell = True)

--------------------------------------------------

Code scanned:
        Total lines of code: 3
        Total lines skipped (#nosec): 0

Run metrics:
        Total issues (by severity):
                Undefined: 0.0
                Low: 1.0
                Medium: 0.0
                High: 1.0
        Total issues (by confidence):
                Undefined: 0.0
                Low: 0.0
                Medium: 0.0
                High: 2.0
Files skipped (0):

通過對剛才所創建的調用了危險函數subprocess的py文件subprocess_Popen.py的掃描,我們識別出了其中的”危險函數”,注意這裡的Issue編號是602,定級是Severity: Low Confidence: High。但是如果我們用bandit去掃描利用了其他函數對危險函數的調用搭橋來二次調用的bad.py文件,我們發現是另外一種結果:

[dechin@dechin-manjaro bandit_test]$ bandit bad.py 
[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: None
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[main]  INFO    running on Python 3.8.5
[node_visitor]  INFO    Unable to find qualified name for module: bad.py
Run started:2021-01-26 15:30:47.370468

Test results:                                                                                                                                                                                                                               
>> Issue: [B404:blacklist] Consider possible security implications associated with subprocess module.
   Severity: Low   Confidence: High                                                                                                                                                                                                         
   Location: bad.py:3                                                                                                                                                                                                                       
   More Info: //bandit.readthedocs.io/en/latest/blacklists/blacklist_imports.html#b404-import-subprocess                                                                                                                              
2
3       from subprocess_Popen import subprocess as subprocess
4
5       subprocess.Popen('touch bad.txt', shell = True)

--------------------------------------------------
>> Issue: [B604:any_other_function_with_shell_equals_true] Function call with shell=True parameter identified, possible security issue.
   Severity: Medium   Confidence: Low                                                                                                                                                                                                       
   Location: bad.py:5                                                                                                                                                                                                                       
   More Info: //bandit.readthedocs.io/en/latest/plugins/b604_any_other_function_with_shell_equals_true.html                                                                                                                           
4
5       subprocess.Popen('touch bad.txt', shell = True)

--------------------------------------------------

Code scanned:                                                                                                                                                                                                                               
        Total lines of code: 2
        Total lines skipped (#nosec): 0

Run metrics:                                                                                                                                                                                                                                
        Total issues (by severity):
                Undefined: 0.0
                Low: 1.0
                Medium: 1.0
                High: 0.0
        Total issues (by confidence):
                Undefined: 0.0
                Low: 1.0
                Medium: 0.0
                High: 1.0
Files skipped (0):

注意這裡雖然實現的功能跟上面那個例子是一樣的,但是這裡的Issue編號為604,定級也變成了Severity: Medium Confidence: Low。這裡的關鍵並不是定級變成了什麼,而是定級被改變了,這是因為bandit是通過對字元串的處理來識別危險函數的,因此對於這種二次調用的特殊場景,bandit不一定都能夠準確的識別出來對危險函數的調用,甚至可能出現二次調用後,完全無法識別風險函數的使用的可能性。

  1. 掃描一個目錄下的所有py文件,並將結果寫入txt文件
[dechin@dechin-manjaro bandit_test]$ bandit *.py -o test_bandit.txt -f txt
[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: None
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[main]  INFO    running on Python 3.8.5
[node_visitor]  INFO    Unable to find qualified name for module: bad.py
[node_visitor]  INFO    Unable to find qualified name for module: subprocess_Popen.py
[text]  INFO    Text output written to file: test_bandit.txt

該案例就掃描了當前目錄下的所有py文件,其實就是bad.pysubprocess_Popen.py這兩個,並且將最終的掃描結果保存至test_bandit.txt文件中,這裡我們就不展示txt文件的具體內容,大概就是將上一章節的兩個執行結果進行了整合。

  1. 掃描一個目錄下的多層文件夾中的py文件,並將結果寫入html文件

假如我們有如下所示的一個目錄結構需要進行掃描測試:

[dechin@dechin-manjaro bandit_test]$ tree
.
├── bad.py
├── bad.txt
├── level2
│   └── test_random.py
├── subprocess_Popen.py
├── test_bandit.html
└── test_bandit.txt

1 directory, 6 files
[dechin@dechin-manjaro bandit_test]$ cat level2/test_random.py 
# test_bandit.py

import random

a = random.random()

我們可以在當前目錄下執行如下指令:

[dechin@dechin-manjaro bandit_test]$ bandit -r . -f html -o test_bandit.html
[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: None
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[main]  INFO    running on Python 3.8.5
[html]  INFO    HTML output written to file: test_bandit.html

這裡我們得到的結果是一個test_bandit.html文件,文件內容如下圖所示:

  1. 使用配置文件禁用部分Issue
    在執行目錄下創建一個.bandit文件,作如下配置就可以避免對B404的審查:
[bandit]
skips: B404

執行的掃描結果如下圖所示,我們可以看到B404相關的Issue已經不在列表中了:

  1. py文件中直接逃避bandit審計
    在待掃描的py文件的對應風險函數後加上如下注釋,即可在bandit審計過程中自動忽略:
# bad.py

from subprocess_Popen import subprocess as sb

sb.Popen('touch bad.txt', shell = 1) # nosec

這裡我們可以看到最終的審計結果中,B604也隨之而不見了,如下圖所示。從這個案例中我們也可以知悉,bandit並不是一個用來作安全防護的工具,僅僅是用來做比較初步的python程式碼安全函數使用規範的審查工作,而掃描出來的問題是否處理,其實最終還是取決於開發者自己。

bandit簡單性能測試

眾所周知python語言的性能是極其受限的,因此bandit的性能也有可能十分的低下,這裡讓我們來定量的測試一下bandit的性能到底在什麼水準。首先我們創建一個10000行的py文件,內容全部為危險函數的使用:

# gen.py

import os

with open('test_bandit_power.py', 'w') as py_file:
    py_file.write('import subprocess as sb\n')
    for i in range(10000):
        py_file.write('sb.Popen(\'whoami\', shell = 1)\n')

通過執行python3 gen.py就可以生成一個10000行的危險函數文件test_bandit_power.py,大約300KB的大小。此時我們針對這單個的文件進行bandit掃描測試,我們發現這個過程極為漫長,並且生成了大量的錯誤日誌:

[dechin@dechin-manjaro bandit_test]$ time bandit test_bandit_power.py -f html -o test_power.html
[main]  INFO    profile include tests: None
[main]  INFO    profile exclude tests: None
[main]  INFO    cli include tests: None
[main]  INFO    cli exclude tests: None
[main]  INFO    running on Python 3.8.5
[node_visitor]  INFO    Unable to find qualified name for module: test_bandit_power.py
[html]  INFO    HTML output written to file: test_power.html

real    0m6.239s
user    0m6.082s
sys     0m0.150s

我們可以簡單估算,如果10000行的程式碼都需要6s的時間來進行掃描,那麼對於比較大的項目的1000000+的程式碼的掃描時間,則有可能達到10min往上,這個時間雖然也不是特別長,但是對於大型的項目而言這絕對不是一個非常高效的選擇。

總結概要

在一些對安全性要求較高的開發項目中,有可能會禁止使用危險函數,如subprocess等。而bandit的作用旨在通過對程式碼的掃描自動化的給出安全危險函數分析意見,至於是否採納,還是以不同項目的管理者需求為準。同時經過我們的測試發現,bandit在實際使用場景下性能表現並不如意,因此在大型項目中我們並不推薦使用,如果一定要使用也可以考慮進行針對性的配置。

版權聲明

本文首發鏈接為://www.cnblogs.com/dechinphy/p/bandit.html
作者ID:DechinPhy
更多原著文章請參考://www.cnblogs.com/dechinphy/