Java版流媒體編解碼和圖像處理(JavaCPP+FFmpeg)

2021 年 10 月 28 日
筆記

歡迎訪問我的GitHub

//github.com/zq2599/blog_demos

內容：所有原創文章分類匯總及配套源碼，涉及Java、Docker、Kubernetes、DevOPS等；

FFmpeg、JavaCPP、JavaCV的關係

先簡單的梳理一下FFmpeg、JavaCPP、JavaCV的關係：

FFmpeg、OpenCV可以理解成C語言版的本地庫（Native library），Java應用無法直接使用
JavaCPP將FFmpeg、OpenCV這些常用庫做了包裝（wrapper），使得Java應用也能使用這些Native API（JavaCPP的底層實現是JNI）
這些JavaCPP包裝後的API，被JavaCV封裝成了工具類（utility classes），這些工具類比原生API更簡單易用

簡單的說如下圖所示，JavaCPP是Native API轉Java API，JavaCV是Java API封裝成工具類，這些工具類更加簡單易用：

學習目的

欣宸的目標是學習和掌握JavaCV，而深入JavaCV內部去了解它用到的JavaCPP，就相當於打好基礎，今後使用JavaCV的時候，也能看懂其內部的實現原理；
於是乎，通過JavaCPP使用FFmpeg就成了基本功，本文會開發一個java應用，調用JavaCPP的API完成以下任務：

打開指定的流媒體
取一幀解碼，得到YUV420P格式的圖像
將YUV420P格式的圖像轉為YUVJ420P格式
將圖像用jpg格式保存在指定位置
釋放所有打開的資源

可見上述一系列步驟已覆蓋編解碼和圖像處理等常見操作，對咱們了解FFmpeg庫有很大幫助

知識儲備

在實際編碼前，建議您對FFmpeg的重要數據結構和API做一些了解，這方面最經典的資料莫過於雷神的系列教程了，尤其是解協議、解封裝、解碼涉及到的數據結構(上下文)和API，都應該簡單了解一遍
如果您實在太忙沒有時間翻閱這些經典，我這準備了一份快餐版，對重要知識點做了簡單的小結，這裡要申明一下：欣宸的快餐版遠不如雷神的經典系列…
先看數據結構，主要分為媒體數據和上下文兩大類，以及底層指針對應的java類：

接着是常用API，按照雷神的解協議、解封裝、解碼思路（還有反過來的編碼和封裝處理）去分類和理解，很容易將它們梳理清楚：

版本信息

本次編碼涉及的操作系統、軟件、庫的版本信息如下：

操作系統：win10 64位
IDE：IDEA 2021.1.3 (Ultimate Edition)
JDK：1.8.0_291
maven：3.8.1
javacpp：1.4.3
ffmpeg：4.0.2（所以ffmpeg-platform庫的版本是4.0.2-1.4.3）

源碼下載

本篇實戰中的完整源碼可在GitHub下載到，地址和鏈接信息如下表所示(//github.com/zq2599/blog_demos)：

名稱	鏈接	備註
項目主頁	//github.com/zq2599/blog_demos	該項目在GitHub上的主頁
git倉庫地址(https)	//github.com/zq2599/blog_demos.git	該項目源碼的倉庫地址，https協議
git倉庫地址(ssh)	[email protected]:zq2599/blog_demos.git	該項目源碼的倉庫地址，ssh協議

這個git項目中有多個文件夾，本篇的源碼在javacv-tutorials文件夾下，如下圖紅框所示：

javacv-tutorials文件夾下有多個子工程，本篇的源碼在ffmpeg-basic文件夾下，如下圖紅框：

開始編碼

為了統一管理源碼和jar依賴，項目採用了maven父子結構，父工程名為javacv-tutorials，裏面有一些jar的版本定義，就不多說了
在javacv-tutorials下面新建名為ffmpeg-basic的子工程，其pom.xml內容如下，可見僅用了JavaCPP，並未用到JavaCV：

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="//maven.apache.org/POM/4.0.0"
         xmlns:xsi="//www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="//maven.apache.org/POM/4.0.0 //maven.apache.org/xsd/maven-4.0.0.xsd">
    <parent>
        <artifactId>javacv-tutorials</artifactId>
        <groupId>com.bolingcavalry</groupId>
        <version>1.0-SNAPSHOT</version>
    </parent>
    <modelVersion>4.0.0</modelVersion>

    <artifactId>ffmpeg-basic</artifactId>

    <dependencies>
        <dependency>
            <groupId>org.bytedeco</groupId>
            <artifactId>javacpp</artifactId>
        </dependency>
        <dependency>
            <groupId>org.bytedeco.javacpp-presets</groupId>
            <artifactId>ffmpeg-platform</artifactId>
        </dependency>

        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
        </dependency>

        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-api</artifactId>
        </dependency>

        <dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-classic</artifactId>
        </dependency>
    </dependencies>
</project>

接下來開始編碼，先寫一個最簡單的內部類，將AVFrame和它對應的數據指針BytePointer都放在這個類中，在調用方法的時候便於傳遞：

class FrameData {

    AVFrame avFrame;
    BytePointer buffer;

    public FrameData(AVFrame avFrame, BytePointer buffer) {
        this.avFrame = avFrame;
        this.buffer = buffer;
    }
}

接下來是整個程序最重要的方法openMediaAndSaveImage，該方法是整個程序的主體，負責將打開流媒體、解碼、轉格式、保存、釋放等五個步驟串起來，外部只要調用這個方法就能完成整個功能：

/**
     * 打開流媒體，取一幀，轉為YUVJ420P，再保存為jpg文件
     * @param url
     * @param out_file
     * @throws IOException
     */
    public void openMediaAndSaveImage(String url,String out_file) throws IOException {
        log.info("正在打開流媒體 [{}]", url);

        // 打開指定流媒體，進行解封裝，得到解封裝上下文
        AVFormatContext pFormatCtx = getFormatContext(url);

        if (null==pFormatCtx) {
            log.error("獲取解封裝上下文失敗");
            return;
        }

        // 控制台打印流媒體信息
        av_dump_format(pFormatCtx, 0, url, 0);

        // 流媒體解封裝後有一個保存了所有流的數組，videoStreamIndex表示視頻流在數組中的位置
        int videoStreamIndex = getVideoStreamIndex(pFormatCtx);

        // 找不到視頻流就直接返回
        if (videoStreamIndex<0) {
            log.error("沒有找到視頻流");
            return;
        }

        log.info("視頻流在流數組中的第[{}]個流是視頻流(從0開始)", videoStreamIndex);

        // 得到解碼上下文，已經完成了初始化
        AVCodecContext pCodecCtx = getCodecContext(pFormatCtx, videoStreamIndex);

        if (null==pCodecCtx) {
            log.error("生成解碼上下文失敗");
            return;
        }

        // 從視頻流中解碼一幀
        AVFrame pFrame = getSingleFrame(pCodecCtx,pFormatCtx, videoStreamIndex);

        if (null==pFrame) {
            log.error("從視頻流中取幀失敗");
            return;
        }

        // 將YUV420P圖像轉成YUVJ420P
        // 轉換後的圖片的AVFrame，及其對應的數據指針，都放在frameData對象中
        FrameData frameData = YUV420PToYUVJ420P(pCodecCtx, pFrame);

        if (null==frameData) {
            log.info("YUV420P格式轉成YUVJ420P格式失敗");
            return;
        }

        // 持久化存儲
        saveImg(frameData.avFrame,out_file);

        // 按順序釋放
        release(true, null, null, pCodecCtx, pFormatCtx, frameData.buffer, frameData.avFrame, pFrame);

        log.info("操作成功");
    }

現在整體邏輯已經清楚了，再來看裏面openMediaAndSaveImage裏面調用的那些方法的源碼，先看打開流媒體的getFormatContext：

    /**
     * 生成解封裝上下文
     * @param url
     * @return
     */
    private AVFormatContext getFormatContext(String url) {
        // 解封裝上下文
        AVFormatContext pFormatCtx = new avformat.AVFormatContext(null);

        // 打開流媒體
        if (avformat_open_input(pFormatCtx, url, null, null) != 0) {
            log.error("打開媒體失敗");
            return null;
        }

        // 讀取流媒體數據，以獲得流的信息
        if (avformat_find_stream_info(pFormatCtx, (PointerPointer<Pointer>) null) < 0) {
            log.error("獲得媒體流信息失敗");
            return null;
        }

        return pFormatCtx;
    }

流媒體解封裝後有一個保存了所有流的數組，getVideoStreamIndex方法會找到視頻流在數組中的位置：

    /**
     * 流媒體解封裝後得到多個流組成的數組，該方法找到視頻流咋數組中的位置
     * @param pFormatCtx
     * @return
     */
    private static int getVideoStreamIndex(AVFormatContext pFormatCtx) {
        int videoStream = -1;

        // 解封裝後有多個流，找出視頻流是第幾個
        for (int i = 0; i < pFormatCtx.nb_streams(); i++) {
            if (pFormatCtx.streams(i).codec().codec_type() == AVMEDIA_TYPE_VIDEO) {
                videoStream = i;
                break;
            }
        }

        return videoStream;
    }

解封裝之後就是解碼，getCodecContext方法得到解碼上下文對象：

    /**
     * 生成解碼上下文
     * @param pFormatCtx
     * @param videoStreamIndex
     * @return
     */
    private AVCodecContext getCodecContext(AVFormatContext pFormatCtx, int videoStreamIndex) {
        //解碼器
        AVCodec pCodec;

        // 得到解碼上下文
        AVCodecContext pCodecCtx = pFormatCtx.streams(videoStreamIndex).codec();

        // 根據解碼上下文得到解碼器
        pCodec = avcodec_find_decoder(pCodecCtx.codec_id());

        if (pCodec == null) {
            return null;
        }

        // 用解碼器來初始化解碼上下文
        if (avcodec_open2(pCodecCtx, pCodec, (AVDictionary)null) < 0) {
            return null;
        }

        return pCodecCtx;
    }

緊接着從視頻流解碼取幀解碼：

    /**
     * 取一幀然後解碼
     * @param pCodecCtx
     * @param pFormatCtx
     * @param videoStreamIndex
     * @return
     */
    private AVFrame getSingleFrame(AVCodecContext pCodecCtx, AVFormatContext pFormatCtx, int videoStreamIndex) {
        // 分配幀對象
        AVFrame pFrame = av_frame_alloc();

        // frameFinished用於檢查是否有圖像
        int[] frameFinished = new int[1];

        // 是否找到的標誌
        boolean exists = false;

        AVPacket packet = new AVPacket();

        try {
            // 每一次while循環都會讀取一個packet
            while (av_read_frame(pFormatCtx, packet) >= 0) {
                // 檢查packet所屬的流是不是視頻流
                if (packet.stream_index() == videoStreamIndex) {
                    // 將AVPacket解碼成AVFrame
                    avcodec_decode_video2(pCodecCtx, pFrame, frameFinished, packet);// Decode video frame

                    // 如果有圖像就返回
                    if (frameFinished != null && frameFinished[0] != 0 && !pFrame.isNull()) {
                        exists = true;
                        break;
                    }
                }
            }
        } finally {
            // 一定要執行釋放操作
            av_free_packet(packet);
        }

        // 找不到就返回空
        return exists ?  pFrame : null;
    }

解碼後的圖像是YUV420P格式，咱們將其轉成YUVJ420P：

    /**
     * 將YUV420P格式的圖像轉為YUVJ420P格式
     * @param pCodecCtx 解碼上下文
     * @param sourceFrame 源數據
     * @return 轉換後的幀極其對應的數據指針
     */
    private static FrameData YUV420PToYUVJ420P(AVCodecContext pCodecCtx, AVFrame sourceFrame) {
        // 分配一個幀對象，保存從YUV420P轉為YUVJ420P的結果
        AVFrame pFrameRGB = av_frame_alloc();

        if (pFrameRGB == null) {
            return null;
        }

        int width = pCodecCtx.width(), height = pCodecCtx.height();

        // 一些參數設定
        pFrameRGB.width(width);
        pFrameRGB.height(height);
        pFrameRGB.format(AV_PIX_FMT_YUVJ420P);

        // 計算轉為YUVJ420P之後的圖片位元組數
        int numBytes = avpicture_get_size(AV_PIX_FMT_YUVJ420P, width, height);

        // 分配內存
        BytePointer buffer = new BytePointer(av_malloc(numBytes));

        // 圖片處理工具的初始化操作
        SwsContext sws_ctx = sws_getContext(width, height, pCodecCtx.pix_fmt(), width, height, AV_PIX_FMT_YUVJ420P, SWS_BICUBIC, null, null, (DoublePointer) null);

        // 將pFrameRGB的data指針指向剛才分配好的內存(即buffer)
        avpicture_fill(new avcodec.AVPicture(pFrameRGB), buffer, AV_PIX_FMT_YUVJ420P, width, height);

        // 轉換圖像格式，將解壓出來的YUV420P的圖像轉換為YUVJ420P的圖像
        sws_scale(sws_ctx, sourceFrame.data(), sourceFrame.linesize(), 0, height, pFrameRGB.data(), pFrameRGB.linesize());

        // 及時釋放
        sws_freeContext(sws_ctx);

        // 將AVFrame和BytePointer打包到FrameData中返回，這兩個對象都要做顯示的釋放操作
        return new FrameData(pFrameRGB, buffer);
    }

然後就是另一個很重要方法saveImg，裏面是典型的編碼和輸出流程，咱們前面已經了解了打開媒體流解封裝解碼的操作，現在要看看怎麼製作媒體流，包括編碼、封裝和輸出：

    /**
     * 將傳入的幀以圖片的形式保存在指定位置
     * @param pFrame
     * @param out_file
     * @return 小於0表示失敗
     */
    private int saveImg(avutil.AVFrame pFrame, String out_file) {
        av_log_set_level(AV_LOG_ERROR);//設置FFmpeg日誌級別（默認是debug，設置成error可以屏蔽大多數不必要的控制台消息）

        AVPacket pkt = null;
        AVStream pAVStream = null;

        int width = pFrame.width(), height = pFrame.height();

        // 分配AVFormatContext對象
        avformat.AVFormatContext pFormatCtx = avformat_alloc_context();

        // 設置輸出格式(涉及到封裝和容器)
        pFormatCtx.oformat(av_guess_format("mjpeg", null, null));

        if (pFormatCtx.oformat() == null) {
            log.error("輸出媒體流的封裝格式設置失敗");
            return -1;
        }

        try {
            // 創建並初始化一個和該url相關的AVIOContext
            avformat.AVIOContext pb = new avformat.AVIOContext();

            // 打開輸出文件
            if (avio_open(pb, out_file, AVIO_FLAG_READ_WRITE) < 0) {
                log.info("輸出文件打開失敗");
                return -1;
            }

            // 封裝之上是協議，這裡將封裝上下文和協議上下文關聯
            pFormatCtx.pb(pb);

            // 構建一個新stream
            pAVStream = avformat_new_stream(pFormatCtx, null);

            if (pAVStream == null) {
                log.error("將新的流放入媒體文件失敗");
                return -1;
            }

            int codec_id = pFormatCtx.oformat().video_codec();

            // 設置該stream的信息
            avcodec.AVCodecContext pCodecCtx = pAVStream.codec();
            pCodecCtx.codec_id(codec_id);
            pCodecCtx.codec_type(AVMEDIA_TYPE_VIDEO);
            pCodecCtx.pix_fmt(AV_PIX_FMT_YUVJ420P);
            pCodecCtx.width(width);
            pCodecCtx.height(height);
            pCodecCtx.time_base().num(1);
            pCodecCtx.time_base().den(25);

            // 打印媒體信息
            av_dump_format(pFormatCtx, 0, out_file, 1);

            // 查找解碼器
            avcodec.AVCodec pCodec = avcodec_find_encoder(codec_id);
            if (pCodec == null) {
                log.info("獲取解碼器失敗");
                return -1;
            }

            // 用解碼器來初始化解碼上下文
            if (avcodec_open2(pCodecCtx, pCodec, (PointerPointer<Pointer>) null) < 0) {
                log.error("解碼上下文初始化失敗");
                return -1;
            }

            // 輸出的Packet
            pkt = new avcodec.AVPacket();

            // 分配
            if (av_new_packet(pkt, width * height * 3) < 0) {
                return -1;
            }

            int[] got_picture = { 0 };

            // 把流的頭信息寫到要輸出的媒體文件中
            avformat_write_header(pFormatCtx, (PointerPointer<Pointer>) null);

            // 把幀的內容進行編碼
            if (avcodec_encode_video2(pCodecCtx, pkt, pFrame, got_picture)<0) {
                log.error("把幀編碼為packet失敗");
                return -1;
            }

            // 輸出一幀
            if ((av_write_frame(pFormatCtx, pkt)) < 0) {
                log.error("輸出一幀失敗");
                return -1;
            }

            // 寫文件尾
            if (av_write_trailer(pFormatCtx) < 0) {
                log.error("寫文件尾失敗");
                return -1;
            }

            return 0;
        } finally {
            // 資源清理
            release(false, pkt, pFormatCtx.pb(), pAVStream.codec(), pFormatCtx);
        }
    }

最後是釋放資源的操作，請注意釋放不同對象要用到的API也不同，另外AVFormatContext的場景不同用到的API也不同（輸入輸出場景），用錯了就會crash，另外release方法一共被調用了兩次，也就說打開媒體流和輸出媒體流用到的資源和對象，最終都需要釋放和回收：

   /**
     * 釋放資源，順序是先釋放數據，再釋放上下文
     * @param pCodecCtx
     * @param pFormatCtx
     * @param ptrs
     */
    private void release(boolean isInput, AVPacket pkt, AVIOContext pb, AVCodecContext pCodecCtx, AVFormatContext pFormatCtx, Pointer...ptrs) {

        if (null!=pkt) {
            av_free_packet(pkt);
        }

        // 解碼後，這是個數組，要遍歷處理
        if (null!=ptrs) {
            Arrays.stream(ptrs).forEach(avutil::av_free);
        }

        // 解碼
        if (null!=pCodecCtx) {
            avcodec_close(pCodecCtx);
        }

        // 解協議
        if (null!=pb) {
            avio_close(pb);
        }

        // 解封裝
        if (null!=pFormatCtx) {
            if (isInput) {
                avformat_close_input(pFormatCtx);
            } else {
                avformat_free_context(pFormatCtx);
            }
        }
    }

最後寫個main方法，調用openMediaAndSaveImage試試，傳入媒體流的地址，以及存放圖片的路徑：

    public static void main(String[] args) throws Exception {
        // CCTV13，1920*1080分辨率，不穩定，打開失敗時請多試幾次
        String url = "//ivi.bupt.edu.cn/hls/cctv13hd.m3u8";

        // 安徽衛視，1024*576分辨率，較為穩定
//        String url = "rtmp://58.200.131.2:1935/livetv/ahtv";
        // 本地視頻文件，請改為您自己的本地文件地址
//        String url = "E:\\temp\\202107\\24\\test.mp4";

        // 完整圖片存放路徑，注意文件名是當前的年月日時分秒
        String localPath = "E:\\temp\\202107\\24\\save\\" + new SimpleDateFormat("yyyyMMddHHmmss").format(new Date()) + ".jpg";

        // 開始操作
        new Stream2Image().openMediaAndSaveImage(url, localPath);
    }

以上所有代碼都在子工程ffmpeg-basic的Stream2Image.java文件中，運行main方法，控制台輸出如下，可見流媒體打開成功，並且輸出了詳細的媒體信息：

18:28:35.553 [main] INFO com.bolingcavalry.basic.Stream2Image - 正在打開流媒體 [//ivi.bupt.edu.cn/hls/cctv13hd.m3u8]
18:28:37.062 [main] INFO com.bolingcavalry.basic.Stream2Image - 視頻流在流數組中的第[0]個流是視頻流(從0開始)
18:28:37.219 [main] INFO com.bolingcavalry.basic.Stream2Image - 操作成功
[hls,applehttp @ 00000188548ab140] Opening '//ivi.bupt.edu.cn/hls/cctv13hd-1627208880000.ts' for reading
[hls,applehttp @ 00000188548ab140] Opening '//ivi.bupt.edu.cn/hls/cctv13hd-1627208890000.ts' for reading
[NULL @ 000001887ba68bc0] non-existing SPS 0 referenced in buffering period
[NULL @ 000001887ba68bc0] SPS unavailable in decode_picture_timing
[h264 @ 000001887ba6aa80] non-existing SPS 0 referenced in buffering period
[h264 @ 000001887ba6aa80] SPS unavailable in decode_picture_timing
Input #0, hls,applehttp, from '//ivi.bupt.edu.cn/hls/cctv13hd.m3u8':
  Duration: N/A, start: 1730.227267, bitrate: N/A
  Program 0 
    Metadata:
      variant_bitrate : 0
    Stream #0:0: Video: h264 (Main) ([27][0][0][0] / 0x001B), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 25 tbr, 90k tbn, 50 tbc
    Metadata:
      variant_bitrate : 0
    Stream #0:1: Audio: aac (LC) ([15][0][0][0] / 0x000F), 48000 Hz, 5.1, fltp
    Metadata:
      variant_bitrate : 0
[swscaler @ 000001887cb28bc0] deprecated pixel format used, make sure you did set range correctly

Process finished with exit code 0

至此，Java版流媒體解碼存圖的實戰就完成了，咱們對JavaCPP包裝的FFmpeg常用函數有了基本的了解，知道了編解碼和圖像處理的常見套路，後面在使用JavaCV工具類時，也明白了其內部基本原理，在定位問題、性能優化、深入研究等場景擁有了更多優勢。

你不孤單，欣宸原創一路相伴

歡迎關注公眾號：程序員欣宸

微信搜索「程序員欣宸」，我是欣宸，期待與您一同暢遊Java世界…
//github.com/zq2599/blog_demos

Java版流媒體編解碼和圖像處理(JavaCPP+FFmpeg)

歡迎訪問我的GitHub

FFmpeg、JavaCPP、JavaCV的關係

學習目的

知識儲備

版本信息

源碼下載

開始編碼

你不孤單，欣宸原創一路相伴

歡迎關注公眾號：程序員欣宸

VirMach 便宜 VPS

QNews

Java版流媒體編解碼和圖像處理(JavaCPP+FFmpeg)

歡迎訪問我的GitHub

FFmpeg、JavaCPP、JavaCV的關係

學習目的

知識儲備

版本信息

源碼下載

開始編碼

你不孤單，欣宸原創一路相伴

歡迎關注公眾號：程序員欣宸

分享此文：

Related Posts

Flink的DataSource三部曲之二:內置connector

ysoserial分析【一】 之 Apache Commons Collections

索尼商城驚現BUG：全部商品變為12499元 包括47999元A1

《英雄聯盟手游》大升級：公會正式上線 多英雄、裝備調整

VirMach 便宜 VPS

QNews

熱門搜尋

ysoserial分析【一】之 Apache Commons Collections

索尼商城驚現BUG：全部商品變為12499元包括47999元A1

《英雄聯盟手游》大升級：公會正式上線多英雄、裝備調整