.NET做人臉識別並分類

2019 年 11 月 28 日
筆記

前言

在遊樂場、玻璃天橋、滑雪場等娛樂場所，經常能看到有攝影師在拍照片，令這些經營者發愁的一件事就是照片太多了，客戶在成千上萬張照片中找到自己可不是件容易的事。在一次遊玩等活動或家庭聚會也同理，太多了照片導致挑選十分困難。

還好有 .NET，只需少量程式碼，即可輕鬆找到人臉並完成分類。

本文將使用 MicrosoftAzure雲提供的 認知服務（ CognitiveServices） API來識別並進行人臉分類，可以免費使用，註冊地址是：https://portal.azure.com。註冊完成後，會得到兩個 密鑰，通過這個 密鑰即可完成本文中的所有程式碼，這個 密鑰長這個樣子（非真實密鑰）：

fa3a7bfd807ccd6b17cf559ad584cbaa

使用方法

首先安裝 NuGet包 Microsoft.Azure.CognitiveServices.Vision.Face，目前最新版是 2.5.0-preview.1，然後創建一個 FaceClient：

string key = "fa3a7bfd807ccd6b17cf559ad584cbaa"; // 替換為你的keyusing var fc = new FaceClient(new ApiKeyServiceClientCredentials(key)){    Endpoint = "https://southeastasia.api.cognitive.microsoft.com",};

然後識別一張照片：

using var file = File.OpenRead(@"C:PhotosDSC_996ICU.JPG");IList<DetectedFace> faces = await fc.Face.DetectWithStreamAsync(file);

其中返回的 faces是一個 IList結構，很顯然一次可以識別出多個人臉，其中一個示例返回結果如下（已轉換為 JSON）：

[    {      "FaceId": "9997b64e-6e62-4424-88b5-f4780d3767c6",      "RecognitionModel": null,      "FaceRectangle": {        "Width": 174,        "Height": 174,        "Left": 62,        "Top": 559      },      "FaceLandmarks": null,      "FaceAttributes": null    },    {      "FaceId": "8793b251-8cc8-45c5-ab68-e7c9064c4cfd",      "RecognitionModel": null,      "FaceRectangle": {        "Width": 152,        "Height": 152,        "Left": 775,        "Top": 580      },      "FaceLandmarks": null,      "FaceAttributes": null    }  ]

可見，該照片返回了兩個 DetectedFace對象，它用 FaceId保存了其 Id，用於後續的識別，用 FaceRectangle保存了其人臉的位置資訊，可供對其做進一步操作。 RecognitionModel、 FaceLandmarks、 FaceAttributes是一些額外屬性，包括識別 性別、 年齡、 表情等資訊，默認不識別，如下圖 API所示，可以通過各種參數配置，非常好玩，有興趣的可以試試：

最後，通過 .GroupAsync來將之前識別出的多個 faceId進行分類：

var faceIds = faces.Select(x => x.FaceId.Value).ToList();GroupResult reslut = await fc.Face.GroupAsync(faceIds);

返回了一個 GroupResult，其對象定義如下：

public class GroupResult{    public IList<IList<Guid>> Groups    {        get;        set;    }      public IList<Guid> MessyGroup    {        get;        set;    }      // ...}

包含了一個 Groups對象和一個 MessyGroup對象，其中 Groups是一個數據的數據，用於存放人臉的分組， MessyGroup用於保存未能找到分組的 FaceId。

有了這個，就可以通過一小段簡短的程式碼，將不同的人臉組，分別複製對應的文件夾中：

void CopyGroup(string outputPath, GroupResult result, Dictionary<Guid, (string file, DetectedFace face)> faces){    foreach (var item in result.Groups        .SelectMany((group, index) => group.Select(v => (faceId: v, index)))        .Select(x => (info: faces[x.faceId], i: x.index + 1)).Dump())    {        string dir = Path.Combine(outputPath, item.i.ToString());        Directory.CreateDirectory(dir);        File.Copy(item.info.file, Path.Combine(dir, Path.GetFileName(item.info.file)), overwrite: true);    }      string messyFolder = Path.Combine(outputPath, "messy");    Directory.CreateDirectory(messyFolder);    foreach (var file in result.MessyGroup.Select(x => faces[x].file).Distinct())    {        File.Copy(file, Path.Combine(messyFolder, Path.GetFileName(file)), overwrite: true);    }}

然後就能得到運行結果，如圖，我傳入了 102張照片，輸出了 15個分組和一個「未找到隊友」的分組：

還能有什麼問題？

就兩個 API調用而已，程式碼一把梭，感覺太簡單了？其實不然，還會有很多問題。

圖片太大，需要壓縮

畢竟要把圖片上傳到雲服務中，如果上傳網速不佳，流量會挺大，而且現在的手機、單反、微單都能輕鬆達到好幾千萬像素， jpg大小輕鬆上 10MB，如果不壓縮就上傳，一來流量和速度遭不住。

二來……其實 Azure也不支援，文檔(https://docs.microsoft.com/en-us/rest/api/cognitiveservices/face/face/detectwithstream)顯示，最大僅支援 6MB的圖片，且圖片大小應不大於 1920x1080的解析度：

JPEG, PNG, GIF (the first frame), and BMP format are supported. The allowed image file size is from 1KB to 6MB.
The minimum detectable face size is 36×36 pixels in an image no larger than 1920×1080 pixels. Images with dimensions higher than 1920×1080 pixels will need a proportionally larger minimum face size.

因此，如果圖片太大，必須進行一定的壓縮（當然如果圖片太小，顯然也沒必要進行壓縮了），使用 .NET的 Bitmap，並結合 C# 8.0的 switchexpression，這個判斷邏輯以及壓縮程式碼可以一氣呵成：

byte[] CompressImage(string image, int edgeLimit = 1920){    using var bmp = Bitmap.FromFile(image);      using var resized = (1.0 * Math.Max(bmp.Width, bmp.Height) / edgeLimit) switch    {        var x when x > 1 => new Bitmap(bmp, new Size((int)(bmp.Size.Width / x), (int)(bmp.Size.Height / x))),         _ => bmp,     };      using var ms = new MemoryStream();    resized.Save(ms, ImageFormat.Jpeg);    return ms.ToArray();}

豎立的照片

相機一般都是 3:2的感測器，拍出來的照片一般都是橫向的。但偶爾尋求一些構圖的時候，我們也會選擇縱向構圖。雖然現在許多 API都支援正負 30度的側臉，但豎著的臉 API基本都是不支援的，如下圖（實在找不到可以授權使用照片的模特了?）：

還好照片在拍攝後，都會保留 exif資訊，只需讀取 exif資訊並對照片做相應的旋轉即可：

void HandleOrientation(Image image, PropertyItem[] propertyItems){    const int exifOrientationId = 0x112;    PropertyItem orientationProp = propertyItems.FirstOrDefault(i => i.Id == exifOrientationId);      if (orientationProp == null) return;      int val = BitConverter.ToUInt16(orientationProp.Value, 0);    RotateFlipType rotateFlipType = val switch    {        2 => RotateFlipType.RotateNoneFlipX,         3 => RotateFlipType.Rotate180FlipNone,         4 => RotateFlipType.Rotate180FlipX,         5 => RotateFlipType.Rotate90FlipX,         6 => RotateFlipType.Rotate90FlipNone,         7 => RotateFlipType.Rotate270FlipX,         8 => RotateFlipType.Rotate270FlipNone,         _ => RotateFlipType.RotateNoneFlipNone,     };      if (rotateFlipType != RotateFlipType.RotateNoneFlipNone)    {        image.RotateFlip(rotateFlipType);    }}

旋轉後，我的照片如下：

這樣豎拍的照片也能識別出來了。

並行速度

前文說過，一個文件夾可能會有成千上萬個文件，一個個上傳識別，速度可能慢了點，它的程式碼可能長這個樣子：

Dictionary<Guid, (string file, DetectedFace face)> faces = GetFiles(inFolder)  .Select(file =>   {    byte[] bytes = CompressImage(file);    var result = (file, faces: fc.Face.DetectWithStreamAsync(new MemoryStream(bytes)).GetAwaiter().GetResult());    (result.faces.Count == 0 ? $"{file} not detect any face!!!" : $"{file} detected {result.faces.Count}.").Dump();    return (file, faces: result.faces.ToList());  })  .SelectMany(x => x.faces.Select(face => (x.file, face)))  .ToDictionary(x => x.face.FaceId.Value, x => (file: x.file, face: x.face));

要想把速度變化，可以啟用並行上傳，有了 C#/ .NET的 LINQ支援，只需加一行 .AsParallel()即可完成：

Dictionary<Guid, (string file, DetectedFace face)> faces = GetFiles(inFolder)  .AsParallel() // 加的就是這行程式碼  .Select(file =>   {    byte[] bytes = CompressImage(file);    var result = (file, faces: fc.Face.DetectWithStreamAsync(new MemoryStream(bytes)).GetAwaiter().GetResult());    (result.faces.Count == 0 ? $"{file} not detect any face!!!" : $"{file} detected {result.faces.Count}.").Dump();    return (file, faces: result.faces.ToList());  })  .SelectMany(x => x.faces.Select(face => (x.file, face)))  .ToDictionary(x => x.face.FaceId.Value, x => (file: x.file, face: x.face));

斷點續傳

也如上文所說，有成千上萬張照片，如果一旦網路傳輸異常，或者打翻了桌子上的咖啡（誰知道呢？）……或者完全一切正常，只是想再做一些其它的分析，所有東西又要重新開始。我們可以加入下載中常說的「斷點續傳」機制。

其實就是一個快取，記錄每個文件讀取的結果，然後下次運行時先從快取中讀取即可，快取到一個 json文件中：

class Cache<T>{    static string cacheFile = outFolder + @$"cache-{typeof(T).Name}.json";    Dictionary<string, T> cachingData;      public Cache()    {        cachingData = File.Exists(cacheFile) switch        {            true => JsonSerializer.Deserialize<Dictionary<string, T>>(File.ReadAllBytes(cacheFile)),            _ => new Dictionary<string, T>()        };    }      public T GetOrCreate(string key, Func<T> fetchMethod)    {        if (cachingData.TryGetValue(key, out T cachedValue))        {            return cachedValue;        }          var realValue = fetchMethod();          lock(this)        {            cachingData[key] = realValue;            File.WriteAllBytes(cacheFile, JsonSerializer.SerializeToUtf8Bytes(cachingData, new JsonSerializerOptions            {                WriteIndented = true,             }));            return realValue;        }    }}

注意程式碼下方有一個 lock關鍵字，是為了保證多執行緒下載時的執行緒安全。

使用時，只需只需在 Select中添加一行程式碼即可：

var cache = new Cache<List<DetectedFace>>(); // 重點Dictionary<Guid, (string file, DetectedFace face)> faces = GetFiles(inFolder)  .AsParallel()  .Select(file => (file: file, faces: cache.GetOrCreate(file, () => // 重點  {    byte[] bytes = CompressImage(file);    var result = (file, faces: fc.Face.DetectWithStreamAsync(new MemoryStream(bytes)).GetAwaiter().GetResult());    (result.faces.Count == 0 ? $"{file} not detect any face!!!" : $"{file} detected {result.faces.Count}.").Dump();    return result.faces.ToList();  })))  .SelectMany(x => x.faces.Select(face => (x.file, face)))  .ToDictionary(x => x.face.FaceId.Value, x => (file: x.file, face: x.face));

將人臉框起來

照片太多，如果活動很大，或者合影中有好幾十個人，分出來的組，將長這個樣子：

完全不知道自己的臉在哪，因此需要將檢測到的臉框起來。

注意框起來的過程，也很有技巧，回憶一下，上傳時的照片本來就是壓縮和旋轉過的，因此返回的 DetectedFace對象值，它也是壓縮和旋轉過的，如果不進行壓縮和旋轉，找到的臉的位置會完全不正確，因此需要將之前的計算過程重新演算一次：

using var bmp = Bitmap.FromFile(item.info.file);HandleOrientation(bmp, bmp.PropertyItems);using (var g = Graphics.FromImage(bmp)){  using var brush = new SolidBrush(Color.Red);  using var pen = new Pen(brush, 5.0f);  var rect = item.info.face.FaceRectangle;  float scale = Math.Max(1.0f, (float)(1.0 * Math.Max(bmp.Width, bmp.Height) / 1920.0));  g.ScaleTransform(scale, scale);  g.DrawRectangle(pen, new Rectangle(rect.Left, rect.Top, rect.Width, rect.Height));}bmp.Save(Path.Combine(dir, Path.GetFileName(item.info.file)));

使用我上面的那張照片，檢測結果如下（有點像相機對焦時人臉識別的感覺）：

1000個臉的限制

.GroupAsync方法一次只能檢測 1000個 FaceId，而上次活動 800多張照片中有超過 2000個 FaceId，因此需要做一些必要的分組。

分組最簡單的方法，就是使用 System.Interactive包，它提供了 Rx.NET那樣方便快捷的 API（這些 API在 LINQ中未提供），但又不需要引入 Observable<T>那樣重量級的東西，因此使用起來很方便。

這裡我使用的是 .Buffer(int)函數，它可以將 IEnumerable<T>按指定的數量（如 1000）進行分組，程式碼如下：

foreach (var buffer in faces  .Buffer(1000)  .Select((list, groupId) => (list, groupId)){  GroupResult group = await fc.Face.GroupAsync(buffer.list.Select(x => x.Key).ToList());  var folder = outFolder + @"gid-" + buffer.groupId;  CopyGroup(folder, group, faces);}

總結

文中用到的完整程式碼，全部上傳了到我的部落格數據 Github，只要輸入圖片和 key，即可直接使用和運行： https://github.com/sdcb/blog-data/tree/master/2019/20191122-dotnet-face-detection

.NET做人臉識別並分類

前言

使用方法

還能有什麼問題？

圖片太大，需要壓縮

豎立的照片

並行速度

斷點續傳

將人臉框起來

1000個臉的限制

總結

VirMach 便宜 VPS

QNews

.NET做人臉識別並分類

前言

使用方法

還能有什麼問題？

圖片太大，需要壓縮

豎立的照片

並行速度

斷點續傳

將人臉框起來

1000個臉的限制

總結

分享此文：

Related Posts

跟著whatwg看一遍事件循環

MYSQL 8 的 DB security 怎麼應對安全部門的 bulabula

高效開發，必知必會的Chrome開發工具

FreeSql v0.11 幾個實用功能說明

VirMach 便宜 VPS

QNews

熱門搜尋