.NET Core 服务诊断工具

前言:

 前一篇文中介绍了.NET Core-全局性能诊断工具 的使用方法,那么接下来自己实现一个简单.NET Core的诊断工具。

 该工具主要包括:.NET Core 程序进程信息查看、性能计数器结果获取、Dump抓取、Trace 文件生成等一些基本功能

 本文主要采用:Microsoft.Diagnostics.NETCore.Client 库来实现相关功能

一、Microsoft.Diagnostics.NETCore.Client 介绍:

 简介:

  Microsoft.Diagnostics.NETCore.Client(也称为Diagnostics客户端库)是一个托管库,可让您与.NET Core运行时(Core CLR)进行交互以执行各种与诊断相关的任务,

  例如:跟踪,请求转储或附加ICorProfiler 。使用此库,您可以编写针对特定情况定制的自己的诊断工具。

 安装方式:

Install-Package Microsoft.Diagnostics.NETCore.Client

二、工具实现:

 1、创建项目:DiagnosticsTools(.NET 5.0 Winform项目)

  添加包引用:

Install-Package Microsoft.Diagnostics.NETCore.Client
Install-Package Microsoft.Diagnostics.Tracing.TraceEvent 

  项目结构:

   

  调整窗体界面如下:

   

 2、获取当前所有.Net Core 3.0及以上的进程列表

/// <summary>
/// 获取进程状态:.Net Core 3.0及以上进程
/// </summary>
private void PrintProcessStatus()
{
   //定位上次记录
int row = dgvPros.CurrentCell == null ? 0 : dgvPros.CurrentCell.RowIndex; int col = dgvPros.CurrentCell == null ? 0 : dgvPros.CurrentCell.ColumnIndex; var data = DiagnosticsClient.GetPublishedProcesses() .Select(Process.GetProcessById) .Where(process => process != null) .Select(o => { return new { o.Id, o.ProcessName, o.StartTime, o.Threads.Count }; }); dgvPros.DataSource = data.ToList(); if (dgvPros.Rows.Count > row) dgvPros.CurrentCell = dgvPros.Rows[row].Cells[col]; }

 3、获取当前进程基本信息:

private string GetProInfo(Process info)
{
    StringBuilder stringBuilder = new StringBuilder();
    stringBuilder.Append("进程影象名:" + info.ProcessName + "\r\n");
    stringBuilder.Append("进程ID:" + info.Id + "\r\n");
    stringBuilder.Append("启动线程树:" + info.Threads.Count.ToString() + "\r\n");
    stringBuilder.Append("CPU占用时间:" + info.TotalProcessorTime.ToString() + "\r\n");
    stringBuilder.Append("线程优先级:" + info.PriorityClass.ToString() + "\r\n");
    stringBuilder.Append("启动时间:" + info.StartTime.ToLongTimeString() + "\r\n");
    stringBuilder.Append("专用内存:" + (info.PrivateMemorySize64 / 1024).ToString() + "K" + "\r\n");
    stringBuilder.Append("峰值虚拟内存:" + (info.PeakVirtualMemorySize64 / 1024).ToString() + "K" + "\r\n");
    stringBuilder.Append("峰值分页内存:" + (info.PeakPagedMemorySize64 / 1024).ToString() + "K" + "\r\n");
    stringBuilder.Append("分页系统内存:" + (info.PagedSystemMemorySize64 / 1024).ToString() + "K" + "\r\n");
    stringBuilder.Append("分页内存:" + (info.PagedMemorySize64 / 1024).ToString() + "K" + "\r\n");
    stringBuilder.Append("未分页系统内存:" + (info.NonpagedSystemMemorySize64 / 1024).ToString() + "K" + "\r\n");
    stringBuilder.Append("物理内存:" + (info.WorkingSet64 / 1024).ToString() + "K" + "\r\n");
    stringBuilder.Append("虚拟内存:" + (info.VirtualMemorySize64 / 1024).ToString() + "K");
    return stringBuilder.ToString();
}

 4、监听进程相关事件(CLR、Dynamic、Kernel等):

/// <summary>
/// 进程运行事件输出:CLR、性能计数器、动态处理(cpu使用率超过90%则抓取dump)
/// </summary>
/// <param name="processId">进程id</param>
/// <param name="threshold">cpu使用率</param>
private void PrintRuntime(int processId, int threshold = 90)
{
    if (!diagnosticsCache.ContainsKey(processId))
    {
        var providers = new List<EventPipeProvider>()
        {
           new EventPipeProvider("Microsoft-Windows-DotNETRuntime",EventLevel.Informational, (long)ClrTraceEventParser.Keywords.Default),
           //性能计数器:间隔时间为1s
           new EventPipeProvider("System.Runtime",EventLevel.Informational,(long)ClrTraceEventParser.Keywords.None,
                                new Dictionary<string, string>() {{ "EventCounterIntervalSec", "1" }})
         };

        DiagnosticsClient client = new DiagnosticsClient(processId);
        diagnosticsCache[processId] = client;
        using (EventPipeSession session = client.StartEventPipeSession(providers, false))
        {
            var source = new EventPipeEventSource(session.EventStream);

            source.Clr.All += (TraceEvent obj) =>
            {
                if (dgvPros.CurrentRow != null && obj.ProcessID.Equals(dgvPros.CurrentRow.Cells[0].Value))
                {
                    string msg = $"Clr-{obj.EventName}-";
                    if (obj.PayloadNames.Length > 0)
                    {
                        foreach (var item in obj.PayloadNames)
                            msg += $"{item}:{ obj.PayloadStringByName(item)}-";
                    }
                    TextAppendLine(msg);
                }
            };
            source.Dynamic.All += (TraceEvent obj) =>
            {
                if (dgvPros.CurrentRow != null && obj.ProcessID.Equals(dgvPros.CurrentRow.Cells[0].Value))
                {
                    string msg = $"Dynamic-{obj.EventName}-{string.Join("|", obj.PayloadNames)}";
            //性能计数器事件
            if (obj.EventName.Equals("EventCounters"))
                    {
                        var payloadFields = (IDictionary<string, object>)(obj.PayloadByName(""));
                        if (payloadFields != null)
                            payloadFields = payloadFields["Payload"] as IDictionary<string, object>;

                        if (payloadFields != null)
                        {
                            msg = $"Dynamic-{obj.EventName}-{payloadFields["DisplayName"]}:{payloadFields["Mean"]}{payloadFields["DisplayUnits"]}";
                            TextAppendLine(msg);
                        }
                //如果CPU使用率超过90%抓取dump
                if (payloadFields != null && payloadFields["Name"].ToString().Equals("cpu-usage"))
                        {
                            double cpuUsage = Double.Parse(payloadFields["Mean"].ToString());
                            if (cpuUsage > (double)threshold)
                            {
                                client.WriteDump(DumpType.Normal, "/tmp/minidump.dmp");
                            }
                        }
                    }
                    else
                    {
                        if (obj.PayloadNames.Length > 0)
                        {
                            foreach (var item in obj.PayloadNames)
                                msg += $"{item}:{ obj.PayloadStringByName(item)}-";
                        }
                        TextAppendLine(msg);
                    }
                }
            };
            source.Kernel.All += (TraceEvent obj) =>
            {
                if (dgvPros.CurrentRow != null && obj.ProcessID.Equals(dgvPros.CurrentRow.Cells[0].Value))
                {
                    string msg = $"Kernel-{obj.EventName}-{string.Join("|", obj.PayloadNames)}";
                    TextAppendLine(msg);
                }
            };

            try
            {
                source.Process();
            }
            catch (Exception e)
            {
                string errorMsg = $"错误:{e}";
                TextAppendLine(errorMsg);
            }
        }
    }
}

 5、Dump抓取功能:

/// <summary>
/// 抓取Dmp文件
/// </summary>
/// <param name="processId"></param>
private void TriggerCoreDump(int processId)
{
    saveFileDialog1.Filter = "Dump文件|*.dmp";
    if (saveFileDialog1.ShowDialog() == DialogResult.OK)
    {
        var client = new DiagnosticsClient(processId);
        //Normal = 1,WithHeap = 2,Triage = 3,Full = 4
        client.WriteDump(DumpType.Normal, saveFileDialog1.FileName, false);
    }
}

  参数说明:

转储类型。

  • Normal:仅包括捕获进程中所有现有线程的所有现有跟踪的堆栈跟踪所需的信息。有限的GC堆内存和信息。
  • WithHeap:包括GC堆和捕获进程中所有现有线程的堆栈跟踪所必需的信息。
  • Triage:仅包括捕获进程中所有现有线程的所有现有跟踪的堆栈跟踪所需的信息。有限的GC堆内存和信息。
  • Full:在此过程中包括所有可访问的内存。原始内存数据包含在末尾,因此可以直接映射初始结构,而无需原始内存信息。此选项可能会导致非常大的转储文件。

 6、生成进程指定事件内Trace文件

/// <summary>
/// 写入Trace文件
/// </summary>
/// <param name="processId">进程ID</param>
/// <param name="duration">指定时间范围(单位s)</param>
private void TraceProcessForDuration(int processId, int duration)
{
    saveFileDialog1.Filter = "Nettrace文件|*.nettrace";
    if (saveFileDialog1.ShowDialog() == DialogResult.OK)
    {
        var cpuProviders = new List<EventPipeProvider>()
        {
            new EventPipeProvider("Microsoft-Windows-DotNETRuntime", EventLevel.Informational, (long)ClrTraceEventParser.Keywords.Default),
            new EventPipeProvider("Microsoft-DotNETCore-SampleProfiler", EventLevel.Informational, (long)ClrTraceEventParser.Keywords.None)
        };
        var client = new DiagnosticsClient(processId);
        using (var traceSession = client.StartEventPipeSession(cpuProviders))
        {
            Task copyTask = Task.Run(async () =>
            {
                using (FileStream fs = new FileStream(saveFileDialog1.FileName, FileMode.Create, FileAccess.Write))
                {
                    await traceSession.EventStream.CopyToAsync(fs);
                }
            });

            copyTask.Wait(duration * 1000);
            traceSession.Stop();
        }
    }
}

三、运行效果

 运行效果如下图:实现相关效果

  

 总结:

 通过微软提供的 Microsoft.Diagnostics.NETCore.Client 比较简单的就实现了这些功能,当然注册的事件里面还有很多信息分析等着去解锁。

 这只是走出了简单的第一步,后续还任重而道远

其他:

 参考://github.com/dotnet/diagnostics 

 源码://github.com/cwsheng/DiagnosticsTools