數據分析高級教程(三)
- 2019 年 10 月 6 日
- 筆記
工作流單元測試
1、工作流定義配置上傳
[hadoop@hdp-node-01 wf-oozie]$ hadoop fs -put hive2-etl /user/hadoop/oozie/myapps/[hadoop@hdp-node-01 wf-oozie]$ hadoop fs -put hive2-dw /user/hadoop/oozie/myapps/ [hadoop@hdp-node-01 wf-oozie]$ lltotal 12drwxrwxr-x. 2 hadoop hadoop 4096 Nov 23 16:32 hive2-dwdrwxrwxr-x. 2 hadoop hadoop 4096 Nov 23 16:32 hive2-etldrwxrwxr-x. 3 hadoop hadoop 4096 Nov 23 11:24 weblog[hadoop@hdp-node-01 wf-oozie]$ export OOZIE_URL=http://localhost:11000/oozie |
---|
2、工作流單元提交啟動
oozie job -D inpath=/weblog/input -D outpath=/weblog/outpre-config weblog/job.properties -run
啟動etl的hive工作流
oozie job -config hive2-etl/job.properties -run
啟動pvs統計的hive工作流
oozie job -config hive2-dw/job.properties -run
3、工作流coordinator配置(片段)
多個工作流job用coordinator組織協調:
[hadoop@hdp-node-01 hive2-etl]$ lltotal 28-rw-rw-r–. 1 hadoop hadoop 265 Nov 13 16:39 config-default.xml-rw-rw-r–. 1 hadoop hadoop 512 Nov 26 16:43 coordinator.xml-rw-rw-r–. 1 hadoop hadoop 382 Nov 26 16:49 job.propertiesdrwxrwxr-x. 2 hadoop hadoop 4096 Nov 27 11:26 lib-rw-rw-r–. 1 hadoop hadoop 1910 Nov 23 17:49 script.q-rw-rw-r–. 1 hadoop hadoop 687 Nov 23 16:32 workflow.xml |
---|
l config-default.xml
<configuration><property><name>jobTracker</name><value>hdp-node-01:8032</value></property><property><name>nameNode</name><value>hdfs://hdp-node-01:9000</value></property><property><name>queueName</name><value>default</value></property></configuration> |
---|
l job.properties
user.name=hadoopoozie.use.system.libpath=trueoozie.libpath=hdfs://hdp-node-01:9000/user/hadoop/share/liboozie.wf.application.path=hdfs://hdp-node-01:9000/user/hadoop/oozie/myapps/hive2-etl/ |
---|
l workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.5" name="hive2-wf"><start to="hive2-node"/> <action name="hive2-node"><hive2 xmlns="uri:oozie:hive2-action:0.1"><job-tracker>${jobTracker}</job-tracker><name-node>${nameNode}</name-node><configuration><property><name>mapred.job.queue.name</name><value>${queueName}</value></property></configuration><jdbc-url>jdbc:hive2://hdp-node-01:10000</jdbc-url><script>script.q</script><param>input=/weblog/outpre2</param></hive2><ok to="end"/><error to="fail"/></action> <kill name="fail"><message>Hive2 (Beeline) action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message></kill><end name="end"/></workflow-app> |
---|
l coordinator.xml
<coordinator-app name="cron-coord" frequency="${coord:minutes(5)}" start="${start}" end="${end}" timezone="Asia/Shanghai" xmlns="uri:oozie:coordinator:0.2"><action><workflow><app-path>${workflowAppUri}</app-path><configuration><property><name>jobTracker</name><value>${jobTracker}</value></property><property><name>nameNode</name><value>${nameNode}</value></property><property><name>queueName</name><value>${queueName}</value></property></configuration></workflow></action></coordinator-app> |
---|
模組開發——數據展示
在企業的數據分析系統中,前端展現工具有很多,
l 獨立部署專門系統的方式:以BusinessObjects(BO,Crystal Report),Heperion(Brio),Cognos等國外產品為代表的,它們的伺服器是單獨部署的,與應用程式之間通過某種協議溝通訊息
l 有WEB程式展現方式:通過獨立的或者嵌入式的java web系統來讀取報表統計結果,以網頁的形式對結果進行展現,如,100%純Java的潤乾報表
本日誌分析項目採用自己開發web程式展現的方式
u Web展現程式採用的技術框架:
Jquery + Echarts + springmvc + spring + mybatis + mysql
u 展現的流程:
1. 使用ssh從mysql中讀取要展現的數據
2. 使用json格式將讀取到的數據返回給頁面
3. 在頁面上用echarts對json解析並形成圖標
Web程式工程結構
採用maven管理工程,引入SSH框架依賴及jquery+echarts的js庫

Web程式的實現程式碼
採用典型的MVC架構實現
頁面 |
HTML + JQUERY + ECHARTS |
---|---|
Controller |
SpringMVC |
Service |
Service |
DAO |
Mybatis |
資料庫 |
Mysql |
程式碼詳情見項目工程
程式碼示例:ChartServiceImpl
@Service("chartService")public class ChartServiceImpl implements IChartService { @Autowired IEchartsDao iEchartsDao; public EchartsData getChartsData() { List<Integer> xAxiesList = iEchartsDao.getXAxiesList(""); List<Integer> pointsDataList = iEchartsDao.getPointsDataList(""); EchartsData data = new EchartsData(); ToolBox toolBox = EchartsOptionUtil.getToolBox(); Serie serie = EchartsOptionUtil.getSerie(pointsDataList); ArrayList<Serie> series = new ArrayList<Serie>(); series.add(serie); List<XAxi> xAxis = EchartsOptionUtil.getXAxis(xAxiesList); List<YAxi> yAxis = EchartsOptionUtil.getYAxis(); HashMap<String, String> title = new HashMap<String, String>(); title.put("text", "pvs"); title.put("subtext", "超級pvs"); HashMap<String, String> tooltip = new HashMap<String, String>(); tooltip.put("trigger", "axis"); HashMap<String, String[]> legend = new HashMap<String, String[]>(); legend.put("data", new String[]{"pv統計"}); data.setTitle(title); data.setTooltip(tooltip); data.setLegend(legend); data.setToolbox(toolBox); data.setCalculable(true); data.setxAxis(xAxis); data.setyAxis(yAxis); data.setSeries(series); return data; } public List<HashMap<String, Integer>> getGaiKuangList(String date) throws ParseException{ HashMap<String, Integer> gaiKuangToday = iEchartsDao.getGaiKuang(date); SimpleDateFormat sf = new SimpleDateFormat("MMdd"); Date parse = sf.parse(date); Calendar calendar = Calendar.getInstance(); calendar.setTime(parse); calendar.add(Calendar.DAY_OF_MONTH, -1); Date before = calendar.getTime(); String beforeString = sf.format(before); System.out.println(beforeString); HashMap<String, Integer> gaiKuangBefore = iEchartsDao.getGaiKuang(beforeString); ArrayList<HashMap<String, Integer>> gaiKuangList = new ArrayList<HashMap<String, Integer>>(); gaiKuangList.add(gaiKuangToday); gaiKuangList.add(gaiKuangBefore); return gaiKuangList; } public static void main(String[] args) { ChartServiceImpl chartServiceImpl = new ChartServiceImpl(); EchartsData chartsData = chartServiceImpl.getChartsData(); Gson gson = new Gson(); String json = gson.toJson(chartsData); System.out.println(json); }} |
---|
Web程式的展現效果
網站概況



流量分析


來源分析


訪客分析

OVER,整個數據項目實戰到此結束!