SZT-bigdata

深圳地铁大数据客流分析系统🚇🚄🌟

OTHER License

Stars
2.3K
Committers
3

SZT-bigdata



   ___     ____   _____           _         _      __ _      _             _
  / __|   |_  /  |_   _|   ___   | |__     (_)    / _` |  __| |   __ _    | |_    __ _
  \__ \    / /     | |    |___|  | '_ \    | |    \__, | / _` |  / _` |   |  _|  / _` |
  |___/   /___|   _|_|_   _____  |_.__/   _|_|_   |___/  \__,_|  \__,_|   _\__|  \__,_|
_|"""""|_|"""""|_|"""""|_|     |_|"""""|_|"""""|_|"""""|_|"""""|_|"""""|_|"""""|_|"""""|
"`-0-0-'"`-0-0-'"`-0-0-'"`-0-0-'"`-0-0-'"`-0-0-'"`-0-0-'"`-0-0-'"`-0-0-'"`-0-0-'"`-0-0-'

  • ...

.file/.doc/SZT-bigdata-2.png


1-cn.java666.sztcommon.util.SZTData
2-cn.java666.etlflink.app.Jsons2Redis
3-cn.java666.etlspringboot.controller.RedisController#get
4-cn.java666.etlflink.app.Redis2ES
5-cn.java666.etlflink.app.Redis2Csv
6-Hive sql 
7-Saprk 
8-HUE  Hive 
9-cn.java666.etlflink.app.Redis2HBase
1014-cn.java666.szthbase.controller.KafkaListen#sink2Hbase
11-cn.java666.etlflink.app.Redis2HBase
12-CDH HDFS+HUE+Hbase+Hive 
13-cn.java666.etlflink.app.Redis2Kafka
15-cn.java666.sztflink.realtime.Kafka2MyCH
16-cn.java666.sztflink.realtime.sink.MyClickhouseSinkFun


+ + ()

  • Java-1.8/Scala-2.11
  • Flink-1.10ETL
  • Redis-3.2 SSDBWin10|CentOS7|Docker Redis-3.2 CentOS REPL yum 3.2
  • Kafka-2.1 CPkafka-eagle-1.4.5Ksql zk Kafka
    • KafkaOffsetMonitor
    • Kafka Manager CMAK Kafka 0.11 Kafka 2.4
    • Kafka
  • Zookeeper-3.4.5 ID
  • CDH-6.2
  • Docker-19 docker
  • SpringBoot-2.13 JAVA
  • knife4j-2.0 swagger-bootstrap-uiREST API
  • Elasticsearch-7
  • Kibana-7.4ELK
  • ClickHouse nginx clickhouse PB
  • MongoDB-4.0 Json
  • Spark-2.3 spark Flink
  • Hive-2.1Hadoop OLAP HQL Mysql
  • Impala-3.2 hive sql impala hive 80
  • HBase-2.1 + PhoenixHadoop HBase rowkey hbase
  • Kylin-2.5
  • HUE-4.3CDH hive + impala hdfs oozie
  • DataX FlinkX Flink
  • Oozie-5.1 UI HUE
  • Sqoop-1.4 Mysql HDFS
  • Mysql-5.7 SQLMysql 8.0 MariaDB Mysql
  • Hadoop3.0HDFS+YarnHDFS Yarn hadoop MR
  • DataV
  • ...

Apache CDH 2021 CDH USDP 32G RAM * 3 Hadoop Apache Hadoop

  • Win10 VMware + Win10 VMware + CentOS7 SSD + HDFS

  • 16G RAM Linux hadoop vagrant
#  linux RAM > 16G centos7
# 
# 
# 
# https://github.com/juewuy/ShellClash

curl -sSL https://raw.githubusercontent.com/geekyouth/vagrant/main/start.sh | sh -x

kafka


javascala IDEAVMware CDH


1- appKey

https://opendata.sz.gov.cn/data/api/toApiDetails/29200_00403601

2-

2.1- cn.java666.etlspringboot.source.SZTData#saveData /tmp/szt-data/szt-data-page.jsons 13371000


2.2- cn.java666.etlflink.sink.RedisSinkPageJson#main etl redis redis 1337


2.3- redis redis-cli hget szt:pageJson 1

dbeaver


2.4- cn.java666.etlspringboot.EtlSApp#main knife4j REST API


2.5- cn.java666.etlflink.source.MyRedisSourceFun#run 133.7 9stationcar_no

{
	"deal_date": "2018-08-31 21:15:55",
	"close_date": "2018-09-01 00:00:00",
	"card_no": "CBHGDEEJB",
	"deal_value": "0",
	"deal_type": "",
	"company_name": "",
	"car_no": "IGT-104",
	"station": "",
	"conn_mark": "0",
	"deal_money": "0",
	"equ_no": "263032104"
}
{
	"deal_date": "2018-09-01 05:24:22",
	"close_date": "2018-09-01 00:00:00",
	"card_no": "HHAAABGEH",
	"deal_value": "0",
	"deal_type": "",
	"company_name": "",
	"conn_mark": "0",
	"deal_money": "0",
	"equ_no": "268005140"
}

2.6- cn.java666.etlflink.app.Redis2Kafka#main kafkatopic-flink-szt-all 1337000 topic-flink-szt 1266039


2.7- kafka-eagle topic

ksql select * from "topic-flink-szt" where "partition" in (0) limit 1000


2.8- cn.java666.etlflink.app.Redis2Csv#main flink sink csv


2.9- cn.java666.etlflink.app.Redis2ES#main ES

ES

ES



2018-09-01 kibana 2018-09-01 00:00:00.000~2018-09-01 23:59:59.999

1266039 2018-09-01 1229180

2018-09-01 6~12 kibana

ETL

1337000 1266039 ES szt-data

1266039 1227234 2018-09-01

122 X 2

ES

  • ES kibana
    index
{
  "properties": {
	"deal_date": {
	  "format": "yyyy-MM-dd HH:mm:ss",
	  "type": "date"
	}
  }
}  

ES 0 ES 0 kibana UTC kibana


  • ES json json
  • ES bean fastjson Gson

TIPS

  • Gson fastjsonGson fastjson

2.10- ES

J AA != 0 BCDEFGHIJ K


2.11- cn.java666.sztcommon.util.ParseCardNo#parse cn.java666.etlspringboot.controller.CardController#get REST API


3-

3.1-

---> ---> --->

3.2-

ODSDWDDWSADS

  • ODS
ods/ods_szt_data/day=2018-09-01/   
# szt_szt_page/day=2018-09-01/  
  • DWD
    dim_ fact_
dwd_fact_szt_in_detail      
dwd_fact_szt_out_detail     
dwd_fact_szt_in_out_detail  
  • DWS
dws_card_record_day_wide  
  • ADS
       
	ads_in_station_day_top
       
	ads_out_station_day_top
       
	ads_in_out_station_day_top
       
	ads_card_deal_day_top  
      
	ads_line_send_passengers_day_top  
        
	ads_stations_send_passengers_day_top
      
	ads_line_single_ride_average_time_day_top
     
	ads_all_passengers_single_ride_spend_time_average
      
	ads_passenger_spend_time_day_top
 
	  		ads_station_in_equ_num_top
	    		ads_station_out_equ_num_top
 
	 		ads_line_in_equ_num_top.png
	 		ads_line_out_equ_num_top
    
	ads_station_deal_day_top
    
	ads_line_deal_day_top
   
	ads_conn_ratio_day_top
 9.5       
	ads_line_sale_ratio_top
 	
	ads_conn_spend_time_top
    
	ads_on_line_min_top

3.3-

hdfs hive /warehouse hue hue hue hue hive sql szt ods dwd dws ads

/warehouse/szt.db/ods/ szt-etl-data.csv szt-etl-data_2018-09-01.csv szt-page.jsons

hdfs dfs -ls -h hdfs://cdh231:8020/warehouse/szt.db/ods/

HUE sql/hive.sql HQL .....

IDEA Database idea cdh hive https://github.com/timveil/hive-jdbc-uber-jar/releases

DBeaver Sqlyognavicatheidisqlworkbench debug DBeaver HUE



3.3.1 -

2018-09-01


3.3.2 -

2018-09-01


3.3.3-

**2018-09-01

**


3.3.4-

**2018-09-01 48 **


3.3.5-

2018-09-01


3.3.6-

2018-09-01>>>


3.3.7-

**2018-09-011500s25 11 40 **


3.3.8-

**2018-09-01 1791 s 30 **


3.3.9-

**2018-09-01 17123 4.75 20 **


3.3.10-

2018-09-01


3.3.11-

2018-09-01@_@


3.3.12-

**2018-09-01 4 **


3.3.12-

**2018-09-011 30 **


3.3.13-

** 15.6% 9.42%**


3.3.14-

9.52018-09-01 90.36% 84.3%


3.3.15-


4- SZT-kafka-hbase

SZT-kafka-hbase project for Spring Boot2 spring-boot-starter-hbasespring-data-hadoop-hbase API

hbase-2.1 + springboot-2.1.13 + kafka-2.0 hbase

  • knife4j hbase

  • hbase 10 10

  • hbase rowkey

  • hbase szt hbase

  • hbase SZT-kafka-hbase

api-debug

hue-hbase

hue-hbase

hbase-shell

scan 'szt:data', {FORMATTER => 'toString',VERSIONS=>10}


  • kafka
    cn.java666.etlflink.app.Redis2Kafka
    SZT-kafka-hbase

hbase 2GB X 3

5- SZT-flink cn.java666.etlflink.app.Json2HBase

redis json hbase redis json kafka flink hbase flink:flink2hbase 1010

hbase bean JSON

val keys = jsonObj.keySet().toList
val size = keys.size()

for (i <- 0 until size) {
	val key = keys.get(i)
	val value = jsonObj.getStr(key)
	putCell(card_no_re, cf, key, value)
}

6- SZT-flink

flink kafka clickhouse


......


TODO:

  • redis pageJson csv
  • kafka
  • elasticsearchkibana
  • ODSDWDDWSADS
  • hive on spark
  • spark on hive spark hive
  • hbase
  • [-] ~~oozie ~~;
  • flink
  • spark
  • DataV

  • 2022-05-28:

    • fastjson
    • 996apache-2.0
  • 2020-05-25

    • flink flink kafka clickhouse
  • 2020-05-22:

  • 2020-05-14

    • RedisSinkPageJson package cn.java666.etlflink.sink package cn.java666.etlflink.app Jsons2Redisjsonredis
  • 2020-05-01

    • redis json hbase
    • hbase-2.1 + springboot-2.1.13 + kafka-2.0
    • kafka hbase n
  • 2020-04-30

    • hbase-2.1 + springboot-2.1.13 hbase
  • 2020-04-27

<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-devtools</artifactId>
	<scope>runtime</scope>
	<optional>true</optional>
</dependency>

#########################  ###################################
#", "
spring.freemarker.cache=false
spring.thymeleaf.cache=false

#
spring.devtools.restart.enabled=true
#livereload
spring.devtools.livereload.enabled=true
#,restart
spring.devtools.restart.additional-paths=src/main/*
#
#spring.devtools.restart.exclude=static/**,public/**
  • 202-04-27:
      • 45932
    • hive
alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;
alter table TABLE_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
alter table PARTITION_PARAMS  modify column PARAM_VALUE varchar(4000) character set utf8;
alter table PARTITION_KEYS  modify column PKEY_COMMENT varchar(4000) character set utf8;
alter table  INDEX_PARAMS  modify column PARAM_VALUE  varchar(4000) character set utf8;
  • 2020-04-24

  • 2020-04-23

  • 2020-04-22

  • 2020-04-21:

    • SZT-spark-hive spark Hive
    • Debugspark on hive yarn
  • 2020-04-20

    • logo
    • SQL hive 3.1 TEZ hive on spark MR 10
  • 2020-04-19

    • vmware rm -rf /usr/ HDFSKafkaES cdh
    • hive on MR hive on spark
  • 2020-04-18

  • 2020-04-17

    • v0.12;
  • 2020-04-16

    • v0.1
  • 2020-04-15

    • common
    • REST API
    • ES
    • Redis2Csv csv
  • 2020-04-14

    • csv
    • GPL-3
    • ES ,kibana
  • 2020-04-13

    • redis
    • redis REST API
    • flink source redis
    • kafka

github


Related Projects