JUST: JD Urban Spatio-Temporal Data Engine
Manage Big Spatio-Temporal Data Efficiently in a Convenient Way.
(public beta version)
Product Homepage | JUST Portal | Development Manual

Paper

Ruiyuan Li, Huajun He, Rubin Wang, Yuchuan Huang, Junwen Liu, Sijie Ruan, Tianfu He, Jie Bao, Yu Zheng. JUST: JD Urban Spatio-Temporal Data Engine. The 36th IEEE International Conference on Data Engineering. (ICDE 2020) (Slides) (EI: 20202308794054) [System]

Architecture

* Our new proposed modules are marked as orange boxes.

Key Features

(1) Scalability

JUST can manage unduly large number of spatio-temporal data, but require little for the clusters.

(2) Efficiency

In our experimental settings, JUST shows a competitive query performance and is much more scalable than six state-of-the-art distributed spatio-temporal data management systems.

(3) Update-enabled

JUST supports inherently new data insertions and historical data updates without index reconstruction.

(4) Easy of Use

JUST incorporates a complete SQL engine, and presets plenty of out-of-the-box spatio-temporal analysis functions. All operations of JUST can be done with a SQL-liked query language, i.e., JustQL.

(5) Multi Execute Engines Supported

JUST supports both local and distributed engine(E.g Spark). Users can click on the tag on upper right corner to switch JUST execute engine.

Experiments

Data Settings


* Order distribution in Beijing, China.
* Lorry trajectory distribution in Guangzhou, China.

Results

JUST shows a competitive query efficiency with the advanced spatio-temporal data managements, and is much more scalable than them.

Experimental Code

Comparing Method Code

Applications

Quick Start

We prepare a test account for the reviewers of ICDE2020. UserName: icde2020 Password: icde2020!

Login

1. Enter the login page: http://portal-just.urban-computing.cn/login
2. Enter the User Name and Password, then click the button 登录, as shown in the following picture.
3. The user interface is shown as the following picture, which has four panels: Table Panel, View Panel, JustQL Panel, and Result Panel.

An Example of the Order Data (Common Table, Point-based Data)

1. Create a table:

CREATE TABLE order_table(
    order_time Timestamp,
    order_position Point,
    attr1 integer,
    attr2 long,
    attr3 long,
    attr4 integer,
    attr5 string,
    attr6 integer,
    attr7 integer,
    attr8 integer,
    attr9 string,
    attr10 string,
    attr11 string,
    attr12 double,
    attr13 integer,
    attr14 integer,
    attr15 integer,
    attr16 integer,
    attr17 integer
  ) WITH (
    "geomesa.indices.enabled" = "z2,z2t",
    "geomesa.z3.interval" = "day",
    "geomesa.xz.precision" = "16"
  );
        
2. Load data from the Hive warehouse:

LOAD hive :just_tutorial.order_table to JUST :order_table (
  order_time to_timestampInMS(time),
  order_position st_makePoint(lng, lat),
  attr1 attr1,
  attr2 attr2,
  attr3 attr3,
  attr4 attr4,
  attr5 attr5,
  attr6 attr6,
  attr7 attr7,
  attr8 attr8,
  attr9 attr9,
  attr10 attr10,
  attr11 attr11,
  attr12 attr12,
  attr13 attr13,
  attr14 attr14,
  attr15 attr15,
  attr16 attr16,
  attr17 attr17
);
        
3. Spatial range query:

SELECT
  order_time,
  order_position,
  attr12
FROM
  order_table
WHERE
  st_within(
    order_position,
    st_makebbox(116, 39, 116.5, 39.5)
  )
        
Users can show the result with a map view.
4. Spatio-temporal range query:

SELECT
  order_time,
  order_position,
  attr12
FROM
  order_table
WHERE
  st_within(
    order_position,
    st_makebbox(116, 39, 116.5, 39.5)
  )
  AND order_time >= '2018-10-01 00:00:00'
  AND order_time <= '2018-11-01 00:00:00'
        
5. K-NN query:

SELECT
  *
FROM
  order_table
WHERE
  st_knn(
    order_position,
    'POINT(115.71 39.57)',
    'common',
    2
  )
        
6. Create a view:

CREATE VIEW order_view AS
SELECT
  *
FROM
  order_table
WHERE
  attr12 > 3
  AND st_within(
    order_position,
    st_makebbox(116, 39, 116.5, 39.5)
  )
LIMIT
  200
        
7. Create a view:

STORE VIEW order_view TO TABLE order_table_small
        
8. Cluster the Order data using DBSCAN method:

SELECT
  st_dbscan('order_position', t1, 1, 1)
FROM
  (
    SELECT
      collect_list(struct(*)) AS t1
    FROM
      order_table_small
  )
        
The third and fourth parameters are minPts and radius respectively, representing there are at least minPts points within distance radius meters of a core point.
User can see the cluster results:
9. Drop a view:

DROP VIEW order_view
        
10. Drop tables:

DROP TABLE order_table
        

DROP TABLE order_table_small
        

An Example of the Traj Data

1. Create tables:

CREATE TABLE trajectory_table (traj Trajectory)
        
2. Load data from the HDFS:

LOAD hdfs:'/just_tutorial/trajectory_data' to JUST:trajectory_table (
    traj.oid 0,
    traj.time to_timestamp(3),
    traj.point st_makePoint(1,2)
);            
        
3. Spatial range query:

SELECT
  traj_linestring(traj), traj_starttime(traj)
FROM
  trajectory_table
WHERE
  st_within(
    traj_linestring(traj),
    st_makebbox(113.1, 23.2, 113.5, 23.6)
  )
        
Users can show the result with a map view.
4. Spatial-temporal range query:

SELECT
  *
FROM
  trajectory_table
WHERE
  st_within(
    traj_linestring(traj),
    st_makebbox(113.1, 23.2, 113.5, 23.6)
  )
  AND traj_starttime(traj) > '2014-03-01 00:00:00'
  AND traj_starttime(traj) < '2014-03-15 00:00:00'
        
5. K-NN query:

SELECT
  *
FROM
  trajectory_table
WHERE
  st_knn(
    traj,
    'POINT(115.71 39.57)',
    'common',
    2
  )
        
6. Create a view:

CREATE VIEW trajectory_view AS 
SELECT
  traj
FROM
  trajectory_table
WHERE
  st_within(
    traj_linestring(traj),
    st_makebbox(113.1, 23.2, 113.5, 23.6)
  )
        
7. Trajectory noise filter:

SELECT
  st_trajnoisefilter(
     traj,
    '{ "filterType": "COMPLEX_FILTER",
      "maxSpeedMeterPerSecond": 100.0,
      "segmenterParams": { "maxTimeIntervalInMinute": 60,
      "maxStayDistInMeter": 100,
      "minStayTimeInSecond": 100,
      "minTrajLengthInKM": 1,
      "segmenterType": "ST_DENSITY_SEGMENTER"}}'
  )
FROM
  trajectory_view       
        
Where the second param of st_trajNoiseFilter is to config which filter method is used. If omited, we will use the default method.
Users can also use distributed execute engine to get the result.
8. Trajectory segmentation:

SELECT
  st_trajsegmentation(
    traj,
    '{ "maxTimeIntervalInMinute": 10,
      "maxStayDistInMeter": 100,
      "minStayTimeInSecond": 100,
      "minTrajLengthInKM": 1,
      "segmenterType": "HYBRID_SEGMENTER"}'
  )
FROM
  trajectory_view
        
Where the second param of st_trajSegmentation is to config which segmentation method is used. If omited, we will use the default method.
9. Trajectory stay point:

SELECT
  st_trajStayPoint(
    traj,
    '{ "maxStayDistInMeter": 10,
       "minStayTimeInSecond": 60,
       "stayPointType": "CLASSIC_DETECTOR"}'
  )
FROM
  trajectory_view
        
Where the second param of st_trajStayPoint is to config which stay point detection method is used. If omited, we will use the default method.
10. Store the stay point detection result:

CREATE VIEW stay_point_view AS 
SELECT
  st_trajStayPoint(
    traj,
    '{ "maxStayDistInMeter": 10,
       "minStayTimeInSecond": 60,
       "stayPointType": "CLASSIC_DETECTOR"}'
  )
FROM
  trajectory_view       
        

STORE VIEW stay_point_view TO TABLE stay_point_table
        
11. Drop views:

DROP VIEW trajectory_view
        

DROP VIEW stay_point_view
        
12. Drop tables:

DROP TABLE trajectory_table
        

DROP TABLE stay_point_table
        

An Example of the Road Network Data

1. Create tables:

CREATE TABLE guiyang_rn (road roadSegment)
        

CREATE TABLE guiyang_traj_table (traj Trajectory)
        
2. Load Data from HDFS:

LOAD hdfs: '/just_test_lhy/data/roadnetwork/guiyang_rn.csv' TO just: guiyang_rn(
  road.oid oid,
  road.direction direction,
  road.speed_limit speed_limit,
  road.level level,
  road.geom st_linefromtext(geom)
) WITH ("just.separator" = "|", "just.header" = "true")
        

LOAD hdfs :'/just_tutorial/guiyang_traj.csv' TO JUST :guiyang_traj_table (
  traj.oid oid,
  traj.time to_timestamp(time),
  traj.point st_makePoint(lng, lat)
) WITH ("just.separator" = "|", "just.header" = "true")
        
3. Spatial Range Query:

SELECT
  *
FROM
  guiyang_rn
WHERE
  st_within(
    roadsegment_linestring(road),
    st_makebbox(106.674686,26.635553,106.712055,26.667067)
  )
        
4. Trajectory Map Match:

SELECT
  st_trajLmmMapMatchToProjection(t1.traj, t2.t)
FROM
  guiyang_traj_table t1,
  (
    SELECT
      st_makeRoadNetwork(collect_list(road)) AS t
    FROM
      guiyang_rn
  ) AS t2
        
5. Drop Tables:

DROP TABLE guiyang_rn
        

DROP TABLE guiyang_traj_table