Http
Http source connector
Description​
Used to read data from Http.
Key features​
Supported DataSource Info​
Http
universal
Source Options​
url
String
Yes
-
Http request url.
schema
Config
No
-
Http and Nexus data structure mapping
schema.fields
Config
No
-
The schema fields of upstream data
json_field
Config
No
-
This parameter helps you configure the schema,so this parameter must be used with schema.
pageing
Config
No
-
This parameter is used for paging queries
pageing.page_field
String
No
-
This parameter is used to specify the page field name in the request parameter
pageing.total_page_size
Int
No
-
This parameter is used to control the total number of pages
pageing.batch_size
Int
No
-
The batch size returned per request is used to determine whether to continue when the total number of pages is unknown
content_json
String
No
-
This parameter can get some json data.If you only need the data in the 'book' section, configure content_field = "$.store.book.*"
.
format
String
No
text
The format of upstream data, now only support json
text
, default text
.
method
String
No
get
Http request method, only supports GET, POST method.
headers
Map
No
-
Http headers.
params
Map
No
-
Http params,the program will automatically add http header application/x-www-form-urlencoded.
body
String
No
-
Http body,the program will automatically add http header application/json,body is jsonbody.
poll_interval_millis
Int
No
-
Request http api interval(millis) in stream mode.
retry
Int
No
-
The max retry times if request http return to IOException
.
retry_backoff_multiplier_ms
Int
No
100
The retry-backoff times(millis) multiplier if request http failed.
retry_backoff_max_ms
Int
No
10000
The maximum retry-backoff times(millis) if request http failed
enable_multi_lines
Boolean
No
false
connect_timeout_ms
Int
No
12000
Connection timeout setting, default 12s.
socket_timeout_ms
Int
No
60000
Socket timeout setting, default 60s.
How to Create a Http Data Synchronization Jobs​
env {
parallelism = 1
job.mode = "BATCH"
}
source {
Http {
result_table_name = "http"
url = "http://mockserver:1080/example/http"
method = "GET"
format = "json"
schema = {
fields {
c_map = "map<string, string>"
c_array = "array<int>"
c_string = string
c_boolean = boolean
c_tinyint = tinyint
c_smallint = smallint
c_int = int
c_bigint = bigint
c_float = float
c_double = double
c_bytes = bytes
c_date = date
c_decimal = "decimal(38, 18)"
c_timestamp = timestamp
c_row = {
C_MAP = "map<string, string>"
C_ARRAY = "array<int>"
C_STRING = string
C_BOOLEAN = boolean
C_TINYINT = tinyint
C_SMALLINT = smallint
C_INT = int
C_BIGINT = bigint
C_FLOAT = float
C_DOUBLE = double
C_BYTES = bytes
C_DATE = date
C_DECIMAL = "decimal(38, 18)"
C_TIMESTAMP = timestamp
}
}
}
}
}
# Console printing of the read Http data
sink {
Console {
parallelism = 1
}
}
Parameter Interpretation​
format​
when you assign format is json
, you should also assign schema option, for example:
upstream data is the following:
{
"code": 200,
"data": "get success",
"success": true
}
you should assign schema as the following:
schema {
fields {
code = int
data = string
success = boolean
}
}
connector will generate data as the following:
200
get success
true
when you assign format is text
, connector will do nothing for upstream data, for example:
upstream data is the following:
{
"code": 200,
"data": "get success",
"success": true
}
connector will generate data as the following:
{"code": 200, "data": "get success", "success": true}
content_json​
This parameter can get some json data.If you only need the data in the 'book' section, configure content_field = "$.store.book.*"
.
If your return data looks something like this.
{
"store": {
"book": [
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
}
],
"bicycle": {
"color": "red",
"price": 19.95
}
},
"expensive": 10
}
You can configure content_field = "$.store.book.*"
and the result returned looks like this:
[
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
}
]
Then you can get the desired result with a simpler schema,like
Http {
url = "http://mockserver:1080/contentjson/mock"
method = "GET"
format = "json"
content_field = "$.store.book.*"
schema = {
fields {
category = string
author = string
title = string
price = string
}
}
}
Here is an example:
Test data can be found at this link mockserver-config.json
See this link for task configuration http_contentjson_to_assert.conf.
json_field​
This parameter helps you configure the schema,so this parameter must be used with schema.
If your data looks something like this:
{
"store": {
"book": [
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
}
],
"bicycle": {
"color": "red",
"price": 19.95
}
},
"expensive": 10
}
You can get the contents of 'book' by configuring the task as follows:
source {
Http {
url = "http://mockserver:1080/jsonpath/mock"
method = "GET"
format = "json"
json_field = {
category = "$.store.book[*].category"
author = "$.store.book[*].author"
title = "$.store.book[*].title"
price = "$.store.book[*].price"
}
schema = {
fields {
category = string
author = string
title = string
price = string
}
}
}
}
Test data can be found at this link mockserver-config.json
See this link for task configuration http_jsonpath_to_assert.conf.
pageing​
source {
Http {
url = "http://localhost:8080/mock/queryData"
method = "GET"
format = "json"
params={
page: "${page}"
}
content_field = "$.data.*"
pageing={
total_page_size=20
page_field=page
#when don't know the total_page_size use batch_size if read size<batch_size finish ,otherwise continue
#batch_size=10
}
schema = {
fields {
name = string
age = string
}
}
}
}
Last updated