Publishing
Simplest way to start publishing to Hermes is sending POST request on topic endpoint::
curl -X POST -H "Content-Type: application/json" http://hermes-frontend/topics/group.topic -d '{"message": "Hello world!"}'
Creating group
As data model describes, topics are gathered into topic groups. If you don't have a group yet,
you need to create one by sending POST with application/json
content type at
/groups
Request body must contain groupName
field, which is the name of the group.
Sample request:
{
"groupName": "my.group"
}
Creating topic
Use Hermes Management REST API to create topic by sending POST request with application/json
content type
on topics resource:
/topics
Request body must contain at least:
- name: fully qualified name of topic including group name, separated with a dot (see: naming convention)
- description: topic description
- contentType: format of data sent to Kafka, either
JSON
orAVRO
- retentionTime: time to keep data in Kafka in days
- owner: who's the owner of this topic –
id
is an identifier meaningful in a givensource
implementation. Default implementation isPlaintext
, whereid
is simply a label.
Minimal request:
{
"name": "my.group.my-topic",
"description": "This is my topic",
"contentType": "JSON",
"retentionTime": {
"duration": 1
},
"owner": {
"source": "Plaintext",
"id": "My Team"
}
}
Other options:
Option | Description | Options | Default value |
---|---|---|---|
ack | acknowledgement level | ALL, LEADER | LEADER |
trackingEnabled | track incoming messages? | - | false |
Request that specifies all available options:
{
"name": "my.group.my-topic",
"description": "This is my topic",
"ack": "LEADER",
"retentionTime": {
"duration": 1
},
"trackingEnabled": false,
"contentType": "JSON",
"owner": {
"source": "Plaintext",
"id": "My Team"
}
}
Message format
Each topic has a defined content type that describes the format of data sent to Kafka.
At this point Hermes supports messages sent in JSON
and AVRO
.
JSON
When topic has content type set to JSON it will accept messages in JSON format and they will be stored as JSON in Kafka.
This mode is dedicated for simple use-cases and offers no validation for published messages.
Avro
Avro is the recommended message format for topics in Hermes. It has many advantages over plain JSON, e.g. has built-in message validation (against defined schema) and lowers the volume of data sent to Kafka.
Read detailed documentation for publishing messages in Avro format here.
Response format
Message Id
Response will contain special header: Hermes-Message-Id
. This is event UUID generated by Hermes, which can be
used to track how event flew through the system.
Response codes
There are two possible response status codes that represent success:
- 201 Created - event received and acknowledged by Kafka
- 202 Accepted - event has not been acknowledged by Kafka, Hermes is buffering it and will try to deliver ASAP
Failure statuses:
- 400 Bad Message - message did not pass validation (see docs about each data format for more)
- 404 Not Found - topic does not exist
- 408 Request Timeout - message was not sent to Hermes within timeout (took too much time on the network)
- 500 Internal Server Error - something went terribly bad
- 503 Service Unavailable - node is in shutdown mode
Acknowledgment level
Each topic can define level of acknowledgement (ACK):
- leader ACK - only one Kafka node (leader) needs to acknowledge reception of message
- all ACK - at least min.insync.replicas nodes must acknowledge reception of message
ACK configuration has the following consequences:
- with
ACK leader
message writes are replicated asynchronously, thus the acknowledgment latency will be low. However, message write may be lost when there is a topic leadership change - e.g. due to rebalance or broker restart. - with
ACK all
messages writes are synchronously replicated to replicas. Write acknowledgement latency will be much higher than with leader ACK, it will also have higher variance due to tail latency. However, messages will be persisted as long as the whole replica set does not go down simultaneously.
Publishers are advised to select topic ACK level based on their latency and durability requirements.
Hermes also provides a feature called Buffering (described in paragraphs below) which provides consistent write latency
despite long Kafka response times. Note that, however, this mode may decrease message durability for ACK all
setting.
Buffering [deprecated]
Hermes administrator can set maximum time, for which Hermes will wait for Kafka acknowledgment. By default, it is set to 65ms. After that time, 202 response is sent to client. Event is kept in Kafka producer buffer and it's delivery will be retried until successful.
This makes Hermes resilient to any Kafka malfunctions or hiccups, and we are able to guarantee maximum response time to clients. Also in case of Kafka cluster failure, Hermes is able to receive incoming events and send them when Kafka is back online.
Buffer persistence
By default, events are buffered in memory only. This raises the question about what happens in case of Hermes node failure (or force kill of process). Hermes Frontend API exposes callbacks that can be used to implement persistence model of buffered events.
Default implementation uses OpenHFT ChronicleMap to persist unsent messages to disk. Map structure is continuously persisted to disk, as it is stored in offheap memory as memory mapped file.
Using buffering with ACK all setting means that durability of events may be lowered when 202 status code is received. If Hermes instance
is killed before message is spilled to disk or the data on disk becomes corrupted, the message is gone. Thus ACK all
with 202 status code
is similar to ACK leader
because a single node failure could cause the message be lost.
Deprecation notice
The buffering mechanism in Hermes is considered deprecated and is set to be removed in the future.
Remote DC fallback
Hermes supports a remote datacenter fallback mechanism for multi datacenter deployments.
Fallback is configured on per topic basis, using a fallbackToRemoteDatacenterEnabled
property:
PUT /topics/my.group.my-topic
{
"fallbackToRemoteDatacenterEnabled": true
}
Using this setting automatically disables buffering mechanism for a topic.
When using this setting for a topic, Hermes will try to send a message to a local datacenter Kafka first and will fall back to remote datacenter Kafka if the local send fails.
Hermes also provides a speculative fallback mechanism which will send messages to remote Kafka if the local Kafka is not responding in a timely manner.
Speculative send is performed after frontend.kafka.fail-fast-producer.speculativeSendDelay
elapses.
When using remote DC fallback, Hermes attempts to send a message to Kafka for the duration of frontend.handlers.maxPublishRequestDuration
property. If after
maxPublishRequestDuration
Hermes has not received an acknowledgment from Kafka, it will respond with 500 status code to the client.
Table below summarizes remote fallback configuration options:
Option | Scope | Default value |
---|---|---|
fallbackToRemoteDatacenterEnabled | topic | false |
frontend.kafka.fail-fast-producer.speculativeSendDelay | global | 250ms |
frontend.handlers.maxPublishRequestDuration | global | 500ms |
Partition assignment
Partition-Key
header can be used by publishers to specify Kafka key
which will be used for partition assignment for a message. This will ensure
that all messages with given Partition-Key
will be sent to the same Kafka partition.