| HTTP SUBSCRIBE Method | August 2024 | |
| Wright | Experimental | [Page] |
The SUBSCRIBE HTTP method allows clients to request for the history of changes to a resource and receive changes in real-time.¶
The SUBSCRIBE method offers a way to retreive a list of changes to a resource, including changes as they are made. In the same way that PUT is a reverse GET (you use GET to retreive the previous PUT), PATCH [RFC5789] may be thought of as a reverse SUBSCRIBE (a SUBSCRIBE retreives a journal of PATCH messages).¶
A resource that supports the SUBSCRIBE method is said to have an underlying journal, which is an append-only list of changes made to it over time. This journal does not necessarially have a definite length, which means the journal representation may not have an end, in which case the response will be held open to write new events as they become available.¶
The journal may be implemented in a variety of ways, and may be represented in a variety of media types suitable to the client, and the server is allowed to forget parts of the journal, or restart the journal from the beginning, as it desires.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document uses ABNF as defined in [RFC5234] and imports grammar rules from [RFC7230].¶
For brevity, example HTTP messages may add folding whitespace, or omit some headers necessary for message transfer.¶
This section to be removed before publication.¶
Get a history of changes to the current resource.¶
Immediately know when the resource on a server changed.¶
Get a list of changes from the previous request through now.¶
Receive changes in a form preferred by the client.¶
Allow responses in a variety of media types to suit different needs.¶
Interoperability with existing caches and intermedaries.¶
Support for incremental (changes since previous request) or streaming (realtime) updates¶
The SUBSCRIBE method is used to request a continuous stream of changes to a resource as they occur. The primary use for SUBSCRIBE is for clients to monitor and react to changes in real-time.¶
A resource that supports the SUBSCRIBE method is said to have an underlying journal, which is a list of changes associated to the resource over its existance. This journal has no definite length, so that future changes to the resource may be "written to the journal" and sent to the client without causing a "change" (new data is streamed from the journal, but it does not amount to a change for caching purposes, since none of the previously sent content, nor the position of the end, has changed from a previous request). The journal is an abstract resource, but may be notated in any suitable media type that can describe changes to a resource. Such an encoding of the journal is called the journal representation.¶
A journal can itself be a resource with its own URI. In this case, a GET to the journal's URI is the same as a SUBSCRIBE to the resource keeping that journal. Because the journal is append-only, there need not be any recursion; the journal is its own journal.¶
A deleted resource may still have a journal, allowing for the possibility that a 410 (Gone) response to a GET request may still have a 2xx response to SUBSCRIBE. This is consistent with the fact that a resource and its underlying journal may be two separate resources, with different URIs. However the journal may be wiped for any reason, including a DELETE, or overwritten resource. However, a 404 (Not Found) implies the resource has no associated journal, and a SUBSCRIBE request will also return 404.¶
The server is allowed to "forget" portions of a journal, in which case the response will be a 206 Partial Content response. A client may indicate it does not care about changes after a particular point, or it can request changes only after a particular point.¶
The journal may be implemented in a variety of ways suitable to the origin. The journal may be the complete record of all changes to a resource since its creation. The journal may also be a "shift buffer" that only keeps a certain number of changes, based on time, events, memory size, or other metrics. The journal may even be simply a list of timestamps or single bytes that signal the resource changed, without conveying any other information.¶
The journal normally has no length, and therefore no end. However a resource, once deleted, closes the journal, and it becomes finitely long, thereby closing all responses currently watching the journal.¶
A SUBSCRIBE request issued to the target resource establishes a monitoring connection. The request may include Range headers to specify a subset of the journal.¶
The SUBSCRIBE method uses the following HTTP header fields:¶
The Accept request header lists the media types that the client is willing to receive.¶
Specifies a byte-range to request a specific subset of the journal. If no end is provided, the response remains open to stream new events as they become available.¶
To indicate that the client is only interested in recent changes to a resource, the client may make "partial content" Range requests.¶
This is used if you want to download only new additions to the journal, or the entire journal if it has been reset.¶
TODO: Also specify a header that provides the entire original resource in this situation, if you don't need the entire change history, but only the current version plus updates. Also consider ways to provide the current version of a resource (the current length of different journal representations).¶
A successful SUBSCRIBE request results in a response that is a representation of unknown length. The connection remains open, streaming updates to the client until the server determines that no further changes will be made.¶
200 (OK: The subscription was successfully established, and the initial part of the journal is included in the response.¶
206 (Partial Content): The request contained a Range header, and the specified subset of the journal is provided in the response.¶
416 (Range Not Satisfiable): The Range specified in the request is not applicable or out of bounds for the resource's journal.¶
Specifies the media type of the representation that was negotiated.¶
Specifies that the attached body is a range on the journal.¶
The Content-Location response header, if present, provides a URI for the journal itself. In this case, a SUBSCRIBE request is the same as a GET request to the journal. However, see Content Negotiation for specifics [RFC9110].¶
If content negotiation was used to select from among different media types, then this must list Accept.¶
The ETag of the journal is established when it is created and does not change unless the journal is recreated. All journal resources MUST have an ETag.¶
As desired, the server may "forget" all history and truncate the journal to empty. In this case, all existing connections to the resource must close, and the ETag of the resource, if any, MUST change.¶
The Last-Modified field of a resource could mean a few different things, the most useful behavior may be describing the creation date of the journal rather than the last write (since writes are not really "modifications"; appends to the journal signal writes).¶
A SUBSCRIBE request establishes a connection where the server sends a stream of updates to the client. Each update in the stream corresponds to a change in the resource, formatted as a diff or patch document.¶
When the media types of the resource before and after a change are the same, the server emits a PATCH method verbatim on all active SUBSCRIBE connections.¶
The server continuously sends these updates as long as changes are made to the resource or until the connection is explicitly closed by the server.¶
If a Range request is included in the SUBSCRIBE request, the server returns the specified subset of the journal when possible.¶
If the partial content is finitely long, the response is closed after the subset is transmitted.¶
If the partial content is indefinitely long, the response remains open until the end is known and reached.¶
The following is a SUBSCRIBE request, and the assocaited snapshot-in-time of a partial response, reflecting the known contents of the journal up to that time. Note how the end of the JSON Patch document [RFC6902] is not closed, but can still be written out, and must be parsed by a suitable streaming parser:¶
SUBSCRIBE /example/resource HTTP/1.1 Range: bytes=0-¶
Response:¶
HTTP/1.1 200 OK
Content-Type: application/json-patch+json
[ { "op": "test", "path": "/a/b/c", "value": "foo" }
, { "op": "remove", "path": "/a/b/c" }
, { "op": "add", "path": "/a/b/c", "value": [ "foo", "bar" ] }
, { "op": "replace", "path": "/a/b/c", "value": 42 }
, { "op": "move", "from": "/a/b/c", "path": "/a/b/d" }
¶
If the patch format does not support streaming, or if the patch format is not self-syncronizing (the entire patch must be read from the beginning), or lacks other metadata that is desired (such as Date or ancestory information), then multiple separate patches may be delivered in a multipart response.¶
Support for the preferred multipart containers, and embedded patch media type, both need to be specified in the Accept request header.¶
The following is a SUBSCRIBE request where the client requests, and receives, a multipart/byteranges response indicating all of the modifications that have happened over time:¶
SUBSCRIBE /example/resource HTTP/1.1 Accept: multipart/byteranges¶
Response:¶
HTTP/1.1 200 OK Content-Type: multipart/byteranges --BOUNDARY Date: Fri, 26 Jul 2024 22:08:20 GMT Content-Range: bytes 0-99/* This is the first line of a log file... --BOUNDARY Date: Fri, 26 Jul 2024 22:14:29 GMT Content-Range: bytes 100-199/* This is the second line of a log file... --BOUNDARY Date: Fri, 26 Jul 2024 22:22:10 GMT Content-Range: bytes 200-299/* This is the third line of a log file... --BOUNDARY¶
This response is, again, a snapshot-in-time of a partial response; the response is still open, and the last boundary of a multipart response would normally end with a dash-dash.¶
It is hypothetically possible to break multipart encodings when streaming, since the recipient can asyncrhonously read the boundary, then write it into the resource. This can be mitigated by splitting the part into two at the boundary delimiter, but this may be undesired, or introduce side effects if the server needs parts to be atomic for some reason (e.g. each part is a separate transaction).¶
A publish-subscribe endpoint lets one client publish an event for many other clients to receive, in realtime, but only storing the most recent value, and realtime changes to that value.¶
In this case, subscribes may be issued with a PUT over the resource, and the new value is streamed with a SUBSCRIBE. The journal is pruned to show only the value of the latest PUT; on every PUT, the last ending offset becomes the new starting offset.¶
The server can even implement congestion control on a per-client basis by skipping over segments of the journal to "catch up" to the head, by emitting different parts of the journal:¶
HTTP/1.1 200 OK Content-Type: multipart/byteranges --BOUNDARY Date: Fri, 26 Jul 2024 22:08:20 GMT Content-Range: bytes 0-91/* This is the first entry of a status message, which is being written to very frequently... --BOUNDARY Date: Fri, 26 Jul 2024 22:08:21 GMT Content-Range: bytes 800-983/* By the time the previous received part was transmitted, the server received many PUT requests, and now the next state the client is ready to receive is at an offset of 1000 bytes... --BOUNDARY¶
For applications such as a wiki, the complete history of the resource up to now may be requested by requesting all of the changes from zero through now. When the server reaches all the changes up to the request time, it will close the connection, because it has reached the end of a finite-length history.¶
If a client merely wants the list of changes since the last time it connected, it can request the changes using the last known etag and offset:¶
SUBSCRIBE /log HTTP/1.1 If-Range: "foo" Range: bytes=5540-¶
The "If-Range" field instructs the server to ignore the Range field if the supplied ETag does not match.¶
This section is non-normative.¶
SUBSCRIBE may be implemented per-resource, it does not need to be supported on all resources together. Often, only singluar resources will implement the SUBSCRIBE method. If the method is not implemented, the server should return 405 (Method Not Allowed) for the resource with an Allow: header, as described in [RFC9110].¶
Many HTTP servers are not optimized for long-running responses. As with other methods that keep connections open, care must be taken to manage resources effectively to prevent abuse, such as denial of service attacks. Implementations should consider authentication, authorization, and rate-limiting mechanisms to ensure that only authorized clients can establish and maintain SUBSCRIBE connections.¶
Allows a resource to link to the journal that underlies it. Following a GET request to this resource is essentially the same as making a SUBSCRIBE request on the resource.¶
Lightly repurposing an existing link relation may also suffice.¶