HTTP SUBSCRIBE Method August 2024
Wright Experimental [Page]
Workgroup:
HTTP
Published:
Author:
A. Wright

HTTP SUBSCRIBE Method

Abstract

The SUBSCRIBE HTTP method allows clients to request for the history of changes to a resource and receive changes in real-time.

Table of Contents

1. Introduction

The SUBSCRIBE method offers a way to retreive a list of changes to a resource, including changes as they are made. In the same way that PUT is a reverse GET (you use GET to retreive the previous PUT), PATCH [RFC5789] may be thought of as a reverse SUBSCRIBE (a SUBSCRIBE retreives a journal of PATCH messages).

A resource that supports the SUBSCRIBE method is said to have an underlying journal, which is an append-only list of changes made to it over time. This journal does not necessarially have a definite length, which means the journal representation may not have an end, in which case the response will be held open to write new events as they become available.

The journal may be implemented in a variety of ways, and may be represented in a variety of media types suitable to the client, and the server is allowed to forget parts of the journal, or restart the journal from the beginning, as it desires.

1.1. Notational Conventions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

This document uses ABNF as defined in [RFC5234] and imports grammar rules from [RFC7230].

For brevity, example HTTP messages may add folding whitespace, or omit some headers necessary for message transfer.

1.2. Use Cases & Requirements

This section to be removed before publication.

  1. Get a history of changes to the current resource.

  2. Immediately know when the resource on a server changed.

  3. Get a list of changes from the previous request through now.

  4. Receive changes in a form preferred by the client.

  5. Allow responses in a variety of media types to suit different needs.

  6. Interoperability with existing caches and intermedaries.

  7. Support for incremental (changes since previous request) or streaming (realtime) updates

2. The SUBSCRIBE Method

The SUBSCRIBE method is used to request a continuous stream of changes to a resource as they occur. The primary use for SUBSCRIBE is for clients to monitor and react to changes in real-time.

3. The Journal

A resource that supports the SUBSCRIBE method is said to have an underlying journal, which is a list of changes associated to the resource over its existance. This journal has no definite length, so that future changes to the resource may be "written to the journal" and sent to the client without causing a "change" (new data is streamed from the journal, but it does not amount to a change for caching purposes, since none of the previously sent content, nor the position of the end, has changed from a previous request). The journal is an abstract resource, but may be notated in any suitable media type that can describe changes to a resource. Such an encoding of the journal is called the journal representation.

A journal can itself be a resource with its own URI. In this case, a GET to the journal's URI is the same as a SUBSCRIBE to the resource keeping that journal. Because the journal is append-only, there need not be any recursion; the journal is its own journal.

A deleted resource may still have a journal, allowing for the possibility that a 410 (Gone) response to a GET request may still have a 2xx response to SUBSCRIBE. This is consistent with the fact that a resource and its underlying journal may be two separate resources, with different URIs. However the journal may be wiped for any reason, including a DELETE, or overwritten resource. However, a 404 (Not Found) implies the resource has no associated journal, and a SUBSCRIBE request will also return 404.

The server is allowed to "forget" portions of a journal, in which case the response will be a 206 Partial Content response. A client may indicate it does not care about changes after a particular point, or it can request changes only after a particular point.

The journal may be implemented in a variety of ways suitable to the origin. The journal may be the complete record of all changes to a resource since its creation. The journal may also be a "shift buffer" that only keeps a certain number of changes, based on time, events, memory size, or other metrics. The journal may even be simply a list of timestamps or single bytes that signal the resource changed, without conveying any other information.

The journal normally has no length, and therefore no end. However a resource, once deleted, closes the journal, and it becomes finitely long, thereby closing all responses currently watching the journal.

3.1. Request

A SUBSCRIBE request issued to the target resource establishes a monitoring connection. The request may include Range headers to specify a subset of the journal.

The SUBSCRIBE method uses the following HTTP header fields:

3.1.1. Accept

The Accept request header lists the media types that the client is willing to receive.

Specified in [RFC9110] Section 12.5.1.

3.1.2. Range

Specifies a byte-range to request a specific subset of the journal. If no end is provided, the response remains open to stream new events as they become available.

To indicate that the client is only interested in recent changes to a resource, the client may make "partial content" Range requests.

3.1.3. If-Match

Only satisfy the request if the resource ETag is the same as the supplied ETag.

Specified in [RFC9110] Section 13.1.1.

3.1.4. If-Range

This is used if you want to download only new additions to the journal, or the entire journal if it has been reset.

TODO: Also specify a header that provides the entire original resource in this situation, if you don't need the entire change history, but only the current version plus updates. Also consider ways to provide the current version of a resource (the current length of different journal representations).

Specified in [RFC9110] Section 13.1.5

3.2. Response

A successful SUBSCRIBE request results in a response that is a representation of unknown length. The connection remains open, streaming updates to the client until the server determines that no further changes will be made.

3.2.1. Status Codes

  • 200 (OK: The subscription was successfully established, and the initial part of the journal is included in the response.

  • 206 (Partial Content): The request contained a Range header, and the specified subset of the journal is provided in the response.

  • 416 (Range Not Satisfiable): The Range specified in the request is not applicable or out of bounds for the resource's journal.

3.2.2. Content-Type

Specifies the media type of the representation that was negotiated.

Specified in [RFC9110] Section 8.3.

3.2.3. Content-Range

Specifies that the attached body is a range on the journal.

3.2.4. Content-Location

The Content-Location response header, if present, provides a URI for the journal itself. In this case, a SUBSCRIBE request is the same as a GET request to the journal. However, see Content Negotiation for specifics [RFC9110].

3.2.5. Vary

If content negotiation was used to select from among different media types, then this must list Accept.

Specified in [RFC9110] Section 12.5.5.

3.2.6. ETag

The ETag of the journal is established when it is created and does not change unless the journal is recreated. All journal resources MUST have an ETag.

As desired, the server may "forget" all history and truncate the journal to empty. In this case, all existing connections to the resource must close, and the ETag of the resource, if any, MUST change.

3.2.7. Last-Modified

The Last-Modified field of a resource could mean a few different things, the most useful behavior may be describing the creation date of the journal rather than the last write (since writes are not really "modifications"; appends to the journal signal writes).

3.3. Behavior

3.3.1. Establishing the Subscription

A SUBSCRIBE request establishes a connection where the server sends a stream of updates to the client. Each update in the stream corresponds to a change in the resource, formatted as a diff or patch document.

3.3.2. Streaming Updates

When the media types of the resource before and after a change are the same, the server emits a PATCH method verbatim on all active SUBSCRIBE connections.

The server continuously sends these updates as long as changes are made to the resource or until the connection is explicitly closed by the server.

3.3.3. Handling Range Requests

  • If a Range request is included in the SUBSCRIBE request, the server returns the specified subset of the journal when possible.

  • If the partial content is finitely long, the response is closed after the subset is transmitted.

  • If the partial content is indefinitely long, the response remains open until the end is known and reached.

4. Examples

4.1. Basic Request

The following is a SUBSCRIBE request, and the assocaited snapshot-in-time of a partial response, reflecting the known contents of the journal up to that time. Note how the end of the JSON Patch document [RFC6902] is not closed, but can still be written out, and must be parsed by a suitable streaming parser:

SUBSCRIBE /example/resource HTTP/1.1
Range: bytes=0-

Response:

HTTP/1.1 200 OK
Content-Type: application/json-patch+json

[ { "op": "test", "path": "/a/b/c", "value": "foo" }
, { "op": "remove", "path": "/a/b/c" }
, { "op": "add", "path": "/a/b/c", "value": [ "foo", "bar" ] }
, { "op": "replace", "path": "/a/b/c", "value": 42 }
, { "op": "move", "from": "/a/b/c", "path": "/a/b/d" }

4.2. Multipart Response

If the patch format does not support streaming, or if the patch format is not self-syncronizing (the entire patch must be read from the beginning), or lacks other metadata that is desired (such as Date or ancestory information), then multiple separate patches may be delivered in a multipart response.

Support for the preferred multipart containers, and embedded patch media type, both need to be specified in the Accept request header.

The following is a SUBSCRIBE request where the client requests, and receives, a multipart/byteranges response indicating all of the modifications that have happened over time:

SUBSCRIBE /example/resource HTTP/1.1
Accept: multipart/byteranges

Response:

HTTP/1.1 200 OK
Content-Type: multipart/byteranges

--BOUNDARY
Date: Fri, 26 Jul 2024 22:08:20 GMT
Content-Range: bytes 0-99/*

This is the first line of a log file...
--BOUNDARY
Date: Fri, 26 Jul 2024 22:14:29 GMT
Content-Range: bytes 100-199/*

This is the second line of a log file...
--BOUNDARY
Date: Fri, 26 Jul 2024 22:22:10 GMT
Content-Range: bytes 200-299/*

This is the third line of a log file...
--BOUNDARY

This response is, again, a snapshot-in-time of a partial response; the response is still open, and the last boundary of a multipart response would normally end with a dash-dash.

It is hypothetically possible to break multipart encodings when streaming, since the recipient can asyncrhonously read the boundary, then write it into the resource. This can be mitigated by splitting the part into two at the boundary delimiter, but this may be undesired, or introduce side effects if the server needs parts to be atomic for some reason (e.g. each part is a separate transaction).

4.3. Pub-Sub

A publish-subscribe endpoint lets one client publish an event for many other clients to receive, in realtime, but only storing the most recent value, and realtime changes to that value.

In this case, subscribes may be issued with a PUT over the resource, and the new value is streamed with a SUBSCRIBE. The journal is pruned to show only the value of the latest PUT; on every PUT, the last ending offset becomes the new starting offset.

The server can even implement congestion control on a per-client basis by skipping over segments of the journal to "catch up" to the head, by emitting different parts of the journal:

HTTP/1.1 200 OK
Content-Type: multipart/byteranges

--BOUNDARY
Date: Fri, 26 Jul 2024 22:08:20 GMT
Content-Range: bytes 0-91/*

This is the first entry of a status message, which is
being written to very frequently...
--BOUNDARY
Date: Fri, 26 Jul 2024 22:08:21 GMT
Content-Range: bytes 800-983/*

By the time the previous received part was transmitted,
the server received many PUT requests,
and now the next state the client is ready to receive
is at an offset of 1000 bytes...
--BOUNDARY

4.4. Revision History

For applications such as a wiki, the complete history of the resource up to now may be requested by requesting all of the changes from zero through now. When the server reaches all the changes up to the request time, it will close the connection, because it has reached the end of a finite-length history.

4.5. Syncronizing Changes

If a client merely wants the list of changes since the last time it connected, it can request the changes using the last known etag and offset:

SUBSCRIBE /log HTTP/1.1
If-Range: "foo"
Range: bytes=5540-

The "If-Range" field instructs the server to ignore the Range field if the supplied ETag does not match.

5. Implementation Guidance

This section is non-normative.

SUBSCRIBE may be implemented per-resource, it does not need to be supported on all resources together. Often, only singluar resources will implement the SUBSCRIBE method. If the method is not implemented, the server should return 405 (Method Not Allowed) for the resource with an Allow: header, as described in [RFC9110].

6. Security Considerations

Many HTTP servers are not optimized for long-running responses. As with other methods that keep connections open, care must be taken to manage resources effectively to prevent abuse, such as denial of service attacks. Implementations should consider authentication, authorization, and rate-limiting mechanisms to ensure that only authorized clients can establish and maintain SUBSCRIBE connections.

7. IANA Considerations

7.1. SUBSCRIBE HTTP method

  • Method name: SUBSCRIBE

  • Safe: Yes

  • Idempotent: Yes

  • Cacheable: Yes

8. Author's Discussion

This section contains considerations for making edits to the document before final publication.

RFC Editor: remove this section for publication as RFC.

8.1. Scope

Things not in scope: Specific media types

8.2. Caching

A new caching flag may be necessary, for example "may-forget-head".

8.3. Retraction

Sometimes certain edits to a resource need to be retracted. How this is done exactly will necessarially vary depending on the patch media type used to represent the changes.

8.4. Relationship to COAP SUBSCRIBE

CoAP also has a SUBSCRIBE method, but it works somewhat differently. It may not be compatible at all, or may only be compatible with a subset of possible SUBSCRIBE requests.

8.5. Naming

Some other names for this are available. Perhaps WATCH, MONITOR, READ, FOLLOW.

9. References

9.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC5234]
Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, , <https://www.rfc-editor.org/rfc/rfc5234>.
[RFC7230]
Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing", RFC 7230, DOI 10.17487/RFC7230, , <https://www.rfc-editor.org/rfc/rfc7230>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
[RFC9110]
Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "HTTP Semantics", STD 97, RFC 9110, DOI 10.17487/RFC9110, , <https://www.rfc-editor.org/rfc/rfc9110>.

9.2. Informative References

[RFC5789]
Dusseault, L. and J. Snell, "PATCH Method for HTTP", RFC 5789, DOI 10.17487/RFC5789, , <https://www.rfc-editor.org/rfc/rfc5789>.
[RFC6902]
Bryan, P., Ed. and M. Nottingham, Ed., "JavaScript Object Notation (JSON) Patch", RFC 6902, DOI 10.17487/RFC6902, , <https://www.rfc-editor.org/rfc/rfc6902>.

Author's Address

Austin Wright