Another important building block for implementing non-tree-based multicasting strategies is the existence of packet identifications.

Contents

What is a Packet Id?

Packet ids have not been implemented in any PSYC software as yet.

We occasionally call them message ids, it's a bad habit and means the same thing.

Don't confuse packet ids with counters. Within a packet itself you don't need to provide a packet id, because all the information is there already. But if you want to act upon a packet, like issue a rerequest or reply to it, then it is appropriate to synthesize the packet id string and provide it in the _reply or other appropriate routing variable.

Even the new definition is not part of the specification yet, but it could be, if it makes sense to have it in there sometime soon.

Old definition

Here's the definition from the historic draft:

   {packet id} :=
      {source} {logical target} {counter} [ {fragment} ]
A packet is identified by the sending object in full uniform (that is including the hostname and maybe port number of the sending entity) combined with the logical target of the packet being either the context or the physical target(s).
Should the source also be the context, then the source is left out (See also Routing).
A different counter is used per source and logical target, in order to make the identifications utilizable in both unicasting and multicasting infrastructures.
If a fragment is given it is appended to the packet id.
If you are going to support multiple infrastructures, even if all of them are reliable, you MUST be capable of sorting out duplicate packets since PSYC group managers will make use of that ability.

New definition

Back then I expected contexts and targets to look differently, but by now we use the same uniforms there. Thus we need to add a way to distinguish a unicast from a multicast packet. Here goes:

   {packet id} :=
      {source} {logical target} [ {type} ] {counter} [ {fragment} ]
   {type} := <none> | 'c'

The type is either missing, if the message is unicast, or it is 'c' as in context. You figure this out simply by checking on presence of _context.

Using default channel instead?

Alternate idea: Don't use {type} in link id and packet id, instead use channels as a means to distinguish a target from a context. Practically, a nameless channel is the default context of a place or user, thus looking like psyc://my.psyc/@talk# instead of the unicast address psyc://my.psyc/@talk and psyc://my.psyc/~joe# for joe's presence subscribers. Ugly about that would be, that in all context messages we have one character more, and whenever we want to reply to a context by unicast we have to strip it off. Bad? Or consequent? Could we even declare contexts using a normal _target variable (how boring)?

String rendition

Also something I did not specify back then: We need a seperator character so we can make a string out of the packed id. Space would be nice, as it is excluded from uniforms, but it is not mouse-paste-friendly. Other suggestions?

<saga> i think we should send it as a list. we've got a syntax for that, so we should use it.
<lynX> sounds good, especially since we have a one-line list syntax.

Link Ids

I should probably split up the definition of packet id using a subdefinition of a link id.

   {link id} :=
      {source} {logical target} [ {type} ]
   {packet id} :=
      {link id} {counter} [ {fragment} ]

Link Identifications are necessary as an internal structure to handle the State for each one-to-one and one-to-many configuration which comes with its own counter.

So the link id does not necessarily have to be rendered to a string as suggested above. It may very well be a struct or array or something. Again it may be useful to strip off the source if it is identical to the logical target.

The MMP parser needs to keep a counter for each link id so it can detect missing packets or delete/prune duplicates. Some multicast strategies may also want to use those duplicates to keep alternate routes in their routing portfolio, although most strategies will always want the fastest arriving copy of a message.

Questions and Answers

<Monkey> What about an md5 hash (or similar) of the headers as a packet id? That would be unique enough and it could be computed later (only when needed) plus it would avoid sending extra identification data with most messages. This is probably way off base but I felt like writing it down...take it or leave it :)

<lynX> If you receive a packet numbered 16, and the last one you saw was 10, then you know, you are missing 11 thru 15. If you receive a packet with a hash, you can check that packet for being consistent in itself (although you don't know who made that hash in the first place, so it's not a guarantee for authenticity) and yes you could use the hash to get the packet again, although you already have it, but you still don't know if you missed any, and which.
<Monkey> Ah, of course. I guess I should have thought that through a little more ;)
<Fippo> Why do we need source and logical target as part of this id?
<lynX> We need a way to identify a packet. The {packet id} is only sent across the network during recovery of lost packets. Whereas during generation and transmission, source and logical target are already part of the communication, so only the MMP counter needs to be added or implicit by circuit stability. There is a different counter for each sender and recipient (with single peer multicasts just the _context is enough, see Routing).

Application

Recovery of Lost Packets

Packet Ids allow you to ask any participant of a context for a resend of missing packets, thus recovering on lost parts of a multicast.

File Casting

If you take that to a gigabyte amount level, it is effectively like the BitTorrent principle, only more realtime.

Referencing

Commenting on an event that happened before, as is popular with Microblogging.

Related Work

Several back-end protocols exist, intended to provide a degree of reliability somewhere between UDP and TCP, like SCTP for example. Some of them also consider the special requirements of acting in a multi-point relay fashion either for multicasting or mere redundancy.

This allows for retransmissions from inside the network rather than having to retransmit data from the source, as is the case with TCP.

In delay-tolerant networking, as devised by Vinton Cerf's team at JPL for the interplanetary Internet, the store-and-forward principle provided by protocols such as SMTP and XMPP (also noteworthy from pre-Internet times, UUCP and BITNET's RSCS) would be delegated deeper into the routing layers of the Internet. PSYC too has the ability to store-and-forward arbitrary payloads (any data that fits into a PSYC packet). Should a technology like DTN become generally available on the Internet, it could make these aspects of PSYC less needed, or could serve as a backbone technology that PSYC can build upon. The fact that Dr. Cerf has been working on this suggests, that existing protocols and open-source technologies aren't delivering that job yet. But that could be a wrong assumption. Also, since these protocols operate as an overlay network, they may have too high latency or bandwidth requirements for the terrestrial needs of PSYC users and simply not have the same set of requirements that we have. Still RFC 4838 does address a lot of very familiar issues. Further investigation may be appropriate.