psyc00.gif

PSYC is a flexible text-based protocol for delivery of data to a variable number of recipients or people, by unicast or multicast. It is primarily used for chat conferencing, presence, publish/subcribe and instant messaging, but not limited to that.

This specification is a work in progress. It is in draft status, incomplete and does not reflect the current real world deployment of PSYC.


Contents

Glossary


Node

A PSYC application which is reachable and interconnected with other PSYC applications is called a node in the PSYC network. A node always has both a protocol root and an entity root. It may also provide further entities of varying kind.

Leaf

A PSYC node which is not reachable from the outside by other PSYC nodes is called a leaf. Clients are usually leaves.

Server

A server is a common terminus for a PSYC node implementing any of person entities, place entities or context functionality. They may provide several functions beyond these, too.


Root

There are two concepts of root in PSYC. Depending on architectural choices, both roots, the protocol root and the entity root MAY be implemented in the same place in the source code, or not.

Root entity

The root entity in a server or application is the object that comes without anything behind the / in its uniform, as in psyc://psyced.org/. It performs jobs like authentication. This root is a full PSYC entity.

Protocol root

The protocol root instead leaves out the / as in psyc://psyced.org. It is used for negotiation for example in circuits and thus operates on the routing layer. If a packet is sent, and either the source or destination is not explicitly or implicitly addressed, the protocol root is intended.


Entity

An entity is any object that can be addressed by a uniform using the appropriate protocol, in most cases either a person entity or a place entity. Entities in PSYC must have:

  • state: the own state and the state of other PSYC entities. A state is usually a set of persistent PSYC variables.
  • trust network: know who the entity trusts and who it can ask for second hand trust.

Entity Behavior

There is a set of basic entity types, which implement a specific behaviour to meet the needs of a chat network. These are the specified entities PSYC currently knows:

Provided for in the spec, but not currently implemented:

Some entities implement a command interface, to let users actually do something with the entities (like setting the topic of a place).

Channels

Each entity has at least one context, which is reachable via his uniform. Some entities, like the person entity, offer multiple sub contexts, called channels, which are addressed by the channel part of its uniform. For example, if psyc://psyced.org/~lynX is the UNI of the person entity, the context where presence updates are multicast, may have the uniform psyc://psyced.org/~lynX#_presence.

Just like an anchor in HTTP URIs, the channel is not a new entity itself but merely defines a different view on the original entity's context. Channels define a variation in routing and its logic (typically a subset of the entity's subscribers), but do not need new subscription procedures as they inherit the trust relationships of the owning entity. Receiving entities are required to accept these messages immediately as if sent by the entity.

For a list of predefined channels, look at the specification of the entity types. Further channels can be provided and named freely, according to the needs of the application.

Channel inheritance

Channel names are bound to rules for keywords and define their own keyword namespace for the purpose of defining standardized compact channel names like p for _presence. Channel inheritance is defined as follows: Any subscriber of a subchannel is also regarded as the subscriber of the parent channel and the context as a whole, and thus receives traffic transmitted to it, but a subscriber of a subchannel does not receive traffic directed to a completely different subchannel. Examples:

A subscriber to psyc://psyced.org/~lynX#_presence_verbose will receive packets transmitted to psyc://psyced.org/~lynX#_presence_verbose, psyc://psyced.org/~lynX#_presence and psyc://psyced.org/~lynX but not to psyc://psyced.org/~lynX#_rants.

A subscriber to psyc://psyced.org/~lynX#_presence will receive packets on psyc://psyced.org/~lynX#_presence and psyc://psyced.org/~lynX but neither psyc://psyced.org/~lynX#_rants nor psyc://psyced.org/~lynX#_presence_verbose.

Channel state inheritance

This affects also the way state is seen by subscribers of various channels. A subchannel may have a differing member list from its parent channel by applying its own modifications to it. Modifications to the parent channel still affect all of the subchannels. Member lists just being an example, as they are not required for channel routing and currently only serve informational purposes.

Here's an example how this can be useful. Consider a news company that provides headlines and discussion channels. It may want to allow its subscribers to select certain themes. It would find it useful to keep general information like a company logo in a top-level state like psyc://news.example.com/@news, then provide theme headlines in channels like psyc://news.example.com/@news#_sports, and allowing users to subscribe psyc://news.example.com/@news#_sports_talk, where they not only receive the psyc://news.example.com/@news#_sports headlines, but are additionally permitted to talk about them without disturbing the subscribers of the parent channels. This gives the company the possibility to make important global announcements on psyc://news.example.com/@news without having to send copies to a whole stack of chatrooms, because the top channel has all subscribers right there.


the UNL

the Uniform Network Location is the effective IP address where a user agent (a client) or other mobile entity is operating for a user (or other UNI), usually in combination with an allocated port number. UNLs are volatile, as the current IP number of a client changes frequently, and are typically given to trusted friends for peer-to-peer activity. Usually the UNL points to the root entity of the user's client.

Example: psyc://localhost:4711/

NOTE: How locations work and how they are linked to the identity is not covered in this Specification


the UNI

the Uniform Network Identification is a permanent uniform of an entity in the PSYC network. Something you can print on your business card or publish in other Internet media. In some cases the entity answering at a UNI will forward or redirect messages to a location addressed by a UNL, in other cases the entity is self-sufficient. UNIs are primarily employed by identities and places.

Example: psyc://psyced.org/~brain

Location

A location is generally a person's client, but it may be any piece of software with a variable address linked to any identity providing some service.

A UNL is used to address a location, it usually reveals a user's current IP address, as for example in psyc://10.12.14.16:49321.

NOTE: How locations work and how they are linked to the identity is not handled in this Specification

Identity

An identity is a linkable entity, typically a person entity, but a service identity may be linkable, too.

A UNI is used to address an identity and the home server is where a user's own identity resides.


physical source

The physical source of a packet is either the _source or, if the _source is not given, _context.

logical source

To determine where the message came from _logically_ you need to get the logical source. The logical source of a packet is in this order (first hit counts): _source_relay, _source, _context.

logical target

The logical target of a packet is either the _context or, if _context is not set, the _target.

logical context

To determine which logical context the message belongs to you have to determine the logical context. It is found in this order (first hit counts): _context, _source_relay, _source.

See logical source above to determine who the sender of the message was.


Uniforms

Uniforms are a superset of URIs, they are byte strings with this general syntax:

 uniform       = scheme ":" opaque-string
 scheme        = 1*<prinable ascii chars except ":">
 opaque-string = 0*VCHAR

To address entities in the PSYC network a new (URI) scheme has been defined: 'psyc' (see below).

PSYC nodes (especially servers) MAY allow routing of packets based on just the scheme name part of the uniform. This allows gateways to use alternate addressing schemes. (For example IRC_URI).

See also Spec:Routing about this.

ABNF

This ABNF is a bit informational and not completely formal, the general syntax for a URI should be taken from RFC 2396.

 uniform     = "psyc://" hostname [ ":" port [ transport ] ] [ "/" [ object-name ["#" channel ] ]
 hostname    = <hostname, see also 'host' definition in RFC 2396 section 3.2.2.>
 port        = <a port number (should be negative for non connectable ports!)>
 transport   = "c" | "d" | "s"
 object-name = [ type ] name
 type        = "@" | "~" | "$"
 name        = <see 'opaque_part' in RFC 2396 section 3. (any printable characters except "#")>
 channel     = <see 'opaque_part' in RFC 2396 section 3.>

port

Port numbers are typically used when connecting to a client process in a peer to peer manner. Servers would use DNS SRV instead, if for some reason they cannot use the canonical port number.

transport

The 'transport' part of the uniform defines what kind of transport is used. The absence of the transport denominator character implies a recommendation to pick an option at discretion of the caller.

  • "c" - a TCP circuit
  • "d" - a UDP port
  • "s" - a TLS circuit

See also Circuit.

type

Channel

About the semantics of the channel part of a uniform, see Spec:Entity.

References


Protocol Keywords

This document describes the keyword naming strategy and syntax as used by PSYC.

All keywords used as method name of a Packet have a special syntax. This syntax allows generic treatment of specialized methods.

ABNF

Keywords consist of ASCII characters and have the following syntax:

  keyword          = 1*short-subkeyword *long-subkeyword
                   =/ 1*long-subkeyword
  
  short-subkeyword = printchar
  long-subkeyword  = "_" 1*alphanumeric
  
  alphanumeric     = %x30-39 / %x41-5A / %x61-7A 
                      ; The ASCII characters 0-9, a-z and A-Z
  printchar        = <see Spec:Packet for definition of printchar>

Examples

This is a long form of a keyword: _reply_error_invalidTarget_noSuchObject. And the short form for it is: reto

Keywords can also be mixed: ret_invalidNamingSyntaxForThisSite.

Short vs. Long Forms

All keywords defined within this specification have a short form and a long form. However, some special or experimental keywords don't have short forms until they are standardized.

All implementations MUST accept both forms for the keywords defined in this specification.

It is recommended to only pass long forms to applications in programmer APIs, so that short forms stay a simple protocol level optimization. Thus, mapping from short forms to long forms should happen right after parsing.

Keyword inheritance

A keyword consists of a hierarchical set of subkeywords. The meaning of the keyword becomes more precise with the number of additional subkeywords appended to its tail.

That means that the receiver of an unknown keyword can strip off subkeywords from the end of the keyword until it recognizes it. With this the receiver can process the packet by only using the known part of the keyword. This implements a form of semantic inheritance that we call keyword inheritance.

Method Families

PSYC packets usually have a method, which adhere to keyword naming. A method has to belong to one of the following method families. Each family comes with a default semantic and behavior. Follow the links for documentation of each family.

It is impossible to support all existing methods, because in PSYC it is legal to add custom extensions, derived by inheritance. An application MUST be able to handle all standard method families, and it MUST be able to handle any unknown method by applying inheritance in order to find a known method it is derived from. It MUST also be able to display any text message it carries using the psyctext template rendering, when appropriate.

All packets with a method that belongs to the _request family SHOULD have a _tag attached (see also Spec:Routing-Variables about Tagging).

NOTE: See also _request_context_enter.

Messages and Requests

Information reporting

Failure reports

Data interchange


PSYC Packets

PSYC packets travel the wire either using TCP circuits or by UDP. UDP is typically used for multicast notices to large contexts whose successful reception is not critical. The PSYC packet format is mostly line-based with some exceptions.

ABNF

(See also ABNF)

PSYC packets are byte sequences which have the following syntax:

  packet     = routing-header [ content-length content ] "|" LF
             ; the length of content is either implicit (scan until LF "|" LF)
             ; or explicitly reported in content-length.
  
  routing-header = *routing-modifier
  entity-header  = *sync-operation *entity-modifier
  content        = entity-header [ body LF ]
  content-length = [ length ] LF
  
  routing-modifier = operator variable ( simple-arg / LF )
  sync-operation   = ( "=" LF / "?" LF )
  entity-modifier  = operator variable ( simple-arg / binary-arg / LF )
  
  body       = method [ LF data ]
  
  operator   = "=" / ":" / "+" / "-" / "?" / "!" / <more reserved glyphs>
  simple-arg = HTAB text-data LF
  binary-arg = SP length HTAB binary-data LF
  
  length      = 1*DIGIT
  binary-data = <a length byte long byte sequence>
  
  method     = 1*kwchar
  variable   = 1*kwchar
  text-data  = *nonlchar
  
  data       = <amount of bytes as given by length or until
                the (LF "|" LF) sequence has been encountered>
  
  nonlchar   = %x00-09 / %x0B-FF             ; basically any byte except newline
  kwchar     = <alpha numeric ASCII char or "_">
  
  For the definition of DIGIT, VCHAR, SP, LF and HTAB see RFC 2234 (ABNF).

List syntax

Either text-data or binary-data can contain lists, which adhere to the following syntax (in ABNF):

 list         =  binary-elem *("|" binary-elem)  ; for binary values
              =/ "|" text-elem *("|" text-elem)  ; for visible/non-binary characters
 binary-elem  = length SP binary-data
 text-elem    = *nonlpipechar
 nonlpipechar = %x00-09 / %x0B-7B / %x7D-FF ; any byte except newline and "|"

Either format can appear in either data container! This list syntax is only valid for variables of the _list type that start with _list.

Example packets

The following examples illustrate the syntax. Consider the names of variables and methods ficticious, some of them are historic, some like _list_topic will probably never be.

This is a simple example packet:

   |
   :_source            psyc://example.symlynX.com/~fippo
   :_target            psyc://ente.aquarium.example.org:-32872
   
   :_nick  fippo
   _info_nickname
   Hello [_nick].
   |

And this is an example packet that covers most of the BNF rules above:

   |
   :_context           psyc://example.org/@democracynow
   :_target            psyc://ente.aquarium.example.org:-32872
   
   :_list_member       |psyc://example.symlynX.com/~jim|psyc://example.org/~judy
   :_list_topic        9 democracy|3 now
   :_list_image 9213   4404 <binary data>|4798 <binary data>
   :_list_owner 26     |psyc://example.org/~judy
   :_image 4212        <binary data>
   _status_context
   In [_context:_nick]: [_list_member:_nick]
   |

This example uses entity-oriented psyctext. The images could be used to decorate a member list, but the normal approach should be to obtain the member images from the state of each member entity. So this example really serves the purpose of showing several possible encodings of lists and data.

The decision which strategy to pick is left to the implementor, mainstream server implementations should choose an application-developer-friendly style however, which is yet to be defined.

Another example:

|
:_source        psyc://base.example.org/~k
:_target        psyc://localhost:1234
175
:_color #CC0000
:_nick  k
:_nick_target   psyc://localhost:1234
_message_private
hi there. this message contains NL | NL here:
|
but it doesn't matter because it has length!
|

Content length

Routing header and content are separated by a special line that is either empty or contains the length of the content (in bytes).

If no length was provided, the bytes after the routing-header are parsed as the content rule in the grammar defines. That means that data ends at the first LF "|" LF.

This means that as soon as you use binary data that might contain LF "|" LF in the content you MUST report the content length.

If content length was provided, the given number of bytes is read from the byte stream after the routing-header and then processed as the content rule in the grammar defines, which means that data ends with the end of the content (This makes it possible to transmit arbitrary opaque binary data as the data part of a packet).

Encoding

Variable names and methods are ASCII encoded strings while the contents of body or the arguments of variables are kept in as UTF-8 unless specified otherwise.

Variables

There are two kinds of variables: routing variables and entity variables.

Routing variables can be persisted (which means the variable is set via the '=' operator (or '+', '-') in the persistent variable set (see below)) and modified during the course of the existence of the circuit, making this a simple mechanism for protocol compression, whereas context entities may persist variables (meaning: they use the '=' modifier to set persistent variables) for all their members to keep for the entire duration of existence of the context, making it a decentralized storage vehicle.

Each context has it's own set of persistent entity variables. If the _context routing variable is NOT set persistent entity variables MUST NOT be changed. This means that persistent entity variable changing modifiers can only be used when _context is set, and thus updating persistent entity variables can only be done by a context.

Should a modifier change a persistent entity variable but _context is not set, the violation SHOULD be acquitted with a _failure_unsupported_state_persistent error packet and the circuit MAY be terminated.


NOTE: In theory also unicast entity communications between a _source and a _target could each define a set of persistent variables. Such entity state (as opposed to context state) is however currently not supported as it raises storage requirements of PSYC implementations more than it is likely to prove useful. It is reserved for possible future use.

Out-of-context communication may however still refer to persistent variables from in-context communication in its psyctext template.

Each packet defines a set of current variables which may be different from the persistent set of variables. When passing the variables to an application, the programming interface SHOULD merge current routing and entity variables into a single structure.


routing and entity modifiers

Each packet comes with a set of routing modifiers and entity modifiers. The routing modifiers belong to the routing-header and are separated by a newline from the entity modifiers (entity-modifier).

The routing modifiers modify the current and persistent routing variables. The entity modifiers modify the current and persistent entity variables.

This means that after routing modifiers (routing-mods) have been processed the persistent variables for the sending entity need to be loaded (persistent context slave) before the process of handling content is started (entity-modifier).

Recommendation: The modification of incoming and outgoing routing variables should be done by the circuit whereas entity variable modifications are generated by the entities and received by context slaves.

Current and persistent variable handling

When a packet is being parsed, the modifiers modify the set of current and persistent routing and entity variables. To do so, the set of current variables is initialized by the set of persistent variables before the modifiers are applied.

After the packet has been processed, the current routing and entity variables are the significant variables that belong to the instance of the parsed packet.

NOTE: You have to make sure that you don't apply the changes to the permanent routing variables until the whole routing header has been parsed. Same with the entity variables.

Operators

  • = – The variable is modified in the set of current variables and the set of persistent variables. If no variable name is provided the persistent variables and current variables are deleted.
  • : – The variable is only modified in the set of current variables.
  • + – For _list variables, the elements in the modifier argument are appended to the permanent variables and the current variables. Further types MAY define custom uses of this modifier.
  • - – For _list variables, the elements in the modifier argument are removed from the permanent variables and the current variables. Further types MAY define custom uses of this modifier.
  • ?State sync request.
  • !$@%&*/#;, – Reserved for as yet undefined state operations.

Keyword Naming

The ASCII strings, denoted by the non-terminals variable and method, have to adhere to the keyword naming specification.

MIME Content Type

The content type for complete PSYC packets themselves is message/x-psyc (uncaring of the content-type of the data contained within). It needs to be encapsulated in an 8-bit transparent way, as it may contain binary data.

Example

Implementation Note

The "Parse" about page gives practical instructions how to write a PSYC parser.

NOTE: When an invalid packet is received, and the routing header has already been parsed successfully, and it has been processed partially, it may or may not have affected state changes before being dismissed as invalid. Thus, after receiving an invalid packet (which means, there are syntax errors in the content part of the packet), the current state data of the addressed context MUST be invalidated.

NOTE: With the new syntax strictly expecting LF between lines, no longer accepting CRLF like the old syntax, you can no longer use telnet for testing purposes. Please obtain netcat (nc command) or similar.


Routing variables

FIXME: _time (or was it _date?) family which typically contain the timestamps of earlier multicast messages need to move into routing as well, since routers need to know if they are retransmitting an earlier packet, or are doing a fresh multicast.

These are the basic variables defined by the routing layer of PSYC. How these variables interact in unicast and multicast packet routing can be read on Spec:Routing.

  • _source - The source entity uniform the packet comes from.
  • _target - The recipient entity.
  • _context - The context (a chatroom, a newsroom, a presence subscription or other group or channel).
  • _source_relay - The original sender of a message. Eg. set to the participants uniform when a group relays messages to it's participants.
  • _source_identity - Contains the uniform of the person entity the _source of the packet is linked to. This is used for example when a client wants to speak for his person entity, e.g. psyc://localhost:-21345 wants to speak in the name of psyc://goodadvice.pages.de/~heldensaga. See also Spec:Person about more details about the semantics.
  • _target_relay - The original target of a message. When a client uses the Identity to which it is linked to as relay to send a message. See the client interface for more information
  • _tag - A tag that should be refered to when this message is replied to or forwarded. The tags can be used to associate replies to their original requests. (Implementations might use callbacks to signal a response to a request).
  • _tag_relay - The original tag of the message this message is a reply to or forward of.

During the process of routing keyword inheritance is not in effect. A _source_relay MUST NOT be treated as if a _source was given. Inheritance is to be applied to routing variables only in the entity layer.

Tagging


The routing variables _tag and _tag_relay are used to implement a special behaviour called Tagging in PSYC.

It is mostly used to map a response back to its request and gives the sender a way to easily figure out which replies were triggered by packets it sent and thus handle responses asynchronously.

When an entity receives a packet and intends to send a response to it, it MUST copy the contents of the _tag variable to the _tag_relay variable of the response.

The contents of _tag is opaque and can be an arbitrary value. It is strongly recommended to generate globally unique ids for tags.

Should the routing layer of any involved router encounter an error while delivering a tagged message and intend to generate an error message, it SHOULD copy the tag to a _tag_relay variable in the error report.


Entity variables

These are popular variables defined by the entity layer of PSYC.

More variables are defined by each method and entity.


State Synchronization

There is a set of entity variables associated with each context. They are called the state.

Contexts update their state via the =, + and - operators.

To synchronize with a context one has to send a '?' operator on a line by itself to it, with the _target being the uniform of the context. . You MAY combine this with a method, or leave the method out. The meaning of '?' is independent from it.

 |
 :_target     psyc://psyced.org/~elmex#friends
 
 ?
 |

The answer from the state is a state reset packet:

 |
 :_context   psyc://psyced.org/~elmex#friends
 :_target    psyc://127.0.0.1:-3234/
 
 =
 =_list_members    |psyc://psyced.org/~lynX|psyc://hancke.name/~fippo
 |

Notice the leading lonely '=', which tells the _target to reset the saved state of the _source.

Again, you MAY provide a method along with this packet, should you want to combine the meaning of '=' with the meaning of such method.

Further updates of the state are either sent explicitly, or the changes are transmitted along with regular context packets, such as _notice_context_enter and _notice_context_leave.

Messaging

The original purpose of PSYC is to transport messages from people to people. The _message method family is used to represent messages in one-to-one and one-to-many conversations. They are used to deliver human generated text messages to humans.

Message types

There is a basic set of message types:

  • _message - A conversational message.
  • _message_echo - Contains the echo of a _message, which was originally sent by an entity.

Content Type

Messages can have special content types. The content type of a message is given in the _type_content variable.

The default _type_content for _message is text/plain as defined by the MIME standard, whereas all other PSYC methods by default have a content type of text/x-psyc.

All of these content types use LF for line endings like HTTP, not CRLF like MIME does. In fact the protocol as such uses LF for its line endings, even if encapsuled into MIME.

text/x-psyc

PSYC has a template syntax by the MIME type text/x-psyc also known as psyctext. It allows for errors, warnings and other automatically generated messages to be both readable for humans and automatons. The variable names in brackets SHOULD be replaced by the actual content of the variable, when presented to the user.

     Example:
     	:_method        i
       _error_unsupported_method
       No such method '[_method]' defined here.
       |

This should display as

       No such method 'i' defined here.

When no match is found, like in the case of a template containing array[i++] where a variable called i++ does not exist, the string remains unchanged. This is not an error.

Warnings

Messages of the _warning method family inform the sender of a problem, but not as grave as an _error or a _failure. It means the sender's request has been processed, but maybe not as intended.

The most common variant of _warning is _warning_usage. It returns usage syntax of manual commands. It is customary to have a derivate of this for each command, as in _warning_usage_mandate.

Errors

Messages of the _error method family inform the sender of a problem on the sender's side.

Standard derivate families of _error are:

  • _error_duplicate – something has been provided twice or more.
  • _error_illegal – something is not legal.
  • _error_invalid – something is not valid.
  • _error_necessary – something has not been provided.
  • _error_unavailable – something is not available.
  • _error_unknown – something does not exist or isn't known.

Failures

Messages of the _failure method family inform the sender of a problem on the recipient's side. A failure is the inability to accomplish a task (a request or other) on the side of the source issuing the _failure.

A classic example would be _failure_filter_strangers indicating that your message cannot be delivered to the recipient as she has denied reception of messages from strangers.

Standard derivate families of _failure are:

  • _failure_deliver – could not deliver to destination.
  • _failure_necessary – something needs to be provided on recipient's side.
  • _failure_redirect – could not accomplish request because something has moved elsewhere.
  • _failure_unavailable – something is not available although it should be.
  • _failure_unsupported – something is not implemented although it should be.


Circuit Definition

A circuit is a virtual connection between two PSYC nodes. Packets between any two entities that are leaves of the nodes are sent over the circuit. There should normally be only one circuit between two nodes.

The circuit provides:

  • bidirectional reliable message delivery
  • verification of sender address

In addition, it may provide

  • encryption and compression between the root entities

Overview of packet delivery process

TODO: This specification does not discuss relaying, that is when a sender trusts an intermediate node to forward the message to a destination, and the intermediate node has a reason to trust the sender in order to implement such relaying operation. Since even simple notification applications depend on this behaviour, it should be documented in the Spec somewhere, and taken in consideration here:

When a packet is sent to a non-local address, the following steps happen:

  • the packet is queued until delivery
  • the _target uniform is parsed and the hostname/port is extracted
  • the hostname is resolved to obtain an IP address
  • if there is no circuit to that IP address, a new circuit is opened
  • if a trusted public key or X.509 certificate for the host is available, a TLS connection is initiated on the circuit
  • when either type of connection is established, an empty packet is sent by the initiator, this establishes the circuit
  • after opening an unencrypted circuit (or when reusing an unencrypted circuit), the initiator requests sender address verification from the accepting side. In the same step, it may require additional verification information from the accepting side
  • the accepting side attempts to verify the sender address of the initiating side and answers the verification request for the root _source and _target addresses
  • the initiating side receives the verification message. If the result is positive, the initiating side may start its own verification process, if it requested and required the accepting side to provide further information
  • the initiating side begins to push any queued packets.
  • if the accepting side receives any non circuit establishment packet from the initiating side, it may conclude that the initiating side has successful verified the sender address of the accepting side and may start to use the circuit in the reverse direction.
TODO: Specification on when and how errors are returned to sender and queue is dismantled (Check the source, Luke)...

Hostname resolution

When a PSYC uniform has to be mapped to a host and port pair and no port is given in the uniform, a DNS SRV lookup SHOULD be performed to retrieve the real host and port number. The lookup is performed on _psyc._tcp.hostname (See also: RFC2782).

NOTE: The DNS SRV lookup SHOULD NOT return more than one host:port pair, as PSYC currently does not support mapping of hostnames to multiple IPs. Multiple hostnames of a PSYC node SHOULD NOT resolve to multiple ip:port combinations. These things have to be taken care of by the administrators of the PSYC nodes at the moment.

When UDP is selected as a transport, _psyc._udp.hostname MAY be consulted.

Should the SRV lookup fail or a port number was provided in the uniform, a forward hostname resolution of the A/AAAA or CNAME records is performed.

The resulting IP number and port of the resolution process may point to an already existing circuit. That circuit SHOULD be reused. If the root of the uniform hasn't already been verified as being a valid _target for that circuit, a _request_authorization is necessary.

SRV lookup may be circumvented by the user by allowing him to override the physical host and port numbers explicitly. This is different from restricting the user to provide a different uniform containing the port number.

A minimal application MAY leave out DNS resolution altogether, if the implications are understood and accepted. Best practice for SRV usage in 2008 is to support it in all PSYC applications, but to not deploy SRV usage in actual installations, if possible.

Circuit endpoints

The endpoints of a circuit are defined by the two IP addresses and port numbers they are connecting.

Should no public port number be available, the receiving peer port number is used, prefixed by a minus sign (thus, a negative port number). Such an endpoint is only routable while the connection is established, after that any messages that wish to be routed to an address with a negative port number should trigger a _error_network_connect_invalid_port error.

This means, that a PSYC node that wants to deliver a packet for host A has to resolve its IP address and look whether it already has a circuit to that host.

If a circuit exists, but hostname A hasn't been authorized on it yet a _request_authorization request needs to be issued.

If no circuit exists yet, a new circuit has to be established.

TCP

Connection establishment

Immediately after a TCP or TLS connection is established, the initiator sends an empty greeting packet (consisting of a pipe and a line feed, aka "|\n"). This establishes a circuit. The initiator should not wait for the the accepting side to reply to this packet as explained below, but may proceed requesting circuit features from the accepting side, or transmitting content that doesn't require any particular circuit features. The accepting side must reply to the empty greeting packet by answering with an empty packet, to acknowledge the establishment of the circuit.

Sender address verification

Rationale
Sender address verification serves the purpose of giving a recipient server time to check the validity of a sender (traditionally that would be an asynchronous DNS request, but in future it could be a synchronous calculation of trust metrics) before allowing him to send. Why do this? Because an attacker might send large packets that the recipient would need to queue until she knows they are invalid, creating an opportunity for a denial of service attack. There may be circumstances that make this level of paranoia unnecessary. Also, the receiving server could tell the sending server by which names she already expects him to send (Hello foo.bar, foo.baz, bar.tender).

The protocol flow is as follows: The initiating side sends tagged _request_authorization with the entity variable _uniform_source containing a single source root uniforms that the sender wants to get authorization for. The _uniform_target entity variable contains a singletarget root uniforms that the initiator presumes to be hosted on the recipient server.

|
:_source
:_target
:_tag sometagasexplainedinexplanationofrequestresponsetagging

:_uniform_source psyc://example.org
:_uniform_target psyc://example.com
_request_authorization
|

This message may occur at any time. First, the receiving side checks if the root uniform given in the _uniform_target entity variable is indeed hosted on it. If that is not the case, it sends back an _error_invalid_uniform_target as follows:

:_source
:_target
:_tag_relay sometagasexplainedinexplanationofrequestresponsetagging

:_uniform_source psyc://example.org
:_uniform_target psyc://example.com
_error_invalid_uniform_target
|

The receiving side additionally verifies that the root uniform contained in the _uniform source variable is permitted to send on that circuit. The preferred method for this is that the initiating side has provided a trusted certificate which contains the identity; another method is a reverse DNS lookup that takes into account the existence of SRV records. If that check fails, the receiving entity answers with a _error_invalid_uniform_source as follows:

:_source
:_target
:_tag_relay sometagasexplainedinexplanationofrequestresponsetagging

:_uniform_source psyc://example.org/
:_uniform_target psyc://example.com/
_error_invalid_uniform_source
|

In both cases, the _uniform_source and _uniform_target entity variables should be sent back as-is.

If both _uniform_source and _uniform_target are considered acceptable, the receiving entity sends back a confirmation:

|
:_source
:_target
:_tag_reply sometagasexplainedinexplanationofrequestresponsetagging

:_uniform_source psyc://example.org/
:_uniform_target psyc://example.com/
_status_authorization
|

Note: if a client sends a packet to the accepting side, the accepting side may consider the hostname portion of _target to be verified. It may also consider the hostname of each target to be acceptable.

TODO: Fuer den fall das die seite wo der initiator hinconnected, dieser schon connected ist, wird ein _failure_redirect geantwortet, und in _source_redirect die eigentliche uniform der destination einem mitgeteilt, zu der man evtl. schon verbunden ist. Man muss dann erst gucken ob man dorthin schon ne verbindung hat, und dort die _request_verification fuer den target host (welcher uns ebend redirected hat) nochmal machen mit _request_verification und _list_targets (aber mit leerem _list_sources, fuer den fall in dem wir das schon ausgehandelt haben dort).

Encryption using TLS


Connection establishment

Immediately after establishing the TCP connection, the initiating side enables TLS. A TLSv1 client handshake MUST be sent. Backward compatibility to clients sending SSLv2 client hellos is only permitted on connections to leaves, however SSLv2 MUST NOT be negotiated. The initiating entity should use the TLS server name indication specified in RFC 4366. This may even be done on the default PSYC port, as the receiving side can use the following heuristic to determine if the initiating side is using TLS: The incoming connection is considered to be TLS if the first received byte is 0x16.

After the handshake the initiator and the recipient MUST also check the TLS certificates. If they are not valid and trusted, either side MAY break the connection. In case of an untrusted certificate, the certificate SHOULD be logged and the administrator SHOULD be notified of the failed communication attempt. See RFC 6125 for BCP regarding this check.

Note: the usage of a TLS-DHE ciphersuite with ephemeral Diffie-Hellman parameters is recommended.

Requesting a circuit feature

Same as TCP, client should not offer the _compression module, as this is handled by TLS already.

Recipient handling of a feature request

Same as TCP, server should ignore the _compression module if offered by the client.

Sender address verification

Sender address verification relies on X.509. The sender may use any host names contained in the subjectAltName/dNSName extension. The same applies for the receiving side. Use of a host name not listed MUST lead to an immediate termination of the circuit. DNS and its associated verification methods MUST NOT be used.

Canonical uniform

A circuit may be initiated by connecting to a uniform which isn't the actual canonical uniform of the recipient. For example connecting to psyc://example.org:4404 may actually be answered by an entity that carries psyc://example.net as its canonical name. The canonical name obviously has to be checked for validity.

Should this procedure have created two circuits to the same node, it makes sense to dismantle one again.

Outgoing uniforms (as typed by a user or clicked on a link) should not be stored persistently (except to cache the knowledge, which actual circuit they belong to, in case the user insists on using them). Entities should always perform a subscription procedure before communicating, therefore they will always be directed to the canonical uniform before completing the subscription procedure.

Canonical uniforms may at a later time be deprecated, in this case they should issue permanent redirects to the new canonical uniform. See the paragraph on redirection.

As a simplification in host authorization we presume that there are no malicious alternate nodes on the same host of a legitimate node willing to impersonate such legitimate node. That is also why processes on localhost are usually given full trust. Thus, any process on a host can produce messages in the name of other entities residing on that host.

Still, there can be several nodes on the same host using different canonical uniforms (different domains or even just different port numbers) as long as they co-exist peacefully.


Routing

When an entity sends a packet to another entity the destination root entity has to decide where to put the packet next. Either it relays it to a remote entity or delivers it to a local one.

Where packets come from and where they should go to is mainly defined by the three routing variables _source, _target and _context.

Contents of _source, _target and _context

The routing variables can contain any uniform scheme. PSYC nodes SHOULD NOT assume that only psyc: scheme uniforms appear in these variables.

When the uniform scheme matches the psyc: scheme routing is performed as specified in this section.

When the scheme of the uniform in _target is not known the PSYC node MAY choose to forward the packet to a locally connected gateway. When receiving packets from a gateway _source and _context can also contain unknown uniform schemes.

NOTE: The psyc: scheme allows for more fine-grained routing decisions in the PSYC node and the PSYC network in general. This means that PSYC nodes know how to resolve PSYC uniforms to other hostnames and ports and can establish circuits to them to deliver the packet. Other schemes usually only travel between gateways and the PSYC server node they are connected to, no further resolution of the uniforms is done in this case.

Overview of basic routing operations

The basic operations of routing are defined by the presence or absence of the three _source, _context and _target variables. The various multicast strategies build on top of that. So let's first look at the important constellations, then open up a window to future PSYC developments.

Regular Unicast Messages

A regular unicast message uses _source and _target, no _context.


A message with a _source only, is a message to the root entity of the receiving node. (Typically a user requesting the amount of logged in users or similar.)

A message with a _target only, is a message coming from the root entity of the sending node. (Typically a short-lived script or automation like wikinotify, which sends some event notification to a given recipient.)

NOTE: Setting persistent entity variables is invalid for unicast messages in current PSYC. This means you MUST NOT try to apply modifiers (see also Spec:Packet) that modify end-to-end entity variables (eg. with the = operator) in messages without a _context set. Supporting this special case would cause plenty of extra implementation complexity with little advantage, and is therefore reserved for a potential future use. templates MAY still refer to persistent variables of the entity's context, however.

Multicast Messages

This is a message that comes from a context. (for example a chatroom, a user's own presence multicast or some other kind of context) whose recipients are defined by the context subscription mechanism only has the _context routing variable set. The original sender is delivered by _source_relay.

These messages have to be multicast to all members of a context. The general rule is: A multicast message SHOULD travel a physical Spec:Circuit only once.

For this to succeed a PSYC node needs to keep track of who is member of either a local or a remote context. There is a special case with person entities.

NOTE: The term for the manager of context multicasts in psyced is context slave.


Unicast Message for State Signaling

A message with _target and _context is a unicast from the place to a single member. The affected state however is the state of the self-sending _context. This is for example used to bring a new place member's state up to speed with the current conversation.

Future Multipeer Multicasting and Signaling

  • A message with _source and _context is reserved for future use (It will probably be used for peer-to-multicast messages, as defined in a future PSYC version).
  • A message with all three routing variables _source, _context and _target set is also reserved for future use (It will probably be a unicast from _source to _target, which configures the state between _source and _context. In a peer-to-multicast scenario this is a message that each sending participant unicasts to a new member of the group, given he finds his outgoing state important enough to do that. Other solutions to achieve the same result would be to reassign the state with the next posting, or not to use much state at all.)

Routing Variable Table

Same information again in a graphical style.

_context _source _target ... and what to do with it
unicast from root to root
x unicast from root to target
x unicast from source to root
x x unicast from source to target
x multicast from context to members
x x unicast from context to single member
x x invalid: multicast from source to context members
x x x invalid: unicast from source to single member

All invalid configurations are reserved for future use and only invalid in current PSYC version.

To support these routing rules we define some helper terms like logical source in Spec:Glossary.

Relaying

Some PSYC nodes (usually servers) or entities may provide the ability to relay packets.

Relaying MAY be done by servers which receive packets where the _target variable doesn't address a local entity, then the root entity will relay the packet.

When a packet is relayed it's original _source is moved to the _source_relay variable. The uniform of the relaying entity (either the root entity of a server or some other relaying entity) is then provided in the _source field.

NOTE: Usually PSYC roots limit relaying to trusted entities and sources, like for example Spec:Circuits from localhost.


Multicast Contexts

PSYC implements multicasting to deliver messages, events and information in general from one to multiple recipients. Contexts provide a semantic model to provide this functionality. Entities usually provide one or more contexts, which other entities can subscribe to.

Subscribing to an entity's context means that you want to receive messages and events related to that entity.

Subscribing a context (a multicast sender) whose data interests you is a fundamental operation. To protect yourself from SPIM it is necessary, that close to all communications have been created by mutual agreement. Before a sender can multicast information to you, you shall request entry, and it shall grant it to you.

Concepts

  • A subscriber (an entity) would like to enter a context to receive its updates.
  • Entities like persons and places can manage multiple contexts in order to multicast group communication, presence data or any other kind of updates or many-to-many exchanges.
  • Contexts can be organized in a hierarchical manner (we speak of subcontexts or channels in that case). You can read about Channel Inheritance for more information on this.
  • Multicast is the one-to-many routing operation of PSYC with optimized distribution, see also Spec:Routing.

An entity stays subscribed until one of the following happens:

  • The entity unsubscribes
  • A ping to it remains unresponded
  • The circuit is lost or the remote server reports that the recipient could not be reached
  • The context decides to remove (unsubscribe/kick) the entity for arbitrary reasons

Depending on the needs of the context, the context MAY choose to mark the recipient as currently unreachable (instead of instantly unsubscribing her) and remove her from the multicast tree, but also to be able to

  • retry some time later
  • simplify the repair process if the person wants to return – not have to go through subscription process again, very useful for friend management that might otherwise need user interaction
  • redirect to a different UNI or UNL, if more than one was provided at subscription time.

Context interface

This section describes the interface of a context. Entities can offer this interface to let other entities subscribe to them.

Subscriber Requesting Entry

The following request methods are the basic set of methods that an entity can use to request becoming part of a context.

Subscribing to a context

This requests the membership of a context.

Required Variables in the packet:

  • _target — the entity you want to subscribe to
  • _tag — a tag identifying your request
  • _source — optional, your UNL

Method name: _request_context_enter

NOTE: As PSYC nodes know which contexts their entities subscribed, they usually also know when they need to synchronize with the state of a newly entered context. To accomplish this, it is recommended that the PSYC node adds a "?" modifier to the outgoing _request_context_enter, which was initiated by the entity that wants to enter. This is only necessary when the node isn't already hosting other members of the same context, thus is already holding a copy of the current state.


Example

A client requesting subscription:

  |
  :_target        psyc://server.tld/@place
  :_tag           284232
  
  _request_context_enter
  |


Informing other subscribers

If the place decides to inform people about new members, it multicasts a _notice_context_enter to all occupants (including the newly joined entity, as it also informs the joined entity, that others were informed about her arrival). The person who joined is kept in _source_relay in this configuration.

Required Variables:

  • _context — The uniform of the sending context
  • _source_relay — the uniform of the new entity

Method name: _notice_context_enter

This message can only be distinguished from other people joining the channel by looking at the _source_relay routing variable.

Example

A place informing all subscribers (including the new one) about the new member:

  |
  :_context          psyc://server.tld/@place
  :_source_relay     psyc://someip:-3457/
  
  _notice_context_enter
  |


Confirming a subscription request

When the request succeeds an _echo_context_enter is sent back to the subscription request as a reply. After the response is sent, the entered entity has (logically) become a member of the context. The person who joined is the _target, obviously.

Required Variables:

  • _source — the sending entity
  • _target — the sender of the request subscription
  • _tag_relay — the tag from the original request
  • _interface — optional, used to describe the interface the context provides

Method name: _echo_context_enter

The _source variable contains the address of the context you are a member of now.

When receiving a _echo_context_enter you MUST check whether it is in response to a former request or you trust the context you enter enough. This check is required to prevent others from forcing you into a SPIM context.

State synchronization MAY be included in the _echo_context_enter reply (for example when the requesting entity asked for a sync, or when the context believes the state needs to be resent). But note that the entering entity is NOT a member at the time of replying to the request. The additional state update MAY alternatively be as a separate packet (potentially without method) but it would not be sent in the _notice_context_enter message as that is going to all recipients and all multicast packets are identical.

Example

A place confirming a subscription request:

  |
  :_source        psyc://server.tld/@place
  :_target        psyc://someip:-3457/
  :_tag_relay     284232
  
  _echo_context_enter
  |

Subscriber requesting leave

Leaving a context

This method asks the context to let you leave it, even though this is formally a polite request, it MUST NOT be denied by the context.

Required Variables:

  • _target — the context from which the entity wants to unsubscribe
  • _tag — a random tag
  • _source — optionally, the uniform of the sender.

Method name: _request_context_leave

The subscriber implementation MUST NOT disturb his human user with further multicasts from the sender, but it MAY report abuse where appropriate.

Example
  |
  :_target        psyc://server.tld/@place
  :_tag           284232
  
  _request_context_leave
  |


Confirming a request for unsubscription

Upon receiving a request for unsubscription, a context MUST confirm the request with an _echo_context_leave.

Required Variables:

  • _target — the requesting uniform
  • _source — the sending context
  • _tag_relay — the original _tag from the request

Method name: _echo_context_leave


Example
  |
  :_source         psyc://server.tld/@place
  :_target         psyc://someip:-3457/
  :_tag_relay      284232
  
  _echo_context_leave
  |

Telling others about leaving

  • When unicast from a context, this method informs the member that she left the context.
  • When multicast from a context, it informs all members that the member has left the context.
  • When unicast to a context, this methods signals an unpolite end of the membership. No confirmation is needed then. (REMARK: Unfortunately some IRC gateways require this to properly emulate IRC behaviour)


Required Variables when context is sender:

  • _context — the sending context
  • _source_relay — the leaving member

Required Variables when context is receiver:

  • _target — the receiving context
  • _source — optional, the senders uniform

Method name: _notice_context_leave


Example

A multicast notice that a member left us:

  |
  :_context        psyc://server.tld/@place
  :_source_relay   psyc://someip:-3457/
  
  _notice_context_leave
  |

_warning, _error & _failure

You may also receive warnings, errors or failures stopping the transaction flow, both because there may be a problem or simply because the context doesn't like your face. See the respective chapters in the specification for appropriate candidates.

The _tag_relay included in these messages helps you take appropriate action.

Custom Commands

Custom commands provided by the context (see the psyced documentation Create Places for examples) may have collateral effects of entering or leaving channels. This is necessary to implement complex functions like a newscast with a preferences pop-up box. By the choice of preferences the appropriate channels are selected as to maintain optimized multicasting.

Entities


Root

There are two concepts of root in PSYC. Depending on architectural choices, both roots, the protocol root and the entity root MAY be implemented in the same place in the source code, or not.

Root entity

The root entity in a server or application is the object that comes without anything behind the / in its uniform, as in psyc://psyced.org/. It performs jobs like authentication. This root is a full PSYC entity.

Protocol root

The protocol root instead leaves out the / as in psyc://psyced.org. It is used for negotiation for example in circuits and thus operates on the routing layer. If a packet is sent, and either the source or destination is not explicitly or implicitly addressed, the protocol root is intended.


Person Entity

The person identity entity in PSYC is a manager object which handles the needs of a human being independently if she is connected to the system with some sort of Client, or not.

Message handling and routing

When a packet with a _message method is received for a person entity it MAY be stored in a log of last messages.

When there are entities linked to the person entity, and the message is a multicast message, routing is done as described in Spec:Routing.

If the message was a unicast, the person entity relays it to the linked entities as described in Spec:Routing (the original _source is put in the _source_relay variable).

This routing behavior is applied to all other packet methods the person entity isn't able or willing to handle.

Some exceptions are described below for message echoes.

For more information about message handling and relaying see Spec:Message and Spec:Routing.

Message echoes

When the person entity receives a _message_private it SHOULD generate a _message_echo_private and send it back to the _source unless the sender is unwelcome or other reasons make the generation of echo problematic.

When the person receives a _message_public from one of the subscribed contexts and the _source_relay of that message is the uniform of the person entity itself, it SHOULD generate a _message_echo_public and send it (instead of relaying the _message_public) to all linked entities.

See also Spec:Message.

Channels

The person entity offers these channels:

  • # - The context all its contacts are member of.
  • #_presence - The context all contacts receiving presence updates are member of.

In order to subscribe to basic contact data of a person (make friendship), apply the context subscription protocol to his contact channel. To also subscribe to a person's presence data, use it on her presence channel. An example presence channel's uniform looks like this: psyc://example.net/~heela#_presence

Friendship

To offer some other person entity friendship a _request_context_enter has to be sent to the other entities #friends channel.

The other entity then SHOULD request to enter your #friends channel, and if the access is granted he should also grant your enter request.

Person entities should store whether someone made this request, so a user can later come back and approve the friendship request.


Places

A place can be a chatroom, a newsfeed or any other entity which offers some special multicast messages or events.

By default, messages of all kinds of methods are submitted to a place for redistribution in form of a unicast to it. It is sufficient to provide a _target to this purpose. Example:

|
:_target    psyc://example.org/@kitchen
:_source    psyc://example.net/~dj

:_action    spins
_message
Hey, it's much nicer in the living room, won't you come over?
|

Place command requests

_request...variablesequivalent command
_kick_person/kick
_nick_nick_local (masquerade)/nick
_set_masquerade_flag_masquerade/masq(uerade)
_set_owners_list_owners/owners
_set_public_flag_public/public
_set_style_uniform_style/style
_show_log_match _amount _parameter/history


The _target MUST be aimed at a place (consider however, places potentially provide differing choices of methods). If your request is to be performed on behalf of an identity not identical to the source (typically the case for a client or external tool) you need to provide a _source_identification.

These _requests can be replied to using any method. The packet tags SHOULD be used to detect whether a request failed or to get the reply of the request.