PSYC packets travel the wire either using TCP circuits or by UDP. UDP is typically used for large multicast notices whose successful reception is not critical. The PSYC packet format is mostly line-based with some exceptions.
(See also ABNF)
PSYC packets are byte sequences which have the following syntax:
packet = routing-header [ LF content LF ] "|" LF ; the length of content is either implicit (by LF "|" LF) ; or explicitly given by the _length variable in the routing vars. routing-header = *routing-modifier content = *entity-modifier [ body ] routing-modifier = glyph [ variable ] ( simple-arg / LF ) entity-modifier = glyph [ variable ] ( simple-arg / binary-arg / LF ) body = method [ LF data ] glyph = "=" / ":" / "+" / "-" / "?" simple-arg = HTAB arg-data LF binary-arg = SP length HTAB binary-data LF length = 1*DIGIT binary-data = <a length byte long byte sequence> method = 1*kwchar variable = 1*kwchar arg-data = *nonlchar data = <amount of bytes as given by _length or until the (LF "|" LF) sequence has been encountered> nonlchar = %x00-09 / %x0B-FF ; basically any byte except newline kwchar = <alpha numeric ASCII char or "_"> For the definition of DIGIT, VCHAR, SP, LF and HTAB see RFC 2234 (ABNF).
Either arg-value or binary-data can contain lists, which adhere to the following syntax (in ABNF):
list = binary-val *("|" binary-val) =/ 1*("|" list-value) binary-val = length SP binary-data list-value = *nonlpipechar nonlpipechar = %x00-09 / %x0B-7B / %x7D-FF ; any byte except newline and "|"
This list syntax is only valid for variables of the _list type that start with _list.
This is a simple example packet:
| :_source psyc://example.symlynX.com/~fippo :_target psyc://ente.aquarium.example.org:-32872 :_nick fippo _info_nickname Hello [_nick]. |
And this is an example packet that covers most of the BNF rules above:
| :_context psyc://example.org/@democracynow :_target psyc://ente.aquarium.example.org:-32872 :_list_member |psyc://example.symlynX.com/~jim|psyc://example.org/~judy :_list_topic 9 democracy|3 now :_list_image 9213 4404 <binary data>|4798 <binary data> :_list_owner 26 |psyc://example.org/~judy :_image 4212 <binary data> _status_context In [_context:_nick]: [_list_member:_nick] |
This example uses the entity-oriented psyctext variant. The images could be used to decorate a memberlist, but the normal approach should be to obtain the member images from the state of each member entity. So this example really serves the purpose of showing several possible encodings of lists and data.
The decision which strategy to pick is left to the implementor, mainstream server implementations should choose an application-developer-friendly style however, which is yet to be defined.
Another example that my renderer just spat out:
| :_source psyc://base.example.org/~k :_target psyc://localhost:1234 :_length 573 :_color #CC0000 :_nick k :_nick_target psyc://localhost:1234 _message_private hi there2 39u2392fhfk <other random bytes filling up the content length to 573> |
There is a special variable (_length) in the set of routing variables that defines how long (in bytes) the content part of a PSYC packet is.
If no _length variable was set, the bytes after the routing-header are parsed as the content rule in the grammar defines. That means that data ends at the first LF "|" LF.
This means that as soon as you use binary data that contains LF "|" LF in the content you have to set the _length variable.
If _length was provided, the given number of bytes is read from the byte stream after the routing-header and then processed as the content rule in the grammar defines, which means that data ends with the end of the content (This makes it possible to transmit arbitrary opaque binary data as the data part of a packet).
Variable names and methods are ASCII encoded strings while the contents of body or the arguments of variables are kept in as UTF-8 unless specified otherwise.
There are two kinds of variables: routing variables and entity variables.
Routing variables can be persisted (which means the variable is set via the '=' glyph (or '+', '-') in the persistent variable set (see below)) and modified during the course of the existence of the circuit, making this a simple mechanism for protocol compression, whereas context entities may persist variables (meaning: they use the '=' modifier to set persistent variables) for all their members to keep for the entire duration of existence of the context, making it a decentralized storage vehicle.
Each context has it's own set of persistent entity variables. If the _context routing variable is NOT set persistent entity variables MUST NOT be changed. This means that persistent entity variable changing modifiers can only be used when _context is set, and thus updating persistent entity variables can only be done by a context.
Should a modifier change a persistent entity variable but _context is not set, the violation SHOULD be acquitted with a _failure_unsupported_state_persistent error packet and the circuit MAY be terminated. (See also Spec:Entity).
NOTE: In theory also unicast entity communications between a _source and a _target could each define a set of persistent variables. Such entity state (as opposed to context state) is however currently not supported as it raises storage requirements of PSYC implementations more than it is likely to prove useful. It is reserved for possible future use.
Out-of-context communication may however still refer to persistent variables from in-context communication in its psyctext template.
Each packet defines a set of current variables which may be different from the persistent set of variables. When passing the variables to an application, the programming interface SHOULD merge current routing and entity variables into a single structure.
See also Spec:State.
routing and entity modifiers
Each packet comes with a set of routing modifiers and entity modifiers. The routing modifiers belong to the routing-header and are separated by a newline from the entity modifiers (entity-modifier).
The routing modifiers modify the current and persistent routing variables. The entity modifiers modify the current and persistent entity variables.
This means that after you processed the routing modifiers (routing-mods) you need to find the persistent variables for the addressed entity (before processing content) before you continue to process the entity-modifier.
Current and persistent variable handling
When a packet is being parsed, the modifiers modify the set of current and persistent routing and entity variables. To do so, the set of current variables is initialized by the set of persistent variables before the modifiers are applied.
After the packet has been processed, the current routing and entity variables are the significant variables that belong to the instance of the parsed packet.
NOTE: You have to make sure that you don't apply the changes to the permanent routing variables until the whole routing header has been parsed. Same with the entity variables.
- = - The variable is modified in the set of current variables and the set of persistent variables. If no variable name is provided the persistent variables and current variables are deleted.
- : - The variable is only modified in the set of current variables.
- + - Only valid for _list variables. The elements in the modifier argument, which is a list, are appended to the permanent variables and the current variables.
- - - Only valid for _list variables. The elements in the modifier argument, which is a list, are removed from the permanent variables and the current variables.
- ? - This modifier asks the destination for the contents of any persistent variable matching the inheritance family provided. If no variable name is given, all persistent variables are intended and the destination entity should include a state sync in the reply to this packet or a separate packet.
The ASCII strings, denoted by the non-terminals variable and method, have to adhere to the keyword naming specification.
MIME Content Type
The content type for complete PSYC packets themselves is message/x-psyc (uncaring of the content-type of the data contained within). It needs to be encapsulated in an 8-bit transparent way, as it may contain binary data (see also _length variable in routing header).
http://about.psyc.eu/Parse gives practical instructions how to write a PSYC parser.
NOTE: When an invalid packet is received, and the routing header has already been parsed successfully, and it has been processed partially, it may or may not have affected state changes before being dismissed as invalid. Thus, after receiving an invalid packet (which means, there are syntax errors in the content part of the packet), the current state data of the addressed context MUST be invalidated.
See also Packet ids for future packet identification strategies in PSYC.