Formerly known as Jabber, XMPP is the 'Extensible Messaging and Presence Protocol.' This page discusses how PSYC relates to XMPP, how psyced implements it and discusses some open points about XMPP which so far have been reason enough not to drop the development of PSYC. Documentation on how to use Jabber with psyced and PSYC is kept at http://www.psyced.org/psyc-jabber.en.html
Siehe auch FAQ.
psyced and the XMPP
The psyced implements XMPP (Jabber) in several ways, gatewaying into PSYC and the other things psyced does. Please note that XMPP client access isn't as stable as IRC's. psyced performs better as a multi-protocol interserver gateway. Here's a list of XMPP extension protocols that we support:
- XEP-0012: Last Activity
- XEP-0030: Service Discovery
- XEP-0045: Multi-User Chat
- XEP-0054: vcard-temp
- XEP-0077: In-Band Registration
- XEP-0092: Software Version
- XEP-0107: User Mood
- XEP-0178: Best Practices for Use of SASL EXTERNAL with Certificates
- XEP-0185: Dialback Key Generation and Validation
- XEP-0190: Best Practice for Closing Idle Streams
- XEP-0199: XMPP Ping
- XEP-0212: XMPP Basic Server 2008
- XEP-0220: Server Dialback
Technical Issues in Jabber
In the following paragraphs we are addressing problems in Jabber, but we are glad so many nice clients can make use of our psyced server using the very elaborate Jabber client protocol and generally positive about somebody carrying the flag of open source out there in the fight against proprietary closed chatsystems.
But writing about things that already do their job is pointless, that's why we're talking about the things which look like they need work.
We even published XEPs (XMPP Extension Protocols) to address some (see below).
Closing Idle Connections
In 2006, four implementations of XMPP were closing connections in an unsafe manner, potentially losing messages that the other side may have started to send while this side is closing the TCP connection. XEP 0190 has been written to address the problem. The new behaviour has also been worked into the upcoming new XMPP RFCs. The issue is hopefully solved by now, although we haven't made another survey.
mawis says roster async happens mostly during periods when an other server is not available. Even though unsubscribe should be acked by an unsubscribed, nobody will wait for that ack (Do you want to keep the subscription up, just because the other side doesn't know it is down?). At best unsubscribe is resent after a while or once a probe comes in, which has no corresponding subscription state. mawis compared his user database with two other large Jabber installations and found out that 11,4% of contact relationships were out of sync.
PSYC addresses this problem with the concept of packet ids. If you can ensure, that all packets arrive, you'll automatically keep states in sync. Unfortunately we haven't implemented that, yet - so large scale deployment of PSYC would currently run into similar problems. psyced also has an implementation for roster revision numbering, but it isn't in use.
XEP 0198 Stream Management introduces acknowledgments and retransmissions into XMPP streams (sort of equivalent to our packet id concept). This XEP has been advanced to draft status in June 2009 and is expected to be implemented in several server technologies soon. It should alleviate if not solve the problem of desynced state.
From a protocol and architectural point of view, Jabber generates a lot more traffic than it necessarily should and will not perform well when users have many friends or enter huge chatrooms.
Here are some statistics that Matthias Wimmer, the friendly systems administrator of amessage, has provided us (Thank you!):
- "I just checked the log of the three biggest domains (amessage.be, amessage.de, and amessage.info) where for the last 9,211,845 stanzas that have been delivered to these three domains (either from remote or locale JIDs) it's the following:"
71.4 % are presence stanzas (not including subscription) 14.5 % are message stanzas 13.4 % are iq stanzas 0.6% are subscription stanzas
Then, he investigated a bit further, by looking at how many presence stanzas are actually identical copies being retransmitted over and over, and found out that:
- "6200 out of 10374 presence stanzas have been dupes (59.76 %)."
That means that 42.66% of packets sent across the amessage servers were redundant. Why is XMPP sending the same information again and again? Presence, like groupchat, is information that needs to be distributed to many recipients, so XMPP has to do it somehow. XMPP simply generates a copy of each message to each and every recipient and even resubmits the same information onto the same TCP connection for each recipient. At least, when both sides of the XMPP communication support XEP 0033, the list of recipients can be delivered in an SMTP-like fashion.
Still, true bandwidth benefits appear, when
- subscription state is not retransmitted with every communication,
- the burden of transmission is not going from the source to every single recipient, but using a distribution tree instead.
When you have hundreds of friends on your roster and distributed on several remote servers, that's when XMPP servers start connecting each of the recipient servers and send a presence update to each person one by one.
The same is done for pubsub distribution. Let's say a news source has a million subscribers. Each time it has news, the publishing server needs to address all of these people one by one. In PSYC, by contrast, the server will contact a set of servers, which will forward the news to another set of servers each, until all the recipients receive the news. Since each of these servers knows who of their users have subscribed to this distribution context, the list of recipients does not need to be retransmitted with the message.
It's a protocol design issue that cannot easily be circumvented by server design. Some XEP drafts have been written to address the problem, but they didn't find enough acceptance in the XMPP community. If you're interested:
The SIMPLE working group has published an Internet Draft on Optimizing Federated Presence with View Sharing. Their approach is similar to what we proposed for XMPP, but they have an explicit negotation of distribution context while we intended to couple that with presence subscriptions. See presence for more details.
Ironically, even if Jabber was using IRC as a kind of multicast overlay network to distribute its presence, it would probably perform better in the whole. fippo wrote up a little proposal: Jabber Relay Chat. IRC has its weaknesses, but some things can be learnt from it.
XMPP is generally claimed to be scalable, but its scalability is best achieved within a single domain, in a closed system, that uses its own decentralization strategies, like the Erlang language provides, not trying to use S2S XMPP internally. The open federation network of XMPP instead would benefit greatly from more elaborate routing strategies like stanza repeaters and multicasting.
Still, a distribution network based on established TCP lines is generally superior to similar efforts using HTTP POST requests (see also XMPP To The Rescue), so if you are still doing things using HTTP, you will of course experience an improvement upgrading to XMPP.
PSYC and XMPP to team up
It is being said((r)) many large XMPP deployments like Google Talk or Wave use custom tree-based distribution protocols within the cluster, while only interfacing to outer XMPP federation using the XMPP S2S protocol. So in a way their existence is a proof, that you can do XMPP in a scalable way, and yet they also prove that it is impractical to do using XMPP itself.
It would be good for the federation, decentralized communications, privacy and open source in general if such strengths weren't limited to a few commercial offerings, but rather generally available and evenly integrated into the XMPP experience.
In PSYC as Jabber S2S we have laid out a plan on how to use PSYC as an interserver federation protocol for XMPP applications. This would combine the respective strengths of the two technologies into a fabulous team. psyced already implements several parts of that, but it sure would be fruitful to extend solid XMPP server technologies to also process PSYC as a more effective network serialisation format and distribution system for the same content.
Presence itself can be a problem for the protocol that was specifically designed to solve it. Here's an example issue: http://mail.jabber.org/pipermail/jdev/2007-April/026204.html See Presence for more.
It's not a question of religion - it's not true that one format is like another - there are clear and logical technical reasons why XML and specifically the XMPP dialect of XML are unsuitable for instant messaging applications:
Major disadvantage: Binary Data Transfer
Just like HTTP, PSYC has the ability to say, the following n Bytes are binary - transfer them without trying to interpret them. This may sound totally unspectacular, but several protocols cannot do that. It makes File Transfers or embedding of a photograph or cryptographic key as simple as a fingersnap. XML instead has no simple solution for binary transfers. You either have to escape all XML control characters, or encode the data using something like base64 that needlessly consumes computation power and bandwidth, or you have to encode each message as a separate XML document, and put a binary capable framing protocol around it. XMPP doesn't do that, either.
Ironically any non-XML protocol is much better suited for delivery of XML data than XMPP. Just like HTTP, PSYC can simply transmit XML as is while XMPP has to encode it to avoid collision with its own syntax. The result will also be quite hard to read, should anyone need to do so. Alternatively, if you omit the XML headers and twist it in a way that the XML document becomes itself part of the protocol, then you made it, too. But that's not simple, not transparent, not spontaneous at all, and you have to define a namespace to do that. You can't say you're delivering an XML document at the snap of a finger as you would when using HTTP.
XMPP extensions like SOAP over XMPP or ATOM over XMPP re-implement the foreign XML format into XMPP stanzas. This means that whenever you want to receive those packets, you have to parse them, then render the tree structure into a regular XML document, then pass it on to the software library which implements SOAP, ATOM (or whatever) for it to parse it again. On the other side when you want to publish a document you received via SOAP or ATOM you have to parse it, then re-render it into XMPP. This is waste of processing power! The only way to avoid this is either to not use libraries and implement everything yourself, or to not use XMPP.
Major problem with XMPP: Missing framing
Being able to provide the length of a packet isn't an advantage only for binary packets. It generally allows to wait for the complete packet before starting to parse it, which allows for more efficient low-level read operations. In the case of PSYC nodes providing routing services only, it is even better as parsing of the packet can stop immediately after the routing header of the packet. Everything else is just read into a buffer at once - very processing-friendly.
All of this is impossible using XML. In XML you have to parse the complete tree before you can find out if something has reached completion, so you have to try parsing every little chunk of data coming from the network, because it might just be the last one you've been waiting for - this makes processing more expensive. Again, you may be wrapping a non-XML framing protocol around XML, that should work. In the case of XMPP this is however not being done:
- "You're correct in your assertion that framed data would mean clean binary transfers, but that isn't a goal of XMPP - anyone shipping binary data directly through XMPP is simply doing something inefficient. If you do want to ship large binary objects, it's more efficient to send them either via email, or send a URL via XMPP, and ship the data by a more suitable protocol."
Flash and some other applications introduced null bytes into the XML stream to frame packets. This doesn't help optimizing read operations, but at least gives you a chance to postpone parsing. XMPP could adopt that.
XMPP looks like XML
One advantage of XML is the existence of ready to use parsers, and by now many of them handle the XMPP dialect pretty well. In fact XMPP developers prefer if you don't roll your own. In order to employ such an XML parser, Jabber developers need to tweak certain things before and after parsing:
- XMPP does not allow encodings other than UTF-8, but UTF-16 is a requirement for XML compliance.
- XMPP forbids XML comments.
- XMPP forbids processing instructions, like <?include ...
- XMPP does not allow unescaped use of > as the XML spec expects.
- See also Mr. Karneges' mail.
- XMPP applications do not implement namespaces properly.
- XMPP doesn't properly close its document (the stream in that case) when negotiating TLS, instead it reopens a stream on the existing stream. The end result is not a valid XML document. Same thing when using compression.
See also elmex' AnyEvent-XMPP Parser comments.
Having to Guess the Meaning of a Packet
The PSYC protocol provides you with a method (or message code, message type) for every protocol message, which immediately makes it clear what the meaning (semantics) of the packet is, whereas a Jabber server, even psyced, has to closely look at what a client or server sends and find out what it wants. This means parsing the complete XML packet, no matter who it is intended for, then inspecting a lot of individual aspects of every packet. For example,
<presence from='foo' to='bar'/>
is what we call a _notice_friend_present, while the Jabber construct
<presence from='foo' to='bar'> <x xmlns='jabber:x:delay stamp='20051015T20:07:42' from='foo'/> </presence>
is a _status_friend_present in PSYC terminology. While in PSYC we can quickly glance at the method name, then decide what other parts of the packet we need, in Jabber we have to look at all the attributes and children to figure out what semantic meaning the packet has, before we can start acting upon it. The lack of a method name as a unified speedway to packet interpretation makes programming XMPP applications harder than it needs to be.
MUC the "Multi-User Conference"
MUCs are organized in the most natural way you would expect a chatroom to be organized: It is running somewhere on a server, all users register with it, all communication flows through the server of the chatroom. You could say XMPP MUC implements the most straightforward way to implement a chatroom, similarely to most webchats.
In situations of either highly distributed topology or large audiences, this approach has the same scalability issues we have seen above with presence and pubsub. Other technologies have more elaborate strategies.
IRC hosts its channels on each and every server of its network, multicasts the messages to each other in a close to optimal way, then each server takes care of letting its users take part. IRC has hosted moderated chatrooms with thousands of participants in the past. The drawback is, that no single authority is in charge of conference control of an IRC channel, so a single rogue server can cause havoc.
PSYC uses a combination of both approaches, it has a central authority organizing a chatroom, yet uses multicast distribution to even out the load on the network. PSYC events have been run hosting ten thousands of users in a single conference. Not all of them were permitted to speak freely, but all of them were able to write up submissions for the editorial team to choose from, in realtime.
Some MUC implementations use custom overlay networks to achieve a similar result as IRC channels. The drawback is, it is not a generic solution all XMPP servers can use to distribute the load - it only works for a set of servers run under the same administration.
It would be nice to see these kind of improvements in a generic way, so that all servers in the XMPP federation can take part (see the Smart Chat proposal above for example). This would address those frequently discussed scalability issues of MUC. And even better, if such a multicast improvement was generic enough to also encompass presence and pubsub.
A minor privacy concern with MUCs is the /msg command which sends private messages through the MUC, not directly to the recipient. This is frequently not obvious to the people using this function.
Another minor concern is about identity. XMPP resources are case sensitive. Due to that, there can be 3 persons in a MUC with the identities foo@bar/AbC, foo@bar/aBC and foo@bar/ABc. This could cause confusion, as they may appear as the same person to other participants.
Supporting MUCs from PSYC
PSYC will however not be able to apply multicast optimizations retroactively: If several PSYC users join the same MUC, each of them will receive its own copy of each message. For a cleaner set-up it's good to use PSYC chatrooms, and invite XMPP users in. In this constellation, all PSYC and IRC users will benefit from interserver multicast routing. Only XMPP users will be handed out redundant copies of messages. This is how it works for Jabber users:
XMPP users entering PSYC places
If you want to enter psyc://example.org/@someplace, you need to join an XMPP JID like *firstname.lastname@example.org. The best way is to query the official addresses of the room by issueing the /status command in it.
This applies to both people using psyced as a Jabber server directly and regular Jabber users in the XMPP federation network, given the PSYC server runs interserver XMPP, too. As explained before, entering via XMPP defeats multicast routing and similar PSYC functions, but provides you with the extras of a native XMPP server in exchange.
Profiles in Jabber
Jabber comes with two different profile formats. One (XEP 0154 aka User Profile), which is very elaborate, but rarely implemented and another one that is very limited, but available everywhere (XEP 0054 aka vcard-temp).
VOICE, HOME and NUMBER are adjacent children of TEL, defeating XPath-like XML access strategies. Even parsing this XML with a line mode parser doesn't work as the DTD doesn't enforce this order of children. HOME may come before VOICE and VOICE after NUMBER.
Minor protocol optimizations
- XMPP uses uptime as a technical value (JEP-0012). That's impractical, as the value gets old while it travels across the network. Using boottime is much better: You can keep it for as long as you know the other side is up (typically while the TCP line is up) and calculate the uptime from it by a simple time() - boottime subtraction. Needless to say, PSYC uses boottime. For end user messages it provides both values.
- Jabber has no practicable solution for migrating users or services from a hostname to another. PSYC has tools like forward and redirect.
JIDs aren't flexible enough
Because the world is round and not flat, I mean because there are other chat systems out there beyond Jabber, the user@host syntax, also known as Jabber Identity, is bound to create problems.
It could have all been so easy if the JID was simply an opaque string, where only the user's server needed to understand what's in it. Why does a client need to know what an icq:1234 is, or an INFITD0E at DM0TUI1S - as long as both the user and his server know what it means, that should be fine. Okay, forget the second example (which is a BITNET address, by the way). If JIDs in XMPP client-server were opaque, then we could have asked our users to drop the uniform into the add buddy wizard's address field, and poof connection is made. Instead people often have seperate username and hostname input fields, and the majority of chatsystems out there aren't compatible with it.
Therefore Jabber has to mingle all kinds of gateway addresses into the user@host syntax. This may work for icq:NUMBER as you can turn it around into NUMBER@icq, and that's kind of logical to the average user too. But once you have an address like msn:email@example.com, how are you going to go about it? 'msn:user' is not a legal user name in Jabber, and even if it was, it is just nothing the end user would guess - to put the protocol scheme into the username field. Oh, XMPP uses UUCP syntax for that: firstname.lastname@example.org.
All of this is just theory anyway, as according to Jabber philosophy your home server is not supposed to handle any gateways itself (psyced is the exception, as usual). You get friends with a transport service instead, then you encode all your friends as being at home in that transport - This means you have to shift all your friends to new addresses if that transport service goes out of business or you start doubting its disinterest in your private affairs. These are issues that are being worked upon, however.
The fact, that the protocol scheme is not supposed to be transmitted in an XMPP JID has a strength and a weakness:
- CON: Jabber servers have no easy way to figure out if XMPP is the only possible protocol to reach a certain person.
- PRO: All Jabber clients and servers have a scheme-free form of address of all people in their data structures. Thus one could replace the XMPP protocol at the backbone of Jabber and use PSYC (or whatever else) instead. So, if a sufficient number of Jabber installations also provided PSYC protocol, the network would slowly and transparently shift to a possibly more efficient interserver protocol. This idea is discussed in 'PSYC as Jabber S2S'.
Concerning dailback, one of the reasons why it was preferred over an IP address check is jabberd was not able to bind() the socket to a particular IP at the time. And why does it require two TCP connections? Because in Java it is nicer to have a reader and a writer thread, and having them use the same socket is impractical (as mentioned here). At times there are even more than two TCP links between two XMPP servers, because of multiple hostnames.
Other Techie Details
In the XMPP world the word broadcast is being used in a kind-of metaphysical way. Broadcast stands for a single transmission operation to reach all recipients on a medium in a single go - like on a radio. Once you transmit, everyone can listen - there is no extra effort involved in getting the content to each recipient. In Jabber lingo however it actually intends the quite inefficient fan-out of copies of a message to each member of a collection of recipients, which, in terms of bandwidth consumption, results in pretty much the opposite of what is expected from a broadcast.
There has also been a time when the word multicast was used in a confusing way to actually mean what the XCAST RFC describes as multi-unicast on page 14. These uses, in particular the naming of XEP 0033, have however been discontinued.
About InnerXML in psyced
The radical new way to fully support Jabber in psyced: If a Jabber client sends a message to an XMPP user, then it will be delivered directly without being interpreted by psyced. In turn, if net/jabber/gateway receives a message for user@localdomain/someresource and there is a jabber client connected with that resource, it is delivered directly. This may circumvent some extras in psyced, but it enables full XMPP compability by being transparent.
Quite often Jabber is being advertised as being more secure because of its interserver encryption using TLS. Some even propose to remove dialback from the XMPP standard, because Jabber is secure now. fippo's recent survey however shows that only 4% of publicly accessible XMPP servers actually provide proper valid certification, everything else is either untrustworthy or unencrypted.
Questions and Answers
Why don't you help Jabber getting better instead of spending so much time documenting its weaknesses?
By writing a Jabber server and gateway as a part of psyced we are just documenting the problems we keep encountering. This page is a by-product of that. At the same time we are helping Jabber. Our first XEP (XMPP Extension Protocol, formerly known as Jabber Enhancement Proposal aka JEP) on Dialback Key Generation and Validation has been accepted as number 0185. Also XEP 0190 on Best Practice for Closing Idle Streams has been adopted.
See above for our Smart Unicast XEPs, which have been rejected in a council meeting on 2006-06-14. The hard data had been given to them before. See in the 2.2 Scalability paragraph above. Also the test framework existed at the time. psyced implements this experimental extension (just #define XMPPERIMENTAL).
Why do messages to Jabber users take so long to echo on the screen sometimes?
First of all, feedback that a message has successfully reached its destination, so-called echoes, is a PSYC feature. The Jabber gateway generates an echo as soon as it successfully delivered a message to the recipient Jabber server, which isn't exactly the same thing, but an acceptable approximation.
Jabber echoes should therefore actually appear quicker than PSYC echoes. They sometimes however don't, because most Jabber servers shutdown inactive network links after a short period. They are considered a cause for load rather than a chance for quick delivery - which is an architectural problem. By consequence after that short period a new connection has to be established between the servers. Using the XMPP protocol this means opening two TCP connections involving two TCP handshake processes plus the XMPP dialback handshakes. This totals in a typical delay of several seconds.
This still is no cause for concern. Only when no echoes appear at all, you should presume that your message has not successfully reached the destination, as is the case with PSYC messages, too. Normally you will receive a detailed error notice as soon as your server gives up on delivery of your message, but this may take a while as psyced keeps a queue of outgoing messages and tries to recover the communication link a couple of times before returning the queue to its respective senders.
- Discussion on Routing on the XMPP Mailing List
- MUC Traffic Issues discussion
- Server Reliability Comparison Chart
- don’t criticize design choices from 10+ years ago if better alternatives were not available hmmm.....