Major disadvantage: Performance inefficiency

Many people are currently turning away from XML and turning to JSON for efficiency gains, but if you look at the libpsyc benchmarks comparing XML, JSON and PSYC you will find out that they both aren't very performant formats unless you really make use of their respective strengths. Check those benchmark results out, they're really quite enlightening. PSYC turns out several factors faster than both, whatever input you feed it.

Major disadvantage: Binary Data Transfer

Just like HTTP, PSYC has the ability to say, the following n bytes are binary data - transfer them without trying to interpret them. This may sound totally unspectacular, but several protocols cannot do that. It makes File Transfers or embedding of a photograph or cryptographic key as simple as a fingersnap. XML instead has no simple solution for binary transfers. You either have to escape all XML control characters, or encode the data using something like base64 that needlessly consumes computation power and bandwidth, or you have to encode each message as a separate XML document, and put a binary capable framing protocol around it. XMPP doesn't do that, either.

It also means that it can't just deliver XML data. Ironically, any non-XML protocol is much better suited for delivery of XML than XMPP – because XMPP itself looks like XML so it can't handle anything similar to it. Just like HTTP, PSYC can simply transmit XML as is while XMPP has to encode it to avoid collision with its own syntax. The result will also be quite hard to read, should anyone need to do so. Either escaped or base64-encoded.

Alternatively, if you twist the XML data in a way that the XML document becomes itself part of the XMPP protocol, then it works. But that's not simple, not transparent, not spontaneous at all: you have to define an XMPP protocol extension with a custom namespace for that. You can't say you're delivering an XML document at the snap of a finger as you would when using HTTP.

XMPP extensions like SOAP over XMPP or ATOM over XMPP re-implement the foreign XML format into XMPP stanzas. This means that whenever you want to receive those packets, you have to parse them, then render the tree structure into a regular XML document, then pass it on to the software library which implements SOAP, ATOM (or whatever).

On the other side when you want to publish a document you received via SOAP or ATOM you have to parse it, then re-render it into XMPP. This is waste of processing power! The only way to avoid this is either to not use libraries and implement everything yourself, or to not use XMPP. Guess what PSYC does: It knows there is an XML thing in its message body. It can pass it on without even looking at it.

Major problem with XMPP: Missing framing

Being able to provide the length of a packet isn't an advantage only for binary packets. It generally allows to wait for the complete packet before starting to work on it, which allows for more efficient low-level read operations. In the case of PSYC nodes providing routing services only, it is even better as parsing of the packet can stop immediately after the routing header of the packet. Everything else is just read into a buffer at once - very processing-friendly.

In XML you have to parse the complete tree before you can find out if something has reached completion, so you have to try parsing every little chunk of data coming from the network, because it might just be the last one you've been waiting for - this makes processing more expensive. XML parser implementations have gotten pretty smart at guessing the number of closed braces to figure out when it is worthwhile to start parsing, still it is an unnecessarily complicated heuristic approach. Several protocols solve this by wrapping a non-XML length-prefixed framing protocol around XML. In the case of XMPP this is however not being done:

Dave Cridland writes:

"You're correct in your assertion that framed data would mean clean binary transfers, but that isn't a goal of XMPP - anyone shipping binary data directly through XMPP is simply doing something inefficient. If you do want to ship large binary objects, it's more efficient to send them either via email, or send a URL via XMPP, and ship the data by a more suitable protocol."

And we would prefer to transfer any data across our PSYC links once we have them up and ready for use, not be limited to XML packets, especially since an HTTP file transfer cannot be multicast.

Flash and some other applications introduced null bytes into the XML stream to frame packets. This doesn't help optimizing read operations, but at least gives you a chance to postpone parsing. XMPP could adopt that.