Contents

Extending the existing IRC URL schemes

Following IRC URL Scheme(s) have been considered:

http://www.mozilla.org/projects/rt-messaging/chatzilla/irc-urls.html
http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt
http://tools.ietf.org/html/draft-butcher-irc-url-04

But all of them do not define IRC identities. Each of those proposals addresses how to access a certain network or chatroom. Even if only few networks enforce owned nicknames and thus the whole idea of identity is flaky on IRC, we still need a way to refer to a sender/recipient in a real-world application.

So this proposal compensates by specifying an IRC URI scheme to extend existing IRC URL schemes. The semantic difference is shown by the presence or absence of the // double slashes.

Since IRC does not provide any protocol to reach out for identities on remote servers, this URI scheme doesn't find much application. We were thinking of using it in conjunction with interserver-level gateways, but figured out, it is more useful to assign PSYC and XMPP uniforms to people on such IRC networks, rather than having to define a routing principle for irc: identities. Bummskraut has been using this syntax to identify IRC contacts, but later dropped the entire URI approach to routing. psyced uses irc: access URLs rather than identities, with the more humble benefit of being compatible with IRC clients that support such syntax. Maybe one day we'll implement a generic routing strategy for IRC identities over gatebots.

See also: IRCPLUS.

The new IRC URI Scheme

   irc:<destination>@<authority>   ; IRC-URI as specified here
   irc://<access>                  ; IRC-URL as specified elsewhere
   IRC-URI-reference := "irc" ":" opaque-string
   opaque-string     := access | identity

   access            := hier_part  ; see RFC 2396 and other IRC URI specs
   identity          := destination "@" authority  
   destination       := "~" nick | channel_prefix channel | reserved_prefix reserved_destination
   nick              := 1*(unreserved | escaped)
                        ; Note: Following characters in nicknames (as defined in RFC 2812) 
                        ; can only be used in URI nicknames in the escaped form:
                        ; "[" | "]" | "\" | "`" | "^" | "{" | "|" | "}"
                        ; See also the note about allowed characters in nicks below.

   channel           := 1*(unreserved | escaped)
                        ; Note: All special characters in the channelname (which are not unreserved) 
                        ; need to be escaped.
                        ; See also the note about allowed characters in channels below.

   channel_prefix    := "*" | "!" | "+" | "&"
                        ; These are the currently defined channel prefixes. 
                        ; Note: The prefix "*" is mapped to the irc channel prefix "#"

   reserved_destination := 1*(unreserved | escaped)
   reserved_prefix   := alphanum | prefix_mark | ";" | "?" | ":" | "@" | "=" | "$" | ","
                        ; reserved_prefix is basically uric (as defined in RFC 2396) 
                        ; except "%" and "/" and the characters in channel_prefix

   prefix_mark       := "-" | "_" | "." | "'" | "(" | ")"

   ; For definition of alphanum see RFC 2396.
   ; For definition of unreserved and escaped see RFC 2396.
   ; For definition of the authority part see RFC 2396. The authority part should
   ; be a host of the IRC network or some other authority which allows you to reach the destination

Character set of nicknames

The actual allowed characters in the <nick> part in the URI depends on the IRC server or network. If the allowed characters are unkown they should be limited to the characters that are allowed in the nickname as defined in RFC 2812.

NOTE: The prefix "*"

The channel prefix "*" needs to be mapped to "#" internally. And if you want to display an IRC channel with a "#" prefix you need to map that back to "*"!

In theory we could stick to the "#" which means encoding it as "%23". But, do you find this acceptable? Each and every URI to come up with an escape code, and every user having to learn what it means?

For completeness: Former versions of the URI RFC permitted "#" to appear in URIs without path. This has been changed since 1998.

Character set of channels

Same rules that apply to <nick> apply to <channel>: the characters should be limited to the ones that are allowed in channel names as defined in RFC 2812, unless it is known that the server or network allows a different set of characters in channel names.

References

  • RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax
  • RFC 2812 - Internet Relay Chat: Client Protocol

Developer Tips

Regular Expression

   /^irc:([~\*!+&^/])([^@]+)@(.+)$/

   $1 is the prefix, which determines the meaning of $2
   $2 is the destination, either a channelname or a nickname
   $3 is the authority

   $2 and $3 have to be unescaped as described in RFC 2396 section "2.4. Escape Sequences".

Escaping

<authority>, <nick>, <channel> and <reserved_destination> must be escaped when generating a URI and unescaped when parsing/splitting up the URI.

All characters that don't belong to the restricted set of characters in a URI (see RFC 2396 "2. URI Characters and Escape Sequences") or are used as delimiters (eg. "@" in <nick> and <channel>) must be escaped when constructing a URI.

URI escape line for Perl (for nicknames, channelnames):

   # this also works for components of the authority part:
   $other =~ s/([^a-zA-Z0-9_.-])/uc(sprintf("%%%02x",ord($1)))/eg;

URI unescape line for Perl:

   $str =~ s/%([0-9a-fA-F]{2})/chr hex ($1)/eg;