How to use delimiters other than ampersands in URLs

View Comments

I recently came across the idea of using different formatting for URLs. By this, I mean doing things such as omitting the question mark before the query string or using a character other than the ampersand to delimit the variables in the query string. This idea is potentially golden. I understand that there are programmers who feel that the standards should be upheld at all times, including when building URLs. However, I also understand that short, readable URLs are ideal and being able to build the URL as you please can help you towards that goal.

The focus of this tutorial

Before we begin, I’d like to state that this tutorial will focus on changing the ampersand delimiters in URLs, the equality signs in URLs, and will touch on using a character other than a question mark to start the query string. As much as I would love to discuss this topic in full, I’m prone to ranting and this tutorial would become far to confusing. We are going to be building a function that will build our superglobal $_GET array from any format of query string that we give it. Our goal is to go from this:

http://example.com/example.php?type=page&id=20&title=Hello+World

To this:

http://example.com/example.php?type:page,id:20,title:Hello+World

Simple enough, right? We’ve got this.

Getting the query string

Before we build the function, we have to know what data we will be dealing with. For example, if you are going to be using a character other than a question mark to start your query string, you need to make use of method #2 (below) to get your query string to work with. We will be finding (or building) our query string now, so that we may use it later.

There are three ways that we can get the query string from this URL:

  1. One way is to build it out of the superglobal $_GET. This method consists of matching the key and value pairs with equality signs and then combining them with ampersands. This method is lengthy and should be a last resort if, for some reason or another, your $_SERVER variables have incorrect values.

    $key_value_pairs = array();
    
    foreach ($_GET as $key => $value) {
    	$key_value_pairs[] = strlen($value) > 0 ? "$key=$value" : $key;
    }
    
    $query_string = implode('&', $key_value_pairs);
  2. Another way is to get the query string from the REQUEST_URI index of the superglobal $_SERVER array. It contains the entire URL that the server received as a request. This method should only be used if the next method produces an incorrect value for some reason, or if you are using a character other than a question mark to begin your query string (such as a slash).

    $file_path = '/relative/path/to/this/file.php';
    $query_string_delimiter = '?';
    $regex = "~{$file_path}.*?(\\{$query_string_delimiter}(.*))?$~";
    
    $query_string = preg_replace($regex, '$1', $_SERVER['REQUEST_URI']);

    This code will remove all data prior to the query string, which is defined to start at the string in $query_string_delimiter.

  3. Finally, the easiest way is to simply pull the query string out of the QUERY_STRING index of the superglobal $_SERVER array.

    $query_string = $_SERVER['QUERY_STRING'];

Building the function

Before we build any function, it’s a good idea to know how flexible we want it to be. Personally, I’d like the ability to use any string that I want in place of the ampersands of the query string and any string that I want in place of the equality signs in the query string. So, we’ll make those options into parameters of the function.

Secondly, we have to figure out what kind of output we want from our function. Since we are essentially replacing the query string, we should alter the superglobal $_GET array directly. As such, there’s no need to actually return anything from our function. From here, we write the code.

/**
 * Rebuild the superglobal $_GET array
 *
 * @param string $query_string The query string
 * @param string $ampersand The replacement for ampersands in the query string
 * @param string $equality The replacement for equality signs in the query string
 * @return void
 */
function rebuild_get($query_string, $ampersand = '&', $equality = '=') {
	if (!empty($query_string)) {
		// Empty the $_GET array
		$_GET = array();
		
		// Insert key => value pairs into the $_GET array
		foreach (explode($ampersand, $query_string) as $pair) {
			$pair_data = explode($equality, $pair, 2);
			$_GET[$pair_data[0]] = (count($pair_data) > 1) ? $pair_data[1] : null;
		}
	}
}

How to use this function

  1. Retrieve the query string,
  2. Call the function using the query string (with the ampersand and equality sign replacements) before attempting the use the superglobal $_GET array, and
  3. Use the superglobal $_GET array as usual.

So, at the top of every file, or in a file that is included in every file (i.e. a config file), place this code at the top:

<?php
rebuild_get($_SERVER['QUERY_STRING'], ',', ':');

// The rest of the page goes here

If you have any questions or have found any glitches, feel free to inform me with a comment.

Did you like this? Share it:
  • Bryan Lee

    “…that the standards should be upheld at all times…”

    Apparently, the ampersand is not a standard but a convention.
    Which brings up a related issue, I’ve always been annoyed that query strings start with ‘?’ and then use some other character ‘&’ to separate parameters. Why not just use the same character for both?

    foo.com?var1=this?var2=that?var3=whatever

    Is there anything that says that’s not valid? And if so, should anyone care – i.e., does that actually cause problems?

    Every time I see or write code that says “if there’s a question mark in the URL, then add &var=val otherwise add ?var=val” it gets on my nerves….

  • Bryan Lee

    “…that the standards should be upheld at all times…”

    Apparently, the ampersand is not a standard but a convention.
    Which brings up a related issue, I’ve always been annoyed that query strings start with ‘?’ and then use some other character ‘&’ to separate parameters. Why not just use the same character for both?

    foo.com?var1=this?var2=that?var3=whatever

    Is there anything that says that’s not valid? And if so, should anyone care – i.e., does that actually cause problems?

    Every time I see or write code that says “if there’s a question mark in the URL, then add &var=val otherwise add ?var=val” it gets on my nerves….

  • http://koviko.net/ Koviko

    It’s easy to write a regex that doesn’t account for question marks inside of a query string (even though it should), so that’d be the only risk with doing what you’ve suggested. It may also confuse visitors that want to share a URL but without a query string (they may be confused as to where the query string begins). But aside from possible errors from other people, there’s nothing stopping you from doing that.

blog comments powered by Disqus