PHP Script for Converting Relative to Absolute URL 2

July 26, 2011

This is a very simple PHP script to convert relative URL to its absolute path, given a base URL. e.g: converting lena.jpg to http://example.com/a/b/lena.jpg.

When I searched Google for scripts to convert relative to absolute URL, I came accross a site that has done that, only in Perl. It is a very simple script compared to the others, so I tried to port the code into PHP.

Note what the Perl geek said about this:

“I wrote this mainly because someone asked how to do this in PHP. Turns out there is nothing in like the venerable URI module in all of PHP-dom (how pathetic is that for a language as web-centric as it?)”

Here’s the script:

<?php
/**
 * Function to convert relative URL to absolute given a base URL
 *
 * @param   string   the relative URL
 * @param   string   the base URL
 * @return  string   the absolute URL
 */
function rel2abs($rel, $base)
{
  if (parse_url($rel, PHP_URL_SCHEME) != '')
    return $rel;
  else if ($rel[0] == '#' || $rel[0] == '?')
    return $base.$rel;

  extract(parse_url($base));

  $abs = ($rel[0] == '/' ? '' : preg_replace('#/[^/]*$#', '', $path))."/$rel";
  $re  = array('#(/\.?/)#', '#/(?!\.\.)[^/]+/\.\./#');

  for ($n = 1; $n > 0; $abs = preg_replace($re, '/', $abs, -1, $n));
  return $scheme.'://'.$host.str_replace('../', '', $abs);
}

The results is shown in the table below given the base URL,

http://example.com/a/b/c/page.php
Relative URL Absolute URL
lena.jpg http://example.com/a/b/c/lena.jpg
./lena.jpg http://example.com/a/b/c/lena.jpg
../lena.jpg http://example.com/a/b/lena.jpg
../../lena.jpg http://example.com/a/lena.jpg
../../../lena.jpg http://example.com/lena.jpg
/lena.jpg http://example.com/lena.jpg
../x/lena.jpg http://example.com/a/b/x/lena.jpg
../../x/y/lena.jpg http://example.com/a/x/y/lena.jpg
../../../x/y/z/lena.jpg http://example.com/x/y/z/lena.jpg
../../../../../../lena.jpg http://example.com/lena.jpg
http://google.com http://google.com
?f=lena.jpg http://example.com/a/b/c/page.php?f=lena.jpg
#lena http://example.com/a/b/c/page.php#lena

When would I need this function?

Let say that you’re writing a link checker script. You download a page, grab the links, and make HTTP requests to check for broken links:

$file = 'http://bsd-noobz.com/blog/';

$doc = new DOMDocument;
$doc->loadHTMLFile($file);
$xpath = new DOMXPath($doc);
$elems = $xpath->query('*//a');

foreach ($elems as $elem) {
  $next_file = rel2abs($elem->getAttribute('href'), $file);
  //...Check for broken links...
}

The links need to be converted to absolute URLs so you can make HTTP requests.

The function fails for relative URL xxx and yyy?

The function above assumes that the base URL points to a file. If the base URL points to a directory, you should add a trailing slash. Failing doing this will give you a headache later. Consider this URL,

http://example.com/a/b/c

where “/c” is actually a directory but the script assumes that its a file. If you use the above base URL for converting

img/lena.jpg

You will obtain

http://example.com/a/b/img/lena.jpg

Which is wrong because the base URL is not ended with a slash. Add a trailing slash to the base URL to get the correct result:

http://example.com/a/b/c/img/lena.jpg

This is the hard part. There is no workaround for the script to determine whether “/c” is a file or a directory. You have to figure it out yourself.

You can obtain this from the HTTP header. Let’s take the above link checker for example. Say you’re fetching http://example.com/a/b/c,

GET /a/b/c HTTP/1.1
Host: example.com
Connection: close

The server recognize that “/c” is a directory and will redirect you to the correct location:

HTTP/1.1 301 Moved Permanently
Location: http://example.com/a/b/c/

You should be able to catch this redirect and use the URL in the “Location:” header for the base URL.

Posted in Programming | 2 comments

Trackbacks

Use this link to trackback from your own site.

Comments

Leave a response

  1. Rizky Sat, 11 Feb 2012 19:30:32 UTC

    Terima Kasih. Saya menggunakan fungsi yang Anda buat. :D

  2. Rafael Thu, 08 Mar 2012 01:14:23 UTC

    Maaaaaaaaan, thank you a lot!

    I’ve tried do this script by myself and I didn’t get anywhere.

    You really help me.

    Once again, thank you.

Comments