This is a very simple PHP script to convert relative URL to its absolute path, given a base URL. e.g: converting lena.jpg to http://example.com/a/b/lena.jpg.
When I searched Google for scripts to convert relative to absolute URL, I came accross a site that has done that, only in Perl. It is a very simple script compared to the others, so I tried to port the code into PHP.
Note what the Perl geek said about this:
“I wrote this mainly because someone asked how to do this in PHP. Turns out there is nothing in like the venerable URI module in all of PHP-dom (how pathetic is that for a language as web-centric as it?)”
Here’s the script:
<?php
/**
* Function to convert relative URL to absolute given a base URL
*
* @param string the relative URL
* @param string the base URL
* @return string the absolute URL
*/
function rel2abs($rel, $base)
{
if (parse_url($rel, PHP_URL_SCHEME) != '')
return $rel;
else if ($rel[0] == '#' || $rel[0] == '?')
return $base.$rel;
extract(parse_url($base));
$abs = ($rel[0] == '/' ? '' : preg_replace('#/[^/]*$#', '', $path))."/$rel";
$re = array('#(/\.?/)#', '#/(?!\.\.)[^/]+/\.\./#');
for ($n = 1; $n > 0; $abs = preg_replace($re, '/', $abs, -1, $n));
return $scheme.'://'.$host.str_replace('../', '', $abs);
}
The results is shown in the table below given the base URL,
http://example.com/a/b/c/page.php
| Relative URL | Absolute URL |
lena.jpg |
http://example.com/a/b/c/lena.jpg |
./lena.jpg |
http://example.com/a/b/c/lena.jpg |
../lena.jpg |
http://example.com/a/b/lena.jpg |
../../lena.jpg |
http://example.com/a/lena.jpg |
../../../lena.jpg |
http://example.com/lena.jpg |
/lena.jpg |
http://example.com/lena.jpg |
../x/lena.jpg |
http://example.com/a/b/x/lena.jpg |
../../x/y/lena.jpg |
http://example.com/a/x/y/lena.jpg |
../../../x/y/z/lena.jpg |
http://example.com/x/y/z/lena.jpg |
../../../../../../lena.jpg |
http://example.com/lena.jpg |
http://google.com |
http://google.com |
?f=lena.jpg |
http://example.com/a/b/c/page.php?f=lena.jpg |
#lena |
http://example.com/a/b/c/page.php#lena |
When would I need this function?
Let say that you’re writing a link checker script. You download a page, grab the links, and make HTTP requests to check for broken links:
$file = 'http://bsd-noobz.com/blog/';
$doc = new DOMDocument;
$doc->loadHTMLFile($file);
$xpath = new DOMXPath($doc);
$elems = $xpath->query('*//a');
foreach ($elems as $elem) {
$next_file = rel2abs($elem->getAttribute('href'), $file);
//...Check for broken links...
}
The links need to be converted to absolute URLs so you can make HTTP requests.
The function fails for relative URL xxx and yyy?
The function above assumes that the base URL points to a file. If the base URL points to a directory, you should add a trailing slash. Failing doing this will give you a headache later. Consider this URL,
http://example.com/a/b/c
where “/c” is actually a directory but the script assumes that its a file. If you use the above base URL for converting
img/lena.jpg
You will obtain
http://example.com/a/b/img/lena.jpg
Which is wrong because the base URL is not ended with a slash. Add a trailing slash to the base URL to get the correct result:
http://example.com/a/b/c/img/lena.jpg
This is the hard part. There is no workaround for the script to determine whether “/c” is a file or a directory. You have to figure it out yourself.
You can obtain this from the HTTP header. Let’s take the above link checker for example. Say you’re fetching http://example.com/a/b/c,
GET /a/b/c HTTP/1.1 Host: example.com Connection: close
The server recognize that “/c” is a directory and will redirect you to the correct location:
HTTP/1.1 301 Moved Permanently Location: http://example.com/a/b/c/
You should be able to catch this redirect and use the URL in the “Location:” header for the base URL.
Terima Kasih. Saya menggunakan fungsi yang Anda buat.
Maaaaaaaaan, thank you a lot!
I’ve tried do this script by myself and I didn’t get anywhere.
You really help me.
Once again, thank you.