top of page

PHP

Displaying links safely and validly and then working with the GET parameter data:

​

I always forget how to do this exactly. There are several points to consider. First off you have to actually be displaying a link, not just forming a URL for all of this to apply. If you are displaying a link, the most obvious consideration is to make sure the HTML gets displayed in a safe way, just like you always need to when displaying HTML. This advice only applies if part of the HTML is dynamic / based on data that changes. In that case, the data could contain anything including special HTML characters. Personally I have been using the built in function  htmlspecialchars() for this. I pass in the HTML that I want to make safe as the first parameter, and ENT_QUOTES as the second, which ends up looking something like this:

​

$safeHtml = htmlspecialchars($unsafeHtml, ENT_QUOTES);

​

The thing to be careful of is you only want to pass in the part of the HTML that is dynamically generated and is not intended to be interpreted as special HTML characters.

​

I no longer remember why I use htmlspecialchars() instead of htmlentities(), but I thought I had a good reason to choose this at one point and it seems to have served me well. The same thing goes for ENT_QUOTES.

​

This is only half of the story for making the link as this only makes the HTML safe to display in general, but it does not ensure the URL of the link is valid. It is also the wrong order to do things in but explaining things in this order helps you understand that simpler concept first.

​

Ensuring the URL of a link is valid becomes an issue you have to consider when part of it is generated / dynamic. If you are just typing in what the URL should be, then you only have to make sure you write it in a valid way to begin with.

​

To make sure the URL is generated in a valid way, use urlencode() to encode the parts of the URL that are dynamically generated (normally you would not use rawurlencode()). urlencode() is simple, it only takes one parameter and returns only one value. Using it might look something like this:

​

$encodedGetParameterData = urlencode($rawGetParameterData);

​

The name urlencode is slightly misleading because it sounds like you should pass in the entire URL to this function. However, if you do that it, will mess up the URL. You should only pass in parts of the URL where none of the data is intended to be interpreted to have special URL character meanings. That is the point of this function, it encodes characters that might otherwise have a special meaning inside a URL, you know like how these characters have special meanings in URLs: &?=

Perhaps you are thinking urlencode() would be more accurately named if it were called urlparameterdataencode(). This shows you are beginning to understand, but that would not really be correct either as this function could be used to make safe any part of the URL that is both dynamically generated and requires encoding. That could include the GET parameter names or even parts of the URL before the parameters begin. However it is quite common for those parts of the URL to be hard coded and thus they often do not need to be encoded.

​

Here is a more "complete" (displays the link) example that illustrates the concepts and shows how all of this can come together, showing the function calls in the right order and way:

​

// $unsafeName is a name a user decided to search on. It could be anything, therefore it could contain

// characters that have a special meaning in HTML or in URLs, or are not valid as part of a URL

$safeName = htmlspecialchars(urlencode($unsafeName), ENT_QUOTES);

// $unsafeAddress is an address a user decided to search on. It could be anything, therefore it could

// contain characters that have a special meaning in HTML or in URLs, or are not valid as part of a URL

$safeAddress = htmlspecialchars(urlencode($unsafeAddress), ENT_QUOTES);

$linkHtml = "<a href='mystupidwebsite.com/search.php?name=$safeName&address=$safeAddress'>";

​

Don't ask me why I came up with an example where the page that has the user search criteria is linking to itself...

​

Here comes the last part. Do nothing. Once that link is clicked, and the page is hit and the GET parameter data is sent to your script, it has already been decoded back to the original data. You do not need to use urldecode().

​

So which one is right, urlencode() or rawurlencode()? The difference seems to center around the way they handle spaces and the tilde character (~). urlencode() encodes spaces by changing them into plus signs whereas rawurlencode() encodes them in a way more consistent with the technique used to encode other character requiring encoding. By encoding things in this more consistent way, rawurlencode() is encoding things according to according to RFC 3986 (whatever that is) whereas urlencode() deviates from it so much as it behaves differently from rawurlencode(). The documentation is less than thorough in how it explains the way the two deal with the tilde character. My guess is if you are using a newer version of PHP, neither of these functions will encode it. Why would someone want to encode spaces as a plus sign by using urlencode()? This is consistent with the way forms send data so something (the browser? the server?) will automatically decode this properly. Why would you not want to avoid encoding spaces that way by using rawurlencode()? I'm guessing to make sure RFC 3986 is being followed which looking at the documentation seems to be the case if you are forming an ftp address for sending email because some email systems will apparently mangle the characters otherwise.

​

Questions I'm asking myself:

​

Would it be safer to encode the entire portion of the URL from the start of the parameters to the end of the URL? This might be the case if it was only the combination of generated and hard coded text that resulted in part of the URL having a special URL meaning, but I'm not sure if such a case actually exists. The same thing might be true of making it safe to display as HTML.

​

Is it really necessary to make the encoded parts of the URL safe to display as HTML like what I showed in the code examples above? Would all the special HTML characters already be encoded because they are also special URL characters and would the way they are encoded always ensure there are no unwanted special HTML meanings?

​

Global Variable Creation and Access

​

The following applies when you are not using any specific namespace and these rules may or may not apply if you are using a namespace. If you create a variable outside of any class or function, it is considered a global variable:

 

$myGlobalVariable = 'fried potato slices are tasty with salt and ketchup';

​

You can then access that variable from any context, including inside a function or class method by using the $GLOBALS superglobal array, like this:

​

$frenchFryAddictsOnlyThought = $GLOBALS['myGlobalVariable'];

​

Sometimes you may want to check if the variable exists first, which you can do like this:

​

if (isset($GLOBALS['myGlobalVariable'])) {

    $frenchFryAddictsOnlyThought = $GLOBALS['myGlobalVariable'];

}

​

Alternatively, you could access the variable by specifying you want to work with the global version of a variable, which you can do with the global keyword. In the context of the previous code sample, it would look like this:

​

if (isset($GLOBALS['myGlobalVariable'])) {

    global $myGlobalVariable;

}

​

Then you could simply use that variable via $myGlobalVariable like so:

​

f (isset($GLOBALS['myGlobalVariable'])) {

    global $myGlobalVariable;

    $myGlobalVariable .= ' and grilled onions and cheese';

}

​

​

bottom of page