1 follower

URL Management

Complete URL management for a Web application involves two aspects:

  1. When a user request comes in terms of a URL, the application needs to parse it into understandable parameters.
  2. The application needs to provide a way of creating URLs so that the created URLs can be understood by the application.

For a Yii application, these are accomplished with the help of CUrlManager.

Note: You can specify URLs without using Yii but it is not recommended since you will not be able to easily change URLs via configuration without touching application code or to achieve application portability.

1. Creating URLs

Although URLs can be hardcoded in controller views, it is often more flexible to create them dynamically:

$url=$this->createUrl($route,$params);

where $this refers to the controller instance; $route specifies the route of the request; and $params is a list of GET parameters to be appended to the URL.

By default, URLs created by createUrl is in the so-called get format. For example, given $route='post/read' and $params=array('id'=>100), we would obtain the following URL:

/index.php?r=post/read&id=100

where parameters appear in the query string as a list of Name=Value concatenated with the ampersand characters, and the r parameter specifies the request route. This URL format is not very user-friendly because it requires several non-word characters.

Tip: In order to generate URL with a hashtag, for example /index.php?r=post/read&id=100#title, you need to specify parameter named # using $this->createUrl('post/read',array('id'=>100,'#'=>'title')).

We could make the above URL look cleaner and more self-explanatory by using the so-called path format which eliminates the query string and puts the GET parameters into the path info part of URL:

/index.php/post/read/id/100

To change the URL format, we should configure the urlManager application component so that createUrl can automatically switch to the new format and the application can properly understand the new URLs:

array(
    ......
    'components'=>array(
        ......
        'urlManager'=>array(
            'urlFormat'=>'path',
        ),
    ),
);

Note that we do not need to specify the class of the urlManager component because it is pre-declared as CUrlManager in CWebApplication.

Tip: The URL generated by the createUrl method is a relative one. In order to get an absolute URL, we can prefix it with Yii::app()->request->hostInfo, or call createAbsoluteUrl.

2. User-friendly URLs

When path is used as the URL format, we can specify some URL rules to make our URLs even more user-friendly. For example, we can generate a URL as short as /post/100, instead of the lengthy /index.php/post/read/id/100. URL rules are used by CUrlManager for both URL creation and parsing purposes.

To specify URL rules, we need to configure the rules property of the urlManager application component:

array(
    ......
    'components'=>array(
        ......
        'urlManager'=>array(
            'urlFormat'=>'path',
            'rules'=>array(
                'pattern1'=>'route1',
                'pattern2'=>'route2',
                'pattern3'=>'route3',
            ),
        ),
    ),
);

The rules are specified as an array of pattern-route pairs, each corresponding to a single rule. The pattern of a rule is a string used to match the path info part of URLs. And the route of a rule should refer to a valid controller route.

Besides the above pattern-route format, a rule may also be specified with customized options, like the following:

'pattern1'=>array('route1', 'urlSuffix'=>'.xml', 'caseSensitive'=>false)

Starting from version 1.1.7, the following format may also be used (that is, the pattern is specified as an array element), which allows specifying several rules with the same pattern:

array('route1', 'pattern'=>'pattern1', 'urlSuffix'=>'.xml', 'caseSensitive'=>false)

In the above, the array contains a list of extra options for the rule. Possible options are explained as follows:

  • pattern: the pattern to be used for matching and creating URLs. This option has been available since version 1.1.7.

  • urlSuffix: the URL suffix used specifically for this rule. Defaults to null, meaning using the value of CUrlManager::urlSuffix.

  • caseSensitive: whether this rule is case sensitive. Defaults to null, meaning using the value of CUrlManager::caseSensitive.

  • defaultParams: the default GET parameters (name=>value) that this rule provides. When this rule is used to parse the incoming request, the values declared in this property will be injected into $_GET.

  • matchValue: whether the GET parameter values should match the corresponding sub-patterns in the rule when creating a URL. Defaults to null, meaning using the value of CUrlManager::matchValue. If this property is false, it means a rule will be used for creating a URL if its route and parameter names match the given ones. If this property is set true, then the given parameter values must also match the corresponding parameter sub-patterns. Note that setting this property to true will degrade performance.

  • verb: the HTTP verb (e.g. GET, POST, DELETE) that this rule must match in order to be used for parsing the current request. Defaults to null, meaning the rule can match any HTTP verb. If a rule can match multiple verbs, they must be separated by commas. When a rule does not match the specified verb(s), it will be skipped during the request parsing process. This option is only used for request parsing. This option is provided mainly for RESTful URL support. This option has been available since version 1.1.7.

  • parsingOnly: whether the rule is used for parsing request only. Defaults to false, meaning a rule is used for both URL parsing and creation. This option has been available since version 1.1.7.

3. Using Named Parameters

A rule can be associated with a few GET parameters. These GET parameters appear in the rule's pattern as special tokens in the following format:

<ParamName:ParamPattern>

where ParamName specifies the name of a GET parameter, and the optional ParamPattern specifies the regular expression that should be used to match the value of the GET parameter. In case when ParamPattern is omitted, it means the parameter should match any characters except the slash /. When creating a URL, these parameter tokens will be replaced with the corresponding parameter values; when parsing a URL, the corresponding GET parameters will be populated with the parsed results.

Let's use some examples to explain how URL rules work. We assume that our rule set consists of three rules:

array(
    'posts'=>'post/list',
    'post/<id:\d+>'=>'post/read',
    'post/<year:\d{4}>/<title>'=>'post/read',
)
  • Calling $this->createUrl('post/list') generates /index.php/posts. The first rule is applied.

  • Calling $this->createUrl('post/read',array('id'=>100)) generates /index.php/post/100. The second rule is applied.

  • Calling $this->createUrl('post/read',array('year'=>2008,'title'=>'a sample post')) generates /index.php/post/2008/a%20sample%20post. The third rule is applied.

  • Calling $this->createUrl('post/read') generates /index.php/post/read. None of the rules is applied.

In summary, when using createUrl to generate a URL, the route and the GET parameters passed to the method are used to decide which URL rule to be applied. If every parameter associated with a rule can be found in the GET parameters passed to createUrl, and if the route of the rule also matches the route parameter, the rule will be used to generate the URL.

If the GET parameters passed to createUrl are more than those required by a rule, the additional parameters will appear in the query string. For example, if we call $this->createUrl('post/read',array('id'=>100,'year'=>2008)), we would obtain /index.php/post/100?year=2008. In order to make these additional parameters appear in the path info part, we should append /* to the rule. Therefore, with the rule post/<id:\d+>/*, we can obtain the URL as /index.php/post/100/year/2008.

As we mentioned, the other purpose of URL rules is to parse the requesting URLs. Naturally, this is an inverse process of URL creation. For example, when a user requests for /index.php/post/100, the second rule in the above example will apply, which resolves in the route post/read and the GET parameter array('id'=>100) (accessible via $_GET).

Note: Using URL rules will degrade application performance. This is because when parsing the request URL, CUrlManager will attempt to match it with each rule until one can be applied. The more the number of rules, the more the performance impact. Therefore, a high-traffic Web application should minimize its use of URL rules.

4. Parameterizing Routes

We may reference named parameters in the route part of a rule. This allows a rule to be applied to multiple routes based on matching criteria. It may also help reduce the number of rules needed for an application, and thus improve the overall performance.

We use the following example rules to illustrate how to parameterize routes with named parameters:

array(
    '<_c:(post|comment)>/<id:\d+>/<_a:(create|update|delete)>' => '<_c>/<_a>',
    '<_c:(post|comment)>/<id:\d+>' => '<_c>/read',
    '<_c:(post|comment)>s' => '<_c>/list',
)

In the above, we use two named parameters in the route part of the rules: _c and _a. The former matches a controller ID to be either post or comment, while the latter matches an action ID to be create, update or delete. You may name the parameters differently as long as they do not conflict with GET parameters that may appear in URLs.

Using the above rules, the URL /index.php/post/123/create would be parsed as the route post/create with GET parameter id=123. And given the route comment/list and GET parameter page=2, we can create a URL /index.php/comments?page=2.

5. Parameterizing Hostnames

It is also possible to include hostname into the rules for parsing and creating URLs. One may extract part of the hostname to be a GET parameter. For example, the URL http://admin.example.com/en/profile may be parsed into GET parameters user=admin and lang=en. On the other hand, rules with hostname may also be used to create URLs with parameterized hostnames.

In order to use parameterized hostnames, simply declare URL rules with host info, e.g.:

array(
    'http://<user:\w+>.example.com/<lang:\w+>/profile' => 'user/profile',
)

The above example says that the first segment in the hostname should be treated as user parameter while the first segment in the path info should be lang parameter. The rule corresponds to the user/profile route.

Note that CUrlManager::showScriptName will not take effect when a URL is being created using a rule with parameterized hostname.

Also note that the rule with parameterized hostname should NOT contain the sub-folder if the application is under a sub-folder of the Web root. For example, if the application is under http://www.example.com/sandbox/blog, then we should still use the same URL rule as described above without the sub-folder sandbox/blog.

6. Hiding index.php

There is one more thing that we can do to further clean our URLs, i.e., hiding the entry script index.php in the URL. This requires us to configure the Web server as well as the urlManager application component.

We first need to configure the Web server so that a URL without the entry script can still be handled by the entry script. For Apache HTTP server, this can be done by turning on the URL rewriting engine and specifying some rewriting rules. We can create the file /wwwroot/blog/.htaccess with the following content. Note that the same content can also be put in the Apache configuration file within the Directory element for /wwwroot/blog.

RewriteEngine on

# if a directory or a file exists, use it directly
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

# otherwise forward it to index.php
RewriteRule . index.php

We then configure the showScriptName property of the urlManager component to be false.

Now if we call $this->createUrl('post/read',array('id'=>100)), we would obtain the URL /post/100. More importantly, this URL can be properly recognized by our Web application.

7. Faking URL Suffix

We may also add some suffix to our URLs. For example, we can have /post/100.html instead of /post/100. This makes it look more like a URL to a static Web page. To do so, simply configure the urlManager component by setting its urlSuffix property to the suffix you like.

8. Using Custom URL Rule Classes

Note: Using custom URL rule classes has been supported since version 1.1.8.

By default, each URL rule declared with CUrlManager is represented as a CUrlRule object which performs the task of parsing requests and creating URLs based on the rule specified. While CUrlRule is flexible enough to handle most URL formats, sometimes we still want to enhance it with special features.

For example, in a car dealer website, we may want to support the URL format like /Manufacturer/Model, where Manufacturer and Model must both match some data in a database table. The CUrlRule class will not work because it mostly relies on statically declared regular expressions which have no database knowledge.

We can write a new URL rule class by extending from CBaseUrlRule and use it in one or multiple URL rules. Using the above car dealer website as an example, we may declare the following URL rules,

array(
    // a standard rule mapping '/' to 'site/index' action
    '' => 'site/index',
 
    // a standard rule mapping '/login' to 'site/login', and so on
    '<action:(login|logout|about)>' => 'site/<action>',
 
    // a custom rule to handle '/Manufacturer/Model'
    array(
        'class' => 'application.components.CarUrlRule',
        'connectionID' => 'db',
    ),
 
    // a standard rule to handle 'post/update' and so on
    '<controller:\w+>/<action:\w+>' => '<controller>/<action>',
),

In the above, we use the custom URL rule class CarUrlRule to handle the URL format /Manufacturer/Model. The class can be written like the following:

class CarUrlRule extends CBaseUrlRule
{
    public $connectionID = 'db';
 
    public function createUrl($manager,$route,$params,$ampersand)
    {
        if ($route==='car/index')
        {
            if (isset($params['manufacturer'], $params['model']))
                return $params['manufacturer'] . '/' . $params['model'];
            else if (isset($params['manufacturer']))
                return $params['manufacturer'];
        }
        return false;  // this rule does not apply
    }
 
    public function parseUrl($manager,$request,$pathInfo,$rawPathInfo)
    {
        if (preg_match('%^(\w+)(/(\w+))?$%', $pathInfo, $matches))
        {
            // check $matches[1] and $matches[3] to see
            // if they match a manufacturer and a model in the database
            // If so, set $_GET['manufacturer'] and/or $_GET['model']
            // and return 'car/index'
        }
        return false;  // this rule does not apply
    }
}

The custom URL class must implement the two abstract methods declared in CBaseUrlRule:

Besides the above typical usage, custom URL rule classes can also be implemented for many other purposes. For example, we can write a rule class to log the URL parsing and creation requests. This may be useful during development stage. We can also write a rule class to display a special 404 error page in case all other URL rules fail to resolve the current request. Note that in this case, the rule of this special class must be declared as the last rule.

Found a typo or you think this page needs improvement?
Edit it on github !