URL Map (conf/urlmap.xml)

The URL map maps source servers onto paths in the current site. It is used in two situations:

  • to translate the site path of the incoming FIT URL into a proper source URL
  • to rewrite source URLs (in the loaded content) to FIT URLs

For introduction to the way FIT handles URLs, read the article about URL Rewriting.

CAUTION: The URL map is not a security feature! Use Request ACLs to specify the permitted data sources.

Syntax

The path attribute defines the local path in the FIT URL. The source attribute defines the source URL, where content is loaded from.

<urlmap mandatory="true">
  <map path="/shop" source="//shop.example.com/" />
  <map path="/" source="//example.com/" />
</urlmap>

The map rules are executed in order of their definition (not by path length or specificity). That means that the first rule with a matching path is used. “Matching” means that the subject (site path or source URL) starts with the given rule.

In the example above, the incoming site path /shop/index would match both /shop and /. However, the /shop has precedence and the resulting main URL would be http://shop.example.com/index. Therefore, the fallback rule / should be put at the end of the map.

The URL Rewriter uses the URL map “flipped”: the source URLs are matched against the source attributes in order. The first hit defines the path in the FIT URL.

Be careful to define path/source pairs that work in both directions. Since the URL map is designed to map local site paths to URLs pointing to source (or “backend”) servers, using a FIT URL as a source might not work as intended.

Exact matches

In some cases the site path should match the given path rule exactly (rather than starting with the rule value). These patterns are marked with a $ at the end of the rule.

This can be used to define mapping exceptions. This is often used to have the source for the start page at a different location:

<urlmap mandatory="true">
  <map path="/$" source="fit://site/public/index.html" />
  <map path="/" source="//example.com/" />
</urlmap>

Again, order wins over specificity. /$ behind the / rule will not work.

Protocols

The source part of the rules may be defined with or without protocol and port. If missing, the corresponding values of the current FIT request will be used.

In the example above / will be translated to https://example.com/, if FIT was called via HTTPS. This is the recommended way.

If protocol or port is given explicitly, the current request has no effect on the URL translation. This could be desirable if the source server does not support both HTTP and HTTPS, or if non-canonical ports are used. In the latter case you may specify two rules for the same path that map to different sources:

<urlmap>
  <map path="/" source="http://example.com:8080" />
  <map path="/" source="https://example.com:8443" />
</urlmap>

There is a shorthand syntax for this case:

<urlmap>
  <map path="/" source="//example.com:{8080,8443}" />
</urlmap>

Default URL Map

If you don’t set up a urlmap.xml for your site, a default URL map is implicitly used. It maps all URLs to the local files of the site:

<urlmap>
  <map path="/" source="fit://site/public/" />
</urlmap>

Mandatory URL Map

The ACLs decide which backend URLs may be loaded. By default, the URL Rewriter also uses the ACLs to decide if the URL should be changed to point to FIT.

When setting the URL map root attribute mandatory="true" backend URLs must be allowed by the ACLs and handled by a map rule to be rewritten to FIT. (Note that this is still not a security feature, as a client may deliberately construct an URL that uses an unmapped backend URL.)

Filters

There are cases where your URL mapping depends on the Delivery Context properties. For example, your development and production setups may use different backend URLs. You can accomplish this by using the Dynamic Configuration mechanisms in your urlmap.xml.

The URL map is loaded very early. At this time, most of the request environment is not yet loaded. This applies to most data under fit://request/* and the following Delivery Context properties, that are either not set or have a default value:

  • request/url
  • request/path
  • request/query
  • request/ppl
  • request/purpose
  • request/ucm

You may also use data from the incoming request in fit://request/request in your URL map filters. It allows easy access to the incoming HTTP request headers, GET and POST parameters and cookies for the FIT Server (not to be confused with encoded backend cookies that are not yet loaded here).

<urlmap mandatory="true">
  <choose>
    <when test="fit-document('fit://request/request')/request/header[@name='X-View' and @value='dev']">
      <map path="/" source="http://localhost:8000/" />
    </when>
    <otherwise>
      <map path="/" source="//production.example.com/" />
    </otherwise>
  </choose>
</urlmap>

As the URL map is needed to determine the main URL, fit://request/request contains no url attribute at this point (it is added afterwards). However, the DC request properties request/frontend-url or request/host among others may be helpful for URL map filtering.

Keep in mind, that data coming from the client should be handled with care. In the example above, anyone could add the X-View header. You should take additional measures to protect your system.

Domain Placeholders

The URL map allows the use of * wildcards for matching and rewriting dynamic domain parts in the hostname of a URL, e.g. to map multiple country codes with a single rule. So you don’t need to define multiple similar rules for each country code.

<urlmap>
  <map path="/#1" source="http://*.wikipedia.org/" />
</urlmap>

For example, the URLs http://de.wikipedia.org/wiki/Test, http://en.wikipedia.org/wiki/Test and http://fr.wikipedia.org/wiki/Test will be shortened equally to:

  • de: http://example.com/de/wiki/Test
  • en: http://example.com/en/wiki/Test
  • fr: http://example.com/fr/wiki/Test

The wildcards and their matches can be used multiple times or even be omitted. In this example, the matches #2 and #4 are discarded whereas #1 is used twice:

<urlmap>
  <map path="/#1/#3/#1" source="http://*.*.*.*/" />
</urlmap>

In the following example, only the last part of the third-level domain is wildcarded (www*). This kind of pattern may be useful if the same content is provided under several subdomains like www1.example.net, www2.example.net and www3.example.net (round robin DNS load balancing):

<urlmap>
  <map path="/#1" source="http://www*.example.net/" />
</urlmap>

Rewriting of local public resources

Every publicly accessible resource which is called via fit protocol will be rewritten to FIT. These files may be referenced by local URLs. These are FIT URLs that contain a ;local Mark denoting the realm of the referenced file (site, local, project or extension). The site path will hold the rest of the local file’s path.

This type of URL is used if local paths (e.g. fit://project/public/) are not defined as mapping rules. They will also work if the URL map is mandatory. This makes local (public) files always accessible in your site without having to configure it explicitly.

Example

Let’s say we have a project with two sites (with default URL maps) which use shared resources from public directory of the project. A site uses an image like this: <img src="fit://project/public/image.jpg">. In this example, the content of the src attribute will be rewritten to /;local=project/image.jpg.

However, if you map fit://project/public/ to /shared/ in your urlmap.xml, the URL will point to /shared/image.jpg.