Basic configuration tutorial

Basic setup

Since version 2.25, the normal open-source stanza of configure, make, make install no longer installs the default configuration. You will need to read the file INSTALL that came with the distribution to install the default configuration that this tutorial uses, as well as for things like generating certificates.

The default configuration you get by following the instructions in the file INSTALL is used in this tutorial to explain the basics of how Yxorp works and to show some of the things Yxorp can do.

There are no complicated things necessary to run this configuration; it only needs a machine with Internet access to run Yxorp and a machine (possibly the same) that runs browsers and can connect to the Yxorp machine. You don't need to set up a web server, because the tutorial uses Yxorp's web site at http://yxorp.sourceforge.net/ (this is why the system needs Internet access).

First, we will look at the contents of the default configuration file.

<?xml version="1.0"?>
<yxorpconfig>
<!-- Main yxorp configuration file -->
<!-- where the log files go -->

<log accesslog="/var/log/yxorpaccess" errorlog="/var/log/yxorperror" />

The first interesting line here is the <log> tag; it sets where the logfiles will go. The file names set here are the files Yxorp will actually write to; log file rotation will cause Yxorp to close the current file, rename it, and open a fresh one with the original name.

The format of the access and error logs can be changed, but for this tutorial, log formats are left at their defaults.

<!-- This is the tcp port the xml listener will run on -->

<configlistener port="7780" ipaddress="127.0.0.1" />

The <configlistener> tag enables the configuration listener. This makes it possible to use online reconfiguration; i.e. you do not have to stop and restart Yxorp every time you make a change in the configuration. Also, the command “yxorpconfig -r” can be used to look at the actual configuration (including all defaults) that is running, the command “yxorprealserver” can be used to set servers in or out of service, “yxorpclientstate” can be used to look at client states, etc.

<!-- request rule that will check and modify a request -->

<rule id="rule1" type="request">

<![CDATA[
if (Host: ~/^[a-z\.]+$/) {
   Host: = "yxorp.sourceforge.net";
} else if (Host: ~/^[0-9\.]+$/) {
   reject("You are not allowed access through the IP address of this server");
} else {
   Host: = "yxorp.sourceforge.net";
}
]]>

</rule>

This <rule> tag defines the rule program that will be executed on receiving a request. Which rule to run on a request is defined on a listener, as we will see later in this example. This particular rule looks at the value of the Host: header (every variable name ending in a colon ':' is interpreted as a header name). If, in this rule, the Host: header contains only letters and dots (as in most domain names), it changes the Host: header to “yxorp.sourceforge.net”. If however the Host: header contains numbers and dots (as in IP addresses), it rejects the request.

<!--the reject rule will be executed if something is wrong with the request-->
<rule id="rule3" type="reject">
<![CDATA[
errorhtml="<html>
   <head>
   <title>Nicht Touchen, U only watchen das Blinkenlights!</title>
   </head>
   <body>
   <h1>Error ";
errorhtml=concat(errorhtml, errorcode, " - ", rejectreason);
errorhtml=concat(errorhtml, "</h1></body></html>");
Server: = "yxorp-x.x";
]]>
</rule>

This <rule> tag defines the reject rule, that is used whenever a request is rejected (either by the Yxorp code, or by the use of the reject function in a script). As with the request rule, it is normally set on the listener. This rule composes a customized error message by using special variables like errorhtml and errorcode. Note how the errorhtml variable is set; the string spans multiple lines. Below the multiple-line string, the special variables errorcode and rejectreason are added; this will generate a simple HTML page showing a text showing which error code Yxorp generated (or received from the server), and a text message giving the reason why the request was rejected. In a real situation, you may not want to give out this information; however, the choice is yours, and for this tutorial it makes sense not to hide the reason why Yxorp rejects a request.

<!-- the definitions for the listeners -->
<listener
   id="test1"
   ipaddress="0.0.0.0"
   port="80"
   rule="rule1"
   rejrule="rule3"
/>

This bit defines the listener. Note especially the IP address; as it is set here, this will cause Yxorp to bind to all the available interfaces in the system. If you set a specific IP address, Yxorp will listen only on that address. Also note the rule settings; this sets the linking between the listener and the rules for the requests that come in on this listener. There are several rule types you can set on the listener, and in other places; most go beyond the scope of this chapter. Further on, though, we will deal with special rules that run if servers are not available.

<server id="yxorp.sourceforge.net" virtualserver="group1" />

<virtualserver id="group1" schedule="lru">
   <real id="sf" />
</virtualserver>

<realserver id="sf" ip="dns" inservice="yes" />

This bit defines what happens with the contents of the Host: header in a request (i.e. what would normally be the domain name of a server). The Host: header value is matched to a <server> definition (i.e. the <server> definition should normally have the same name as the domain name). The <server> in this example is named “yxorp.sourceforge.net”, and thus will be selected if the Host: header contains this domain name (remember we used the request rule above to make sure it does). The <server> maps to a <virtualserver> group; <virtualservers> are used to create the link between one “logical” <server> and one or more “physical” <realserver>. This is the way Yxorp can be configured for load balancing.

This virtualserver group defines only one real server (more will be added later on in this tutorial). The last line defines the real server; note that it does not list an IP address but “dns”; this causes the Host: header to be resolved in DNS to be translated to an IP address. This is useful for examples and testing, but in a real life situation it is better to define IP addresses for reasons of security and performance. In this case it is very easy, because I don't have to update configurations and documentation if Sourceforge decide to change their IP addresses.

If you followed the instructions in the file INSTALL that comes with the source code, the complete configuration file for this example should have been installed on your system. You can also pick it up from http://yxorp.sourceforge.net/examples/yxorpconfig.xml

Testing the configuration

As root, start yxorp with the following command:

# yxorp

Yxorp needs to be started by root; however, it will release most privileges the root user has (on Solaris and Linux). You can also configure Yxorp to change to another userid and group and/or run it in a chroot jail, see the configuration reference for details.

Since version 2, Yxorp starts as a background daemon by default (so it may appear as though nothing happened). You can use any of the following commands to verify that Yxorp is running:

# yxorpconfig -r             (reads the actual configuration from the daemon)
or
# ps -e| grep yxorp          (lists the process table and looks for yxorp)

If you did not install Yxorp into some directory in your path, you'll have to include a path to the yxorp command. Yxorp uses a default configuration file name to start from; the default filename is “yxorpconfig.xml”, and Yxorp looks for this file in the default location for configuration files. On most systems, this is /usr/local/etc. Both can be changed as in the following example:

# yxorp -c my-yxorpconfig.xml -s /directory-for-yxorp-configurations

If you did not change the default installation, and did not set -c <configfilename> as an option to yxorp, the configuration described in the paragraphs above should be active.

Verify it is working by pointing your favorite browser to http://localhost/ ; you should now get the Yxorp website that is really at http://yxorp.sourceforge.net/

You can leave the Yxorp daemon running for the next steps in this tutorial, however, if you want to stop the daemon for some reason, use the following command:

# yxorp -K

Adding SSL

In this step, we will merge the following bits in the active configuration:

<?xml version="1.0"?>
<yxorpconfig>
<!-- the definitions for the ssl listener -->
<listener
   id="test1ssl"
   ipaddress="0.0.0.0"
   port="443"
   rule="rule1"
   rejrule="rule3"
   ssl="yes"
   certfile="yxorptest.pem"
/>
</yxorpconfig>

As you see above, this configuration snippet defines another listener, but with some added fields for SSL. The most important bit is the “ssl=yes” switch, that determines that this listener will actually expect SSL. The other important bit is the certfile definition; this is the filename that the certificate is stored in. Yxorp will try to read this file from the sysconfdir that has been set in by configure; usually this is something like /usr/local/etc or /etc.

The certificate should have been installed if you followed the instructions in the INSTALL file, but you can also use your own certificate. Yxorp tries to read the file from the same directory as it's configuration is in; usually this is /usr/local/etc.

The complete configuration file for this example is available in the Yxorp source tree in the data/examples directory, it is called add_ssl.xml. You can also pick it up from http://yxorp.sourceforge.net/examples/add_ssl.xml

Testing the configuration

If you stopped Yxorp after the step above, start it again. Apply the extra SSL configuration bits by running the command:

# yxorpconfig -c add_ssl.xml

Verify it is working by pointing your favorite browser to https://localhost/ . You should get a popup window from your browser about the certificate, since it is self-signed. Depending on your browser, you may also get a warning about nonsecure items on the page, this is because the Sourceforge logo on top of the Yxorp page is inserted with an absolute http URL, and thus is retrieved directly from the original Sourceforge site instead of being proxied by Yxorp.

Adding basic authentication

A simple example of adding basic authentication is in the following configuration snippet:

<?xml version="1.0"?>
<yxorpconfig>

<rule id="rule1" type="request">
<![CDATA[
if (Host: ~/^[a-z\.]+$/) {
   Host: = "yxorp.sourceforge.net";
} else if (Host: ~/^[0-9\.]+$/) {
   reject("You are not allowed access through the IP address of this server");
} else {
   Host: = "yxorp.sourceforge.net";
}
if (uri ~/^\/yxorp*$/) {
   basic_auth_check("my-realm", "local");
}
]]>
</rule>

<basicauth realm=”my-realm” userid="aladdin" passwd="sesame" />

</yxorpconfig>

Note that the source for rule1 is completely replaced. In the last if-statement, the regexp checks if the URI contains /yxorp in the beginning; if it does, it will demand basic authentication by the function call to 'basic_auth_check("my-realm", "local")'. Basic_auth_check will reject the request if valid basic authentication credentials are not present in the request; thus, the reject rule we already had will be activated.

The basic authentication credentials that are accepted are defined in

<basicauth realm=”my-realm” userid="aladdin" passwd="sesame" />

The realm=”my-realm” portion in the definition must correspond to the first parameter in the basic_auth_check call. This allows you to define several “realms”, each of which may give access to different parts of your content. The userid/password combinations are unique within the same value of the realm, but not across realms; so the same userid and password may be defined multiple times in several realms.

Testing the configuration

If you stopped Yxorp after the step above, start it again. Apply the basic authentication configuration bits by running the command “yxorpconfig -c add_basicauth.xml”. Again, the file is in data/examples, or http://yxorp.sourceforge.net/examples/add_basicauth.xml

Verify it is working by pointing your favorite browser to http://localhost/ . This should work as before. Now point your browser at http://localhost/yxorpdoc-2.html and you should get a popup window asking for userid and password.

Now run the command

yxorpconfig -r

This will read the complete configuration from Yxorp as it is currently active, and look for the line saying:

<basicauth realm=”my-realm” base64="YWxhZGRpbjpzZXNhbWU=" />

This is the internal representation that Yxorp uses for basic authentication credentials. Note that this is not any form of encryption (which would not make a lot of sense since basic authentication is not very secure), but the base64-encoded form of userid and password. You can specify either the base64 or userid/password forms in a configuration; Yxorp will always report the base64 form.

Adding support for non-standard headers

Some sites use other headers than those described in RFC2616 and later; also, Yxorp does not necessarily know about all headers in all RFC's (since there are lots). By default, Yxorp discards headers it does not know about. If however you need one of these headers, you can add it to Yxorp's table of headers as in the following example:

<?xml version="1.0"?>
<yxorpconfig>
<globalconfiguration>
   <header id="X-Pad:" xlateid="X_Pad:" client="1" server="1"
      check="rfc2616-text" maxlen="80" />
</globalconfiguration>
</yxorpconfig>

In this example, a header named “X-Pad” is added (this header is at the time of writing sent out by Sourceforge's project web servers, where Yxorp's website resides).

Refer to the configuration chapter for full detail on the settings that you can specify for a header. One thing to look at is the check parameter, which sets the character set that Yxorp will check the contents of the header to, and the length parameter, which is the maximum length of this header that Yxorp will accept. For most header attributes, you can specify if Yxorp should ignore (and discard) the header, or reject the request.

Testing the configuration

Testing the effect of the change above is a bit harder than with the previous examples, since I'm not exactly clear on when Sourceforge's servers insert this header, and some browsers don't allow you to look at the headers. One definite way would be to trace the traffic between your browser and Yxorp with Ethereal, for example.

As before, apply the change by the command yxorpconfig -c add_headers.xml, find the file in data/examples or on http://yxorp.sourceforge.net/examples/add_headers.xml

Adding load balancing

To use load balancing, we need to add a second web server (i.e., a “real server”) to the configuration and setup the “virtualserver” to know about this second real server. How this is done is demonstrated by the following configuration snippet:

<?xml version="1.0"?>
<yxorpconfig>

<virtualserver id="group1" schedule="lru">
   <real id="sf" />
   <real id="sf2" />
</virtualserver>

<realserver id="sf" ip="dns" inservice="yes" />
<realserver id="sf2" ip="dns" inservice="yes" />

</yxorpconfig>

Note that what this does is just duplicate the definition of the original real server. In a real example, you would obviously define different servers (otherwise, what's the point of load balancing), and use IP addresses instead of the “dns” testing shortcut.

Testing the configuration

Apply the change: yxorpconfig -c add_loadbalancing.xml from data/examples or http://yxorp.sourceforge.net/examples/add_loadbalancing.xml and access the site a couple of times via http://localhost/ . Then look at the access log file (usually in /var/log/yxorpaccess) and look for the realserver names in the log entries, these should look similar to these:

127.0.0.1 27340 [09/Jan/2006:17:39:17 +0100] sf "GET http://yxorp.sourceforge.net/yxorpdoc-2_html_m34ac0a87.png" 200 501 15874 546 15836 501 299 546 261 737
127.0.0.1 27342 [09/Jan/2006:17:39:17 +0100] sf2 "GET http://yxorp.sourceforge.net/yxorpdoc-2_html_m64bcb189.png" 200 501 17568 546 17530 501 299 546 261 753
127.0.0.1 27346 [09/Jan/2006:17:39:17 +0100] sf2 "GET http://yxorp.sourceforge.net/yxorpdoc-2_html_1cb9e497.png" 200 500 17250 545 17212 500 299 545 261 744
127.0.0.1 27344 [09/Jan/2006:17:39:17 +0100] sf "GET http://yxorp.sourceforge.net/yxorpdoc-2_html_m5fd42.png" 200 498 32706 543 32668 498 299 543 261 945

As you see, both realservers are scheduled.

You can now have a look at the realserver status with the yxorprealserver command, as follows:

# yxorprealserver -v
sf          : inservice        available        config
sf2         : inservice        available        config
#

Other options on the yxorprealserver command allow you to manually set a server out-of-service or unavailable:

# yxorprealserver -v -u sf
sf          : inservice        unavailable      config
sf2         : inservice        available        config
#

Note the difference between inservice and available; Yxorp can automatically set a realserver out-of-service if it fails to respond to a connection several times in a row (and Yxorp will also automatically set it inservice again if it responds to a new connection that is scheduled by the auto-wakeup mechanism). Yxorp will however not change the available/unavailable flag.

There is another difference between inservice and available, this has to do with sticky load balancing. See the example on that for a discussion.

One last thing to try with this setup is what happens if you set both realservers out-of-service. Note the ugly message a client gets in this case. This is where the sorry rule comes in.

Adding a sorry rule

If no servers in a virtualserver group are available, it is possible to let a sorry rule run. A sorry rule can generate a custom HTML page, as we did above with the reject rule. However, especially with a complex layout for a sorry page, this can be a lot of work, and another technique is possible that we will see in the example below:

<?xml version="1.0"?>
<yxorpconfig>

<rule id="sorryrule" type="sorry">
   redirect("http://yxorp.sourceforge.net/noserver.html");
</rule>

<server id="yxorp.sourceforge.net" virtualserver="group1" sorry="sorryrule" />

</yxorpconfig>

The sorry rule just generates a redirect (i.e., a 307 Temporary Redirect status code, with a Location: header set to http://yxorp.sourceforge.net/noserver.html.

Also note that the definition of the sorryrule is on the server level, not on the virtualserver. This is because it may be necessary to have a different sorry page for each server definition, even though these servers may share the virtualserver group.

Testing the configuration

As above, activate the file add_sorryrule.xml from data/examples or http://yxorp.sourceforge.net/examples/add_sorryrule.xml, then make sure both realservers are set out-of-service or unavailable, and then try http://localhost/ and if all has gone well, you should now be redirected to http://yxorp.sourceforge.net/noserver.html.

Sticky load balancing

Sticky load balancing is the technique where many clients are distributed evenly over a group of servers, but each request from a specific client is always sent to the same server, i.e. the load balancer “remembers” the association between a client and a server. This is necessary if an application runs on the webservers that is not stateless, and there is no other means of sharing the state over the servers. If you want to share state on the web/application server level, or if you want sticky load balancing, is a discussion that goes rather far beyond the scope of this tutorial.

Yxorp can do sticky load balancing, however, the implementation has some drawbacks compared to straight load balancing; at least, the total throughput of Yxorp will decrease somewhat. Also, the table in which the state mapping is maintained may grow large, especially if someone is doing denial-of-service attacks on your site (or worse, targeted attacks tailored to let Yxorp's tables grow).

The way Yxorp keeps track of which client is which is by setting a session cookie. This cookie will by default have the name “<your-hostname>_state”, and will be valid until the end of the “session” (in most cases, this is the lifetime of the browser instance). The cookie value will be a randomly generated string like XL61DS0FPS4PLEDXUIBVIB8LLU301IZX1ZOB1WVWSOW9SWNMG2CEXGZFW6CQ1L. The cookies and associated client state information are timed out after a configurable time of inactivity. See the chapters on client state for more details.

The configuration changes to enable sticky load balancing are simple:

<?xml version="1.0"?>
<yxorpconfig>

<virtualserver id="group1" sticky="yes">
   <real id="sf" />
   <real id="sf2" />
</virtualserver>

</yxorpconfig>

In fact, the only change this makes is the “sticky=yes” part.

Testing the configuration

Activate the configuration from http://yxorp.sourceforge.net/examples/add_sticky.xml or data/examples. Then, point your browser to http://localhost/ and cause a couple of hits.

Then, use the yxorpclientstate command as follows:

# yxorpclientstate -v
<yxorpclientstate>

<clientstate
   id="IQ3AHIIPFKEDO1SQUQ1H67CPX68NMFSUARUWQESXGEXGGQT9TU74DJYU92C7V0"
   clientip="127.0.0.1" sticky="1" lastactive="[09/Jan/2006:23:52:37 +0100]"
   hitcount="1" toclient="9850" fromclient="451" toserver="9850"
   fromserver="451" >

<stickymap server="yxorp.sourceforge.net" realserver=sf2" />

</clientstate>

<tablestats entries="1" tablebytesize="824" maxchainlength="1"
   tabletruncated="0" />

</yxorpclientstate>
#

As you see, the value of the state cookie is linked to a lot of information. For this example, note the stickymap tag; this links requests for a certain server to a specific realserver.

Client states (and following from that, also the sticky load balancing mappings) are cleaned up automatically by Yxorp after a configurable time of inactivity, see the chapter on client state for more details.

Sticky loss rules

Similar to the sorry rules we already saw, there is also the case in sticky load balancing when the mapped server is no longer available. In this case, Yxorp has a special rule called “stickyloss” that handles what happens with the request.

<?xml version="1.0"?>
<yxorpconfig>

<rule id="stickylossrule" type="stickyloss">
   redirect("http://yxorp.sourceforge.net/stickyloss.html");
</rule>

<server id="yxorp.sourceforge.net" stickyloss="stickylossrule" />

</yxorpconfig>

You would typically want a stickyloss rule to explain that clients have lost their state in an application, and need to retry to another server.

Testing the configuration

Activate the configuration from http://yxorp.sourceforge.net/examples/add_stickyloss.xml or data/examples. Use the yxorprealserver command to check if you need to restore the realservers to available and inservice after the experiments in the previous example. Then, point your browser to http://localhost/ and cause a couple of hits.

Then, use the yxorpclientstate command as follows:

# yxorpclientstate
<yxorpclientstate>

<clientstate
   id="PQJZNM51ELUXTKDNECCZQIRLC1ME0BZBALX68AWBLMKIE978NV22YEXWLWMKV4"
   clientip="127.0.0.1" sticky="1" lastactive="[10/Jan/2006:00:35:36 +0100]"
   hitcount="3" toclient="29584" fromclient="1763" toserver="29584"
   fromserver="1763" >

<stickymap server="yxorp.sourceforge.net" realserver=sf" />

</clientstate>

<tablestats entries="1" tablebytesize="824" maxchainlength="1"
   tabletruncated="0" />

</yxorpclientstate>
#

Note the server that is mapped (in this case it is sf), and set it out-of service as follows:

# yxorprealserver -v -o sf
sf          : out-of-service   available        config
sf2         : inservice        available        config
#

and see what happens if you reload your browser (remember, it needs to be the same window as before, since Yxorp's state cookie should only be valid in a single browser instance). If everything is correct, nothing apparently has changed. Why? Because we just set the sf server out-of-service; it will not be eligible for new sticky mappings, or normal scheduling, but it will continue to serve for valid states.

Next, to see the stickylossrule in action, we will do:

# yxorprealserver -v -u sf
sf          : out-of-service   unavailable      config
sf2         : inservice        available        config
#

If you reload your browser again, you should now see the effect of the stickyloss rule.