Tutorial: The Service of Server-Side Includes
Many librarians are turning to dynamically generated Web pages to more effectively manage the growing numbers of library Web pages. Creating Web pages on-the-fly minimizes the amount of HTML coding required by staff, encourages HTML-resistant staff to participate in Web site maintenance and development, and allows libraries to offer more customization to their Web site visitors. It also allows libraries to respond more quickly to needed design or content changes. As Antelman notes, "To move forward, libraries must stop thinking of their Web sites as collections of HTML pages and view them as dynamic resources for information and services that patrons will use in highly individualized ways." 1
Basically, SSIs embed special commands into an HTML document that tells the server to perform specific actions when a user requests the page. The server then creates the Web page on-the-fly by merging files or inserting requested information. From the Web administrator's perspective, the maintenance is lessened considerably, since a single change on a single file will affect all other files pointing to it. Users viewing the Web page will be unable to tell that SSIs have been used (see figure 1). Even if users view the source code, the page will display only the generated HTML code, not the SSIs (see figure 2). Only the Web administrator or author with Web server access sees the code with the SSIs (see figure 3).
The use of SSIs requires two steps to set up the Web server. A general description of the steps used for the Apache server running under UNIX follows. Filenames and directory paths will vary depending upon the system version and configuration. For Netscape or IIS servers, see the server documentation for setup.
First, find the options directive in the directory section of the global access configuration file. On newer systems, this file is named http.conf, while older versions may use access.conf. In the directory section, one of two lines are required to indicate that either all type of include files are permissible (Options Includes), or to exclude the executable types of includes (Options IncludesNoExec). For example:
Some Web administrators may prefer the second type (Options IncludesNoExec) for security or server-load reasons.
In the second step, tell the server which files should be parsed by indicating the file extension. Depending upon the system, this information may be included in the same http.conf file or in the srm.conf file. In the file, insert the following line:
AddType text/x-server-parsed-html. shtml
This tells the Web server to treat all files ending in .shtml as if they contained SSIs. Some Web administrators prefer this extension, as it limits the number of files processed by the server. However, renaming hundreds or thousands of files and links with a different extension is not a viable option for many libraries. Unless the Web site is extraordinarily popular, most Web servers should be able to handle the extra load of parsing all files and simply using the .html extension already in place. This is especially true if the Options IncludeNoExec setting, which disables executable scripts or programs, is in place. The .htm or .html extensions may be added to the file like this:
AddHandler server-parsed .shtml .htm .html
AddType text/x-server-parsed-html .shtml .htm .html
An alternative setup method uses the XBitHack directive to parse HTML files based on file permissions. 2 This method, which also allows you to automatically keep the .html file extension, works by setting the user's execute bit in the file permissions. This means that the file would need to be set to read, write, and execute for the user and read only for others. With this method, security becomes an issue if executable files are housed in the same directory as the files with SSIs.
Even if the Web administrator does not allow changes on the server settings, individual users may still be allowed to run SSIs in their own directories. To do this, create a file named .htaccess and insert the Options Includes, AddType, and AddHandler statements mentioned in the preceeding example.
SSIs have three common uses. First, SSIs may be used to echo variables, meaning to display information that the computer receives about the user. This information could include the user's Web browser type or the date and time the user viewed the page. Second, SSIs may embed information like graphics, text, or HTML from another file or directory. The third use, executing an external script, will be discussed in the security section of this article.
SSI statements look similar to HTML comments. They begin with <!--# and end with -->. There is no space after the # and the command, but at least one space before the closing tag. No hard returns are permitted inside the comment tags. The basic general syntax for a server-side include is:
<!--#command tag1=value tag2=value -->
The server searches for the <!--# sequence and then replaces it through the --> with the include information. The include information will be inserted wherever the SSIs exist, so add in whatever extra text, spacing, or HTML formatting needed.
To see the list of all the environmental variables available on a specific computer, use the Printenv command. For example:
The <pre> tags help formatting. The first part of the output listed in capitals is the variable; the second part is its value for that particular computer (see figure 4). Lists of the SSIs commands and descriptions are available on the NCSA and W3C Web sites. 3
Date and Time
While a standard criterion for Web site quality, the most recent date and time of an update on a Web site requires constant attention to maintain by hand. This SSI automatically places this information on the page:
Last updated: <!--#echo var="LAST_MODIFIED"-->
In addition to LAST_MODIFIED, echo includes a number of different variables like DATE_LOCAL. This second variable, DATE_LOCAL can be useful for Web pages where the user may need to be reminded of the current date and time. Calendars, instruction session schedules, hours pages, or document delivery services that maintain a specific time schedule are all good candidates for the DATE_LOCAL variable. For example:
Library Workshops are held every Friday at 2 P.M.
Today is: <!--#echo var="DATE_LOCAL" -->
If a book is requested from storage before noon, Monday-Thursday, then it will be available for pickup after 3 P.M. that day. Your book was requested at <!--#echo var="DATE_LOCAL"-->
Be careful not to mix up the DATE_LOCAL and LAST_MODIFIED variables. "It is easy to mistakenly (or intentionally) use the DATE_LOCAL SSIs after a statement such as 'Page last updated on,' which then makes it look as if that page is always updated every day," comments Notess. 4
Displaying the URL on a Web page offers a nice service for users who print pages for later reference, especially if the pages have long URLs that are likely to be cut off. This requires cobbling together several echo variables to form a full URL, including the domain (HTTP_HOST) and the directory path and filename (DOCUMENT_URI). Notice that http:// is not included in the HTTP_ HOST variable and is simply typed into the HTML code.
URL: http://<!--#echo var="HTTP_HOST" --><!--#echo var="DOCUMENT_URI" -->
The user will see:
In a similar manner, indicate file size for files that users must download, like a browser plug-in needed for a library tutorial or a program to automatically configure a proxy server. Users accessing the library site over a slow modem will especially appreciate this information. For example:
<a href="proxy.exe">Download Proxy Configuration Program</a> (Filesize: <!--#fsize file="proxy.exe" -->)
will be viewed as:
Download Proxy Configuration Program (File size 44 K)
Headers and Footers
In many libraries, standard page elements like headers and footers are subject to sudden change by outside parties such as the university administration or the board of directors. Making a change such as adding a link to the university home page or changing the color of the navigation bar becomes much simpler and faster with SSIs in place because a change in a single file will affect all other files pointing to it. Easing the management of standard features also encourages more staff to participate in Web development. Most medium and large libraries have gone beyond the "webmaster" model where one person is responsible for every change on the site. 5 In theory, having many people edit Web pages lessens the workload. However, the use of multiple Web authors of varying skill levels invariably means that some standard features will get mangled in the process and will need to be redone. The use of SSIs simplifies the template, making it easier for staff to understand and edit. 6 If standard headers and footers are currently embedded into all library Web pages, a simple search-and-replace script may be used to replace bits of HTML code with the SSIs.
To include a standard footer, create an HTML file with the footer information, which may include other include tags. If the files are in the same directory, only the filename is needed in the tag like this:
If the files are in different directories, then indicate the subdirectory like this:
When pointing to other files, be careful not to end up with a document with multiple HTML, HEAD, or BODY tags. In addition, be certain of the filename and path. If the server cannot find the correct file or path, the user will see the error message "an error occurred while processing this directive" generated as part of the Web page, rather than the more meaningful information in the header or footer.
In addition to headers and footers, SSIs also work for non-HTML files. This enables every person who is capable of typing a few lines a potential Web-page contributor. As Notess writes, "By using includes, a simple text file can contain the content, and the people with no HTML experience can be given access to change that text." 7 A staff Web page, for example, may include a single paragraph of information about each staff member. Using SSIs, staff members could edit their individual text file, rather than the HTML document. For example:
When including fragments, you may want to use a non-HTML extension, so that the file is not indexed as a separate Web page either internally or externally. HTML tags may still be included, even if the file extension is not .htm or .html.
<a href="<!--#echo var="HTTP_REFERER"-->">Return to Previous Page</a>
Multiple Web Authors and WYSIWYG Editing Tools
If multiple staff members edit Web pages containing SSIs, two issues must be considered. First, if staff save files from the Web browser for editing, rather than directly accessing the Web server directories, the saved file will not contain SSIs, only the generated code. If direct access to the Web server directories is not possible, then a workaround like a local FTP mirror site will be required in order to preserve the SSIs.
The second consideration is how popular WYSIWYG editors such as Microsoft FrontPage, Adobe GoLive, and Macromedia Dreamweaver treat SSIs. The good news is that no matter which editor is used, Web authors will be able to edit the SSIs in the HTML code view, assuming the file was copied from the server. However, beginning Web authors, especially those who use only the WYSIWYG editing screen, may require some initial training to avoid several common pitfalls.
First, Web authors must recognize how SSI content is displayed in their WYSIWYG editing screen, so that they do not accidentally alter or delete this content. Newer Web editors like Adobe GoLive 5.0 display placeholder graphics to indicate SSIs in the WYSIWYG editing screen, while older editors like FrontPage 98 do not. Second, Web authors who use the preview function available in most WYSIWYG editors may be dismayed to discover that the SSIs will not translate on their local machine. Previewing a Web page without crucial elements like background color, header, or footer for the first time can be disconcerting, especially to visually oriented designers (see figure 5). Third, Web authors will need a firm grasp on constructing valid SSI statements to include other files, since the WYSIWYG editors do not generally check the validity of these statements. Just as beginning Web authors may include local file information when creating links to Web pages or graphics, they may also create faulty include statements like this:
New Web authors may also be tempted to insert non-Web documents into an include statement, such as a Microsoft Word or Adobe Acrobat document. For example, a WYSIWYG editor may permit the statement:
but when the computer attempts to translate this file, meaningless characters like ÜìÁ± will be inserted into the document.
Currently, Dreamweaver 4.0 offers the most support for SSIs with an "Insert Server-Side Include" menu option for local files and SSI rendering for files within the same directory. Figure 6 demonstrates how the information in footer.html is displayed in the Dreamweaver 4 editing screen, while header.html is not. Nontranslated SSIs like the environmental variables used to display the last modified date and URL are shown by placeholder graphics. For new Web authors, the rendering of footer.html appears misleading, as the included file may not be edited from that screen, unlike other parts of the page. Instead, the Web author must open the included file separately in order to edit it.
Although Dreamweaver offers the most support for SSIs, other WYSIWYG editors like Adobe GoLive 5.0 and FrontPage 2000 offer a similar feature called components. Like SSIs, components simplify Web maintenance by housing common information in a single file, such as a page header. However, there are two major differences. The first difference is that unlike SSIs, Web pages using components are not generated on-the-fly. Instead, the WYSIWYG editor uses a search-and-replace function to hard code the information onto each Web page. This means that if a common file like a page header is changed, then all files pointing to it must be reloaded on the server. Second, editors such as Adobe GoLive or FrontPage recognize and update these components by inserting a proprietary code in the HTML. This means that if a library decides to change Web editors, then none of the component code will work. SSIs, in contrast, will work no matter which editor a library uses.
XSSI, available from Apache, allows some advanced dynamic features, such as hiding or displaying links depending upon IP address or browser type, generating random images, or using hit counters. Basically, XSSI works by allowing the use of variables in commands and by allowing conditionals. To set variables, the basic string is:
To set conditionals (if, else, elif, endif), the basic syntax is:
<!--#if expr="first" -->do this first task
<!--#elif expr="second -->do this other task
<!--#else -->do something else
As the example demonstrates, conditionals make it possible to send users to various pages or display different information depending upon their particular needs. This feature is extremely valuable for libraries, particularly for its distance users. For example, users outside of the university IP domain could first be sent to a proxy server setup page, rather than immediately to the listing of databases on-campus folks would see. Users with a noncompatible browser for the proxy server could automatically view an explanatory message, rather than expecting the user to find the help page himself (figure 7). Instead of frustrating users with a "forbidden" message, links to restricted sites could be hidden from external users. For example, to hide the link to staff_intranet.html from users outside the IP address 1.2.3, use:
<!--#if expr="$REMOTE_ADDR = /^1.2.3./"--> <a href="staff_intranet.html">Library Staff Intra_ net</a><!--#endif-->
Security and Other Issues
In addition to the server-load issue mentioned earlier, SSIs also raise some security concerns. The list of potential security issues includes crashing Web servers, killing other users' processes, and sending e-mail. 8 Many of these risks are triggered by external programs using the #Exec command in combination with either CMD or CGI to launch a program. For example, to run the script called sample.cgi, use:
On the positive side, using SSIs in combination with a CGI script allows the Web administrator to offer more sophisticated customization than is possible with SSIs alone. However, running scripts opens the server to possible harm, causing many Web administrators to use the Options IncludesNoExec setting. Many articles offer some steps to minimize the security risks, including:
- Disable includes in the CGI-bin directory or other directories with executable files.
- Run the Web server as the user nobody, not root. Root has unlimited access to the system.
- Keep server software up-to-date and use the IncludesNoExec option to disable scripts.
- Use the Virtual command, rather than Exec. 9
The use of SSIs is a simple, relatively quick way to add much-needed manageability to growing library Web sites. It is also an effective way to involve all staff, even those with no HTML experience, in the development of a library Web site. While not as full-featured as a database driven site, use of SSIs does allow the library to add limited customization to the site. Most importantly, it also frees up the Web administrator's time from the more routine maintenance, affording more time for all those other Web projects at the library.
References and Notes
2. For more information about the XBitHack method, see: "How to Enable Server Side Includes on Your Web Server." Accessed Jan. 19, 2001, http://bignosebird.com/sdocs/enable.html.
3. NCSA HTTPd Tutorial: Server Side Includes (Sept. 28, 1995). Accessed Aug. 9, 2000, http://hoohoo.ncsa.uiuc.edu/docs/tutorials/includes.html. W3C Server Side Include Commands. Accessed Jan. 19, 2001, www.w3.org/Jigsaw/Doc/User/SSIs.html.
8. Jared Karro and Jie Wang, "Protecting Web Servers from Security Holes in Server-Side Includes," in Proceedings of the 14th Annual Computer Security Applications Conference (Los Alamitos, Calif.: IEEE Computer Society Pr., 1998): 103-11.
9. Art Sackett, SSIs for the Rest of Us. Accessed July 12, 2000, http://artsackett.com/grey_papers/ssi/rest_of_us.html. Chuck Musciano, "Inject Compelling and Up-to-the-Minute Data into Your Web Site," Sun World, Dec. 1996. Accessed Oct. 5, 2000, www.sunworld.com/swol-12-1996/swol-12-webmaster.html. Karro and Wang, "Protecting Web Servers from Security Holes in Server-Side Includes," 109. Notess, "Server Side Includes for Site Management," 80.
Michelle Mach ( mmach@manta. colostate.edu) is an Assistant Professor and Web Librarian, Colorado State University Libraries, Fort Collins