
ServersThe electronic files containing the texts, images, and supporting apparatus for the William Blake Archive are distributed across three separate servers: the public site server (www.blakearchive.org), an internal testing site server, and the file server containing our archives, records, and work in progress. All three servers run versions of the Linux operating system. They are hosted and maintained by the Library Systems department at the University of North Carolina. ImagingScanning: We scan digital images from three types of source media: 4" x 5" transparencies, 8" x 10" transparencies, and 35mm slides. The transparencies, which include color bars and gray scales to ensure color fidelity, are verified for color accuracy against the original artifact by the photographer and often by an editor as well. In the past, we scanned transparencies on a Microtek Scanmaker III with a transparent media adapter. We upgraded to a Microtek Scanmaker V in May 1998. This model offered two advances: a separate drawer to hold 4" x 5" film inside the body of the scanner and Microtek's EDIT (Emulsion Direct Imaging Technology) system. From 2002 until early 2005, we scanned transparencies on a Microtek ArtixScan 1100, an advanced version of the Scanmaker V. In 2005, we replaced the ArtixScan 1100 with a Microtek Scanmaker i900. Slides are used only occasionally in the Archive; we scan them using the 35mm slide tray of the Scanmaker i900; in the past we used a Nikon LS-3510AF Slide Scanner and a Microtek 35t Plus Scanner. We use the most recent version of Microtek's ScanWizard software (as of this writing, 7.6). The i900 is attached to a dual 2GHz Power Macintosh G5 running Mac OS X. We began with a baseline standard of 24-bit color and 300 dpi (dots per inch) resolution for all scanned images. Due to hardware improvements—more internal storage, more RAM, writeable CDs and DVDs, and better scanners—we are now scanning at 600 dpi. The 600 dpi "raw scans" are stored on DVD as LZW-compressed TIFFs, from which we derive 300 dpi TIFF images for color correction. Images less than 40 x 30 cm are scaled 1:1 against the source dimensions of the original artifact so that on a monitor with 100 dpi screen resolution they display in the Object View Page at true size and their enlargements display at three times the size of their original. Images greater than 40 x 30 cm are scaled 1:2/3 and display smaller than the original but can be shown at true size using the ImageSizer applet (see below) on the Object View Page; the enlargements of these larger images are displayed at twice the size of the original. As part of the scanning process for each image, a project assistant completes a form known as an Image Production (IP) record. The IP records contain detailed technical data about the creation of the digital file for each image. These records are retained in hard copy at the project office, and they become part of the Image Information record that is inserted into each image as metadata (see below). To ensure color consistency, we calibrate our scanners and computer monitors on a regular basis; to ensure image clarity, we use compressed air to blow dust, lint, and hair from the scanner bed and the transparency before each scan. Color Correction: Using Adobe Photoshop, an editor color-corrects the "raw" scanned images against the original transparency or slide, which has itself been color-corrected against the original artifact. Between 1996 and 2001, we used Adobe Photoshop 4.0 and 5.0 in conjunction with hooded Radius PressView 17SR and 21SR monitors, calibrated using ProSense 1.8 software. Between 2002 and 2004, we used LaCie Electron 19 Blue and Electron 22 Blue monitors calibrated with the LaCie Blue Eye Sensor and software. Between 2004 and 2007, we used Adobe Photoshop CS with Apple 20" Cinema and 23" Cinema HD Displays. As of this writing, we use Adobe Photoshop CS3 and Apple 30" Cinema HD Displays. Since 2004, all displays have been calibrated with a GretagMacbeth Eye-One calibrator. The color correction process is necessary in order to bring the color channels of the digital image into alignment with the hues and color tones of the original (see A Tour of the William Blake Archive). The process can take up to several hours per image, though it ordinarily takes around thirty minutes. This step is key in establishing the scholarly integrity of the Archive, for it ensures that each image will match the original artifact when displayed under optimal conditions. Although we cannot control the color settings of an individual user's monitor, we specify these optimal viewing conditions, allowing users to adjust their monitor settings appropriately. File Formats and Archival Storage: All scanned images are saved using the Tagged Image File Format (TIFF) and archived as such on removable storage media. The Archive began on 8mm magnetic Exabyte tape and is now maintained on CD-ROM (in ISO 9660 hybrid format) and DVD. In addition, TIFF images are backed up to networked "dark archive" storage provided by Carolina Digital Library and Archives. These archived raw images would provide the source for newly color-corrected images if necessary. The color-corrected images displayed to users in the online Archive are presented in the JPEG (Joint Photographic Experts Group, ISO/IEC 10918) format. Users are presented with an inline image in the Object View Page at 100 dpi and have the option to view an enlargement at 300 dpi for the study of details (see A Tour of the William Blake Archive). The 100 dpi JPEG is derived from the color-corrected 300 dpi JPEG using ImageMagick, a UNIX software package that enables the batch processing of image files from the command line. Metadata: Each and every image in the Archive also contains textual metadata comprising its Image Information record. The Image Information record combines the technical data recorded in the Image Production record (see above) with additional bibliographic documentation of the image, as well as information pertaining to provenance, present location, and the owning institution. These textual records are, at the most literal level, a part of the Archive's image files. Image files are typically considered to be nothing but information about the images themselves, but in practice, an image file can be the container for several different kinds of information. The William Blake Archive takes advantage of this capability by inserting its Image Information records into the portion of the image file reserved for textual metadata. This integration allows the record to travel with the image, even if the image is downloaded and detached from the Archive. The Image Information record may be viewed using the "Info" button located on the control panel of the Archive's ImageSizer applet (see below) or with the Text Display feature of standard software such as Adobe Photoshop or X-View. eXtensible Markup Language (XML)All significant textual data in the Archive—Blake's actual poetry and prose, as well as the editors' bibliographic commentary and illustration descriptions—is encoded using eXtensible Markup Language (XML). XML is not a programming language; it is a descriptive meta-language used to encode (or "tag") textual data in such a way that it will remain usable even as platforms and file formats change over time. To take a very simple example, whereas the word processor used to write this document would represent italics by means of a proprietary binary code, XML would indicate italics with a plain ASCII tag such as: <hi rend="italic">this<hi>. But XML does not have to be merely descriptive; unlike HTML, XML allows us to identify and encode the structure of documents. A title or heading (to once again take a very simple example) can be tagged and described as such rather than being simply rendered in a large font or in boldface, etc., as HTML would encourage. By explicitly describing textual data according to a recognized W3C standard, XML frees the Archive from reliance on the vicissitudes of proprietary software packages. A set of XML tags designed for a specific purpose is known as a Document Type Definition (DTD). A DTD provides a hierarchical system of contexts and constraints which enables its tags to be used to create consistent document structures. The Blake Archive makes use of several DTDs developed specifically for the project when it was hosted at IATH. The primary and most expansive of these is known as the Blake Archive Description (BAD). The BAD DTD is used to encode all works at both the object and the collection level; its emphasis is on the description of Blake's works as physical artifacts. The BAD provides the basic document structure used to deliver the Archive's content to users and also serves as the information-base consulted by the Archive's search engines (see below). The Archive's second DTD, the Blake Object Description (BOD) is used to encode the textual metadata that constitutes the Image Information record. The Text Encoding Initiative (TEI) DTD is used for other materials in the archive, such as its bibliographies, collection lists and Erdman's Complete Poetry and Prose of William Blake, where description of the physical artifact is not the DTD's central purpose. The Archive on the Web: Delivery of the William Blake Archive on the Web is supported by three key components. All technologies are open source and standards based. First, the XML-encoded documents are stored in and indexed as an eXist database. Second, an assortment of custom XSL transforms produce HTML from our XML. Third, Apache Cocoon plays traffic cop, receiving and dispatching requests between users and applications. Each of these technologies is described at greater length below.
JavaJava is a platform-independent programming language developed by Sun Microsystems in order to facilitate object-oriented programming in conjunction with the HTTP layer of the World Wide Web. Software written in Java can be "run" (activated) directly from ordinary Web pages, without requiring users to have pre-installed any of the software's files on their own personal machine and without regard for the type of computer or operating system used to access the Web page from which the Java software is invoked. The Blake Archive uses two separate Java applets (or applications). Both were developed at IATH, and continue to be maintained at CDLA, in order to support the image-based editing that is fundamental to the project. Both of these applets, Inote and the ImageSizer, should be understood as computational implementations of the editorial practices governing the design of the Archive and its scholarly objectives. Both applets are based on version 1.3 of the Java Development Kit (JDK). Inote: Inote is an image-annotation tool. It permits us to append textual notes ("annotations") to selected regions (or "details") of a particular image; these annotations are generated directly from the XML-encoded illustration descriptions prepared by the editors. Inote functions most powerfully when used in conjunction with the Archive's image searching capabilities, where it can open an image found by the search engine, zoomed to the quadrant of the image containing the object(s) of the search query, with the relevant textual annotation displayed in a separate window (see A Tour of the William Blake Archive). From there, Inote allows the user to enlarge the image for further study and/or to access additional annotations located in other regions of the image. Inote may also be invoked directly from any of the Archive's Object View Pages, allowing users to "browse" the annotations created for a given image (see A Tour of the William Blake Archive). In addition, users can download and install their own executable copies of Inote on their personal computers (using a version of the software programmed in the Java Runtime Environment); upon doing so, they may attach annotations of their own making to locally saved copies of an image, for use in either teaching or research. The most recent release of Inote is version 6.0. ImageSizer: The ImageSizer is a sophisticated image manipulation tool (see A Tour of the William Blake Archive). Its principal function for the Archive is to allow users to view Blake's work on their computer screens at its actual physical dimensions. Users may invoke the ImageSizer's calibration applet to set a "cookie" informing the ImageSizer of their own unique screen-resolution. Based on this data (recorded in the cookie), all subsequently viewed images will be resized on the fly so as to appear at their true size on the user's screen. If a user returns to the Archive at some later date from the same machine, the data stored by the cookie will remain intact, and there will be no need to recalibrate. Users may also set the ImageSizer's calibration applet to deliver images sized at consistent proportions other than true size, for example, at twice normal size (for the study of details). In addition, the ImageSizer allows users to enlarge or reduce the image within its on-screen display area, and to view the textual metadata comprising the Image Information record embedded in each digital image file (see above). OtherWork in Progress Site: The Archive maintains a password-protected Work in Progress (WIP) site for the exclusive use of the editors and the project staff. The WIP site provides gateways to private testing ports on our servers, which allow us to proof works in the Archive's online environment without their being publicly accessible to users before they've reached their finished state. The WIP site also houses a variety of "tracking sheets," enabling the editors and project staff to accurately monitor the different stages of preparation for the many hundreds of text and image files in the Archive, as well as a Reference area for other materials (such as an "X-Files" of unsolved problems, agendas for project meetings, copies of grant and development materials, a complete archive of postings from the blake-proj list, etc.—see below). E-mail: The Blake Archive currently operates three electronic mailing lists. The oldest of these, an internal communications list known as "blake-proj," has existed since the project's inception and serves as the focal point for discussions among the editors, the project staff, and the technical staff at CDLA. All traffic on blake-proj is archived and available to the project editors and staff. A second internal list, "blake-board," was created in fall of 1998 to facilitate communication between the editors and the Archive's Advisory Board. Finally, a public list, also created in 1998 and known as "blake-update," is used for distribution of periodic updates and announcements to our users. Any visitor to the Archive may subscribe to blake-update via a form on our front-end Web pages. All lists are maintained by the University of North Carolina at Chapel Hill using the Lyris Listmanager software. Access Tracking: Access to the Archive's servers is tracked by the Apache Web server. The log files provide daily records that allow us to observe the frequency with which texts and images are requested by users, as well as the type of browser and platform being used to access the Archive, the IP address of the users, and their domain name (allowing us to compare, say, access from educational sites with access from commercial sites). Backup and Record Keeping: All of the Archive's data is safeguarded via daily, incremental backups. Weekly backups are stored off-site. In the event of a catastrophic disk failure or a server break-in, the Archive's data could be quickly restored from the backup system. The Archive's project office retains hard copies of all Image Production records, as well as ledgers tracking electronic file transfers, consignment of TIFF images to CD-ROM, and shipping of transparencies and slides. [See also "Managing the Blake Archive." Romantic Circles (March 1998).] |