BioXRT Installation

  1. PREREQUISITES

    BioXRT system runs on top of several software packages. These must be
    installed and configured before you can run BioXRT system. Most
    preconfigured Linux systems will have some of these packages installed
    already.

    A) MySQL 4.0.13 or higher -- <http://www.mysql.com>
        The MySQL database is a fast open source relational database that is
        widely used for web applications.

    B) Apache Web Server -- <http://www.apache.org>
        The Apache web server is the industry standard open source web
        server for Unix and Windows systems.

    C) Perl 5.005 -- <http://www.cpan.org>
        The Perl language is widely used for web applications. Version 5.6
        is preferred, but 5.00503 or higher will work.

    D) Standard Perl modules -- <http://www.cpan.org>
        The following Perl modules must be installed for BioXRT to work.
        They can be found on the Comprehensive Perl Archive Network (CPAN):
                CGI                  (2.56 or higher)
                DBI                  (any version)
                DBD::mysql           (any version)
                Digest::MD5          (any version)

    E) Bioperl version 1.4 or higher -- <http://www.bioperl.org>
        BioXRT uses Bio::DB::GFF::Adaptor::dbi::caching_handle to access the
        MySQL database, some other small modules in bioperl are used as well.

    F) XML::Parser, XML::Parser::EasyTree -- <http://www.cpan.org>
        These two modules are needed for XView, configuration file for BioXRT
        view is written in XML format, so XView has to parse XML to get the
        view settings.


  2. INSTALLING BioXRT FROM SOURCE

    Brief synopsis:

            perl Install.PL

    The installation program runs in a step by step fashion, asking you
    the directories for different software components. You can specify the
    locations by passing parameters and/or punching them in while being asked
    by the installation program.

    Details:

    The BioXRT system consists of two CGI scripts named "tbrowse" and "xview",
    Perl modules that handle some of the gory details, a configuration
    directory that contains configuration files for TBrowse and XView, and
    a perl script which loads XRT tables into a MySQL database. By default,
    these will be installed in the following locations:

              CGI scripts:  /usr/local/apache/cgi-bin
             Perl modules:  -standard site-specific Perl library location-
             Config files:  /usr/local/apache/conf/bioxrt.conf
     Command line scripts:  /usr/local/bin

    You can change the location of the installation by passing Install.PL
    one or more NAME=VALUE pairs, like so:

      perl Install.PL CONF=/etc SCRIPT=/home/myhome/bin

    This will cause the configuration files to be installed in
    /etc/bioxrt.conf and the perl scripts to be installed in
    /home/myhome/bin.

    The following arguments are recognized:

      CONF            Configuration file directory
      CGIBIN          CGI script directory
      LIB             Perl site-specific modules directory
      SCRIPT          Directory to put the perl command line scripts
      PREFIX          Base directory for conf, and cgibin

    For example, if you are on a RedHat system, where the default Apache
    installation uses /var/www/cgi-bin for CGI scripts, and /etc/httpd/conf
    for the configuration files, you should specify the following
    configuration:

      perl Install.PL CONF=/etc/httpd/conf \
                      CGIBIN=/var/www/cgi-bin

    (The backslashes are there to split the command across multiple lines
    only). To make it easier when upgrading to new versions of the software,
    you can put this command into a shell script.

    As a convenience, you can use the configuration option PREFIX, in which
    case the configuration directory and CGI files will be placed into
    PREFIX/conf, and PREFIX/cgi-bin respectively, where PREFIX is the
    location you specified:

      perl Install.PL PREFIX=/home/www

    During the installation, specified directory will be checked, if any of
    them does not exist, installation will be stopped, as well as when you
    don't have write permission to the target directory. So before install
    BioXRT, make sure you get the directories ready and have appropriate 
    privileges.

    Note that the configuration files are always placed in a subdirectory
    named bioxrt.conf. You cannot change this. The install script will detect
    if there are already configuration files in the selected directory and
    not overwrite them if so. However, other files (cgi scripts, modules
    etc.) are NOT checked before overwriting them, so be careful to copy the
    new copies somewhere safe if you have modified them.

    You can always manually move the files around after install. See BioXRT
    tutorial for details.

    The first time you run Install.PL, a file named BioXRT.def will be created
    to keep your file path settings. When Install.PL is run again, it will
    ask you whether you wish to reuse the settings stored in the file.


  3. LOADING XRT TABLES INTO THE DATABASE (MySQL)

    This step takes you through loading BioXRT data into the database.

    Synopsis:

      mysql -uroot -p password -e 'create database locuslink_test'

      mysql -uroot -p password -e 'grant all privileges on locuslink_test.* to me@localhost'
      mysql -uroot -p password -e 'grant file on *.* to me@localhost'
      mysql -uroot -p password -e 'grant select on locuslink_test.* to nobody@localhost'

      bulk_load_xrt.pl -d locuslink_test -u me -pass password sample_data/locuslink/*.xrt

    Details:

    Note for RedHat Linux users: note that if you are using the default
    installed Apache, the user that apache runs as is 'apache' as opposed to
    the otherwise standard 'nobody'. Therefore, everywhere 'nobody' occurs
    in these directions, replace it with 'apache' or whatever it is in your
    system.

    You will need a MySQL database in order to start using BioXRT. Using
    the mysql command line, create a database (called "locuslink_test" in the
    synopsis above), and ensure that you have update and file privileges on
    it. The example above assumes that you have a username of "me" and that
    you will allow updates from the local machine only. It also gives all
    privileges to "me". You may be comfortable with a more restricted set of
    privileges, but be sure to provide at least SELECT, UPDATE and INSERT
    privileges. You will need to provide the administrator's name and
    correct password for these commands to succeed.

    In addition, grant the "nobody" user the SELECT privilege. The web
    server usually runs as nobody, and must be able to make queries on the
    database. Modify this as needed if the web server runs under a different
    account.

    The next step is to load the database with data. This is accomplished by
    loading the database from XRT tables (tab-delimited files). The 
    distribution comes with a tool for loading Bio::DB::XRT databases, 
    bulk_load_xrt.pl, this Perl script can be used in two major situations:
    
        1 When you create a database the very first time, or when you want to
          wipe all existing data in a database and reload the XRT tables from
          scratch! The script will initialize a new Bio::DB::XRT database with
          a fresh schema, deleting anything that was there before, and then
          load the XRT table.
          
        2 When you want to load additional data into a database which contains
          some XRT data already. In this case, the no_wipe option should be
          used, then the script will load the new data without removing the
          existing data.
          
          Be very careful while using no_wipe option, you may have to remove
          the overlaping data from the database manually before loading new
          data. Overlaping data means data in the database which overlaps the
          data you are going to load, they belong to the same class, have the
          same attributes and IDs. For example, you loaded the XRT tables of
          a Gene class and some other classes, some time later, you get the
          Gene data updated and want to load it again. In this case, you have
          2 choices, (i) Without using no_wipe: you load Gene data together
          with other classes, the script will remove all existing data;
          (ii) Using no_wipe: before run bulk_load_xrt with no_wipe option,
          you have to manully remove Gene class data from the database, then
          load the updated Gene class only. The reason we normally choose the
          later one is it's faster, and has less data to be loaded. We only
          recommend the no_wipe option to advanced users.

    For testing purposes, this distribution includes XRT tables derived from
    a subset of Loucs Link data (in file LL_tmpl). The files can be found in
    the sample_data subdirectory.
    
    If the load is successful, you should see a message indicating that 77026
    entries were successfully loaded.


  4. TRY THE TBROWSE AND XVIEW OUT

    Go to the conf directory where keeps configuration files for TBrowse and
    XView, (the directory was specified while BioXRT system was installed, for
    example: /usr/local/apache/conf/bioxrt.conf). In file 01.locuslink.conf and
    locuslink.xview.xml, change the database user and pass to the real ones.

    You should now be able to browse the XRT table. Type the following
    URL into your favorite browser:

      http://name.of.your.host/cgi-bin/tbrowse?source=locuslink

    This will display the TBrowse data query interface, it allows the users
    to search for keywords and filter results by column values to obtain data
    of interest. There are examples at the bottom showing how to use keywords
    and set column filters. Try search for keyword ACHE, you will get the
    gene ACHE showing in a result table, then click the ID, this will bring
    you to XView which shows the detail information of ACHE.
    
    *IF YOU GET AN ERROR* examine the Apache server error log (depending on
    how Apache was installed, it may be located in /usr/local/apache/logs/,
    /var/log/httpd/, /var/log/apache, or elsewhere). Usually there will be
    an informative error message in the error log. The most common problem
    is MySQL password or permissions problems.


    More configuration information and a short tutorial are located at:

       http://projects.tcag.ca/bioxrt/tutorial


    Have fun!

    Junjun Zhang
    junjun@genet.sickkids.on.ca
    June 12, 2005

