Delila Program: alist

alist program

Documentation for the alist program is below, with links to related programs in the "see also" section.

{   version = 6.64; (* of alist.p 2017 Jan 02}

(* begin module describe.alist *)
(*
name
      alist: aligned listing of a book

synopsis
      alist(inst: in, book: in, alistp: inout, colors: in,
            namebook: in, namelist: in, avalues: in,
            list: out, clist: out, output: out)

files
      inst: delila instructions of the form 'get from 56 -5 to 56 +10;'
         If this file is empty, then the sequences will be
         aligned either by their 5' ends or by their zero base,
         depending on the 4th parameter in alistp.

      book: the book generated by delila using inst

      alistp: parameters to control the program.  If empty, the range of the
         instructions are used.  Otherwise,

         0. The version of alist that this parameter file is designed for.
            If the program finds an old version, it will *upgrade* the
            alistp file.

         1. The first line contains one line with two integers defining the
         range of basesto display.  This allows one to have a wide alignment,
         but look only at a portion.

         2. If the first character of the second line is:  'p' the piece name
            and coordinates are given in the list.  If it is ' ' then neither
            is given.  If one has only one piece that one is working with,
            one may not want the piece name but will want the coordinate
            In this case use 'c'.

            If the second character is 'l', then the
            long name of the piece is given in the list preceeding the piece
            name.  Note that this long name can be written into the book from
            the instructions by the Delila "name" instruction.  (See <NAME> in
            the libdef.)  Blank names (ie, 'name "";') are accepted.

            If the third character is not '-', then the sequences are
            numbered.  If it is '-' they are not.

         3. If the first character of the third line is 'n' then paging is not
         done to the list.

         4. If the first character of the fourth line is
            'f' (for 'first') then the sequences are always aligned by their
            first base.

            'i' then the sequences are aligned by the delila instructions.  If
            the inst file is empty, alignment is forced to the 'b' mode.

            'b' (for 'internal') then the alignment is on the internal zero of
            the book's sequence.  This option is to be used when "default
            coordinate zero" is used in the Delila instructions.

            The following table should clarify the cases and their uses:

      state |instructions empty |              instructions exist
      ------|-------------------|-------------------------------------------
            |                   |  instruction alignment |  book alignment
            |                   |  (def coo nor)         |  (def coo zer)
            |-------------------|------------------------|------------------
        'f' |    first base     |  first base            |  first base
            |   (first base)    | (first base)           | (first base)
            |-------------------|------------------------|------------------
        'i' |    book           |  inst                  |  inst (DO NOT USE)
            |    (0)            | (aligning base)        |  (aligning base)
            |-------------------|------------------------|------------------
        'b' |    book           |  book (DO NOT USE)     |  book
            |    (0)            | (0)                    |  (0)

        The first line of each entry defines how the alignment will be
        assigned.  Thus 'f' forces the first base to be used at all times and
        'b' forces the book to be used.  In two case this does not make sense.
        First, if the instructions were generated with the "default coordinate
        zero", then the Delila instructions do not correspond to the base
        coordinates in the book (by definition) and so the alignment should not
        use the instruction file.  In the second case, the instructions use
        "default coordinate normal" so the zero base in the book does not
        correspond to the zero base in the instructions.  The basic problem
        here is that there is no way for the program to know which situation
        occurs, without spending time reading the Delila instructions.  So the
        user must specify.  (This may be automated in the future.)

        The second line of each entry is the coordinate number which appears on
        the left column of the aligned listing.

        5. Column number to read from avalues file (integer),
           followed by the field width and number of decimal places to write
           the values to the list and clist.

        6. edgecontrol edgeleft, edgeright, edgelow, edgehigh:
             edgecontrol is a single character that controls how the bounding
             box of the figure is handled.  If it is 'p' then the bounding
             box will be the page parameters defined in constants inside the
             program (llx, lly, urx, ury).  Otherwise, there are four real
             numbers that define the edges around the clist in cm.  To allow
             a clist to be imbedded into another figure, its size must be
             defined in PostScript (with %%BoundingBox).  By setting these
             four numbers, the edges are defined.

        7. map control: A series of values:
         * mapcontrol: If the first character on the line is a 'C', then the
           color map file will be written.  If it is 'R' then the page will
           be set up so that the upper left corner is moved to the lower left
           corner and the image is rotated 90 degrees counter clockwise.
           This has the effect of making the image in "landscape" mode.

         * fontsize (integer): The character height in points (there
           are 72 points/inch, 2.54 cm/inch).  Typical value for alist: 15.

        8. deltaXcm deltaYcm scaleimage: image positioning controls

         * deltaXcm: The amount to move the image in X (cm).
         * deltaYcm: The amount to move the image in Y (cm).
         * scaleimage: the scaling factor.

         The image will be shifted on the printed page.  X is positive to the
         right and Y is positive up the page.  Generally one would use
         positive values for X and negative values of Y since the image
         should otherwise fit snugly in the upper left corner of the page.

         The scaling is performed after movement from the lower left hand
         corner of the image as one would read it.  If the image has been put
         in "landscape" mode the delta-shifts are given in the new coordinate
         system.  This allows one to switch between "landscape" and regular
         "portrait" mode without changing the parameters, and it allows one
         to think in terms of a normally held page.

        9. headercontrol: the first character on the line determines
         whether the header description is written to list and clist.
         If the character is 'h' it is written, otherwise not.
         Headers can also be removed from the clist by deleting lines
         containing the word "NOHEADER".  In Unix this is done by:

            grep -v clist NOHEADER > clist.noheader

         With 'h' the numbar (bar of vertically written numbers) is included
         above the sequence, but if the character is '0' (zero) the numbar is
         not written.  This allows one to use the list file to extract column
         data easily, otherwise it is not recommended.

      namebook: names of genes or transcripts from this book appear in
         the list.  If namebook is empty, then only the items specified in
         alistp are given.

      namelist: if this file is not empty, then it should contain a simple list
         of names to give to each sequence listed.  These are placed to the
         left of the alist and may contain anything one wants.  The number of
         columns used is determined by the longest line in the file.

      avalues: Aligned list values.  A file containing values to list for
         each of the sequences.  If the file is not empty, the values appear
         to the right of the sequences.  The first line of the file is
         expected to begin with "* " followed by the title of the values.
         All other lines that begin with "*" are ignored.  The program uses
         the data column of avalues as defined in the alistp parameter file.

      list: the aligned listing

      clist: the aligned listing, in PostScript color.  Paging is ALWAYS done
         to this file, using the page parameter.  However, it can be removed
         by deleting all lines with the word "REMOVE" on them.  This is
         easily done in Unix with:
            grep -v clist REMOVE

      colors: colors defining the bases, see makelogo for definition.

      output: messages to the user

description

      Alist creates an aligned listing of a sets of sequences.  The pieces in
      the book are aligned according to the instructions in file inst, and
      listed in the list file.  Each piece is identified, and a bar of numbers
      (called a 'numbar') that are read vertically defines the locations of
      bases around the aligning point.

example

      To generate an example input set using namebook, start with a set of
      instructions that name genes and get them (as 'get from gene beginning -0
      to gene beginning +2;').  Produce namebook.  Check for genes that are
      reversed relative to the piece (use hist and alist without instructions),
      and correct the delila instructions.  To convert these instructions to
      absolute form, use program search with 'd f -54321 t +12345 q atg gtg
      ttg' on namebook.  Now convert -54321 and +12345 to the range of interest
      (beware of absolute locations with the same numbers).  Finally, generate
      the book using delila.  (Someday this process will be simpler.)

      Here are some search instructions (file: sea):

 * instructions for the input file of the search program
 d
 q
 #gtt
 ~ =
 q

(The blanks at the beginning of each line protect from the compiler detecting
the # on the first line, and should be removed to try this example.)

When these are given to search along with the book 'exobk':

      cp ex0bk book
      search < sea

The inst file is:

title "95/01/24 21:12:11 search 6.05";
(@ * 86/12/12 13:06:31, 84/05/05 21:12:50, ex0: example
@)
default numbering piece;
default numbering 1;
default out-of-range reduce-range;
default coordinate zero;

(@   typed pattern: "#gtt" @)
organism ecoli; chromosome ecoli;
piece lac;
get from 43 -100 to 43 +100 direction +;
(@ the complementary search string is now: "aa#c" @)

(@   typed pattern: "aa#c" @)
piece lac;
get from 35 +100 to 35 -100 direction -;
get from 42 +100 to 42 -100 direction -;

      (The '*' of comments was converted to '@' so that this page could be a
      comment in the alist source code.) Note the "default coordinate zero"
      which was inserted by hand.  When these instructions are run through
      delila (using ex0bk as the library) and then given to alist with
      parameters alistp:

-10 10      From and To
pl          display control p: piece&coordinate of zero base; l: long name
n           no paging
i           f: first base, i: inst, b: book alignment
6  4  1     avalues: column, output width, output decimals
alistp: parameters for alist 5.54 and heigher

An example list file is:

 alist 6.05, aligned listing of book:
 * 95/01/24 22:43:43, 95/01/24 19:36:00,
   95/01/24 21:12:11 search 6.05
 piece names from:
 * 95/01/24 22:43:43, 95/01/24 19:36:00,
   95/01/24 21:12:11 search 6.05
 The book is from:         0 to 20
 This alignment is from: -30 to 20

           ---------------------                   +++++++++++
           322222222221111111111--------- +++++++++11111111112
           098765432109876543210987654321012345678901234567890
           ...................................................
lac   0  1                 gtgaaaccagtaacgttatac
lac   0  2                 gtataacgttactggtttcac
lac   0  3                        gtataacgttactggtttcac

documentation
      delman.use.aligned.books

see also
      program that produces the book:  delila.p
      search program to help locate sites: search.p

      example inst: spliceA.in
      example book: spliceA.bk
      example aligned listing parameter file: alistp
      example colors file: colors

      To learn about page printer boundaries, go to
      https://alum.mit.edu/www/toms/postscript.html#tricks

author
      Thomas D. Schneider

bugs

      If you use relative instructions, then alist will bomb.
      Ie, do not use instructions of the form:
          get from gene beginning - 5 to gene beginning +5;

      There is also an unsolved bug in alist:
      When the pieces and instructions are not 'just right', alist will
      produce listings that are thousands of characters wide...  The reason
      for this is not completely clear, but it is related to attempting
      to extend the from-to range of an aligned book, and perhaps to incorrect
      responses of delila when attempting to 'reduce' a piece beginning or
      ending that is off the end of a fragment of a circular piece.  The code
      now contains traps that halt the program when wide listings would have
      been generated.  This bug may have been solved.

      Alist cannot align a sequence if the alignment point is outside the
      sequence.

      Note:  it is possible to use the 'i' mode when "default coordinate zero"
      has been set, but this can lead to confusing output.  There is no simple
      mechanism to prevent this in DelilaI.

      [1995 Dec 7]  The namebook mechanism is currently broken for the clist.

technical notes

   The variable 'nametype' defines the kind of name picked up in namebook.

   The constant 'pagelength' defines the length of the page in the list.

   The constant 'topofpage' defines the top of the page in cm in the clist.

   There are 4 constants that tell the program the printer page boundaries:

   The following bounding box is for the Canon Color Laser Copier 1150.
   defaultllx =   7.10999; default for llx, lower left x
   defaultlly =   7.01995; default for lly, lower left y
   defaulturx = 588.15;    default for urx, upper right x
   defaultury = 784.98;    default for ury, upper right y

   These should be set for your printer.  To see how this is
   done, go to the link given in the See Also.
   Alternatively, you can use the edgecontrol parameter.

   As of version 5.96, alist can sense that a parameter file (alistp) is out
   of date and it will automatically upgrade the file.  For this reason the
   parameter file is now listed as 'inout', meaning that it can be modified
   by this program.

*)
(* end module describe.alist *)
{This manual page was created by makman 1.45}


{created by htmlink 1.62}