Delila Program: hc

hc program

Documentation for the hc program is below, with links to related programs in the "see also" section.

{   version = 1.05; (* of hc.p 2017 Jul 12}

(* begin module describe.hc *)
(*
name
      hc: H-curve and H uncertainty

synopsis
      hc(inst: in, book: in, hcp: inout, colors: in,
            namebook: in, namelist: in, avalues: in,
            list: out, clist: out, xygraph: out, output: out)

files
      inst: delila instructions of the form 'get from 56 -5 to 56 +10;'
         If this file is empty, then the sequences will be
         aligned either by their 5' ends or by their zero base,
         depending on the 4th parameter in hcp.

      book: the book generated by delila using inst

      hcp: parameters to control the program.  If empty, the range of the
         instructions are used.  Otherwise,

         0. The version of hc that this parameter file is designed for.
            If the program finds an old version, it will *upgrade* the
            hcp file.

         1. The first line contains one line with two integers defining the
         range of basesto display.  This allows one to have a wide alignment,
         but look only at a portion.

         2. If the first character of the second line is:  'p' the piece name
            and coordinates are given in the list.  If it is ' ' then neither
            is given.  If one has only one piece that one is working with,
            one may not want the piece name but will want the coordinate
            In this case use 'c'.

            If the second character is 'l', then the
            long name of the piece is given in the list preceeding the piece
            name.  Note that this long name can be written into the book from
            the instructions by the Delila "name" instruction.  (See <NAME> in
            the libdef.)  Blank names (ie, 'name "";') are accepted.

            If the third character is not '-', then the sequences are
            numbered.  If it is '-' they are not.

         3. If the first character of the third line is 'n' then paging is not
         done to the list.

         4. If the first character of the fourth line is
            'f' (for 'first') then the sequences are always aligned by their
            first base.

            'i' then the sequences are aligned by the delila instructions.  If
            the inst file is empty, alignment is forced to the 'b' mode.

            'b' (for 'internal') then the alignment is on the internal zero of
            the book's sequence.  This option is to be used when "default
            coordinate zero" is used in the Delila instructions.

            The following table should clarify the cases and their uses:

      state |instructions empty |              instructions exist
      ------|-------------------|-------------------------------------------
            |                   |  instruction alignment |  book alignment
            |                   |  (def coo nor)         |  (def coo zer)
            |-------------------|------------------------|------------------
        'f' |    first base     |  first base            |  first base
            |   (first base)    | (first base)           | (first base)
            |-------------------|------------------------|------------------
        'i' |    book           |  inst                  |  inst (DO NOT USE)
            |    (0)            | (aligning base)        |  (aligning base)
            |-------------------|------------------------|------------------
        'b' |    book           |  book (DO NOT USE)     |  book
            |    (0)            | (0)                    |  (0)

        The first line of each entry defines how the alignment will be
        assigned.  Thus 'f' forces the first base to be used at all times and
        'b' forces the book to be used.  In two case this does not make sense.
        First, if the instructions were generated with the "default coordinate
        zero", then the Delila instructions do not correspond to the base
        coordinates in the book (by definition) and so the alignment should not
        use the instruction file.  In the second case, the instructions use
        "default coordinate normal" so the zero base in the book does not
        correspond to the zero base in the instructions.  The basic problem
        here is that there is no way for the program to know which situation
        occurs, without spending time reading the Delila instructions.  So the
        user must specify.  (This may be automated in the future.)

        The second line of each entry is the coordinate number which appears on
        the left column of the aligned listing.

        5. Column number to read from avalues file (integer),
           followed by the field width and number of decimal places to write
           the values to the list and clist.

        6. edgecontrol edgeleft, edgeright, edgelow, edgehigh:
             edgecontrol is a single character that controls how the bounding
             box of the figure is handled.  If it is 'p' then the bounding
             box will be the page parameters defined in constants inside the
             program (llx, lly, urx, ury).  Otherwise, there are four real
             numbers that define the edges around the clist in cm.  To allow
             a clist to be imbedded into another figure, its size must be
             defined in PostScript (with %%BoundingBox).  By setting these
             four numbers, the edges are defined.

        7. map control: A series of values:
         * mapcontrol: If the first character on the line is a 'C', then the
           color map file will be written.  If it is 'R' then the page will
           be set up so that the upper left corner is moved to the lower left
           corner and the image is rotated 90 degrees counter clockwise.
           This has the effect of making the image in "landscape" mode.

         * fontsize (integer): The character height in points (there
           are 72 points/inch, 2.54 cm/inch).  Typical value for hc: 15.

        8. deltaXcm deltaYcm scaleimage: image positioning controls

         * deltaXcm: The amount to move the image in X (cm).
         * deltaYcm: The amount to move the image in Y (cm).
         * scaleimage: the scaling factor.

         The image will be shifted on the printed page.  X is positive to the
         right and Y is positive up the page.  Generally one would use
         positive values for X and negative values of Y since the image
         should otherwise fit snugly in the upper left corner of the page.

         The scaling is performed after movement from the lower left hand
         corner of the image as one would read it.  If the image has been put
         in "landscape" mode the delta-shifts are given in the new coordinate
         system.  This allows one to switch between "landscape" and regular
         "portrait" mode without changing the parameters, and it allows one
         to think in terms of a normally held page.

        9. headercontrol: the first character on the line determines
         whether the header description is written to list and clist.
         If the character is 'h' it is written, otherwise not.
         Headers can also be removed from the clist by deleting lines
         containing the word "NOHEADER".  In Unix this is done by:

            grep -v clist NOHEADER > clist.noheader

         With 'h' the numbar (bar of vertically written numbers) is included
         above the sequence, but if the character is '0' (zero) the numbar is
         not written.  This allows one to use the list file to extract column
         data easily, otherwise it is not recommended.

      namebook: names of genes or transcripts from this book appear in
         the list.  If namebook is empty, then only the items specified in
         hcp are given.

      namelist: if this file is not empty, then it should contain a simple list
         of names to give to each sequence listed.  These are placed to the
         left of the hc and may contain anything one wants.  The number of
         columns used is determined by the longest line in the file.

      avalues: Aligned list values.  A file containing values to list for
         each of the sequences.  If the file is not empty, the values appear
         to the right of the sequences.  The first line of the file is
         expected to begin with "* " followed by the title of the values.
         All other lines that begin with "*" are ignored.  The program uses
         the data column of avalues as defined in the hcp parameter file.

      list: the aligned listing

      clist: the aligned listing, in PostScript color.  Paging is ALWAYS done
         to this file, using the page parameter.  However, it can be removed
         by deleting all lines with the word "REMOVE" on them.  This is
         easily done in Unix with:
            grep -v clist REMOVE

      xygraph: H curve xy coordinates for each sequence

      colors: colors defining the bases, see makelogo for definition.

      output: messages to the user

description

      Build H-curve or H-computation (or both) on top of the alist program.

      Hc is like alist in that it creates an aligned listing of a set
      of sequences.  However, the value column is the Shannon uncertainty
      in bits.

      In addition, hc produces an xygraph file which gives the H-curve
      coordinates (starting at zero) for each sequence.  These can be
      plotted using denplo.

documentation

@article{Hamori.Ruskin1983,
author = "E. Hamori
 and J. Ruskin",
title = "{H curves, a novel method of representation of nucleotide
series especially suited for long DNA sequences}",
journal = "J Biol Chem",
volume = "258",
pages = "1318--1327",
pmid = "6822501",
year = "1983"}

see also

      Program that does aligned listings upon which hc is built:  alist.p

      program that produces the book:  delila.p
      search program to help locate sites: search.p

      example inst: spliceA.in
      example book: spliceA.bk
      example aligned listing parameter file: hcp
      example colors file: colors

      To learn about page printer boundaries, go to
      https://alum.mit.edu/www/toms/postscript.html#tricks

author
      Thomas D. Schneider

bugs

      If you use relative instructions, then hc will bomb.
      Ie, do not use instructions of the form:
          get from gene beginning - 5 to gene beginning +5;

      There is also an unsolved bug in hc:
      When the pieces and instructions are not 'just right', hc will
      produce listings that are thousands of characters wide...  The reason
      for this is not completely clear, but it is related to attempting
      to extend the from-to range of an aligned book, and perhaps to incorrect
      responses of delila when attempting to 'reduce' a piece beginning or
      ending that is off the end of a fragment of a circular piece.  The code
      now contains traps that halt the program when wide listings would have
      been generated.  This bug may have been solved.

      Alist cannot align a sequence if the alignment point is outside the
      sequence.

      Note:  it is possible to use the 'i' mode when "default coordinate zero"
      has been set, but this can lead to confusing output.  There is no simple
      mechanism to prevent this in DelilaI.

      [1995 Dec 7]  The namebook mechanism is currently broken for the clist.

technical notes

   The variable 'nametype' defines the kind of name picked up in namebook.

   The constant 'pagelength' defines the length of the page in the list.

   The constant 'topofpage' defines the top of the page in cm in the clist.

   There are 4 constants that tell the program the printer page boundaries:

   The following bounding box is for the Canon Color Laser Copier 1150.
   defaultllx =   7.10999; default for llx, lower left x
   defaultlly =   7.01995; default for lly, lower left y
   defaulturx = 588.15;    default for urx, upper right x
   defaultury = 784.98;    default for ury, upper right y

   These should be set for your printer.  To see how this is
   done, go to the link given in the See Also.
   Alternatively, you can use the edgecontrol parameter.

   As of version 5.96, hc can sense that a parameter file (hcp) is out
   of date and it will automatically upgrade the file.  For this reason the
   parameter file is now listed as 'inout', meaning that it can be modified
   by this program.

*)
(* end module describe.hc *)
{This manual page was created by makman 1.45}


{created by htmlink 1.62}