Delila Program: xyplo

xyplo program

Documentation for the xyplo program is below, with links to related programs in the "see also" section.

{   version = 9.51; (* of xyplo.p 2024 Jan 17}

(* begin module describe.xyplo *)
(*
name
   xyplo: x, y data plotter (pronounced: "zyplo")

synopsis
   xyplo(xyin: in, xyout: out, xyplop: inout, xyplom: in,
         warnings: out, output: out)

files
   xyin: A set of header lines that begin with asterisk ('*') or pound
       sign ('#') are copied to output.  Remaining lines are the data
       in columns, ending with end of file.  Tabs may separate data.
       Missing columns are not allowed.  See the demonstration file
       xyin.demo for an example.  Once the first data line has been
       read, lines that begin with an '*' or '#" or that are entirely
       blank will be ignored.  This allows one to place comments or
       other information deeper into the file without having xyplo
       object.  The # allows files to be read by gnuplot too.

   xyplop:  Parameters to control the plot, on lines as shown.  The major
       sections of the parameter file are separated by lines that are used by
       the program as separators.  A separator line may begin with blanks, and
       these must be followed by asterisks, as shown below.  These lines simply
       make the file easier to deal with, but you must have them in the file!
       The easiest way to create a xyplop file is to copy the demonstration
       file (xyplop.std or xyplop.demo) and modify that to suite your needs.

       xzero yzero         amounts to move the graph origin (cm)
       zx min max          (character, real, real) if zx='x' then set xaxis
       zy min max          (character, real, real) if zy='y' then set yaxis
                           These two lines set the minimum and maximum range of
                           the data to graph.  Other characters mean the
                           program automatically uses the range of the data.
       xinterval yinterval xsubintervals ysubintervals:
                           These 4 parameters, all on one line, define the
                           number of numbered intervals on each axes to plot
                           and the number of unnumbered subintervals.  Note
                           that on a regular scale, going from 10 to 20
                           requires 10 intervals (to get whole numbers), but on
                           a log scale, going from 10 to 100 requires 9
                           intervals (to get tics every 10), not 10 intervals!

       xwidth    ywidth    width of numbers in characters
       xdecimal  ydecimal  number of decimal places
       xsize     ysize     size of axes in cm
       xlabel              the x axis label
       ylabel              the y axis label
                           NEW: 2016 Sep 29 the y label is now rotated and
                           on the left side of the graph.  To restore
                           the original behavior of the y label above
                           the y axis, put '^' in front of the title.
       zc                  if zc='c' then a crosshairs put on zero of x and y
                                 'x' then only X axis is plotted
                                 'X' then only X axis and crosshairs
                                 'y' then only Y axis is plotted
                                 'Y' then only Y axis and crosshairs
                                 'n' then neither axis nor crosshairs
                                 'N' then neither axis with crosshairs
                                 'i' then numbering and tic marks but no line
                                     (invisible)
                           Otherwise, both axes are plotted without crosshairs.
       zxl base            if zxl='l' then convert the x axes to a log scale
                           using the indicated base.
                           zxl='L' is like 'l' but the numbers given on the
                           axis have not had the log taken of them.
       zyl base            if zyl='l' then convert the y axes to a log scale
                           using the indicated base.
                           zyl='L' is like 'l' but the numbers given on the
                           axis have not had the log taken of them.

       * define columns to read data from ***********************************
                           This section defines which column of xyin contains
                           what kind of data.  You can use a column only once.

       xcolumn   ycolumn   columns of xyin that determine the
                           location of the symbol
       symbol-column       the xyin column to read symbols from
                           if zero, then use the first symbol defined below
       xscolumn  yscolumn  columns of xyin that determine the size  of the
                           symbol.  If zero, then no data is expected.

                           NOTE: for most symbols this is the entire size of
                           the symbol.  For the I beam symbol, the yscolumn is
                           half of the total size plotted.  Thus one may use
                           standard deviations and obtain a symbol of 2
                           standard deviations high centered on the y
                           coordinate.

       colorkind hucolumn sacolumn brcolumn
                           define the hue saturation brightness columns.
                           These control the color of the rectangle symbol.
                           1 0 0 is black (assumed if columns are all zero)
                           1 0 1 is white
                           Hue runs from red (value 0) through the spectrum
                           to violet (value 1).  See the technical notes for
                           further details.

                           If the first character of the line is a
                           digit, blank or 'h' then the hsb (hue
                           saturation brigtness) color model is used.

                           If the first character of the line is 'r'
                           then the rgb (red green blue) color model
                           is used.

       * define one or more symbols *****************************************

                           Each of these sections defines one of the symbols by
                           specifying what to do for each symbol flag seen in
                           the symbol column.  There may be as many symbols as
                           will fit in memory.

                           The last of these sections must contain just a '.'
                           as the 'symbol-to-plot'.  This is required to end
                           the symbol definition section since there are an
                           indefinite number of symbols.

       symbol-to-plot      (character) Most symbols are plotted at the
                           coordinates given in xcolumn and ycolumn.

                           'c' plot a circle
                           'C' plot a circle; with connections in color
                           'b' plot a box
                           'B' plot a box; with connections in color
                           'x' plot an x
                           '+' plot a plus
                           'I' plot an I beam symbol
                           'd' plot a box with central dot
                           'p' point (or dot) alone.
                           'm' mark according to a xyplom definition
                           'M' mark according to a xyplom definition,
                               with connections in color

                           'R' plot a filled rectangle in color.  Unlike the
                           other symbols, which are centered on the data, the
                           lower right hand corner of this rectangle is placed
                           on the data.  This allows the user more control on
                           placement.

                           's' plot a histogram from zero to the height

                           'r' like 'R' but gray scale.  The brightness
                           column is used for controlling the brightness.

                           'f' Means to plot the symbol-flag (defined below).
                           The 'f' type allows several symbols to be made each
                           with its own regression and connection lines, but
                           plotted with the entire flag string in xyin.  The
                           symbols are distinguished by their first character.
                           The symbol-flag in xyplop should be set to the string
                           that one desires to be recognized.
                           'f' will center the string.
                           'F' will left justify the string.

                           'g' Means 'grab bag'.  The 'g' type has lower
                           priority than any other symbol.  Xyplo searches
                           through all the available symbols looking for a
                           match to the symbol-flag.  If a symbol-flag cannot
                           be found, then the data are assigned to the
                           'grab-bag'.  The program uses the symbol-flag on the
                           graph.
                           The symbol-flag in xyplop can be anything.
                           'g' will center the string.
                           'G' will left justify the string.

                           'L' will make just lines using the current colors.

                           The symbol underscore (_) in xyin is converted
                           to a blank to allow the appearance of separated
                           words.

                           One can do grab-bag connected curves without symbols
                           by setting g and the symbol-flag to ' '.  One can
                           also set the symbol-to-plot to blank (or other
                           unrecognized symbol) to get specific connected
                           curves.  In this case, the symbols MUST be connected
                           or the program will object (invisible symbol and
                           invisible connection means data loss).

                           'm' Means to look up the symbol-flag name
                            in the xyplom file and to use the PostScript
                           definition there to create the symbol.

                           'M' like 'm' but draw the connecting line
                           in color and if there is no symbol in the
                           xyplom, don't draw any symbol.  This allows
                           one to make curves without symbols.

       symbol-flag         The string of characters that indicates that this
                           symbol should be plotted.  Eg, if the
                           'symbol-to-plot' is I and the flag is x, then
                           whenever an x is seen in the symbol column, an I
                           beam will be plotted.  The flag can be more than one
                           character long, but (unfortunately) it cannot
                           contain blanks.

       symbol-sizex        Side in cm on the x axis of the symbol.  If this
                           value is negative, the data in xscolumn is used to
                           determine the size.  For circles, sizex determines
                           the diameter, sizey is ignored.

       symbol-sizey        Side in cm on the y axis of the symbol.  If this
                           value is negative, the data in yscolumn is used to
                           determine the size.  For circles, sizeX determines
                           the diameter but a positive number is still required
                           for sizey.

       connection linetype size   If the first character is 'c' then the
                           symbols will be connected by lines of linetype as
                           defined below.  (Linetype must follow the c
                           immediately, without blanks.)

       linetype  size      linetype is a character defining the kind of
                           regression line to plot for this symbol:
                           'l' means do regression line
                           'i' invisible,
                           '.' dotted
                           '-' dashed
                           'n' means no line.

                           '-' and '.' require a size in cm for the
                           spacing.  The others also require a number, but it
                           is ignored.

       * end the symbol definitions with a period (left justified!) *********
.
       * define zero or more user defined lines *****************************
       linetype m b size   One or more lines to be drawn on the plot, m and b
                           are slope and intercept.  Linetype and size are
                           define as for the symbol connection lines.  blank
                           lines and lines that begin with "*" are ignored.
                           in this section
                           linetype is defined as for the regression lines.

       * end of the line definitions and start of more parameters ***********

          edgecontrol edgeleft, edgeright, edgelow, edgehigh:
          edgecontrol is a single character that controls how the bounding
          box of the figure is handled.  If it is 'n' then the bounding box
          will be the page parameters defined in constants inside the program
          (llx, lly, urx, ury AND changes as set by the previous parameter
          line).  If the parameter is 'p', there are four real numbers that
          define the edges around the clist in cm.  To allow a map to be
          imbedded into another figure, its size must be defined in
          PostScript (with %%BoundingBox).  By setting these four numbers,
          the edges are defined.  Negative values are allowed, so one may
          move the edges as desired.

          (New as of 2003 Aug 22)

   xyout: regression results, ready for PostScript input.  (See technical
          notes.)

   xyplom: A file containing definitions of additional symbols,
      written in PostScript.  The symbols must be of the form:

             /mysymbol { % xsize ysize mysymbol -
                % xsize x size in points
                % ysize y size in points
                /xsize exch def
                /ysize exch def
                ...
             } bind def

      Xyplo translates to the proper location and sets the color before
      calling the routine.  This allows the xyin file to control the color of
      the symbols, but of course it can be overridden by the user's routine.
      It is even possible for the symbol to use both colors!

      Be sure to define the symbol in the xyplop.

      If the xyplom file is empty, there are no user defined symbols.

      FONT CONTROL (New as of 2010 Oct 26)
         %F  If the xyplom file contains any line starting with '%F'
         then the user defines the font.

         %T If the xyplom file contains any line starting with '%T'
         then the user can scale the tic marks on both axes.

         %X If the xyplom file contains any line starting with '%X'
         then the user can scale the tic marks on the X axis.

         %Y If the xyplom file contains any line starting with '%Y'
         then the user can scale the tic marks on the Y axis.

         %x If the xyplom file contains any line starting with '%x'
         then the user can move the x label up or down (cm).

         %y If the xyplom file contains any line starting with '%y'
         then the user can move the y label up or down (cm).

         %W If the xyplom file contains any line starting with '%W'
         then a white rectangle is written on the background.  If
         figure is converted to a jpg using the ImageMagick convert
         program, the background will not be black.
         The white rectangle is determined from the bounding box.
         You can control the bounding box using edgecontrol.

         Example:

            %F % font control from xyplom
            /Helvetica-Bold findfont
            16 scalefont
            setfont
            2 setlinewidth % sets linewidth
            %T 1.5 % tic mark scale factor

   warnings: extensive warning messages.  These get in the way if they go to
      output.  One line appears on the output file if there are any
      warnings.

   output: messages to the user

description
   The data in the xyin file are converted to graphics in the PostScript
   language on the xyout file, under control of the parameters set in xyplop.
   There are several distinct sections of the parameters:

   1. The first set of parameters determine the overall characteristics of the
      graph.
   2. The second set of parameters defines the columns of xyin to be read.
   3. The next section of the parameter file defines one or more symbols to be
      plotted on the graph.  If desired, a linear regression is performed
      between the data columns, and this may be graphed for each symbol.  The
      invisible option allows one to obtain the regression data without the
      graph.  Regression data include the correlation coefficient
      and Fisher's z'.
   4. A section with just a period ends the symbols section.
   5. The last section contains lines you define.

   Recommended procedure for using xyplo: obtain a copy of xyplop.demo and
   xyin.demo, set permission to read them for yourself (on a Unix system use
   chmod), and copy them to the names xyplop and xyin.  Try them out as is.  If
   you don't get a graph, doing your own data will not do any good!  Then
   convert the xyplop to your own use by changing the xyplop.demo file and
   substitute your xyin file.  This way the complexity of xyplop can be held at
   bay.

see also

   Basic example files:
   xyin, xyout, xyplop, xyplom,

   Standard parameter file:  xyplop.std

   Demonstration examples:
   xyplop.demo, xyin.demo,
   xyplop.test, xyin.test,
   xyplop.mul, xyin.mul,
   xyin.logs
   xyplop.xn xyplop.xl xyplop.xL
   xyplop.yn xyplop.yl xyplop.yL
   xyin.genbank xyplop.genbank

   Related program - generate histogram:
   genhis.p

   Related programs - graphics routines:
   dops.p
   doodle.p

   Technical note:
   To define the bounding box for the graph must be defined.
   The postscript program
   https://alum.mit.edu/www/toms/ftp/printerarea.ps
   will give the values for your printer.
   substitute in the values at %%BoundingBox

   Xyplo now accepts the '#' as the start of a comment
   so that data files can also be used for gnuplot.  See:
   http://www.gnuplot.info/

   Confidence Limits for the Correlation,
   Fisher's z':
   http://sportsci.org/resource/stats/sscorr.html#fisherz

   who says that to get the confidence limits of the correlation
   coefficient,

     "use the Fisher z transformation: z = 0.5log[(1 + r)/(1 - r)].
     The transformed correlation (z) is normally distributed with
     variance 1/(n - 3), so the 95% confidence limits are given by z
     ± 1.96/sqrt(n - 3).  You then have to back-transform these
     limits to correlation coefficients using the equation r =
     [(e^(2z) - 1)/(e^(2z) + 1)]."

   See also:
   Fisher RA (1921). On the probable error of a coefficient of
   correlation deduced from a small sample. Metron 1, 3-32.
   http://davidmlane.com/hyperstat/A50760.html

author
   Thomas Schneider

technical notes

   The program originally generated output in the pic format.  One
   could then run this through pic and troff to produce a graph.
   However, the program has been modified to eliminate the pic
   notation (by substituting modules from dops rather than domods).
   All lines outside the graphics now are preceeded by a %, which is
   beginning of a comment in PostScript.  Thus the output of the
   program can be run directly into a PostScript interpreter.  This
   saves on both memory and speed of graphing since the intermediate
   file is no longer created.

   Colors in PostScript are defined with hue, saturation and
   brightness with the sethsbcolor function.  (Xyplo does not use red,
   green, blue model because it does not give the most useful
   continuous scale, though maybe someday I'll put in a switch.)  The
   standard hue runs from red at 0 to red at 1 with the Roy G. Biv in
   between.  (The famous physicist Roy G. Biv stands for: red, orange,
   yellow, green, blue, indigo, violet, the colors of the spectrum.
   On 1992 September 16 I realized that, amazingly, the wavelength in
   nm match quite nicely!

      700 red
      750 orange
      600 yellow
      650 green
      500 blue
      550 indego
      400 violet

   Ok, Ok, back to xyplo!  Since the hue runs in a circle, the
   spectrum is not generated from the hue range 0 to 1.  Xyplo
   therefore converts the input numbers by multiplying by 0.84 and
   adding 0.16.  This gives the color range from yellow through red
   corresponding to values 0 to 1.  (Note:  adding 0.16 give the range
   from yellow through red, but that is not a spectrum which runs from
   red through violet.)

   As of version 8.50 (1996 March 21) the xyplo program now uses cm
   (YEA!) which means that ALL previous graphs are out of date.  If
   you happen to be stuck in a backwards, primitive country, tough
   luck.  Bite the bullet.

   Minor unobvious things have prevented people from getting graphs.
   Most problems occur when badly formed xyplop files are used, and
   the program has no way to tell what the difficulty is.  More checks
   have been put it, so the program can detect most oddly formed xylop
   and xyin files.  Check your xyplop carefully.

   Setting the Default Printer Area:

   The size of the page in PostScript is determined with 4 constants
   llx, lly, urx and ury.  These must be set correctly for each
   printer.  These can be easily obtained by printing the file:

   https://alum.mit.edu/www/toms/ftp/printerarea.ps

   In this program these parameters are set as 4 constants defaultllx,
   defaultlly, defaulturx and defaultury.  The parameters for page
   edges define whether to use the defaults or to compute the size to
   show.

bugs

ENHANCEMENTS:
* xyplo should apply a PostScript to clip everything outside the drawing area

* xyplo has a numerical drift problem.  For extremely large numbers of
data points, the computed position differes from the position rendered
by PostScript.  This should be corrected by using gsave and restore,
but it is not clear how to modify the draw routines to do this
cleanly.  As it is the program tracks the location of where it THINKS
it is on the graphical display.

ENHANCEMENTS:
* xyplo: cannot handle error bars in log mode

* xyplo: user defined lines have to be type l??

* xyplo: define symbols in postscript - should speed graphics a lot

* xyplo bug: when the last line of the file is empty, the program halts (that's
fine).  The count of the location of the error does NOT include the * lines
though!  Yet it says:  "at line 27 of data (INCLUDING * lines)" Correct this.

* xyplo bug from Mark (email 1992 May 27)

*)
(* end module describe.xyplo *)
{This manual page was created by makman 1.45}


{created by htmlink 1.62}