Delila Program: calhnb

calhnb program

By downloading this code you agree to the
Source Code Use License (PDF).

Pascal source code: calhnb.p (wget instructions)
Instructions on compiling
MacOS binary: calhnb
Alphabetic List of Delila Programs
Delila Programs by Most Recent Update
Please report broken links
delilabundle.zip = All Programs and MacOS Binaries
Copyright Statement for Delila Programs

Documentation for the calhnb program is below, with links to related programs in the "see also" section.

{ version = 2.29; (* of calhnb.p 2005 Jul 16}

(* begin module describe.calhnb *)
(*
name
      calhnb: small-sample correction for information and uncertainty

synopsis
      calhnb(fin: in, fout: out, output: out)

files
      fin: the genomic composition (integers) on one line followed by
         a set of integers, one per line representing values of n

      fout: a table showing n, e(hnb), ae(hnb) and their difference.
         the variances var(hnb) and avar(hnb) are tabulated along with
         the difference between their square roots.  This is the difference
         between the standard deviations.  e(n) is found from the genomic
         uncertainty minus e(hnb).  Finally, sd(n) = sqrt(var(hnb)) is given.

      output: messages to the user.

describe

   Given a genomic composition and a series of integers (n) that represent
   the number of sample sites, calhnb calculates the sampling error as e(hnb)
   and the variance var(hnb).  It also finds the approximations ae(hnb) and
   avar(hnb).  These values are presented in a table along with the
   differences between the exact and approximate calculations.  This table
   will allow a user to decide when to use the approximations.  Beware that
   the exact calculation becomes very expensive for large n.  For this
   reason, I use the approximate computation for n > 20 in rseq and alpro.

examples

   When used as fin, the calhnb.fin file should generate the calhnb.fout file
   in the fout.  The data should be identical those given in Figure A.2 on
   page 428 of the Appendix of Schneider et al 1986.

documentation

   "Information content of binding sites on nucleotide sequences"
   T. D. Schneider, G. D. Stormo, L. Gold, and A. Ehrenfeucht
   JMB 188:415-431 (1986)  [see link below]

see also

   Example       input  file, fin:  calhnb.fin
   Corresponding output file, fout: calhnb.fout

   fin  file for values up to n = 50: calhnb.50.fin
   fout file for values up to n = 50: calhnb.50.fout

   Discussion about correctiing for small sample size:
   https://alum.mit.edu/www/toms/small.sample.correction.html

   Schneider et al. (1986):
   https://alum.mit.edu/www/toms/paper/schneider1986

   related programs: rseq.p, alpro.p

author

      Thomas D. Schneider

bugs

   It would be nice to have a generalized algorithm for any number
   of symbols.

*)
(* end module describe.calhnb *)
{This manual page was created by makman 1.45}

{created by htmlink 1.62}