program ttest(ttestp, list, output); (* ttest: Student's t-test Thomas D. Schneider, Ph.D. toms@alum.mit.edu https://alum.mit.edu/www/toms *) label 1; (* end of program *) const (* begin module version *) version = 1.12; (* of ttest.p 2015 Jul 20 2015 Jul 20, 1.12: improve documentation 2015 Jan 22, 1.11: fix documentation, Bulyk.Church2002 2013 Mar 27, 1.10: align columns of output 2000 Apr 19, 1.08: program produces probability. origin probably 1994 January 15 *) (* end module version *) (* begin module describe.ttest *) (* name ttest: Student's t-test synopsis ttest(ttestp: in, list: out, output: out) files ttestp: parameters to control the program: A set of 6 lines defines the two distributions: N (integer) for first distribution mean (real) for first distribution standard deviation (real) for first distribution N (integer) for second distribution mean (real) for second distribution standard deviation (real) for second distribution The 7th line is the factor to divide the sample by. If we have an Rsequence calculated from a dimeric sequence, then the two halves are NOT independent. The correct calculation takes this into account by using the same N for the one-way sites and by dividing the mean by 2. Squaring the standard deviation gives the variance. This variance is divided by 2 and then square rooted to get the variance of the half sites. If this "sample division factor" is 1, then the calculations proceed without them. If the factor is 2, then the changes described above are made. list: Input values and calculated T value output: messages to the user description This simple program performs the T test computations. examples 7 OxyR binding site sequences were analzyed for information content and the standard deviation calculated by the rsim.p program. This gave 15.4 +/- 1.9 bits for n = 14 sequences. A randomization experiment was performed and 16 sequences which bind OxyR were recovered. These were analyzed as above. This gave 17.5 +/- 1.2 bits for n = 32 sequences. Since both the sequences and their complements were used for the calculation, only half site information should be used. The ttest p file is: 14 n1: number of samples, sample 1 15.4 m1: mean, sample 1 1.9 s1: standard deviation, sample 1 32 n2: number of samples, sample 2 17.5 m2: mean, sample 2 1.2 s2: standard deviation, sample 2 2 sample division factor. The resulting list file is: ******************************************************************************** old: ttest 1.04 sample division by a factor of 2 distribution 1 | distribution 2 number 14 | 32 mean 7.70000 | 8.75000 standard dev. 1.34350 | 0.84853 sigma-D = 0.38914 degrees of freedom = 44 t = -2.69827 This is significant (p < 0.02). So the randomization did not give a similar information content to the wild type. ******************************************************************************** ttest 1.08 sample division by a factor of 2 distribution 1 | distribution 2 number 14 | 32 mean 7.70000 | 8.75000 standard dev. 1.34350 | 0.84853 sigma-D = 0.38914 degrees of freedom = 44 t = -2.69827 p = 0.99508 documentation @book{Press1989, author = "W. H. Press and B. P. Flannery and S. A. Teukolsky and W. T. Vetterling", title = "Numerical Recipies in Pascal. The Art of Scientific Computing", publisher = "Cambridge University Press", address = "Cambridge", year = "1989"} @article{Schneider.oxyr, author = "T. D. Schneider", title = "Reading of {DNA} Sequence Logos: Prediction of Major Groove Binding by Information Theory", journal = "Meth. Enzym.", volume = "274", pages = "445-455", year = "1996"} Given a t value from a Student's t test, and the degrees of freedom, df, return the probability for a two tailed test. The code was originally in java script, from: Richard Lowry Department of Psychology Vassar College Poughkeepsie, NY 12604-0396 USA office: (914)437-7381 fax: (914)437-7538 lowry@vassar.edu http://faculty.vassar.edu/~lowry/VassarStats.html The original functional html containing this code is given below the Pascal. It was translated to Pascal by Tom Schneider. A concise description of the t-test is given on page 1256 of: @article{Bulyk.Church2002, author = "M. L. Bulyk and P. L. Johnson and G. M. Church", title = "{Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors}", journal = "Nucleic Acids Res.", volume = "30", pages = "1255--1261", pmid = "11861919", pmcid = "PMC101241", year = "2002"} see also rseq.p, rsim.p, multtest.p http://www.statsol.com/tools/stattools/ttestindependenttool.html http://faculty.vassar.edu/~lowry/VassarStats.html {Bulyk.Church2002:} http://www.ncbi.nlm.nih.gov/pubmed/11861919 author Thomas Dana Schneider bugs technical notes *) (* end module describe.ttest *) var ttestp, list: text; (* files used by this program *) (* begin module halt *) procedure halt; (* stop the program. the procedure performs a goto to the end of the program. you must have a label: label 1; declared, and also the end of the program must have this label: 1: end. examples are in the module libraries. this is the only goto in the delila system. *) begin writeln(output,' program halt.'); goto 1 end; (* end module halt version = 1.17; (@ of multtest.p 2000 April 19 *) (* begin module ttestprobability *) function ttestprobability(t, df: real): real; (* Given a t value from a Student's t test, and the degrees of freedom, df, return the probability for a two tailed test. The code was originally in java script, from: Richard Lowry Department of Psychology Vassar College Poughkeepsie, NY 12604-0396 USA office: (914)437-7381 fax: (914)437-7538 lowry@vassar.edu http://faculty.vassar.edu/~lowry/VassarStats.html The original functional html containing this code is given below the Pascal. It was translated to Pascal by: Dr. Thomas D. Schneider toms@alum.mit.edu permanent email: toms@alum.mit.edu https://alum.mit.edu/www/toms/ It is tested carefully in: testttestprobability. *) var pi: real; (* ratio of circle circumfrence to diameter *) pj2: real; (* pi/2 *) pj4: real; (* pi/4 *) pi2: real; (* 2*pi *) e: real; (* e *) dgr: real; (* 180/pi *) function zip(q,i,j,b: real): real; var k: real; zz, z: real; begin zz := 1; z := zz; k := i; while (k<=j) do begin zz := zz*q*k/(k-b); z := z+zz; k := k+2 end; zip := z; end; function buzz(t,n: real): real; var rt, fk, ek, dk : real; begin t :=abs(t); rt :=t/sqrt(n); fk :=arctan(rt); if (n=1) then buzz := 1-fk/pj2 else begin ek := sin(fk); dk := cos(fk); if ((round(n) mod 2)=1) then buzz := 1-(fk+ek*dk*zip(dk*dk,2,n-3,-1))/pj2 else buzz := 1-ek*zip(dk*dk,1,n-3,-1) end end; { NOT USED function Abuzz(p,n: real): real; var v, dv, t: real; begin v := 0.5; dv := 0.5; t := 0; while (dv>1e-6) do begin t := 1/v-1; dv := dv/2; if(buzz(t,n)>p) then v := v-dv else v := v+dv end; Abuzz := t end; } begin (* ttestprobability *) pi := 4.0*arctan(1.0); pj2 := pi/2; pj4 := pi/4; pi2 := 2*pi; e := exp(1); dgr := 180/pi; ttestprobability := 1.0-(buzz(t, df)/2.0); end; (* ttestprobability *) (* The material below this point is a functioning javascript program that can be used as a web page. ********************************************************************************