By downloading this code you agree to the
Source Code Use License (PDF). |
{ version = 3.61; (* of makewalker.p 2006 Jul 07} (* begin module describe.makewalker *) (* name makewalker: walk an information weight matrix across a sequence synopsis makewalker(book: in, ribl: in, colors: in, makewalkerp: in, walk: out, output: out) files book: a book from the Delila system ribl: a weight matrix from the Ri program colors: definitions of how to color letters. See makelogo.p for details. makewalkerp: parameters to control this program The first line must be the version number of the program. This allows the program to recognize when the parameter file is old. rangefrom: integer, FROM of the ribl matrix to use. rangeto: integer, TO of the ribl matrix to use. basesperline: integer, number of bases per line to display. linesperpage: integer, number of lines per page to display. basenumber: integer, the base on the line to place the zero of the walker at initially on the page. It must be between 0 and basesperline - 1. Counting begins at zero on the left side of the page. linenumber: integer, the line number to place the zero of the walker at initially on the page. It must be between 0 and linesperpage - 1. Counting begins at zero on the bottom of the page. coornumber: integer, the coordinate number to place the zero of the walker at initially. If this number is not found in the piece coordinate system, the walker will be placed at the beginning of the sequence when coornumber's value is zero or negative and placed at the end of the sequence when coornumber's value is positive. pagewidth: real, the width of the lines of sequence in cm. pageheight: real, the height of the lines of sequence in cm. pagex: real, the x coordinate of the page lower left corner in cm. pagey: real, the y coordinate of the page lower left corner in cm. lowerbound: real < 0, the lowest Ri(b,l) value in bits that can be fully displayed (bases with lower values are clipped and have a red line on the bottom). boxes: charcter: if 'b' then the walker characters are surrounded by character-boxes as defined below. Otherwise the boxes are invisible. outofsequence: charcter: if 'o' then the walker is set next to the sequence. Otherwise the walker is in line with the sequence. Thanks to Seth Taylor for suggesting this option on 1994 November 22. ALL LINES FOLLOWING THIS POINT: These are inserted into the walk as commands before the initial display. walk: A postscript program that implements the walk. It is to be run with ghostscript: gs -q walk Ghostscript then pops up a graphics window and the user types commands to control the display. (The -q just makes ghostscript quiet on startup.) The program reports information to the user that include the position, the individual information for the current position (Ri, bits) and the Z score for this Ri given the mean (Rsequence) and standard deviation of the original population of sequences used to create the ribl matrix. When the absolute value of the Z score is less than or equal to 2, an arrow (<---) indicates that the position is likely to be a site. Likewise, when the Ri value is positive, this is indicated by plus signs (++++). (The actual test can be set by the user.) The user can type '?' or 'help' to get a list of commands. These commands are discussed in further detail below. NOTE: the Ri evaluation is ONLY for the portion of the walker displayed on the screen. output: Messages to the user. description This program creates a PostScript program, called the "walk", by reformatting the DNA sequences in a Delila book and joining them to the ribl matrix. The user then runs the "walk" using the interactive PostScript interpreter ghostscript. Within the ghostscript graphic page appears part or all of the sequence(s) in the book. The majority of the letters are black, but a portion are in color. These letters correspond to the evaluation of those bases by the Ri(b,l) matrix read from the ribl file. The height of each letter is proportional to its weight in the matrix. Thus the user can immediately see the components of the weight matrix as applied to the particular sequence. The user may then type commands to move the evaluated region around. The user literally walks the evaluation across the sequence, and thereby gains a sense of the reaction each part of the recognizer to each part of the sequence. GENERAL SCHEME OF A WALKER PAGE A walker page consists of a rectangular array of character boxes: <------------- basesperline ------------> (10 in this case) 0 1 2 3 4 5 6 7 8 9 ^ ----------------------------------------- ^ p | |152|153|154|155|156|157|158|159|1 |2 | | a | | | | | | | | | | | | 2 | g | | | | | | | | | | | | | e | | | | | | | | | | | | | h | ----------------------------------------- | e | |3 |4 |5 |6 |7 |8 |9 |10 |11 |12 | | i | | | | | | | | | | | | 1 linesperpage g | | | | | | | ! | | | | | (3 in this case) h | | | | | | | | | | | | | t | ----------------------------------------- | | |13 |14 |15 |16 |17 |18 |19 |20 |21 |22 | | ( | | | | | | | | | | | | 0 | c | | | | | | | | | | | | | m | | | | | | | | | | | | | ) v *---------------------------------------- v * * <----------- pagewidth (cm) ------------> * **** lower left hand corner is at pagex horizontal (cm) and pagey vertical (cm) on the page, starting from the PostScript default zero coordinate. The "!" is at basenumber = 5, linenumber = 1, coornumber = 8 All the parameters: basenumber, linenumber, coornumber, basesperline, linesperpage, pageheight, pagex and pagey are defined independently. The physical positioning parameters pagex, pagey, pagewidth and pageheight determine where the entire set of character boxes is placed on the page. Each character box size is determined by the basesperline and linesperpage so that the required number fit the defined area of the page. The zerobase of the walker is set initially at the coordinate given by basenumber and linenumber. The coordinates of the bases for the rest of the sequence are determined by the coordinate of the zerobase of the walker. Note that the coordinate system in the example above represents a fragment of a circular DNA, with coordinates running from 152 up to 159, followed by a jump to the start of numbering at 1 and then proceeding up to 22. (These kinds of coordinates can be generated and handled by Delila programs.) GENERAL SCHEME OF A WALKER CHARACTER BOX +---+ <-- 2 bits per base | | |---| <-- 0 bits per base | | | | | | +---+ <-- lowerbound bits per base The box has a part above zero in which letters appear upright and a part below zero in which the letters appear rotated 180 degrees if they are within the evaluated region or black and upright if they outside. If the walker is out of the sequence, then a gap of height 1 bit is created just above the 2 bits mark. The sequence is put there. The rest of the characterbox is scaled accordingly. Bases which have positive Ri(b,l) values run upward from 0 to 2 bits, those that have a negative value run downward. If a base evaluates to a number of bits lower than lowerbound, it will be drawn down but any amount below lowerbound is cutoff. To indicate this situation, the background becomes purple. If the base has a value less than -log2(n) bits (where n is the number of sequences used to make the ribl model), it is considered to be negative infinity, and the background becomes black. COMMANDS When the walk program is run in GhostView, the user can control the display by means of typed commands. These commands are built from PostScript procedures. This means that any arguments must be given before the command itself. This may feel a little strange at first, bit it is easy to get used to. For example, to go to location 132, the user types: 132 goto<cr> where <cr> is a carriage return. # means that the command is proceeded by a number. * means not implemented yet Movement Commands: These commands affect the direction that the walker or the sequence moves. Which moves depends on the w command. The commands are the same as those of the Unix editor vi. # h: move left on the page (# is optional) # j: move down on the page (# is optional) # k: move up on the page (# is optional) # l: move right on the page (# is optional) Move commands may have an integer in front which says how many times to move. The program will repeat the command. * n: next sequence * p: previous sequence w: A toggle between two states: the walker moves along the stationary sequence, or the sequence moves along the stationary walker. q: quit ?: help message r: Refresh the page. R: restore or restart ghostscript on the current walk file. This allows one to start over or to modify the walk and restart without quitting ghostscript. The modification could be done by the makewalker program, by hand-editing or by another program. cl: clear the ghostscript command screen. # A,C,G,T: Mutate the given absolute location to the desired base. For example, to set base 100 to be an "A", type "100 A". # a,c,g,t: Mutate the given relative location to the desired base. The location is relative to the current position of the walker. For example, to set the base 10 to the left of the walker zero to be an "a", type "-10 a". # setwait: set the wait time in seconds after display (starts at zero) # isasecond: set the number of {1 pop} cycles per second. This depends on how fast your computer is and should be adjusted. # goto: Type a coordinate and then "goto". For example, to get to coordinate 100 type "100 goto". The zero base of the walker will be set to the coordinate. # invert: invert the Ribl matrix. This is only useful if you have an asymmetric binding site. # jump: Like goto except one gives the relative number of bases to move. For example, to move 5 bases in the 5' direction, type "-5 jump". The zero base of the walker will be set to the new coordinate. boxes: toggle between having boxes and not. These are mostly helpful for seeing where things are on the page. # lines: Set the number of lines per page, eg type "3 lines". # bases: Set the number of bases per page, eg type "30 bases". ("wide" can also be used) # left, right, up, down: move the graphic on the page in units of cm. example: "0.5 right" moves the graphic right half a cm. # height, width: set the page height or width in cm. in: Put the walker into the sequence. out: Put the walker out of the sequence. # wave: define base at which the low point of the cosine wave is set. example: "5 wave" puts the low point at base +5. waveon: Turns on drawing the wave. waveoff: Turns off drawing the wave. toggleprinting or tp: a toggle that turns on and off printing. This allows one to give several commands without seeing the display change. Turning printing on automatically causes a display. NOTE: printing is initially off to allow displays to be created without showing anything. It may be turned on as the first user command following the other makewalkerp parameters. toggleerase or te: a toggle that turns on and off eraseing the page. In conjunction with the toggleprinting command this allows one to display several walkers on a page for making a figure. togglereport or tr: a toggle that turns on and off reports to output. If it is placed as the first user defined command in the makewalkerp, then there will be no output messages and ghostview will not put up a display message. This is useful for embedding in another figure. # from: change FROM range of the matrix to use # to: change TO range of the matrix to use help: help message # setri: set minimum Ri for searching and display # setz: set minimum Z for searching and display # f: search forward to next site which fits search criteria # b: search backward to next site which fits search criteria TO MAKE PRINTOUTS The walker is interactive, which means that the PostScript showpage function is not called since it would pause the screen and then wipe out the display at every command. However, printers require showpage and if it is not inculded they won't print anything. If you do this they will spend a few minutes rendering the page and then nothing will come out! To make printouts, attach: gsave showpage grestore to the end of the walk file. The gsave/grestore assure that the graphics state is not lost during the showpage. You can put any commands you like in front of the showpage: 180 goto boxes out showpage This allows one to set up the page as desired. TO IMBED IN FIGURES In addition to the note above about showpage, the walk file contains commands that translate the image. To prevent these from affecting the surrounding PostScript, they must be enclosed in a gsave-grestore pair. The gsave is provided at the start of the walk file. The grestore is provided by the q command. Commands can be put at the end of the parameter (makewalkerp) file. The command toggleprint is called before and after these commands, so the commands are normally not seen. If you surround your commands with calls to toggleprint, you will see a movie of the actions taken. The command toggleerase allows one to draw several walkers on a page, merely by preventing the previously drawn one from being erased. However, if a figure is imbedded into an AdobeIllustrator figure and toggleerase is called when printing is active, this action may wipe out other parts of the figure. This can be prevented by turning off the erase with toggleerase before turning on the printing with toggleprint. If the command togglereport is the first command, then the messages sent to standard output, which appear on the ghostscript control window, are all suppressed (errors are still reported). This prevents a display window from popping up in ghostview. This is an example of what to add to the end of the makewalkerp to make a figure: togglereport % turn off messages to output waveoff 5 up % do some things silently toggleerase % do this before the toggleprint toggleprint % turn on printing 6 down l % jump around toggleprinting toggleprinting % force printing 6 down l % jump around toggleprinting toggleprinting % force printing showpage Do not use copypage for figures as this halts the display. ACKNOWLEDGMENTS I thank Seth Taylor for suggesting the mode for the walker being outside the sequence, Paul Hengen for suggesting the cosine wave applied to the letters and Denise Rubens for suggesting the mutation function. examples -10 rangefrom: integer, FROM of the ribl matrix to use +10 rangeto: integer, TO of the ribl matrix to use 50 basesperline: integer, number of bases per line to display. 3 linesperpage: integer, number of lines per page to display. 20 basenumber: integer, the base on the line to place the zero of the walker 1 0 linenumber: integer, the line number to place the zero of the walker 132 coornumber: integer, the coordinate number to place the walker zero 18.5 pagewidth: real, the width of the lines of sequence in cm. 24.9 pageheight: real, the height of the lines of sequence in cm. 1.5 pagex: real, the x coordinate of the page lower left corner in cm. 1.5 pagey: real, the y coordinate of the page lower left corner in cm. -4 lowerbound: real < 0, the lowest Ri(b,l) value in bits displayed nb boxes: b: boxes around each character io insequence: i: in the sequence, else out % all lines from this point on are PostScript commands % The "%" makes a comment % makewalkerp: parameters for makewalker 3.03 and higher % The following commands make a picture of 2 walkers % waveoff % turn off waves 1 lines % display only one line 10 up % move 10 cm up 5 height % make the line only 5 high 44 wide % show 44 characters across w 5 h w % move the sequence 5 positions left 132 goto % put the walker in a new spot toggleprinting toggleprinting % force printing toggleerase % prevent erasing during the next steps 6 down % jump 6 cm down 143 goto % put the walker in a new spot toggleprinting toggleprinting % force printing % gsave showpage grestore % unearth the command if you send this to a printer! documentation Ghostscript documentation can be found from: <a href = http://www.cs.wisc.edu/~ghost/index.html> http://www.cs.wisc.edu/~ghost/index.html</a> see also delila.p, makelogo.p, ri.p, scan.p, dnaplot.p author Thomas Dana Schneider bugs Known Bughs: Only one sequence is loaded from the book. With parameter for 3 lines, reset to 1 line puts the entire display too low. Yet starting with 1 line it's ok. Some global parmaeter is not being set in definepageparameters. (Same thing: When there is one line per page the position is too low, one needs to use (eg) "5 up".) 180 goto 1 goto - it doesn't erase old stuff to left! Something uses up virtual memory every time the walker takes a step. Eventually this causes an error and GhostScript dies: Error: /VMerror in --charpath-- VM status: 0 16061098 16168018 Current file position is 5 XIO: fatal IO error 12 (Not enough memory) on X server ":0.0" after 47675 requests (45252 known processed) with 2497 events remaining. Why? When number of lines per page is changed, the cosine wave height does not change correctly, often being too small. (Apparently fixed.) The display glitches sometimes by leaving behind pieces that should get erased. This occurs when numbers are being are displayed that don't fit into the available area and get clipped. A relevant location in the code is in the routine displaywalker at: "white 0 0 charbox fill" A replacement replacement: "0 0 charbox clip erasepage initclip" does not help. Perhaps this is the wrong part of the code. It is also possible that the problem is in ghostscript. The effect sometimes occurs as one is moving the walker around. Letters that are drawn that go below the lower bound don't get clipped properly, they leave a slight edge there. Range checking does not work properly. If the ribl has a range from -100 to +99, then a request for -99 to +100 bombs. This should be caught in walker. Perhaps there should be a function that automatically defines the lower bound in bits so that the user does not need to figure thisout. Resetting lower bound messes up the display! f (and probably b) searches don't work when the display is toggled off. Fortunately this is easy to get around: just determine the locations and use goto. If one has a small sequence, visible on the screen and then sets the move mode to move the sequence with the walker steady (ie use the w toggle), then when the end of the sequence moves in, the last character is not removed, so there are repeating bases on the end. technical notes Note: encapsulation of the figure requires a gsave and a grestore to surround the walk code to undo the translation to the basenumber = 0, linenumber = 0 coordinate and any other translations done by commands. No showpage is provided, since this does not help during interactive graphics. Worse, ghostscript pauses at every showpage or copypage, saying: ">>copypage, press <return> to continue<<" So the user would be forced to type extra carriage returns for every command. If a showpage is needed for making a printout, it must be added later as "gsave showpage grestore. isasecond is a global constant that defines the number of {1 pop} operations that the display can run through in 1 second. This must be determined for each computer. The bounding box for EPS is defined in the constants. *) (* end module describe.makewalker *) {This manual page was created by makman 1.45}{created by htmlink 1.62}