Go to UW homepage

WatCut: An on-line tool for restriction analysis, silent mutation scanning, and SNP-RFLP analysis



Helptopics:

Troubleshooting Overview Restriction analysis Silent mutation analysis SNP RFLP analysis Selecting enzymes Preferences Saving and deleting data Howto FAQ Credits


Trouble shooting

Several people have reported errors with 'printer-friendly output' or 'select reading frames', and the error might surface in other contexts as well. The error message they received was

Warning: extract(): First argument should be an array in (...)\class_restriction_project.php on line 130

At least in one case, this has been resolved by clearing all watcut cookies from the browser. (Note: Clearing the cookies will cause you to loose all data previously saved on the server.) If this does not resolve your problem, please let me know.

Back to top


Overview

WatCut offers several analysis tools related to the use of restriction enzymes in cloning experiments. These are

Silent mutation analysis is intended for use with oligonucleotide sequences. It starts from a DNA (not protein) sequence, for which you also need to indicate the reading frame of protein translation. WatCut then scans your sequence for restriction sites that can be introduced without changing the encoded protein sequence. It will find sites created by any number of mutations, with both non-degenerate and degenerate recognition sequences - that is, it will find all possible sites.

Restriction analysis works very much as usual. WatCut can display the results in a graphical format, as a plain table, or in a complete textual format along with the DNA and translated protein sequences. All of these displays can also be formatted for printing. Sequences may be supplied by file-upload or by copy-and-paste.

SNP-RFLP analysis scans a sequence that contains a single nucleotide polymorphism for existing or possible mutant restriction sites that will distinguish between the two polymorphic nucleotides.

WatCut knows and uses all commercially available type II restriction enzymes. You may

Enzyme sets may also be saved directly from within the displayed results, and the saved sets applied in further analyses. Once you wrap your mind around this, you will see that you can use this feature to very quickly shortlist your options for a cloning experiment. The How to section of this page gives several examples.

Apart from enzyme sets, you may also customize the screen display and print versionsof your results. This is described in the 'Preferences' section below.

WatCut uses information available from Rebase. All enzymes listed on WatCut's pages will link to their reference pages at Rebase. Follow these links for easy access to information on recognition sequences, methylation sensitivity, commercial availability, and isoschizomers.

WatCut remembers you by using cookies, and it uses JavaScript on some of its pages to make their usage more smooth and pleasant. It is suggested that you turn on these features in your browser. WatCut will still be usable without JavaScript. Without cookies, however, it will not remember your previous projects, enzyme sets and display settings once you leave this site.

WatCut is a hobby of mine and only receives intermittent attenttion, so it may still have some bugs and problems. If you identify a problem, please report it using the E-mail address at the bottom of this page. This will help me to make WatCut more useful and reliable. Thank you!

Back to top


Restriction analysis

Acceptable file / data formats Display of restriction analysis results Display formats

This function is available through the 'Restriction analysis' entry in the main menu (and, in fact, right from the first page you see upon visiting WatCut). It lets you analyze DNA sequences up to 50 kb in length. Analysis is performed with all commercially available type II restriction enzymes, as listed by Rebase.
Note that WatCut does not presently care about circular sequences. This means that it will not find, e.g., the EcoRI site in pBR322, as this site is split into halves located at the two ends of the sequence. (Proper treatment of circular sequences will be added in the future). Another quirk is that the base counting is '0 2013; based', which means that a sequence of 1000 bases will be numbered 0 - 999.

Acceptable file / data formats

WatCut will extract DNA sequences from raw text, fasta, and GenBank formats. It will work with other file formats as well if

  1. the DNA sequence is separated from other information in the file by line breaks,
  2. the DNA sequence section of the file is not interrupted by anything else but spaces, tabs, line- pagebreaks, and digits.

Note that, besides A,C,G, and T, N is accepted as a 'nucleotide' for restriction analysis (but not for silent mutation analysis). This lets you work with crude sequencing results. However, it will of course limit the accuracy of the analysis in those regions that are polluted with Ns. This will affect both the detection of restriction sites and of open reading frames. (WatCut will translate a codon that contains an 'N' as '???' but will allow open reading frames to continue.)

Display of restriction analysis results

Upon submission of a new sequence (or selecting an old project) from the start page, you will be taken to the results display page. Here, you may apply a variety of settings, which are put into effect with the 'Update display' button. These settings are:

In addition, from this page you also may access the 'printer-friendly' version of the results. Note, however, that clicking this link always gives you the print version of the current state of the display. Thus, if you want to change the current settings, hit 'Update display' first to put your changes into effect, and then click 'Printer-friendly version'.

Display formats for restriction analysis

The results of the restriction analysis can be displayed in various ways:

To switch between the display formats, check the appropriate radio button on the Results page, and hit 'Update display'. All display formats can also be formatted for printing. Note, however, that the corresponding link on the 'Results' page gives you the print version of the current state of the display. Therefore, if you want to change the display, click 'Update display' first and then 'Printer-friendly version'.

Graphical display

In this mode, you will see a map of the sequence with the open reading frames (ORF's) colorized. Below, you will see the current selection of restriction enzymes, with the cleavage sites represented by vertical bars and projected onto the ORF map. Note that this display is character-based and therefore somewhat coarse; don't print and use it as a yardstick for restriction mapping in forensic applications.

Clicking on the ORF map will display the start and end bases of ORF's and intervening regions in the 'Show sequence' fields at the top of the page. Click on two regions in the map to span the sequence segment they delimit; click twice on one region to see both its start and end. Click 'Update display' to zoom in to sequence segment thus defined; check 'entire sequence' and click 'Update display' again to revert to the display of the whole sequence.

Mousing over an ORF in the map will also display its pertinent information in the status bar (at the bottom of the page).
Note that in the graphical mode the display of overlapping ORF's is limited to two. Any further overlapping ones get kicked out without further notice (unfairly enough, the smallest ones first). Display of ORF's is further restricted by display resolution - if you have a very long sequence, ORF's will be excluded if they span less than the equivalent of one character.

Table display

This is a plain-vanilla enumeration of the cleavage sites. Nothing much about it, except that it may save some trees when printing the output.

Display with sequence

In this mode, you can see all cleavage sites next to the DNA (and protein) sequence. The actual cleavage position is in front of the enzyme name, like so:

             EcoRI                           EcoRI
      ..ACGTGAATTCACGT..     ->      ..ACGTG AATTCACGT..

Enzymes are displayed with checkboxes and links as long as the data set doesn't get overly big; beyond a certain limit, display will be simplified in the same way as for print output, and saving of enzyme sets will no longer be possible.
ORF's are numbered 1 to 3 to indicate translation in the forward, or -1 to -3 in the reverse direction. In contrast to the graphical mode, overlapping ORF's are completely displayed.
Note that, in the 'sequence' display mode, you will not see enzymes that don't cut. Use 'table' or 'graphical' mode if you need those.

Saving enzyme sets from a display of restriction results

First, click all the appropriate checkboxes (or 'check all' as a shortcut). Enter a name for your set, and hit 'Save'. If you don't give a name, the default name 'my_working_set' will be used. WatCut will overwrite 'my_working_set' each time you don't specify a name. Therefore, going with 'my_working_set' will avoid cluttering up your account if you do not want to permanently save your enzyme sets.

Back to top


Silent mutation analysis

This function lets you find mutations in your oligo sequence that create novel restriction sites for any commercially available type II restriction enzyme. WatCut will

Start with clicking on 'Silent mutation analysis' in the main link menu. Type or paste in your oligo sequence. 'N' is not allowed here 2013; it will plainly be stripped out, as will any other unsuitable input. Give your oligo a name 2013; or it will be christened 'my_working_oligo', which sounds silly, doesn't it. Click 'Submit sequence' to proceed to choosing the correct reading frame, and finally click 'Analyze'. This will take you to the 'Results' page, where you will have the following options:

Use the 'Select enzymes' link on the top of the page to change the subset of restriction enzymes to be included in the analysis.

Next to the sequences of your mutant oligos, you will see the numbers of mutations, and two melting temperatures. The left one ('Tmtemplate') factors in the effect of the mutations 2013; use this one as an estimate of the oligo's behaviour in hybridization experiments, and in the first round of PCR. The second one ('Tmself') neglects the mutation. This should be a more accurate estimate of the oligo's behaviour in subsequent rounds of PCR, because the oligo will then become its own template. You might consider both temperatures if you are fond of writing sophisticated PCR protocols.

Note that both these values are approximations. The formula used is:

Tm = 64.9 + 41 * ([number of G's and C's] - 16.4) / [total number of bases] - 100 * [number of base changes] / [total number of bases]

There are better algorithms. One is implemented at IDT's oligo analyzer (this link seems to be dead - if you know a good website for this, please tell me). I found the values to agree within +/- 2 C.

Back to top


SNP-RFLP analysis

SNP (single nucleotide polymorphism) can be detected by restriction analysis (or 'restriction fragment length polymorphism' - RFLP), if only one of the two variant bases creates a cleavage site for a restriction enzyme. For example, if we have a polymorphic site

    ATCGAA[C/T]TCAA

then only

    ATCGAATTCAA

but not

    ATCGAATCCAA

will be cleaved by Eco RI.In addition to naturally occurring sites, useful sites may also be created by introducing mutations close to the variant position. E.g., if the wild type sequence were

    ATCTAA[C/T]TCAA

neither variant would be cleaved by EcoRI. However, we could use a mutant primer for amplification of the SNP sequence, like so:

    primer:    5'-.......ATCGAA-3'
    template:  5'-.......ATCTAA[C/T]TCAA..............

Since the polymorphic site is not covered by the primer, the natural variants will be preserved in the PCR product. The T→G mutation will be present in both variants and produce a discriminating EcoRI site. This is known as an 'amplification-created restriction site'.

WatCut will work both with 'true' SNP (two allelic bases at one position, as in the above example) and with 'false' SNP, in which one or more nucleotides are missing from one of the alleles, e.g.

    ATCTAA[-/AT]TCAA

The maximum number of missing / inserted bases is 50 - which is way more than required, since one can obviously detect such large differences in size directly by electrophoresis. However, it will not work with more than two alleles; higher numbers of alleles are not very common and are not as readily amenable to restriction analysis.

Input formats match that in the NCBI SNP database. To indicate a 'true' SNP, you can either use [A/G] notation, or the corresponding binary degeneracy code (R for A/G, Y for C/T, W for A/T and so on - this is used in the 'Fasta' format of the NCBI database). ).

Of course, for 'false' SNP, only the first notation will do. (The NCBI Fasta format uses a single 'N' for false SNPs of any kind, which is of course ambiguous and cannot be processed.) Please give the dash (representing the void) first, and then one or more inserted bases, e.g. [-/AT].

Alternatively, it is also possible to specify (one or more) numerical identifiers into the NCBI database. WatCut will then retrieve the sequence automatically.

When staring at the results, note that the numbering of the cleavage positions is relative to the polymorphic position, not the beginning of the sequence. Note also that, in rare cases, one mutation may result in multiple cleavage sites, all of which will be listed.

The following considerations apply to the design and the selection of mutations in the amplification primers:

  1. The fewer mutations, the better. You can limit the number of mutations to scan for; WatCut's default is 1.
  2. Mutations must not be introduced directly adjoining to the polymorphic site, since that would be the 3' end of the oligonucleotide. By default, WatCut will report sites that leave three positions next to the SNP unchanged, but you can increase this number by adjusting the (minimum) 'Distance to SNP' setting if you do not receive any hits. Allowing more than one mutation and having a gap of less than 3 base pairs is probably not a good idea.
  3. Enzymes should recognize only the polymorphic site but not any other sites close by, in which case both alleles will be cleaved. WatCut will find all enzymes that cleave the sequence outside the polymorphic site, and exclude them from further analysis.
  4. Since the oligo may not span the polymorphic site, multiple mutations must all be on the same side of the SNP position. WatCut will report multiple mutations only if they fulfill this condition.

WatCut will exclude enzymes that have a recognition site within 25 base pairs of the polymorphic site. Cleavage sites outside this interval are not detected or reported; a size difference of ≥ 25 bp should still be detectable on a gel, and exclusion of enzymes outside this interval would be overly restrictive. However, before ordering an enzyme, it is probably a good idea to check the entire intended PCR product for further occurrences of the recognition site to see exactly what you are going to get out of a restriction digest. You can use WatCut's restriction analysis function for that purpose.

Back to top


Selecting enzymes

This function is accessed through the entry 'Select enzymes' in the main menu. The setting made on this page will affect all functions (restriction, silent mutations, SNP), and they will be remembered between sessions (provided you have cookies enabled). There are two ways of selecting enzymes: 1., applying a filter, and 2., selecting enzymes individually into a set.

A filter combines settings in the following criteria:

Once you are done defining your filter, click 'Select enzymes' to apply your settings to the presently displayed results, or to the next analysis you are going to perform.

To create an enzyme set from scratch or edit a previously defined set, click on 'Edit or create custom set' in the 'Select enzymes' dialog. On the page you will bring up, you may again use a filter to preselect the enzymes listed. If you had an enzyme set selected, you will see that set displayed. Choose 'create new' or any other previously defined set from the dropdown at the top, and proceed to edit. From the preselected list, pick and choose the individual ones and delete others.

Apart from picking individual enzymes, you may also 'include compatibles'. This function lets you expand the currently edited set by including enzymes that match any one of those in the set with respect to the overhangs. E.g., having BamHI (G^GATCC) in the set and choosing 'include compatibles' will cause BglII (A^GATCT), BclI (G^GATCC), Sau3AI (^GATC) and XhoII (R^GATCY) to be co-opted. Note that this happens regardless of current filter settings, and that it will not work for blunt ends (use filtering, not enzyme sets to find all enzymes that create blunt ends).

Before saving your set, supply a name, or accept the default name 'my_working_set'. Once you save a new set, it will be applied in any subsequent restriction or silent mutation analyses.

Another way to create enzyme sets is to save them from within your displayed results. The How to section of this page gives examples illustrating the usefulness of this feature.

Enzyme sets and filters are essentially independent of each other. However, the 'preferred supplier' filter setting will affect the enzymes displayed with a set. E.g., if you have saved EcoRV into your set and later choose Fermentas as the supplier, your results will display Eco32I instead of EcoRV. This, however, is a transient change, so upon reverting to the default 'None' setting for the supplier you will again see the original contents of your set.

Back to top


Preferences

You may set several options for both screen display and printed output in order to get your results displayed the way you want. These settings are applied in the 'Preferences' dialog accessible through the main menu.

Printing from HTML has the advantage that some sequence highlighting is preserved. However, page breaks will work only with CSS2 2013; enabled browsers, which excludes Netscape 4 (I am not sure about IE 4). If your browser messes up the page breaks, or if you want to download the results and edit them before printing, choose 'plain text'. The choices of file extensions (.wri, .txt, .doc) do not reflect any real changes in content 2013; they only serve to trick your browser into spawning different applications on download. Page breaks will work with Word and WordPad but not with Notepad - that's why 'txt' is not the default option.

Back to top


Saving and deleting data with WatCut

This refers to the data stored on the WatCut server, not to the download of your results (which is described in the preceding section). WatCut stores your results, enzyme sets and preferences in a database, along with an identifyer code that is also stored on your system in a cookie. If your browser is set to reject cookies, or if you delete them, you will not be able to access your data any more.

WatCut will overwrite old data with new ones if you use the same name again, and it will do so without asking for your confirmation. E.g., if you don't supply a name for a new enzyme set, WatCut will overwrite the previous 'my_working_set' with the newly saved one. In this way, you won't clutter up your account with enzyme sets that are only transiently needed. On the other hand, it's probably not a good idea for sets you want to use repeatedly, e.g., a set represent everything in your freezer, so be sure to supply names for manually defined enzyme sets.

If you upload sequence files for restriction analysis and do not at that time supply a name, WatCut will use the file name instead.

To delete old data: 'Preferences' 2192; 'Clean up old projects'. You will see a list with all the data in your account.

Back to top


How to

The following examples will show you how to get more out of WatCut and to quickly shortlist your options for a cloning experiment. You will note that enzyme sets are very versatile and key in most tasks.

Back to top


FAQ

Back to top


Credits

WatCut was written entirely in PHP (although I would now do it in Python) and uses a MySQL database.
The SNP function was suggested by Alex Postma. Several people helped me by finding and reporting bugs, and by proposing improvements. Thanks!




Site created and maintained by Michael Palmer, University of Waterloo, Ontario, Canada. Questions / Problems? Send email.
Page last modified: April 17, 2014 . Enzymes as of April 2014, Rebase.