-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathtools_structured.json
More file actions
697 lines (697 loc) · 58.9 KB
/
tools_structured.json
File metadata and controls
697 lines (697 loc) · 58.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
{
"tool": [
{
"name": "abyss-pe",
"display_name": "Abyss Paired-End",
"version": "1.0.0",
"summary": "\n**What it does**\n\nABySS is a de novo, paired-end sequence assembler that is designed for short reads. \n\n**Input**\n\nThe suffix of the read identifier for a pair of reads must be one of '1' and '2', or 'A' and 'B', or 'F' and 'R', or 'F3' and 'R3', or 'forward' and 'reverse'. The reads may be interleaved in the same file or found in different files; however, interleaved mates will use less memory.\n\n**Description**\n\nThis tool performs the following commands:\n\nABYSS - the single-end assembler\nAdjList - finds overlaps of length k-1 between contigs\nKAligner** - aligns reads to contigs\nParseAligns** - finds pairs of reads in alignments\nDistanceEst** - estimates distances between contigs\nOverlap - find overlaps between blunt contigs\nSimpleGraph - finds paths between pairs of contigs\nMergePaths - merges consistent paths\nConsensus - for a colour-space assembly, convert the colour-space contigs to nucleotide contigs\n\n**Reference**\n\nhttp://www.bcgsc.ca/platform/bioinfo/software/abyss\n",
"description": "Assemble short paired reads",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "abyss",
"display_name": "Abyss",
"version": "1.0.0",
"summary": "\n**What it does**\n\nABySS is a de novo sequence assembler that is designed for short reads.\n\n.. image:: http://www.bcgsc.ca/platform/bioinfo/software/abyss/screenshot\n\n**Reference**\n\nhttp://www.bcgsc.ca/platform/bioinfo/software/abyss\n",
"description": "Assemble short unpaired reads",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "WigAdd",
"display_name": "Add",
"version": "1.1.0",
"summary": "\n \nThis tool will add all values in the specified Wig files base pair by base pair.\n \n.. class:: infomark\n\n**TIP:** If your dataset does not appear in the pulldown menu, it means that it is not in Wig or BigWig format. Use \"edit attributes\" to set the correct format if it was not detected correctly.\n \n",
"description": "multiple (Big)Wig files",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "gd_add_fst_column",
"display_name": "Per-SNP FSTs",
"version": "1.2.0",
"summary": "\n\n**Dataset formats**\n\nThe input datasets are in gd_snp_, gd_genotype_, and gd_indivs_ formats.\nThe output dataset is in gd_snp_ or gd_genotype_ format. (`Dataset missing?`_)\n\n.. _gd_snp: ./static/formatHelp.html#gd_snp\n.. _gd_genotype: ./static/formatHelp.html#gd_genotype\n.. _gd_indivs: ./static/formatHelp.html#gd_indivs\n.. _Dataset missing?: ./static/formatHelp.html\n\n-----\n\n**What it does**\n\nThe user specifies a SNP table and two \"populations\" of individuals, both previously defined using the Galaxy tool to specify individuals from a SNP table. No individual can be in both populations. Other choices are as follows.\n\nFrequency metric. The allele frequencies of a SNP in the two populations can be estimated either by the total number of reads of each allele (if the table is in gd_snp format, but not with gd_genotype), or by adding the frequencies inferred from genotypes of individuals in the populations.\n\nAfter specifying the frequency metric, the user sets lower bounds on amount of data required at a SNP. For estimating the Fst using read counts, the bound is the minimum count of reads of the two alleles in a population. For estimations based on genotype, the bound is the minimum reported genotype quality per individual.\n\nThe user specifies whether the SNPs that violate the lower bound should be ignored or the Fst set to -1.\n\nThe user specifies whether SNPs where both populations appear to be fixed for the same allele should be retained or discarded.\n\nFinally, the user chooses which definition of Fst to use: Wright's original definition, the Weir-Cockerham unbiased estimator, or the Reich-Patterson estimator.\n\nA column is appended to the SNP table giving the Fst for each retained SNP.\n\nReferences:\n\nSewall Wright (1951) The genetical structure of populations. Ann Eugen 15:323-354.\n\nWeir, B.S. and Cockerham, C. Clark (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370.\n\nWeir, B.S. 1996. Population substructure. Genetic data analysis II, pp. 161-173. Sinauer Associates, Sundand, MA.\n\nDavid Reich, Kumarasamy Thangaraj, Nick Patterson, Alkes L. Price, and Lalji Singh (2009) Reconstructing Indian population history. Nature 461:489-494, especially Supplement 2. \n\nTheir effectiveness for computing FSTs when there are many SNPs but few individuals is discussed in the following paper.\n\nEva-Maria Willing, Christine Dreyer, Cock van Oosterhout (2012) Estimates of genetic differentiation measured by FST do not necessarily require large sample sizes when using many SNP markers. PLoS One 7:e42649.\n\n-----\n\n**Example**\n\n- input, SNP table::\n\n #{\"column_names\":[\"scaf\",\"pos\",\"A\",\"B\",\"qual\",\"ref\",\"rpos\",\"rnuc\",\"1A\",\"1B\",\"1G\",\"1Q\",\"2A\",\"2B\",\"2G\",\"2Q\",\"3A\",\"3B\",\"3G\",\"3Q\",\"4A\",\"4B\",\"4G\",\"4Q\",\n #\"5A\",\"5B\",\"5G\",\"5Q\",\"6A\",\"6B\",\"6G\",\"6Q\",\"pair\",\"dist\",\"prim\",\"rflp\"],\"dbkey\":\"canFam2\",\n #\"individuals\":[[\"PB1\",9],[\"PB2\",13],[\"PB3\",17],[\"PB4\",21],[\"PB6\",25],[\"PB8\",29]],\n #\"pos\":2,\"rPos\":7,\"ref\":6,\"scaffold\":1,\"species\":\"bear\"}\n Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0\n Contig113_chr5_11052263_11052603 28 C T 38.2 chr5 11052280 C 1 2 1 12 3 2 1 10 5 0 2 42 2 1 2 13 3 0 2 36 8 0 2 51 Y 161 +99. 0\n Contig215_chr5_70946445_70947428 363 T G 28.2 chr5 70946809 C 4 0 2 39 0 5 0 12 9 0 2 54 6 0 2 45 3 3 2 1 9 0 2 54 N 43 0.153 0\n etc.\n\n- input, Population 1 individuals::\n\n 9 PB1\n 13 PB2\n\n- input, Population 2 individuals::\n\n 17 PB3\n 21 PB4\n\n- output (minimum read count of 3, discard fixed)::\n\n Contig113_chr5_11052263_11052603 28 C T 38.2 chr5 11052280 C 1 2 1 12 3 2 1 10 5 0 2 42 2 1 2 13 3 0 2 36 8 0 2 51 Y 161 +99. 0 0.1636\n Contig215_chr5_70946445_70947428 363 T G 28.2 chr5 70946809 C 4 0 2 39 0 5 0 12 9 0 2 54 6 0 2 45 3 3 2 1 9 0 2 54 N 43 0.153 0 0.3846\n etc.\n\n ",
"description": ": Compute a fixation index score for each SNP",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "add_taxa",
"display_name": "add_taxa",
"version": "1.2.0",
"summary": "\n \n ",
"description": "Add taxa to OTU table",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "abyss-pe",
"display_name": "Abyss Paired-End",
"version": "1.0.0",
"summary": "\n**What it does**\n\nABySS is a de novo, paired-end sequence assembler that is designed for short reads. \n\n**Input**\n\nThe suffix of the read identifier for a pair of reads must be one of '1' and '2', or 'A' and 'B', or 'F' and 'R', or 'F3' and 'R3', or 'forward' and 'reverse'. The reads may be interleaved in the same file or found in different files; however, interleaved mates will use less memory.\n\n**Description**\n\nThis tool performs the following commands:\n\nABYSS - the single-end assembler\nAdjList - finds overlaps of length k-1 between contigs\nKAligner** - aligns reads to contigs\nParseAligns** - finds pairs of reads in alignments\nDistanceEst** - estimates distances between contigs\nOverlap - find overlaps between blunt contigs\nSimpleGraph - finds paths between pairs of contigs\nMergePaths - merges consistent paths\nConsensus - for a colour-space assembly, convert the colour-space contigs to nucleotide contigs\n\n**Reference**\n\nhttp://www.bcgsc.ca/platform/bioinfo/software/abyss\n",
"description": "Assemble short paired reads",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "abyss",
"display_name": "Abyss",
"version": "1.0.0",
"summary": "\n**What it does**\n\nABySS is a de novo sequence assembler that is designed for short reads.\n\n.. image:: http://www.bcgsc.ca/platform/bioinfo/software/abyss/screenshot\n\n**Reference**\n\nhttp://www.bcgsc.ca/platform/bioinfo/software/abyss\n",
"description": "Assemble short unpaired reads",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "WigAdd",
"display_name": "Add",
"version": "1.1.0",
"summary": "\n \nThis tool will add all values in the specified Wig files base pair by base pair.\n \n.. class:: infomark\n\n**TIP:** If your dataset does not appear in the pulldown menu, it means that it is not in Wig or BigWig format. Use \"edit attributes\" to set the correct format if it was not detected correctly.\n \n",
"description": "multiple (Big)Wig files",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "gd_add_fst_column",
"display_name": "Per-SNP FSTs",
"version": "1.2.0",
"summary": "\n\n**Dataset formats**\n\nThe input datasets are in gd_snp_, gd_genotype_, and gd_indivs_ formats.\nThe output dataset is in gd_snp_ or gd_genotype_ format. (`Dataset missing?`_)\n\n.. _gd_snp: ./static/formatHelp.html#gd_snp\n.. _gd_genotype: ./static/formatHelp.html#gd_genotype\n.. _gd_indivs: ./static/formatHelp.html#gd_indivs\n.. _Dataset missing?: ./static/formatHelp.html\n\n-----\n\n**What it does**\n\nThe user specifies a SNP table and two \"populations\" of individuals, both previously defined using the Galaxy tool to specify individuals from a SNP table. No individual can be in both populations. Other choices are as follows.\n\nFrequency metric. The allele frequencies of a SNP in the two populations can be estimated either by the total number of reads of each allele (if the table is in gd_snp format, but not with gd_genotype), or by adding the frequencies inferred from genotypes of individuals in the populations.\n\nAfter specifying the frequency metric, the user sets lower bounds on amount of data required at a SNP. For estimating the Fst using read counts, the bound is the minimum count of reads of the two alleles in a population. For estimations based on genotype, the bound is the minimum reported genotype quality per individual.\n\nThe user specifies whether the SNPs that violate the lower bound should be ignored or the Fst set to -1.\n\nThe user specifies whether SNPs where both populations appear to be fixed for the same allele should be retained or discarded.\n\nFinally, the user chooses which definition of Fst to use: Wright's original definition, the Weir-Cockerham unbiased estimator, or the Reich-Patterson estimator.\n\nA column is appended to the SNP table giving the Fst for each retained SNP.\n\nReferences:\n\nSewall Wright (1951) The genetical structure of populations. Ann Eugen 15:323-354.\n\nWeir, B.S. and Cockerham, C. Clark (1984) Estimating F-statistics for the analysis of population structure. Evolution 38:1358-1370.\n\nWeir, B.S. 1996. Population substructure. Genetic data analysis II, pp. 161-173. Sinauer Associates, Sundand, MA.\n\nDavid Reich, Kumarasamy Thangaraj, Nick Patterson, Alkes L. Price, and Lalji Singh (2009) Reconstructing Indian population history. Nature 461:489-494, especially Supplement 2. \n\nTheir effectiveness for computing FSTs when there are many SNPs but few individuals is discussed in the following paper.\n\nEva-Maria Willing, Christine Dreyer, Cock van Oosterhout (2012) Estimates of genetic differentiation measured by FST do not necessarily require large sample sizes when using many SNP markers. PLoS One 7:e42649.\n\n-----\n\n**Example**\n\n- input, SNP table::\n\n #{\"column_names\":[\"scaf\",\"pos\",\"A\",\"B\",\"qual\",\"ref\",\"rpos\",\"rnuc\",\"1A\",\"1B\",\"1G\",\"1Q\",\"2A\",\"2B\",\"2G\",\"2Q\",\"3A\",\"3B\",\"3G\",\"3Q\",\"4A\",\"4B\",\"4G\",\"4Q\",\n #\"5A\",\"5B\",\"5G\",\"5Q\",\"6A\",\"6B\",\"6G\",\"6Q\",\"pair\",\"dist\",\"prim\",\"rflp\"],\"dbkey\":\"canFam2\",\n #\"individuals\":[[\"PB1\",9],[\"PB2\",13],[\"PB3\",17],[\"PB4\",21],[\"PB6\",25],[\"PB8\",29]],\n #\"pos\":2,\"rPos\":7,\"ref\":6,\"scaffold\":1,\"species\":\"bear\"}\n Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0\n Contig113_chr5_11052263_11052603 28 C T 38.2 chr5 11052280 C 1 2 1 12 3 2 1 10 5 0 2 42 2 1 2 13 3 0 2 36 8 0 2 51 Y 161 +99. 0\n Contig215_chr5_70946445_70947428 363 T G 28.2 chr5 70946809 C 4 0 2 39 0 5 0 12 9 0 2 54 6 0 2 45 3 3 2 1 9 0 2 54 N 43 0.153 0\n etc.\n\n- input, Population 1 individuals::\n\n 9 PB1\n 13 PB2\n\n- input, Population 2 individuals::\n\n 17 PB3\n 21 PB4\n\n- output (minimum read count of 3, discard fixed)::\n\n Contig113_chr5_11052263_11052603 28 C T 38.2 chr5 11052280 C 1 2 1 12 3 2 1 10 5 0 2 42 2 1 2 13 3 0 2 36 8 0 2 51 Y 161 +99. 0 0.1636\n Contig215_chr5_70946445_70947428 363 T G 28.2 chr5 70946809 C 4 0 2 39 0 5 0 12 9 0 2 54 6 0 2 45 3 3 2 1 9 0 2 54 N 43 0.153 0 0.3846\n etc.\n\n ",
"description": ": Compute a fixation index score for each SNP",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "add_taxa",
"display_name": "add_taxa",
"version": "1.2.0",
"summary": "\n \n ",
"description": "Add taxa to OTU table",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "adjust_seq_orientation",
"display_name": "adjust_seq_orientation",
"version": "1.2.0",
"summary": "\n \n ",
"description": "Get the reverse complement of all sequences",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "gd_sum_gd_snp",
"display_name": "Aggregate Individuals",
"version": "1.1.0",
"summary": "\n\n**Dataset formats**\n\nThe input datasets are in gd_snp_, gd_genotype_, and gd_indivs_ formats.\nThe output dataset is in gd_snp_ or gd_genotype_ format. (`Dataset missing?`_)\n\n.. _gd_snp: ./static/formatHelp.html#gd_snp\n.. _gd_genotype: ./static/formatHelp.html#gd_genotype\n.. _gd_indivs: ./static/formatHelp.html#gd_indivs\n.. _Dataset missing?: ./static/formatHelp.html\n\n-----\n\n**What it does**\n\nThe user specifies that some of the individuals in a gd_snp or gd_genotype\ndataset form a \"population\", by supplying a list that has been previously\ncreated using the Specify Individuals tool. The program appends a new\n\"entity\" (set of four columns for a gd_snp table, or one column for a\ngd_genotype table), analogous to the column(s) for an individual but\ncontaining summary data for the population as a group. For a gd_snp\ntable, these four columns give the total counts for the two alleles,\nthe \"genotype\" for the population, and the maximum quality value, taken\nover all individuals in the population. If all defined genotypes in\nthe population are 2 (agree with the reference), then the population's\ngenotype is 2, and similarly for 0; otherwise the genotype is 1 (unless\nall individuals have undefined genotype, in which case it is -1).\nFor a gd_genotype file, only the aggregate genotype is appended.\n\n-----\n\n**Example**\n\n- input gd_snp::\n\n Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0\n Contig48_chr1_10150253_10151311 11 A G 94.3 chr1 10150264 A 1 0 2 30 1 0 2 30 1 0 2 30 3 0 2 36 1 0 2 30 1 0 2 30 Y 22 +99. 0\n Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0\n etc.\n\n- input individuals::\n\n 9 PB1\n 13 PB2\n 17 PB3\n\n- output::\n\n Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0 29 0 2 72\n Contig48_chr1_10150253_10151311 11 A G 94.3 chr1 10150264 A 1 0 2 30 1 0 2 30 1 0 2 30 3 0 2 36 1 0 2 30 1 0 2 30 Y 22 +99. 0 3 0 2 30\n Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0 13 0 2 42\n etc.\n\n ",
"description": ": Append summary columns for a population",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "agile_wrapper",
"display_name": "AGILE",
"version": "1.0.0",
"summary": "\n\t\n.. class:: warningmark \n\nThe default parameter values can be altered in the agile tool xml file\n\n-----\n\t\n**What it does**\n \nThis tool uses the **AGILE** alignment program, a faster replacement for the **BLAT** algorithm. Your reads file is searched against a genome build or another uploaded file. \n\n-----\n\n**Parameters**\n\n- *Maximum Single Imperfect Matches* (**-maxSIMs**) : The number of allowable mismatches as a percentage of read length.\n\n- *Tuple Length* (**-tileSize**) : The length of tuples for craring a hash table.\n\n- *Maximum Frequency* (**-maxFreq**) : The maximum number of pattern occurrences allowed.\n\n- *All Matches* (**-all**) : Output all matches satisfying the match criteria (true/false).\n\n- *Output Format* (**-out**) : Define the output format for the match file.\n\n-----\n\n**Reference**\n \n **AGILE**: Sanchit Misra, Ankit Agrawal, Wei-keng Liao, Alok Choudhary. Anatomy of a Hash-based Long Read Sequence Mapping Algorithm for Next Generation DNA Sequencing. Bioinformatics 2010; doi: 10.1093/bioinformatics/btq648.\n\n\n\t",
"description": " Quickly match reads to a reference genome or sequence file",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "ctb_alignit_create_db",
"display_name": "Pharmacophore",
"version": "0.1",
"summary": "\n\n.. class:: infomark\n\n**What this tool does**\n\nAlign-it_ is a tool to align molecules according to their pharmacophores.\nA pharmacophore is an abstract concept based on the specific interactions \nobserved in drug-receptor interactions: hydrogen bonding, \ncharge transfer, electrostatic and hydrophobic interactions. \nMolecular modeling and/or screening based on pharmacophore similarities \nhas been proven to be an important and useful method in drug discovery.\n\nThe functionality of Align-it_ consists mainly of two parts. \nThe first functionality is the generation of pharmacophores from molecules\n(the function of this tool). Secondly, pairs of pharmacophores \ncan be aligned (use the tool **Pharmacophore Alignment**). The resulting \nscore is calculated from the volume overlap resulting of the alignments.\n\n.. _Align-it: http://www.silicos-it.com/software/align-it/1.0.3/align-it.html\n\n-----\n\n.. class:: infomark\n\n**Input**\n\n* Example::\n\n - database\n\n 30 31 0 0 0 0 0 0 0999 V2000\n 1.9541 1.1500 -2.5078 Cl 0 0 0 0 0 0 0 0 0 0 0 0\n 1.1377 -1.6392 2.1136 Cl 0 0 0 0 0 0 0 0 0 0 0 0\n -3.2620 -2.9284 -1.0647 O 0 0 0 0 0 0 0 0 0 0 0 0\n -2.7906 -1.9108 0.9092 O 0 0 0 0 0 0 0 0 0 0 0 0\n 0.2679 -0.2051 -0.3990 N 0 0 0 0 0 0 0 0 0 0 0 0\n -2.0640 0.5139 -0.3769 C 0 0 0 0 0 0 0 0 0 0 0 0\n -0.7313 0.7178 -0.0192 C 0 0 0 0 0 0 0 0 0 0 0 0\n -2.4761 -0.6830 -1.1703 C 0 0 0 0 0 0 0 0 0 0 0 0\n 1.6571 -0.2482 -0.1795 C 0 0 0 0 0 0 0 0 0 0 0 0\n -3.0382 1.4350 0.0081 C 0 0 0 0 0 0 0 0 0 0 0 0\n -0.3728 1.8429 0.7234 C 0 0 0 0 0 0 0 0 0 0 0 0\n -2.6797 2.5600 0.7506 C 0 0 0 0 0 0 0 0 0 0 0 0\n -1.3470 2.7640 1.1083 C 0 0 0 0 0 0 0 0 0 0 0 0\n 2.5353 0.3477 -1.0918 C 0 0 0 0 0 0 0 0 0 0 0 0\n 2.1740 -0.8865 0.9534 C 0 0 0 0 0 0 0 0 0 0 0 0\n -2.8480 -1.8749 -0.3123 C 0 0 0 0 0 0 0 0 0 0 0 0\n 3.9124 0.3058 -0.8739 C 0 0 0 0 0 0 0 0 0 0 0 0\n 3.5511 -0.9285 1.1713 C 0 0 0 0 0 0 0 0 0 0 0 0\n 4.4203 -0.3324 0.2576 C 0 0 0 0 0 0 0 0 0 0 0 0\n -1.7086 -0.9792 -1.8930 H 0 0 0 0 0 0 0 0 0 0 0 0\n -3.3614 -0.4266 -1.7676 H 0 0 0 0 0 0 0 0 0 0 0 0\n -0.0861 -1.1146 -0.6780 H 0 0 0 0 0 0 0 0 0 0 0 0\n -4.0812 1.2885 -0.2604 H 0 0 0 0 0 0 0 0 0 0 0 0\n 0.6569 2.0278 1.0167 H 0 0 0 0 0 0 0 0 0 0 0 0\n -3.4382 3.2769 1.0511 H 0 0 0 0 0 0 0 0 0 0 0 0\n -1.0683 3.6399 1.6868 H 0 0 0 0 0 0 0 0 0 0 0 0\n 4.6037 0.7654 -1.5758 H 0 0 0 0 0 0 0 0 0 0 0 0\n 3.9635 -1.4215 2.0480 H 0 0 0 0 0 0 0 0 0 0 0 0\n 5.4925 -0.3651 0.4274 H 0 0 0 0 0 0 0 0 0 0 0 0\n -3.5025 -3.7011 -0.5102 H 0 0 0 0 0 0 0 0 0 0 0 0\n\n - cutoff : 0.0\n\n-----\n\n.. class:: infomark\n\n**Output**\n\n* Example::\n \n - aligned Pharmacophores \n\n 3033\n HYBL -1.98494 1.9958 0.532089 0.7 0 0 0 0\n HYBL 3.52122 -0.309347 0.122783 0.7 0 0 0 0\n HYBH -3.262 -2.9284 -1.0647 1 1 -3.5666 -3.7035 -1.61827\n HDON 0.2679 -0.2051 -0.399 1 1 -0.076102 -0.981133 -0.927616\n HACC -2.7906 -1.9108 0.9092 1 1 -2.74368 -1.94015 1.90767\n $$$$ \n\n-----\n\n.. class:: infomark\n\n**Cite**\n\n`Silicos-it`_ - align-it\n\n.. _Silicos-it: http://www.silicos-it.com/software/align-it/1.0.3/align-it.html\n\n ",
"description": "generation (Align-it)",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "mothur_align_check",
"display_name": "Align.check",
"version": "1.20.0",
"summary": "\n**Mothur Overview**\n\nMothur_, initiated by Dr. Patrick Schloss and his software development team \nin the Department of Microbiology and Immunology at The University of Michigan, \nprovides bioinformatics for the microbial ecology community.\n\n.. _Mothur: http://www.mothur.org/wiki/Main_Page\n\n**Command Documenation**\n\nThe align.check_ command allows you to calculate the number of potentially misaligned bases in a 16S rRNA gene sequence alignment using a secondary_structure_map_. If you are familiar with the editor window in ARB, this is the same as counting the number of ~, #, -, and = signs. \n\n.. _secondary_structure_map: http://www.mothur.org/wiki/Secondary_structure_map\n.. _align.check: http://www.mothur.org/wiki/Align.check\n\n ",
"description": "Calculate the number of potentially misaligned bases",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "mothur_align_seqs",
"display_name": "Align.seqs",
"version": "1.20.0",
"summary": "\n**Mothur Overview**\n\nMothur_, initiated by Dr. Patrick Schloss and his software development team\nin the Department of Microbiology and Immunology at The University of Michigan,\nprovides bioinformatics for the microbial ecology community.\n\n.. _Mothur: http://www.mothur.org/wiki/Main_Page\n\n**Command Documenation**\n\nThe align.seqs_ command aligns a user-supplied fasta-formatted candidate sequence file to a user-supplied fasta-formatted template_alignment_.\n\nThe general approach is to\n i) find the closest template for each candidate using kmer searching, blastn, or suffix tree searching;\n ii) to make a pairwise alignment between the candidate and de-gapped template sequences using the Needleman-Wunsch, Gotoh, or blastn algorithms; and\n iii) to re-insert gaps to the candidate and template pairwise alignments using the NAST algorithm so that the candidate sequence alignment is compatible with the original template alignment.\n\nIn general the alignment is very fast - we are able to align over 186,000 full-length sequences to the SILVA alignment in less than 3 hrs with a quality as good as the SINA aligner. Furthermore, this rate can be accelerated using multiple processors. While the aligner doesn't explicitly take into account the secondary structure of the 16S rRNA gene, if the template database is based on the secondary structure, then the resulting alignment will at least be implicitly based on the secondary structure.\n\n.. _template_alignment: http://www.mothur.org/wiki/Alignment_database\n.. _align.seqs: http://www.mothur.org/wiki/Align.seqs\n\n\n ",
"description": "Align sequences to a template alignment",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "align2database",
"display_name": "align-to-database",
"version": "1.0.0",
"summary": "\n\n**Example output**\n\n.. image:: ./static/operation_icons/align_multiple2.png\n\n\n**What it does**\n\nThis tool aligns a query interval set (such as ChIP peaks) to a database of features (such as other ChIP peaks or TSS/splice sites), calculates and plots the relative distance of database features to the query intervals. Currently two databases are available: \n\n-- **ChIP peaks** from 191 ChIP experiments (processed from hmChIP database, see individual peak/BED files in **Shared Data**)\n\n-- **Annotated gene features**, such as: TSS, TES, 5'ss, 3'ss, CDS start and end, miRNA seed matches, enhancers, CpG island, microsatellite, small RNA, poly A sites (3P-seq-tags), miRNA genes, and tRNA genes. \n\nTwo output files are generated. One is the coverage/profile for each feature in the database that has a minimum overlap with the query set. The first two columns are feature name and the total number of overlapping intervals from the query. Column 3 to column 102 are coverage at each bin. The other file is an PDF file plotting both the heatmap for all features and the average coverage for each individual database feature.\n\n\n**How it works**\n\nFor each interval/peak in the query file, a window (default 10,000bp) is created around the center of the interval and is divided into 100 bins. For each database feature set (such as Pol II peaks), the tool counts how many intervals in the database feature file overlap with each bin. The count is then averaged over all query intervals that have at least one hit in at least one bin. Overall the plotted 'average coverage' represnts the fraction of query features (only those with hits, number shown in individual plot title) that has database feature interval covering that bin. The extreme is when the database feature is the same as the query, then every query interval is covered at the center, the average coverage of the center bin will be 1. \n\nThe heatmap is scaled for each row before clustering.\n\n ",
"description": " features ",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "align2multiple",
"display_name": "align-to-multiple",
"version": "1.0.0",
"summary": "\n.. class:: infomark\n\nThis tool allows you to check the co-localization pattern of multiple interval sets. All interval sets are aligned to the center of the intervals in the query interval set.\n\nEach row represents a window of certain size around the center of one interval in the query set, such as ChIP peaks. Each heatmap shows the position of other features in the SAME window (the same rows in each heatmap represent the same interval/genomic position).\n\n\nThe example below shows that of all Fox2 peaks, half of them are within 1kb of TSS. Of the half outside TSS, about one half has H3K4me1, two thirds of which are further depleted of H3K4me3. \n\n-----\n\n**Example**\n\n.. image:: ./static/images/align2multiple.png\n\n",
"description": "features",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "alignCustomAmplicon",
"display_name": "Align Custom Amplicon",
"version": "0.0.1",
"summary": " \n\n.. class:: infomark\n\n**What it does**\n\nIt is an amplicon aligner that uses primers for higher accuracy.\n\nReads with primers are aligned to the reference, then primers are discarded.\n\nIf both reads are long enough, they are aligned with the reference and a consensus alignment is generated.\n\nOtherwise, each read is aligned separately.\n\nSequences with bad quality reads are discarded. \n\n\n\n**Input**\n\nref:\n\n\tFasta file of ref gnome\n\nread1:\n\n\tFastq file of left to right read\n\t(Can also be compressed [fastq.gz])\n\nread2:\n\n\tFastq file of right to left read\n\t(Can also be compressed [fastq.gz])\n\nprimers:\n\n\tText file with primers name and length (see example)\n\nExample primers format::\n\n\t#Name_of_amplicon \tlength_left \tlength_right\n\t1:115256345-115256520\t23\t\t23\n\t1:115256436-115256606\t25\t\t22\n\t1:115256530-115256724\t23\t\t23\n\t1:115256532-115256723\t23\t\t23\n\t4:55151914-55152086\t21\t\t23\n\t4:55151935-55152132\t20\t\t23\n\t4:55151991-55152182\t23\t\t24\n\t4:55591944-55592136\t23\t\t24\n\t4:55592065-55592263\t20\t\t23\n\t4:55593504-55593674\t24\t\t25\n\t...\n\n\n ",
"description": "align amplicon to reference with primers",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "alignr",
"display_name": "align",
"version": "1.0.0",
"summary": "\n\n**What it does**\n\nThis tool aligns two sets of intervals, finds overlaps, calculates and plots the coverage of the first set across the second set. Applications include: \n\n- check read distribution around TSS/poly A site/splice site/motif site/miRNA target site\n- check relative position/overlap of two lists of ChIP-seq peaks\n\nTwo output files are generated. One is the coverage/profile for each interval in input 2. The first two columns are interval ID and the total number of overlapping intervals from input 1. Column 3 to column nbins+2 are coverage at each bin. The other file is an PDF file plotting the average coverage of each bin. To modify the visualization, please downlaod the coverage file and make your own plots.\n\n-----\n\n**Annotated features**\n\nCurrently supports mouse genome build mm9 and human hg18. Each interval spans 1000bp upstream and 1000bp downstream of a feature such as TSS. Features with overlapping exons in the intronic/intergenic part of the 2000bp interval are removed.\n\n-----\n\n**Usage**\n\n -h, --help show this help message and exit\n -a INPUTA (required) input file A, BED-like (first 3 columns: chr, start, end) or BAM format. The\n script computes the depth of coverage of features in file\n A across the features in file B\n -b INPUTB (required) input file B, BED format or MACS peak file.\n Requires an unique name for each line in column 4\n -m inputB is a MACS peak file.\n -f AFORMAT Format of input file A. Can be BED (default) or BAM\n -w WINDOW Generate new inputB by making a window of 2 x WINDOW bp\n (in total) flanking the center of each input feature\n -n NBINS number of bins. Features in B are binned, and the coverage\n is computed for each bin. Default is 100\n -s enforce strandness: require overlapping on the same\n strand. Default is off\n -p load existed intersectBed outputfile\n -q suppress output on screen\n -o OUTPUTPROFILE (optional) output profile name.\n -v PLOTFILE (optional) plot file name\n ",
"description": "two interval sets",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "alignvis",
"display_name": "heatmap",
"version": "1.0.0",
"summary": "\n\n**What it does**\n\nThis tool generates a heatmap for output from 'align' tool. Each row is the color-coded coverage of a feature, and the features are sorted by the total coverage in the interval. \n\n**Example**\n\n.. image:: ./static/operation_icons/heatmap.png\n\n ",
"description": "of align output",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "align_back_trans",
"display_name": "Thread nucleotides onto a protein alignment (back-translation)",
"version": "0.0.4",
"summary": "\n**What it does**\n\nTakes an input file of aligned protein sequences (typically FASTA or Clustal\nformat), and a matching file of unaligned nucleotide sequences (FASTA format,\nusing the same identifiers), and threads the nucleotide sequences onto the\nprotein alignment to produce a codon aware nucleotide alignment - which can\nbe viewed as a back translation.\n\nIf you specify one of the standard NCBI genetic codes (recommended), then the\ntranslation is verified. This will allow fuzzy matching if stop codons in the\nprotein sequence have been reprented as X, and will allow for a trailing stop\ncodon present in the nucleotide sequences but not the protein.\n\nNote - the protein and nucleotide sequences must use the same identifers.\n\nNote - If no translation table is specified, the provided nucleotide sequences\nshould be exactly three times the length of the protein sequences (exluding the gaps).\n\nNote - the nucleotide FASTA file may contain extra sequences not in the\nprotein alignment, they will be ignored. This can be useful if for example\nyou have a nucleotide FASTA file containing all the genes in an organism,\nwhile the protein alignment is for a specific gene family.\n\n**Example**\n\nGiven this protein alignment in FASTA format::\n\n >Alpha\n DEER\n >Beta\n DE-R\n >Gamma\n D--R\n\nand this matching unaligned nucleotide FASTA file::\n\n >Alpha\n GATGAGGAACGA\n >Beta\n GATGAGCGU\n >Gamma\n GATCGG\n\nthe tool would return this nucleotide alignment::\n\n >Alpha\n GATGAGGAACGA\n >Beta\n GATGAG---CGU\n >Gamma\n GAT------CGG\n\nNotice that all the gaps are multiples of three in length.\n\n\n**Citation**\n\nThis tool uses Biopython, so if you use this Galaxy tool in work leading to a\nscientific publication please cite the following paper:\n\nCock et al (2009). Biopython: freely available Python tools for computational\nmolecular biology and bioinformatics. Bioinformatics 25(11) 1422-3.\nhttp://dx.doi.org/10.1093/bioinformatics/btp163 pmid:19304878.\n\nThis tool is available to install into other Galaxy Instances via the Galaxy\nTool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/align_back_trans\n ",
"description": "Gives a codon aware alignment",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "align_seqs",
"display_name": "align_seqs",
"version": "1.2.0",
"summary": "\n \n ",
"description": "Align sequences using a variety of alignment methods",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "allele_counts_1",
"display_name": "Variant Annotator",
"version": "1.1",
"summary": "\n\n.. class:: infomark\n\n**What it does**\n\nThis tool parses variant counts from a special VCF file. It counts simple variants, calculates numbers of alleles, and calculates minor allele frequency. It can apply filters based on coverage, strand bias, and minor allele frequency cutoffs.\n\n-----\n\n.. class:: infomark\n\n**Input Format**\n\n.. class:: warningmark\n\n**Note:** variants that are not A/C/G/T SNVs will be ignored!\n\nThe input VCF should be like the output of the **Naive Variant Detector** tool (using the stranded option). The sample column(s) must give the read count for each variant **on each strand**. Below is an example of a valid sample column entry (the important part is after the last colon)::\n\n 0/0:1:0.02:+T=27,+G=1,-T=22,\n\n-----\n\n.. class:: infomark\n\n**Output**\n\nEach row represents one site in one sample. For unstranded output, 12 fields give information about that site::\n\n 1. SAMPLE - Sample name (from VCF sample column labels)\n 2. CHR - Chromosome of the site\n 3. POS - Chromosomal coordinate of the site\n 4. A - Number of reads supporting an 'A'\n 5. C - 'C' reads\n 6. G - 'G' reads\n 7. T - 'T' reads\n 8. CVRG - Total (number of reads supporting one of the four bases above)\n 9. ALLELES - Number of qualifying alleles\n 10. MAJOR - Major allele\n 11. MINOR - Minor allele (2nd most prevalent variant)\n 12. MINOR.FREQ.PERC. - Frequency of minor allele\n\nFor stranded output, instead of using 4 columns to report read counts per base, 8 are used to report the stranded counts per base::\n\n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16\n SAMPLE CHR POS +A +C +G +T -A -C -G -T CVRG ALLELES MAJOR MINOR MINOR.FREQ.PERC.\n\n**Example**\n\nBelow is a header line, followed by some example data lines. Since the input contained three samples, the data for each site is reported on three consecutive lines. However, if a sample fell below the coverage threshold at that site, the line will be omitted::\n\n #SAMPLE CHR POS A C G T CVRG ALLELES MAJOR MINOR MINOR.FREQ.PERC.\n BLOOD_1 chr20 99 0 101 1 2 104 1 C T 0.01923\n BLOOD_2 chr20 99 82 44 0 1 127 2 A C 0.34646\n BLOOD_3 chr20 99 0 110 1 0 111 1 C G 0.009\n BLOOD_1 chr20 100 3 5 100 0 108 1 G C 0.0463\n BLOOD_3 chr20 100 1 118 11 0 130 0 C G 0.08462\n\n-----\n\n.. class:: warningmark\n\n**Site printing and allele tallying requirements**\n\nCoverage threshold:\n\nIf a coverage threshold is used, the number of reads **on each strand** must be at or above the threshold. If either strand is below the threshold, the line will be omitted. **N.B.** this means the total coverage for each printed site will be at least twice the number you give in the \"coverage threshold\" option. Also, since only simple variants are counted, a site with 100 reads, all supporting a deletion variant, would not be printed.\n\nFrequency threshold:\n\nIf a frequency threshold is used, alleles are only counted (in the ALLELES column) if they meet or exceed this minor allele frequency threshold.\n\nStrand bias:\n\nThe alleles passing the threshold on each strand must match (though not in order), or the allele count will be 0. So a site with A, C, G on the plus strand and A, G on the minus strand will get an allele count of zero, though the (strand-independent) major allele, minor allele, and minor allele frequency will still be reported. If there is a tie for the minor allele, one will be randomly chosen.\n\n ",
"description": " process variant counts",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "alpha_diversity",
"display_name": "alpha_diversity",
"version": "1.2.0",
"summary": "\n \n ",
"description": "Calculate alpha diversity on each sample in an otu table, using a variety of alpha diversity metrics",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "alpha_rarefaction",
"display_name": "alpha_rarefaction",
"version": "1.2.0",
"summary": "\n \n ",
"description": "A workflow script for performing alpha rarefaction",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "mothur_amova",
"display_name": "Amova",
"version": "1.20.0",
"summary": "\n**Mothur Overview**\n\nMothur_, initiated by Dr. Patrick Schloss and his software development team\nin the Department of Microbiology and Immunology at The University of Michigan,\nprovides bioinformatics for the microbial ecology community.\n\n.. _Mothur: http://www.mothur.org/wiki/Main_Page\n\n**Command Documenation**\n\nThe amova_ command calculates the analysis of molecular variance from a phylip_distance_matrix_, a nonparametric analog of traditional analysis of variance. This method is widely used in population genetics to test the hypothesis that genetic diversity within two populations is not significantly different from that which would result from pooling the two populations.\n\nA design file partitions a list of names into groups. It is a tab-delimited file with 2 columns: name and group, e.g. :\n\t=======\t=======\n\tduck\tbird\n\tcow\tmammal\n\tpig\tmammal\n\tgoose\tbird\n\tcobra\treptile\n\t=======\t=======\n\nThe Make_Design tool can construct a design file from a Mothur dataset that contains group names.\n\n\n.. _phylip_distance_matrix: http://www.mothur.org/wiki/Phylip-formatted_distance_matrix\n.. _amova: http://www.mothur.org/wiki/Amova\n\n\n ",
"description": "Analysis of molecular variance",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "gatk_analyze_covariates",
"display_name": "Analyze Covariates",
"version": "0.0.5",
"summary": "\n**What it does**\n\nCreate collapsed versions of the recal csv file and call R scripts to plot residual error versus the various covariates.\n\nFor more information on base quality score recalibration using the GATK, see this `tool specific page <http://www.broadinstitute.org/gsa/wiki/index.php/Base_quality_score_recalibration>`_.\n\nTo learn about best practices for variant detection using GATK, see this `overview <http://www.broadinstitute.org/gsa/wiki/index.php/Best_Practice_Variant_Detection_with_the_GATK_v3>`_.\n\nIf you encounter errors, please view the `GATK FAQ <http://www.broadinstitute.org/gsa/wiki/index.php/Frequently_Asked_Questions>`_.\n\n------\n\n**Inputs**\n\nGenomeAnalysisTK: AnalyzeCovariates accepts an recal CSV file.\n\n\n**Outputs**\n\nThe output is in CSV and HTML files with links to PDF graphs and a data files.\n\n\nGo `here <http://www.broadinstitute.org/gsa/wiki/index.php/Input_files_for_the_GATK>`_ for details on GATK file formats.\n\n-------\n\n**Settings**::\n\n recal_file The input recal csv file to analyze\n output_dir The directory in which to output all the plots and intermediate data files\n path_to_Rscript The path to your implementation of Rscript. For Broad users this is maybe /broad/tools/apps/R-2.6.0/bin/Rscript\n path_to_resources Path to resources folder holding the Sting R scripts.\n ignoreQ Ignore bases with reported quality less than this number.\n numRG Only process N read groups. Default value: -1 (process all read groups)\n max_quality_score The integer value at which to cap the quality scores, default is 50\n max_histogram_value If supplied, this value will be the max value of the histogram plots\n do_indel_quality If supplied, this value will be the max value of the histogram plots\n\n@CITATION_SECTION@\n ",
"description": "- draw plots",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "snv_annotate",
"display_name": "SNV Annotator",
"version": "1.0.0",
"summary": "\n\n**What it does**\nAnnotates filtered SNVMix output that has been attached to codon information (outputs the same data with additional columns describing the predicted effect of the SNV).\nRequires input produced by the \"SNP filtering and pre-annotation\" tool\n\nThe additional columns are as follows:\n1) Mutated form of the codon\n2) Reference amino acid at that position\n3) mutant amino acid at that position\n4) CODING or SYNONYMOUS\n5) Mutation with position and wild-type amino acid (separated by semicolon for genes with multiple transcripts)\n\nExample input:\nchr7:148139660 ENSG00000106462 -1 TAC 2 602;646; ENST00000350995;ENST00000320356; T A T:39,A:25,0.0000000000,1.0000000000,0.0000000000,2\n\nExample output:\nchr7:148139660 ENSG00000106462 -1 TAC 2 602;646; ENST00000350995;ENST00000320356; T A T:39,A:25,0.0000000000,1.0000000000,0.0000000000,2 TTC Y F CODING Y602F;Y646F\n\n ",
"description": "Annotates filtered SNVMix output that has been attached to codon information (outputs the same data with additional columns describing the predicted effect of the SNV)",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "bedtools_annotatebed",
"display_name": "AnnotateBed",
"version": "@WRAPPER_VERSION@.0",
"summary": "\n \n**What it does**\n\nbedtools annotate, well, annotates one BED/VCF/GFF file with the coverage and number of overlaps observed from multiple other BED/VCF/GFF files. In this way, it allows one to ask to what degree one feature coincides with multiple other feature types with a single command.\n\n@REFERENCES@\n\n ",
"description": "",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "annotateBed",
"display_name": "Annotates",
"version": "0.0.1",
"summary": "\n\n**What it does**\n\nThis tool is used to annotate the regions in a bed file with features provided in multiple BED files.\n\n.. class:: warningmark\n\nThis tool requires that bedtools__ has been installed on your system.\n\n-----\n\n.. __: http://code.google.com/p/bedtools/\n\n",
"description": " the depth & breadth of coverage of features from multiple files\n\t",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "annotateGenes",
"display_name": "AnnotateGenes",
"version": "1.0",
"summary": "\n**What it does**\n\nThis tool annotates peaks with genomic feature (promoter, enhancer, exon, intron, etc.) and creates a .png file with distribution\n\n ",
"description": "Annotation of genes with ChIP-Seq peaks",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
},
{
"name": "annotatePeaks",
"display_name": "AnnotatePeaks",
"version": "1.0",
"summary": "\n**What it does**\n\nThis tool annotates peaks with genomic feature (promoter, enhancer, exon, intron, etc.) and creates a .png file with distribution\n\n ",
"description": "Genomic annotation of Chip-Seq peaks",
"example": [],
"links": [],
"references": [],
"availability": [],
"technology": [],
"programming_language": [],
"license": [],
"operating_system": [],
"inputs": [],
"parameters": [],
"outputs": [],
"quality": [],
"learn_flow": [],
"algorithm": []
}
]
}