Skip to content

Duplicated genes in TFEA, KEA X2K web output #13

@ajw2329

Description

@ajw2329

Hello,

First of all, thanks for providing such a fantastic tool!

I wasn't quite sure where to write this because it's not strictly a bug, but I thought it worth highlighting the fact that duplicate TF and kinase gene names often show up in X2Kweb output as below:

ChEA
KEA

Running using the API makes things it clear that the reason for the TF gene duplication is just the default use of both CHEA, ENCODE entries. Nonetheless, as this may be confusing for downstream users maybe it would be better to use name (i.e. SUZ12_CHEA) instead of simpleName in the TFEA output?

In the KEA output, there are occasionally multiple entries for the same kinase referred to by different names. For instance, GSKB and GSKBETA both have separate entries (as well as GSKA, GSKALPHA, and GSK. I suspect this has a similar underlying issue in that (if I understand correctly) KEA 2018 is also aggregating from many sources - possibly the underlying source could be attached as with the TFs above? It is also worth mentioning that for the 'non-standard' names (e.g. GSK3BETA) the harmonizome link provided with the output does not match an entry (http://amp.pharm.mssm.edu/Harmonizome/gene/GSK3BETA).

Thanks very much again!

Best,
Andrew

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions