-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Hello,
First of all, thanks for providing such a fantastic tool!
I wasn't quite sure where to write this because it's not strictly a bug, but I thought it worth highlighting the fact that duplicate TF and kinase gene names often show up in X2Kweb output as below:
Running using the API makes things it clear that the reason for the TF gene duplication is just the default use of both CHEA, ENCODE entries. Nonetheless, as this may be confusing for downstream users maybe it would be better to use name (i.e. SUZ12_CHEA) instead of simpleName in the TFEA output?
In the KEA output, there are occasionally multiple entries for the same kinase referred to by different names. For instance, GSKB and GSKBETA both have separate entries (as well as GSKA, GSKALPHA, and GSK. I suspect this has a similar underlying issue in that (if I understand correctly) KEA 2018 is also aggregating from many sources - possibly the underlying source could be attached as with the TFs above? It is also worth mentioning that for the 'non-standard' names (e.g. GSK3BETA) the harmonizome link provided with the output does not match an entry (http://amp.pharm.mssm.edu/Harmonizome/gene/GSK3BETA).
Thanks very much again!
Best,
Andrew

