Getting poster data...
Salvador Santiago (EMBL, Heidelberg, Germany)To extract automatically mutations from literature is a challenging task, and most of the systems have false positives. Our aim is to develop a semi-automatic tool MutationLocator to extract mutations from different types of documents. Every mutation with its effect is shown to the user in the document context through a web interface, allowing him to discard possible wrong mutations, and to link the list of mutations found directly, to a sequence and the position within the same. In order to accomplish this task, the tool matches regular expressions built out of the list of mutations in the Gold Standard (e.g T54A) with a set of reference sequences related to the selected organism (e.g HBX2 for HIV-1). As well the tool provides in advance a set of possible hits calculated based on the whole set of mutations and subsets to improve the performance. Once the sequence selection is made, it has found very useful to bring this mutation information to the context of 3D structure using the SRS3D tool. MutationLocator has been applied to extract mutations related to the HIV-1 polyprotein Gag-Pol from about 100 selected freely available full text articles obtaining valuable information, eas