Some rough and ready notes from day two of the first ChEBI workshop, 20th May 2008. There were two talks, one from Kirill Degtyarenko (European Patent Office) and the other from Janna Hastings (EBI), followed by a discussion.
Kirill Degtyarenko: Good annotation practice for chemical data, ChEBI experience
Kirill’s talk described how to give the most appropriate names, especially since “biologists don’t name things properly, if at all” (!). Systematic (IUPAC) names are usually better than common names except for “the unprounounceables” for example, an antibiotic called (E)-roxithromycin (ChEBI:48935) has the IUPAC name:
(3R,4S,5S,6R,7R,9R,10E,11S,12R,13S,14R)-4-(2,6-dideoxy-3-C-methyl-3-O-methyl-α-L-ribo-hexopyranosyloxy)-14-ethyl-7,12,13-trihydroxy-10-{[(2-methoxyethoxy)methoxy]imino}-6-[3,4,6-trideoxy-3-(dimethylamino)-β-D-xylo-hexopyranosyloxy]-3,5,7,9,11,13-hexamethyloxacyclotetradecan-2-one
…which just trips of the tongue (and fits beautifully, without line breaks onto regular computer screens). Fortunately, the curator can draw the chemical (note the wavy bond, unknown stereochemistry), using the curator tools, then the inchi and smiles strings are generated from the drawing. Currently they use something called ACD/Name which can generate PubChem links automatically. As of May 2008 14,000 chebi ids translates to around 11,000 CIDs in PubChem, which is structures only.
(more…)