[Zoobank-list] RE: ZooBank Data Objects

Donat Agosti agosti at amnh.org
Wed Mar 7 08:17:53 GMT 2007


In our current set up to mark-up documents using taxonx (GoldenGATE is the
dedicated editor for it - http://idaho.ipd.uka.de/GoldenGATE/), the question
of how many names or taxon concept exist is not really a question, because
the automatic discovery and LSID-look-up process picks up every name, and it
will ask you, if a concept is not yet in the database, what to do with it.

In fact, I am puzzled how many variants of names exist, up to the wrong
genus (eg Smithistruma emmae instead of Strumigenys emmae). I started to
document this, because we found out during the mark up process to the
Malagasy ant literature (http://antbase.org/databases/madagascar.htm), that
we need to spend a considerable amount of time to resolve all these new
concept, which are often very complicated issues, especially when
quadrinomen pop up (see http://antbase.blogspot.com/ sloppiness because it
worries me, that in the age of name servers, people would not check their
names)

The only way to do this work efficiently is to have all the original
literature digitized and a name server that is complete for the particular
taxon. For ants we have this, so it is possible to verify each name

Instead of building up the ZooBANK, I would suggest to take the documents
from where the names are discovered, mark them up and add a GUID (we add
LSID linking to the Hymenoptera Name Server, but that could easily be
changed or pointed to another system like Ubio, ZooBANK) to the names, so
that the documents move also up the levels, and hopefully, with the
treatments marked up, become part of ZooBank.

Donat



I agree completely!  And this is where initiatives like INOTAXA and uBio and
BHL need to figure out a protocol and structure for an intermediate step
between programmatically-scanned documents and "verified"
IPNI/IF/ZooBank-style records.  This would fall into "Level 2" of the EoL
model.  In the context of ZooBank, I see this realm of data entering as
"unverified" records (see:
http://www2.bishopmuseum.org/iczn/docs/ZooBank-GPP.pdf), and thus being
exposed to expert taxonomists as they encounter them, who then gradually add
the human side of "taxonomic intelligence" working hand-in-hand
(err...hand-in-GUID) with automated "taxonomic intelligence" services --
inching us ever closer to that idealistic data destination.

Aloha,
Rich

> -----Original Message-----
> From: zoobank-list-bounces at afriherp.org 
> [mailto:zoobank-list-bounces at afriherp.org] On Behalf Of Weitzman, Anna
> Sent: Tuesday, March 06, 2007 2:18 PM
> To: Zoobank Discussion List (ICZN); Zoobank Discussion List (ICZN)
> Subject: RE: [Zoobank-list] RE: ZooBank Data Objects
> 
>  Hi Chris and Rich,
>  
> While I do agree with Chris, I also agree with Rich that what 
> this means is that we need to be able to capture more than 
> one taxon concept in a single work.  A reference to someone 
> else's concept is not the same 'usage instance'.  However, 
> that is not going to be practical to tease out in most cases 
> when we are doing this markup mechanically / programmatically 
> (except in those rare instances where the author of the work 
> clarifies by differentiating between Aus bus L. 1767 and Aus 
> bus sensu Klotsch 1824).  Those clarifications are going to 
> come when/if a subject matter expert is working on something 
> and takes the time to add those bits of information--some of 
> which are a matter of interpretation and some of which are 
> clearly stated in the work.  What we need to do is to make 
> sure that in the larger 'taxonomic workspace' those things 
> can be captured, identified and correctly linked to the 
> correct 'usage instance', the correct taxon concept, etc.
>  
> Cheers,
> Anna
>  
> Anna L. Weitzman, PhD
> Informatics Branch Chief, ITO
> Informatics, Botany and Biodiversity Research National Museum 
> of Natural History Smithsonian Institution
>  
> 202.633.0846
> weitzman at si.edu
> 
> ________________________________
> 
> From: zoobank-list-bounces at afriherp.org on behalf of Richard Pyle
> Sent: Tue 06-Mar-07 6:00 PM
> To: 'Zoobank Discussion List (ICZN)'
> Subject: RE: [Zoobank-list] RE: ZooBank Data Objects
> 
> 
> 
> Hi Chris,
> 
> > Up to a point, probably; the only nagging thought I have 
> here is that 
> > in some works the author will be discussing / interpreting previous 
> > usages of a name, so part of the text will refer to the current 
> > author's usage instance, whilst the name elsewhere in the text will 
> > refer to his interpretation (or repetition) of someone else's usage 
> > instance.
> 
> Right -- but at that point you're talking about 
> cross-referencing taxonomic concepts, aren't you?  Certainly 
> there needs to be a data structure to accommodate this sort 
> of information (what I believe is called 
> "RelationshipAssertions" in TCS).  But I see this as more 
> concept-based information, than name based information.  
> Certainly it involves names (in the sense of strings of text 
> that represent names), and there needs to be a mechanism for 
> capturing the intended references to those name-strings.
> So...I'm not sure what the best answer is.
> 
> > This may
> > be getting into such obscure detail that we can never hope to parse 
> > out and interpret the context, but it does become fairly 
> apparent in 
> > catalogues, for example, where one might wish to resolve the 'same' 
> > name in different directions, depending on which 
> publication is being 
> > cited as its origin.
> > I guess that in all cases we have to look at cost/benefit 
> of applying 
> > a GUID (or a placeholder for one).
> 
> Yes -- none of this stuff is written in stone (yet...).  I 
> guess we need to get a clearer sense for where nomenclature 
> ends, and taxon concept information begins (if there is even 
> a way to disambiguate the two).  And yes, at some point it's 
> important to capture the core elements in a structured way, 
> punting the subtleties into a "text-blob" sort of comment or 
> annotation.  But at the same time, I understand your point 
> about the need to parse "sub-usages" (for lack of a better 
> term) within the context of a single publication.
> 
> I just wish my head didn't hurt so much whenever I try to 
> wrap it around these sorts of discussions.... :-)
> 
> Aloha,
> Rich
> 
> 
> 
> --
> This message has been scanned for viruses and dangerous 
> content by MailScanner, and is believed to be clean.
> 
> _______________________________________________
> Zoobank-list mailing list
> Zoobank-list at afriherp.org
> http://list.afriherp.org/mailman/listinfo/zoobank-list
> 
> 
> 
> --
> This message has been scanned for viruses and dangerous 
> content by MailScanner, and is believed to be clean.
> 
> _______________________________________________
> Zoobank-list mailing list
> Zoobank-list at afriherp.org
> http://list.afriherp.org/mailman/listinfo/zoobank-list



-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
Zoobank-list mailing list
Zoobank-list at afriherp.org
http://list.afriherp.org/mailman/listinfo/zoobank-list



-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the Zoobank-list mailing list