So all I need is a number?
In the Neuroscience Information Framework (http://neuinfo.org), we often tout the importance of using unique identifiers rather than text strings as a way to ensure that search engines like NIF can mitigate the ambiguity associated with searching for strings. Â NIF provides access to the largest source of neuroscience information on the web, by providing simultaneous search over multiple databases, catalogs and literature databases. Â If you search for Ca2 in NIF, you will find information on calcium, the hippocampus and a gene called CA2. Â Unique identifiers can disambiguate among these by assigning unique handles to each; Â a sort of social security number for each thing that we want to talk about. Â Many groups are creating and promoting unique identifiers for all sorts of entities: Â people (e.g.,
ORCID), articles (PubMed ID's) and they are very handy things. Â NIF itself has gotten into the business through its unique resource identifiers and antibody ID's. Â So all I need is a number, right? Â Alas, no. Â Because numbers, like names, are not unique either. Â I just searched through NIF and found an antibody in the
Beta Cell Consortium Database.  There was a column for "people who are using this" with a reference of  10077578.  Clicking on it took me to an article in PubMed, so clearly it is a Pub Med ID.  Great, I thought.  I want to see who else references that paper in NIF.  So I typed in PMID:10077578 into the NIF search interface and was able to retrieve the article in the NIF literature database.  But that's not what I wanted. Most of the times, database providers don't provide the prefix PMID; rather, they list just the numbers in a column labeled "Reference" or "Citation". So I typed in 10077578 and got multiple hits in the data federation from several databases. Great, I thought. Here are other sources of information that are referencing this paper. Unfortunately, one was to Novus Biochemical antibody 100-77578, and one was to the gene Rumal_1324 (GeneID: 10077578).  So, clearly a number is not enough.  Some sort of name space is required, e.g., PMID:10077578 clearly tells me where I am to look. NIF should have known better and is working to resolve this glitch, by identifying each number with a prefix, and in time, a full URI (
Uniform Resource Identifier, not an upper respiratory infection). Â The semantic web community has been working on these standards for a long time and discussion of the URIÂ is beyond this post. Â But this is yet another example of why we at NIF encourage resource providers to think globally about their data; Â are we producing our data in a form that makes it easier to link individual parts of our resource to other parts?