[IUCr Home Page] [CIF Home Page]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Items for the Agenda of the COMCIFS closed meeting

I would like to suggest that the real question is not the future direction
of CIF, but the future direction of information management in crystallography
and allied sciences.  The issues of catagory key naming and the 
relationship between
DDL1 and DDL2 or between parent and child data items are not central issues
for our science.   The issue of how well our data can move into and among
experiment control systems, databases and publications is.  The definition
of CIF and the structure of its supporting software should be moving in
a direction that supports efficient and reliable data management in 

John and Syd made an important contribution to CIF when they made DDL2 CIFs
better able to carry the information needed to load SQL-based data bases.
dREL will add more useful features.  We should be asking what additional
funtional capabilities, if any, we made need, and then working to create
a single framework with supporting software within which all the features
we need are easily available to working crystallographers, archivists and

At 3:17 PM -0500 3/21/05, David Brown wrote:
>To members of COMCIFS
>I would like to place the following two topics on the agenda for the 
>closed meetings in Florence.  I welcome suggestions for other agenda 
>1. What is the role of CIF in the current rapidly changing world of 
>information technology?
>2. How can we make transparent the boundary between CIFs written 
>with DDL1 dictionaries and those written with DDL2?
>David Brown
>It should be no surprise that an information technology language 
>adopted in 1990 needs to be reviewed after fifteen years of 
>operation.   The rapid advances in the field and the introduction of 
>XML make such a review more than timely.  A further urgency is added 
>by the need to ensure that incremental changes that we make in the 
>dictionaries and other documents are compatible with future 
>directions of crystallographic information technology.  Two current 
>problems illustrate how this impacts on dictionary structures.
>1. Is it better to have a semantically meaningless item as the 
>_list_reference (DDL1) or _category_key (DDL2) to label each line in 
>a loop, or should we use semantically meaningful items (such as 
>_atom_site_label) that are already present?  The former solution 
>allows a more straightforward programming and avoids possible 
>conflicts between the information technology and crystallographic 
>use of the item, but the latter leaves the CIF less cluttered and 
>easier for humans to follow because the links are more readily 
>followed by eye.  The current revision of the core dictionary needs 
>an answer to this question, because the answer will affect future 
>CIF data structures.
>2. Should there be rules defining the relationships that are allowed 
>to be expressed by parent-child links?  These links have been 
>developed in an ad hoc way, but as we move towards more advanced 
>data structures, we may find that we have developed links that are 
>impossible to manipulate.  One way of exploring the logic of the 
>linked structures is to use the ResourceDescriptionFramework (RDF) 
>which is being developed as part of the Semantic Web (see 
>http://www.w3.org/RDF/ and http://www.w3.org/RDF/FAQ ).  This scheme 
>expresses the parent-child links as a graph making it easier to 
>trace the logic.  Another possibility is to use the Unified Modeling 
>Language ( www.uml.org ).
>As interest focuses on software that explores the interactions of 
>small and large molecules, the incompatibility between the 
>Dictionary Definition Language 1 (DDL1) and DDL2 is becoming a 
>CoreCIF is designed for use with small molecules and is written in 
>DDL1 but mmCIF designed for reporting macromolecules is written 
>using DDL2.  While most of the features of the two standards are 
>similar, there are two significant differences:  Firstly DDL2 has a 
>tighter structure designed to make automatic computer manipulation 
>of the information easier, secondly the names given to the data 
>items have a different structure.  As the similarities between the 
>two languages are far greater than their differences, it should be 
>possible to achieve some convergence;  already the core dictionary 
>is evolving towards the DDL2 standard, but a complete convergence 
>would require major reworking of some dictionaries.
>Convergence can be achieved in different ways.  One way is to ensure 
>that software is able to validate CIFs against both DDL1 and DDL2 
>dictionaries, and since the dictionaries contain synonyms of the 
>data names (alternative data names for items with essentially the 
>same definition, listed under _related_item (DDL1) and 
>_item_aliases.alias_name (DDL2)), any character string used to 
>represent a particular data name should be recognized by software 
>that takes note of any alias names present regardless of the 
>dictionary or version being used.   Since all the items in the 
>coreCIF dictionary appear (transformed to DDL2) in the mmCIF 
>dictionary with their original DDL1 data names given as aliases, 
>mmCIF software should be able to read coreCIFs without difficulty. 
>mmCIF aliases are currently not present in the coreCIF dictionary 
>but could easily be added.  Alternatively, a DDL2 version of the 
>coreCIF dictionary could be separated out and used as an alternative 
>to the DDL1 core dictionary.
>Attachment converted: Macintosh HD:idbrown 15.vcf (TEXT/ttxt) (00111444)
>comcifs mailing list

  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

comcifs mailing list

Reply to: [list | sender only]

Copyright © International Union of Crystallography

IUCr Webmaster