Joergen Albertsson and I have been discussing the "units problem" of data items such as _refine_diff_density_* and _refln_F_* etc. Joergen is here for a couple of months while I am in the faculty office... and these discussions have been very useful in providing a non-cif-expert sounding board on the clarity of definitions etc. I sympathise completely the concerns of Brian and David about "generic" definitions, because I am involved in developments where linkages between definitions will complicate searches... not significantly, I might add, if there are DDL attributes that specify what the link is (more on that later). However, I am just as concerned about the proliferation of data items that will follow if we define a new item for each quantity which differs only in the units. AND, as an application person I have to say that I see just as serious problems with having to search for N names rather than 1 to access a given data quantity, even when the units of that item has to be deduced from another place. So we seem to be "between a rock and a hard place" with these choices. Joergen prefers the generic approach because he considers it to be more intuitive and, as he says, it will keep the number of items that have to listed in the Notes for Authors to an absolute minimum! I believe that we all have to sympathise with this human(e) view! I would like to quickly explore both possibilities, and it will quickly become obvious which direction I prefer. (1) The multiple definition approach ------------------------------------ Here are the current definitions with units involving electrons. _refine_diff_density_max _refine_diff_density_min _refine_diff_density_rms _refln_F_meas _refln_F_calc _refln_F_sigma _refln_F_squared_calc _refln_F_squared_meas _refln_F_squared_sigma _refln_A_calc _refln_A_meas _refln_B_calc _refln_B_meas _atom_type_scat_dispersion_imag _atom_type_scat_dispersion_real _exptl_crystal_F_000 Its not complicated to add.... _refine_xd_diff_density_max _refine_xd_diff_density_min _refine_xd_diff_density_rms _refln_xd_F_meas _refln_xd_F_calc ... or _refine_diff_density_xd_max _refine_diff_density_xd_min _refine_diff_density_xd_rms _refln_F_xd_meas _refln_F_xd_calc ... and _refine_nd_diff_density_max _refine_nd_diff_density_min _refine_nd_diff_density_rms _refln_nd_F_meas _refln_nd_F_calc ... and _refine_ed_diff_density_max _refine_ed_diff_density_min _refine_ed_diff_density_rms _refln_ed_F_meas _refln_ed_F_calc ... but do we really honestly want to add 48 new items!!!!!? 2. The generic definition approach ---------------------------------- Here is the current definition of data_refine_diff_density_ loop_ _name '_refine_diff_density_max' '_refine_diff_density_min' '_refine_diff_density_rms' _category refine _type numb _type_conditions esd _units e_A^-3^ _units_detail 'electrons per cubic angstrom' _definition ; The largest, smallest and root-mean-square-deviation, in electrons per angstrom cubed, of the electron density in the final difference Fourier map. The *_rms value is measured with respect to the arithmetic mean density, and is derived from summations over each grid point in the asymmetric unit of the cell. This quantity is useful for assessing the significance of *_min and *_max values, and also for defining suitable contour levels. ; These items can be easily redefined as... and this is an example for all of the electron-dependent items listed above. data_refine_diff_density_ loop_ _name '_refine_diff_density_max' '_refine_diff_density_min' '_refine_diff_density_rms' _category refine _type numb _type_conditions esd _units_construct (_diffrn_radiation_scat_units)_A^-3^ _units_default e_A^-3^ _definition ; The largest, smallest and root-mean-square-deviation, of the density in the final difference Fourier map. The *_rms value is measured with respect to the arithmetic mean density, and is derived from summations over each grid point in the asymmetric unit of the cell. This quantity is useful for assessing the significance of *_min and *_max values, and also for defining suitable contour levels. The units of density are defined in _diffrn_radiation_scat_units per angstrom cubed. ; OK, this introduces two new DDL1 attributes... a bit scary some might say . but I foreshadow that its only the start of many more such cross linkages which will be introduced in future versions of the DDLs, as Nick, Ian, John and I work on new "methods" paradigms. Such considerations are going on now. Everyone in this learned group appreciates, and indeed has reminded me about on occasions, some data items are defined in a decidedly clumsy way, especially for electronic parsing. So please... before launching any missiles in this direction for supposedly moving the definition goal posts, understand that adding attributes to the definition language helps preserve the current naming of data items and PREVENT the proliferation of new items for non-sense reasons... such as in this case. Indeed the problems with the data items we are discussing here is a classical example of what happens when the description language is not rich enough. And as we get to understand how to improve this richness intelligently, the actual number of data items we will need should decrease rather than increase. I have missed an important part of this proposal. What about the single new data item _diffrn_radiation_scat_units and the definition of the two new DDL items _units_construct and _units_default? In the cif core dictionary we will need... data_diffrn_radiation_scat_units _name '_diffrn_radiation_scat_units' _category diffrn_radiation _type char _list both _list_reference '_diffrn_radiation_wavelength_id' loop_ _enumeration _enumeration_detail e electrons fm femtometres V volts _definition ; The scattering units associated with the radiation type as defined in _diffrn_radiation_probe. ; and in the DDL1 dictionary... data_units_construct _definition ; String of characters specifying the construction of the units of the defined data item. The construction is composed of two entities: (1) data names (2) construction characters The rules of construction conform to the the regular expression (REGEX) specifications detailed in the IEEE document P1003.2 Draft 11.2 Sept 1991 (ftp file '/doc/POSIX/1003.2/p121-140'). ; _name '_units_construct' _category units_construct _type char _example (_cell_length_unit)^3^ _example_detail 'the units for _cell_volume' data_units_default _definition ; A unique code which identifies the default units of the defined data item. This should be used if the value cannot be specified from the value of _units_construct. ; _name '_units_default' _category units _type char _example e_A^-3^ _example_detail 'electrons per angstrom cubed' OK, the above does require more intelligent parsing but this had to come. Joergen considers it much better than the proliferation alternative... and so do I (:->). Seriously though, we really do need to address the immediate and long term consequences of such proliferation. I want the existing data names to remain active, AS DO ALL OF THE OTHER PRESENT SOFTWARE DEVELOPERS! And since most don't use the DDL in the definitions actively, the proposed new attributes won't inconvenience them. Its the next generation of cif developers that we need to look out for and thats who a richer definition language should be aimed at. Herbert Bernstein and John Westbrook may not be too chuffed by this proposal at first sight but knowing them they will turn it into an opportunity for even better cif tools. Whatever happened to the day of rest? Cheers, Syd.
Copyright © International Union of CrystallographyIUCr Webmaster