[sbml-discuss] What to do about his unit checking case
Michael Hucka
mhucka at caltech.edu
Thu Feb 22 13:39:12 PST 2007
I wanted to get back to the issue of syntax for indicating
units in SBML. Stefan's awesome message was wonderfully
detailed and hit all the right points. (Thanks!) Still,
one item that I dispute slightly is that assuming default
units for numbers makes solutions infeasible. We can define
a default, and tell people what it is, and how to define
other units when the default is not suitable.
Here is a proposal for the simplest way forward for L2v3:
1) Define "raw" numbers appearing in MathML expressions as
having units "dimensionless" (which is not the same as
having undefined units).
2) Tell people that if they want to introduce a number
having other units into their expressions, they should
define a parameter for that number and then use the
parameter's id in the expression instead of the number.
Features of this approach:
(a) No new syntax needs to be introduced.
(b) For some cases, people's existing math expressions will
be correct. In particular, expressions where a "raw"
number is multiplying or dividing another number, will
be correct under mechanistic unit checking. However
when numbers are being added or substracted, in most
cases they would fail unit checking. In Stefan's
original example, the model would still be incorrect
because there is a mix of these two. The way to solve
it for that example is to define a parameter for at
least the "1", and the "2" could be left as-is. By
contrast, assuming that raw numbers have undefined units
would cause all cases to fail unit validation. I think
on balance defaulting to "dimensionless" in this way
will cause people less grief.
My proposal for L3 (it could optionally be done for L2v3 but
I think it would have to be voted on, which would take more
time, and I think no one wants to delay L2v3 further):
1) Get rid of the UnitKind enumeration, and instead, define
UnitKind's current values as being reserved words in
UnitSId.
2) Change the data type of the Unit "kind" field to be
UnitSId. (Combination of #1 and #2 thereby puts all
units into the same identifier space, for #3.)
3) Define the ability to put unit attributes on MathML
elements in this form:
<cn sbml:units="mole"> 1 </cn>
Bonus: if the sbml:units attribute is permitted anywhere
in the MathML, then units of more than just <cn> could be
defined uniformly.
4) Define "raw" numbers appearing in MathML expressions as
having units "dimensionless" (which is not the same as
having undefined units).
5) Tell people that if they want to introduce a number
having other units into their expressions, they can
either define a parameter for that number and then use
the parameter's id in the expression instead of the
number, or else use the sbml:units attribute.
I would personally rather do this 2nd alternative, because
it's more powerful and (I wager) less cumbersome, but it
would delay the issuing of L3.
Does anyone have any objections or see any problems with the
first approach?
MH
>>>>> On 12 Feb 2007, Stefan Hoops <shoops at vbi.vt.edu> wrote:
shoops> Dear Darrin:
shoops>
shoops> I like to pick up on your example since it is a
shoops> real one frequently encountered (even in SBML). I
shoops> modified it a little to conform with the
shoops> definition of the kineticLaw of SBML, i.e., it
shoops> looks like
shoops>
shoops> k * V * X(X-1)/2
shoops>
shoops> Here k is the kinetic constant in the appropriate
shoops> units and V the volume of the reaction
shoops> compartment. The MathML for this rate law may be
shoops> found below
shoops>
shoops> <math xmlns="http://www.w3.org/1998/Math/MathML">
shoops> <apply> <divide/> <apply> <times/> <ci> k </ci>
shoops> <ci> V </ci> <ci> X </ci> <apply> <minus/> <ci> X
shoops> </ci> <cn> 1 </cn> </apply> </apply> <cn> 2 </cn>
shoops> </apply> </math>
shoops>
shoops> This MathML includes two cn elements of which one
shoops> <cn> 2 </cn> is dimensionless and the other <cn> 1
shoops> </cn> must have the same units as X. It is easy
shoops> for a human to detect the needed units for the
shoops> above example.
shoops>
shoops> As Mike stated in his initial post to the mailing
shoops> list defining the units of numbers in MathML
shoops> implicitly has at least one major problem. Let us
shoops> assume that the units of k are incorrectly given
shoops> in the SBML document. In this case the implicit
shoops> unit definition of <cn> 2 </cn> would correct for
shoops> this obvious error and thus preventing the unit
shoops> checking which SBML facilitates. This is clearly
shoops> not the intent of implicit unit definition for
shoops> numbers.
shoops>
shoops> Please note that the above example can be written
shoops> in SBML without any problems. We just need to
shoops> replace <cn> 1 </cn> with <ci> ONE_ITEM </ci> and
shoops> define ONE_ITEM as a global parameter with the
shoops> appropriate units. This approach is clearly
shoops> cumbersome but the only correct way in SBML as it
shoops> stands now. Implicit unit definitions where
shoops> brought up to make writing correct SBML
shoops> easier. However, the above problem (at least in my
shoops> opinion) makes them unfeasible.
shoops>
shoops> What other options do we have:
shoops> 1) We could add an attribute to the cn element
shoops> similar to CellML
shoops> <cn sbml:units="UnitSId"> 1.0 </cn> In this case
shoops> the default value should be
shoops> sbml:units="dimensionless", which currently exist
shoops> only as unit kind and not as UnitSId. This however
shoops> is only a minor problem
shoops>
shoops> 2) We can adopt the proposal for MathML which can
shoops> be found at:
shoops> http://www.w3.org/TR/mathml-units/ This is
shoops> basically introducing text into the content
shoops> markup, i.e., it is by itself not structured
shoops> enough for SBML needs.
shoops>
shoops> 3) We could allow UnitSId to be valid within a ci
shoops> element. This creates
shoops> the problem that UnitSIds and SIds are in separate
shoops> name spaces and thus we may have
shoops> ambiguity. However, we could remove this problem
shoops> by having just one name space for all SIds.
shoops>
shoops> 4) We define a third csymbol:
shoops> http://www.sbml.org/sbml/symbols/units and allow
shoops> only UnitSIds as values for this csymbol. In case
shoops> that we need a number with units in MathML this
shoops> expands to: <apply> <times/> <cn> 1 </cn> <csymbol
shoops> encoding="text"
shoops> definitionURL="http://www.sbml.org/sbml/symbols/units">
shoops> SUBSTANCE_UNITS </csymbol> </apply>
shoops>
shoops> I would like to hear about your thoughts an the
shoops> suggested solutions or even better other simpler
shoops> ones :)
shoops>
shoops> Thanks, Stefan
shoops>
shoops>
shoops> On Sun, 11 Feb 2007 21:59:40 +0000 (GMT) Darren
shoops> Wilkinson <darrenjwilkinson at btinternet.com> wrote:
shoops>
>> Yes, units really are tricky! I don't have a strong
>> view on what to do about them (yet!), but I always
>> think its helps clarify things to think about my
>> favourite rate law:
>>
>> X(X-1)/2
>>
>> Here it is assumed that X has units of "item". Then the
>> "1" also has units of "item", but the "2" is
>> dimensionless. This difference really matters if you
>> decide to convert units to (say) moles, as then the "1"
>> needs converting (to become the reciprocal of avogadros
>> constant), but the 2 remains unchanged. It would
>> obviously be highly desirable for any unit
>> checking/conversion capability in SBML/libSBML to be
>> able to cope with such issues... However, it would be a
>> bit messy to have to define a "one" and a "two"
>> externally with appropriate units just to be able to
>> construct this rate law...
>>
>> Regards,
>>
>> Darren
>>
>>
>>
>> --- Michael Hucka <mhucka at caltech.edu> wrote:
>>
>> > The SBML editors had an off-line discussion which we
>> > realized needs to be put on sbml-discuss. It
>> > concerns an issue that Sarah Keating ran into while
>> > implementing unit validation in libSBML.
>> >
>> > The issue is how the units of pure numbers in MathML
>> > should be treated. Here's an example, in the case of
>> > the 'delay' field on Events:
>> >
>> > <delay>
>> > <math
>> > xmlns="http://www.w3.org/1998/Math/MathML">
>> > <cn> 1 </cn>
>> > </math>
>> > </delay>
>> >
>> > Note the literal "1". What units, if any, should be
>> > assumed for the number?
>> >
>> > One option is to treat literal numbers as always
>> > being without units. This is a valid viewpoint, but
>> > it leads to more model validation failures. Although
>> > modelers should define parameters (which can have
>> > defined units) for a lot of the cases where they
>> > often write numbers directly, the practical reality
>> > is that many people won't think of doing that.
>> >
>> > A second option is to take the units to be whatever
>> > units are defined as the default for the field in
>> > question, or if there is an associated field
>> > definining the units, then whatever that field says
>> > the units are. (For instance, there is a 'timeUnits'
>> > field on Events that could apply in this example.)
>> > The logic behind this idea is that even though the
>> > user didn't specify the units explicitly, a
>> > reasonable interpretation of their intention is that
>> > the units are meant to be whatever the defaults are.
>> > In other words, the modeler didn't mean "1" without
>> > units, they meant "1 in the units of time" for this
>> > example.
>> >
>> > Sounds reasonable? But it gets more complicated.
>> > Imagine now something like
>> >
>> > <math xmlns="http://www.w3.org/1998/Math/MathML">
>> > <apply>
>> > <plus/> <ci> S </ci> <cn> 2 </cn>
>> > </apply>
>> > </math>
>> >
>> > For the purpose of this second example, the "S" can
>> > be anything (e.g., a species identifier) and the
>> > context can be anything, not just the Events example
>> > above.
>> >
>> > The question is, what should be assumed for the units
>> > of "2"?
>> >
>> > If we follow the same guideline, i.e., that the
>> > literal number has whatever units are appropriate for
>> > the situation, then it creates validation problems.
>> > First, it implies a unit validator must deduce the
>> > expected units somehow, and this may sometimes be
>> > impossible due to ambiguities in a given situation.
>> > Second, Stefan Hoops pointed out it opens a hole in
>> > validation: it effectively means that numbers in
>> > MathML are assumed to always have the appropriate
>> > units, so unit checking becomes worthless.
>> >
>> > To cope with this, we are tentatively thinking of
>> > using the following set of guidelines:
>> >
>> > 1) If the number is the only thing contained inside a <math>
>> > (such as the original example involving a delay on
>> > Events), assume that the units of the number are
>> > the units defined for that field (so, timeUnits
>> > for delays).
>> >
>> > 2) In all other cases, assume pure numbers have no
>> > units.
>> >
>> > This handles the first case, and also addresses the
>> > second case. (In the second case, there would be a
>> > unit validation failure because the "2" would not
>> > have units and therefore the overall expression could
>> > not have consistent units.)
>> >
>> > The downside is that this "rule" is defined with an
>> > exception.
>> >
>> > What do the rest of you SBMLers think?
>> >
>> > Whatever is decided, the next SBML specification will
>> > have to spell out the guidelines precisely.
>> >
>> > MH
>> >
>> > ____________________________________________________________
>> > To manage your sbml-discuss list subscription, visit
>> > https://utils.its.caltech.edu/mailman/listinfo/sbml-discuss
>> >
>> > For a web interface to the sbml-discuss mailing list,
>> > visit http://sbml.org/forums/
>> >
>> > For questions or feedback about the sbml-discuss
>> > list, contact sbml-team at caltech.edu.
>> >
>>
>>
>> -- Darren Wilkinson email:
>> darrenjwilkinson at btinternet.com home www:
>> http://www.darrenjwilkinson.btinternet.co.uk/ work www:
>> http://www.staff.ncl.ac.uk/d.j.wilkinson/
>> ____________________________________________________________
>> To manage your sbml-discuss list subscription, visit
>> https://utils.its.caltech.edu/mailman/listinfo/sbml-discuss
>>
>> For a web interface to the sbml-discuss mailing list,
>> visit http://sbml.org/forums/
>>
>> For questions or feedback about the sbml-discuss list,
>> contact sbml-team at caltech.edu.
shoops>
shoops>
shoops> -- Stefan Hoops, Ph.D. Senior Project Associate
shoops> Virginia Bioinformatics Institute - 0477 Virginia
shoops> Tech Bioinformatics Facility I Blacksburg, Va
shoops> 24061, USA
shoops>
shoops> Phone: (540) 231-1799 Fax: (540) 231-2606 Email:
shoops> shoops at vbi.vt.edu
shoops> ____________________________________________________________
shoops> To manage your sbml-discuss list subscription,
shoops> visit
shoops> https://utils.its.caltech.edu/mailman/listinfo/sbml-discuss
shoops>
shoops> For a web interface to the sbml-discuss mailing
shoops> list, visit http://sbml.org/forums/
shoops>
shoops> For questions or feedback about the sbml-discuss
shoops> list, contact sbml-team at caltech.edu.
More information about the sbml-discuss
mailing list