Explorations with a VOEvent Ontology for a User-Annotated Solar Catalogue
Elizabeth Auden, VOTech
13 November 2005
Introduction
Solar event catalogues provide spatial, time, spectral and coordinate information for occurrences on the Sun such as flares, coronal mass ejections (CMEs) and solar waves. This use of this metadata make solar events ideally described by VOEvent packets. Most event catalogues are maintained by scientists associated with specific facilities or instruments, such as the Yohkoh SXT / TRACE flare list [1], the Hessi flare list [2], and the NOAA SGAS energetic event list [3]. As new solar missions are launched that produce increasingly larger datasets, more solar events are likely to be discovered by scientists who analyse mission datasets but are not formally associated with the mission in question. Therefore, a solar physicist at MSSL has suggested the development of an online solar event catalogue that could receive contributions from any member of the solar community. The Solar User-Annotated VOEvent Catalogue (
SuaveCAT) will be implemented as VOTech research into the use of a VOEvent ontology with practical space science applications.
Background
The EGSO Solar Event Catalogue [4] has become quite popular as solar researchers have begun to interact with virtual observatories. This event catalogue combines other online catalogues, such as the Yohkoh, Hessi and NOAA catalogues mentioned above, into a single searchable interface. Users may search either one or two catalogues simultaneously based on event start and end times, or they may freely search the combined system using SQL queries. The EGSO SEC has been integrated with
AstroGrid using the
DataSet? Access (DSA) [5] module. This allows users to incorporate searches of the SEC into larger workflows, such as the
AstroGrid solar movie maker. In order to integrate successfully with solar workflows, the
SuaveCAT facility should offer similar event metadata to the catalogues incorporated in the EGSO SEC.
In addition to the first requirement that virtual observatory users be able to update this event catalogue, two further requirements were imposed to gain experimental value from the project. The second requirement was the ability to both search the catalogue and add new events to it using the
AstroGrid infrastructure. Third, as the event metadata contained in most solar event catalogues overlapped with concepts encapsulated in the VOEvent schema developed by the International Virtual Observatory Alliance [6], it was decided that the
SuaveCAT project would provide a good base for investigating a VOEvent ontology and accompanying software agents.
This paper examines the initial development of an ontology upon which to base the catalogue, development of the catalogue itself, and the configuration of
AstroGrid tools to search the catalogue, add to the catalogue, and retrieve catalogue entries as VOEvent packets. In addition, I hope to raise several questions that can be explored in further work on
SuaveCAT: what value can an ontology add to a space event catalogue? Are there science issues that an ontology-based software agent can answer better than an SQL query?
Ontologies
STC UML to OWL
The first draft of an ontology based on the VOEvent schema 0.90 [7] was developed in June 2005. Several issues were raised during discussion on the
IVOA DM mailing list [8]. Two such issues were conducive to simultaneous investigation: first, subelements of the
WhereWhen? element of the VOEvent schema could “in general, be any legal VO STC expression.”[9] The Space-Time Coordinates Metadata for the Virtual Observatory (STC) is an
IVOA schema that provides a precise format in which to specify the spatial, time, and spectral information for a VO resource [10]. Although this detailed schema had not yet been encoded as an OWL file that could be imported into a VOEvent ontology, it was available as a series of UML diagrams. This tied in with a second issue announced on the
IVOA DM mailing list; Dragan Gasevic’s new XSLT tool could translate an XMI file to OWL, and this tool could be used in a wider context to convert UML diagrams to XMI and finally OWL [11].
Converting the existing STC UML diagrams to OWL with tools seemed to offer a potentially high savings in effort compared to building an STC ontology by hand from the schema. Arnold Rots kindly provided me with STC UML diagrams constructed in Microsoft Visio 2003. Microsoft offered a Visio plugin XMIExprt that could convert UML static structure diagrams to XMI. After some trial an error, I was able to build the XMIExprt plugin (with minor edits) in
MicroSoft? Visio 2005 Beta, install the plugin in Visio 2003, and export the STC UML diagrams to a single XMI file. [10]. Turning to Gasevic’s
XMItoOWL?.xslt tool, I read that the tool works best with XMI files created with the UML software Poseidon [13]. Applying the
XMItoOWL?.xslt tool to the STC.xmi file exported with Visio 2003 produced an OWL file containing only XML namespace declarations. I tried opening STC.xmi in Poseidon and reexporting the file as XMI, but unfortunately the tool produced the same results. I was unable to convert the STC UML diagrams to an OWL file using these methods.
VOEvent 1.20 to OWL
Rather than forging ahead with the creation of an STC ontology by hand, I turned back to the VOEvent ontology. Between June and October 2005, the VOEvent schema had grown from version 0.90 to 1.0 [14]. The initial VOEvent ontology was updated to be contemporary with the 1.0 schema using the ontology tool Protégé [15].
The VOEvent ontology is based on three concepts detailed in the Protégé User Tutorial: classes, object properties, and datatype properties [16]. Each element and subelement of the VOEvent ontology is represented by a class. Relationships between elements are represented with object properties; the most common relationship in this ontology is “has[SubElement]”. Each element has one has[SubElement] object property for each subelement it contains; this design decision was debated in the
IVOA DM mailing list. The strict definition of forcing each VOEvent element to be built with a specific number and type of subelements will hopefully aid the software agents and correlation tools later on. Finally, the XML attitributes of relevant elements in the VOEvent schema have been included in the ontology as functional datatype properties. The use of “functional” restricts each element to having exactly one occurrence of the corresponding attribute.
Solar VOEvent Catalogue Ontology
A VOEvent packet contains up to eight subelements:
Who,
What,
WhereWhen,
Why,
How,
Citations,
Description, and
Reference. An individual VOEvent packet may contain at most one of each of these subelements. Five of these subelements were chosen for inclusion in the solar VOEvent catalogue ontology.
The
Who element provides curation information encapsulated in
PublisherID,
Contact, and
Date elements. Although the Contact element can contain a number of subelements that provide address, email and telephone information, for simplicity in the catalogue only two subelements were chosen:
Name and
Institution. Therefore, a
Who class was created with “has” relationships to
PublisherID,
Date, and
Contact classes. The
Contact class has “has” relationships with
Name and
Institution.
The
What element contains observational information; this may include
Param elements grouped under
Group elements, individual
Param elements,
References and
Descriptions. For this ontology, the
What class only has a “has” relationship with the
Param class. This reflects the structure of the catalogue; observational information, here restricted to “ARN” (active region number) and “Instrument Name”, can be encapsulated in
Param classes using name and value functional datatype properties without further need for Reference or Description classes. “Instrument Name” was included as a
Param class under
Why instead of as a
Reference under
How for the user’s ease. Existing solar event catalogues simply include the name of the mission and instrument rather than a URI pointing to the instrument’s description. For this reason, the
How element was included in the solar VOEvent ontology, but it is not used in the catalogue.
Further observational data is described inside the
WhereWhen element. This element contains space-time coordinate metadata as described in the STC schema, such as spatial and time frames, observation time data, coordinates, and spectral data. Until an STC ontology has been built that can be imported into a VOEvent ontology, specific elements relevant to solar observations were chosen and encapsulated in an
ObservationLocation class. This class has “has” relationships with
AstroCoords,
AstroCoordSystem, and
AstroCoordArea classes. The
AstroCoords class contains
SpaceFrame and
TimeFrame classes; the chosen spatial frame for this ontology is “HGC”, or heliographic coordinates, and “TOPOCENTER” indicates the position of the instrument. The
TimeFrame class has
TimeScale “UTC” for Universal Time, “TOPOCENTER” indicating instrument time, and a
Name that can be filled as “Time”. The
AstroCoords class contains latitude and longitude data inside the
Position2D class along with spectral units, name, value, and error data inside the
Spectral class. Finally, the
AstroCoordArea class contains time information; within a
TimeInterval class,
StartTime and
StopTime classes each contain
ISOTime classes in which users can add the event’s start and stop times as ISO8601 dates.
The
Why element contains
Concept and
Name subelements that can be grouped under an
Inference element. In this ontology, the
Why class has a “has” relationship with
Inference, which in turn has “has” relationships with
Concept and
Name.
Concept can be used in the solar VOEvent catalogue to describe the event type, such as flare, CME, or wave. Users could be more specific with the
Concept class and mark an event as a “Class B X-ray Flare”. As work with this ontology develops, it may be sensible to provide separation between broad event types and specific event classes. The
Name class may be used in the context of an event name if a particularly notable event later has a date attached to it, such as “The Valentine’s Day Flare”. This instance of the
Name class may prove to be unnecessary to the catalogue.
Finally, the
Citations element was created as a class with “has” relationships to
EventID and
Description along with a functional datatype property for “reason” (supersedes, followup, retraction). The
EventID class allows users to specify events from either the
SuaveCAT resource or other solar event catalogues that may be related to the catalogue entry being made, and a user can expound upon the relationship between two events with the
Description class. The
Citations class will be important to the solar VOEvent catalogue as future development with event correlation tools reveals not only events observed with different instruments in multiple catalogues, but also relationships between flares and coronal mass ejections.
Catalogue Creation
To create a user-updatable database from the solar VOEvent ontology described above, the various elements of a VOEvent packet representative of a solar catalogue entry were sorted into information that could be hard-coded for the catalogue and information that required user input. Much of the space-time “infrastructure” metadata contained in the
WhereWhen class could be easily hard-coded along with the catalogue’s
publisherID,
Param element names, event roles and VOEvent version. The curation, coordinates, spectral data, time information and citation data were reduced to nineteen fields of user input; slightly more information required than the average solar event catalogue, but not overwhelming. Each catalogue entry’s
EventID is generated automatically to ensure uniqueness. The catalogue’s user input is stored in a
MySQL? database.
| User-provided | Hard-coded | Generated |
| Active region number | Role (“observation”) | Event ID |
| Instrument name | Version (“1.0”) |
| Event description | PublisherID? (“ivo://mssl.ucl.ac.uk”) |
| Solar longitude | Param element for ARN |
| Solar latitude | Param element for instrument name |
| Spectral unit | AstroCoordSystem? ID |
| Spectral name | (“HGC-UTC-TOPO”) |
| Spectral value | TimeFrame? |
| Spectral error | Name=”Time” |
| Start time | TimeScale?=”UTC” |
| End time | TOPOCENTER |
| Event name | SpaceFramev? |
| Event concept | Name=”Solar Space Frame” |
| Event reporter | HGC |
| Reporter’s institution | TOPOCENTER |
| Reporting date | SPHERICAL |
| Cited eventID | AstroCoords? |
| Citation reason | coord_system_id=”HGC-UTC-TOPO” |
| Citation description | Position2D |
| | Unit=”deg” |
| | Name=”Longitude, Latitude” |
| | AstroCoordArea? |
| | ID=”Sun” |
| | Coord_system_id=”HGC-UTC-TOPO” |
DSA
The solar VOEvent catalogue data has been made available through an
AstroGrid DSA module, using astrogrid-pal-skycatserver-1.1-004pl.war. This DSA instance has been configured as a
TabularDB? resource. Users can build queries with the Astronomical Data Query Language (ADQL) to search the catalogue entries. Using ADQL queries with DSA imparts the functionality of searching events by date, a feature of most solar event catalogues. However, the ADQL also allows searches to be performed on any combination of the nineteen user input columns plus the autogenerated
EventID? column.
Currently, the catalogue’s DSA instance is only available through a development
AstroGrid installation at MSSL while testing occurs with false event data. Once the catalogue is stable, it will be migrated to a live
AstroGrid installation so that solar physicists may begin inputting data from observed solar events.
CEA
In addition to searching facilities, two other tools have been developed for use with this event catalogue: a tool to add new events and a tool to return a VOEvent packet given an event ID from the catalogue. Both tools have been deployed as
AstroGrid CEA applications interfacing with unix commandline scripts.
The ability to add new events to this catalogue was one of the primary project requirements. Originally, functionality to add, edit and delete solar events was going to be provided through a JSP interface. However, the secondary requirement to use the
AstroGrid infrastructure where possible shifted development of these tools to CEA applications. Currently, the
SuaveCAT “add event” tool can be accessed through the portal or workbench. Users are presented with nineteen text boxes in which to enter curation, location, spectral, time, and event description metadata. The add tool then appends this information to a
MySQL? database that holds the event catalogue. An event ID of the format “suavecat#” is generated each time an event is added. No output is returned to the user, but the catalogue gains an additional entry.
The second tool developed for
SuaveCAT generates a VOEvent packet from an individual catalogue entry. The user provides a single entry – the
SuaveCAT event ID – and the tool extracts the relevant catalogue entry from the database. The commandline tool then constructs an XML file in accordance with the VOEvent schema. Information from the
SuaveCAT database is written into curation, location, time, spectral and description elements, but many of the VOEvent subelements are hard-coded to uniform values for all catalogue entries, particularly spatial and time frame information along with some curation metadata.
A sample
SuaveCAT VOEvent packet generated by the tool takes the following format:
<?xml version="1.0" encoding="UTF-8"?>
<VOEvent id="ivo://mssl.ucl.ac.uk/10249" role="observation" version="1.0"
xmlns="http://www.ivoa.net/xml/VOEvent/v1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.ivoa.net/xml/VOEvent/v1.0
http://www.ivoa.net/internal/IVOA/IvoaVOEvent/VOEvent-v1.0.xsd">
<Who>
<PublisherID>ivo://mssl.ucl.ac.uk</PublisherID>
<Contact>
<Name>10243</Name>
<Institution>10244</Institution>
</Contact>
<Date>2005-11-05T12:00:00</Date>
</Who>
<What>
<Param name="ARN" value="1024" />
<Param name="InstrumentName" value="TRACE" />
</What>
<Why>
<Concept>10242</Concept>
</Why>
<WhereWhen>
<ObservationLocation>
<AstroCoordSystem xmlns="http://www.ivoa.net/xml/STC/stc-v1.20.xsd" ID="HGC-UTC-TOPO">
<TimeFrame>
<Name>Time</Name>
<TimeScale>UTC</TimeScale>
<TOPOCENTER/>
</TimeFrame>
<SpaceFrame>
<Name>Solar Space Frame</Name>
<HGC/>
<TOPOCENTER/>
<SPHERICAL coord_naxes="2"></SPHERICAL>
</SpaceFrame>
</AstroCoordSystem>
<AstroCoords xmlns="http://www.ivoa.net/xml/STC/STCcoords/v1.20" coord_system_id="HGC-UTC-TOPO">
<Position2D unit="deg">
<Name>Longitude, Latitude</Name>
<Value2>148 72</Value2>
</Position2D>
<Spectral unit="Angstrom">
<Name>x-ray</Name>
<Value>600</Value>
<Error>0.8</Error>
</Spectral>
</AstroCoords>
<AstroCoordArea ID="Sun" coord_system_id="HGC-UTC-TOPO">
<TimeInterval>
<StartTime>
<ISOTime xmlns="http://www.ivoa.net/xml/STC/STCcoords/v1.20">2002-12-02T12:00:00</ISOTime>
</StartTime>
<StopTime>
<ISOTime xmlns="http://www.ivoa.net/xml/STC/STCcoords/v1.20">2002-12-02T13:00:00</ISOTime>
</StopTime>
</TimeInterval>
</AstroCoordArea>
</ObservationLocation>
</WhereWhen>
<Citations>
<EventID cite="followup">ivo://mssl.ucl.ac.uk/suavecat20</EventID>
<Description>10248</Description>
</Citations>
</VOEvent>
Future Work
New Tools
In addition to the existing add, search, and return VOEvent packet facilities, further tools will be developed to add functionality to this solar event catalogue. First, functionality to delete and edit event entries will be added to the catalogue. Next, a series of event correlation tools will use the solar VOEvent ontology to link related events.
The delete functionality may be approached in two ways; either events could be removed by deleting rows from the catalogue database, or the event entry could remain in the database but be set to “inactive”. The first approach has the advantage that as scientists become accustomed to adding events to the catalogue, events entries made in error can be easily removed without cluttering up the database. However, this approach is open to accidental or malicious deletion of valid event entries. The second approach guards against this problem as any event may be reset to “active” at any time. The concept of active and inactive catalogue entries reflects the philosophy of resource entries in VO registries. Also, users wishing to publish a VOEvent packet retraction will be able to cite inactive events that remain in the catalogue.
Edit functionality may be difficult to implement as an asynchronous tool. Ideally, a user could search the catalogue for the event to be edited, select the event, and generate a text field form pre-populated with the existing data for the event entry. However, the asynchronous nature of the
AstroGrid workflow system’s interaction with CEA applications prevents the return of a pre-populated form for resubmission. One possible implementation would be for the user to generate a VOEvent packet using the existing tool, edit the VOEvent packet outside of the workflow, and submit the edited VOEvent packet to a CEA application that would extract the relevant information in order to update the catalogue database.
Aside from edit and delete functionality, a set of event correlation tools will be developed to use the solar VOEvent ontology with the catalogue. The first such tool will correlate events observed with different instruments; not only will this involve events within
SuaveCAT, but the tool should also examine events reported through the EGSO SEC. The next correlation function will associate event entries for related solar flares and coronal mass ejections. A third tool will attempt to assign event classifications such as flare, wave, or CME if none has been provided by the event reporter. These correlation tools will investigate whether software using an ontology can uncover richer relationships in a VOEvent context than software using database queries.
Full Ontology
As the solar VOEvent tools are developed, it may become apparent that a fuller VOEvent ontology would be more powerful than an ontology restricted to the needs of a single event catalogue. The main areas for expansion are creating and importing an STC ontology, using a unit ontology, and making greater use of the How element. The combination of full VOEvent and STC ontologies would open the ontology to a wide range of uses with astronomical and solar terrestrial physics events. This could benefit the solar VOEvent catalogue by allowing correlation of solar events with atmospheric magnetic and plasma events; alternatively, solar events could be catalogued and compared with stellar events in a broader catalogue.
The next ontology steps will be the development of an STC ontology and a full VOEvent ontology.
References
- Yohkoh SXT TRACE Flare List, http://www.lmsal.com/nitta/sxt_trace_flares/list.html, Updated 8 March 2002, Viewed 13 November 2005.
- Hessi Flare List, http://hesperia.gsfc.nasa.gov/ssw/hessi/dbase/, Updated 13 November 2005, Viewed 13 November 2005.
- NOAA SGAS Energetic Event List, http://www.nwra-az.com/spawx/listsgas.html, Updated 13 November 2005, Viewed 13 November 2005.
- EGSO SEC, http://sec.ts.astro.it/sec_ui.php, Viewed 13 November 2005
- “Publisher’s Astrogrid Library Overview” (DSA). http://www.astrogrid.org/maven/docs/HEAD/pal/index.html, Updated 5 November 2005, Viewed 13 November 2005.
- IVOA Status Report. IVOA Executive. http://www.ivoa.net/pub/info/, Updated May 2005, Viewed 13 November 2005.
- Auden, E. “VOEvent Ontology”. http://wiki.eurovotech.org/bin/view/VOTech/VoEventOntology, Updated 26 May 2005, Viewed 13 November 2005.
- IVOA Data Modelling Forum, http://www.ivoa.net/forum/dm/0506/date.htm, 1-28 June 2005. Viewed 13 November 2005.
- “Sky Event Reporting Metadata”, IVOA WG Internal Draft 2005-07-11. http://www.ivoa.net/Documents/WD/VOEvent/VOEvent-20050714.html, Viewed 13 November 2005.
- “Space-Time Coordinate Metadata for the Virtual Observatory”, Version 1.21, IVOA Proposed Recommendation 15 March 2005. http://www.ivoa.net/Documents/PR/STC/STC-20050315.html, Viewed 13 November 2005.
- Gasevic, D. “UMLtoOWL: Converter from UML to OWL.” http://afrodita.rcub.bg.ac.yu/~gasevic/projects/UMLtoOWL/. Viewed 13 November 2005.
- Auden, E. “Creating Ontologies from UML Diagrams”, http://wiki.eurovotech.org/bin/view/VOTech/OntologiesFromUML, Updated 17 October 2005, Viewed 13 November 2005.
- “Poseidon for UML”, http://www.gentleware.com/index.php, Viewed 13 November 2005
- VOEvent 1.0 Schema, http://www.ivoa.net/internal/IVOA/IvoaVOEvent/VOEvent-v1.0.xsd, Viewed 13 November 2005
- “The Protégé Ontology Editor and Knowledge Acquisition System,” http://protege.stanford.edu/, Viewed 13 November 2005
- Horridge, M. “A Practical Guide to Building OWL Ontologies Using the Protégé-OWL Plugin and CO-ODE Tools Edition 1.0”, http://www.co-ode.org/resources/tutorials/ProtegeOWLTutorial.pdf, Updated 27 August 2004, Viewed 13 November 2005.
--
ElizabethAuden - 13 Nov 2005