HEPData is an open-access repository for
scattering data from experimental particle physics. It includes data points from several
thousand publications produced by multiple Collaborations working in High Energy and Nuclear Physics,
and is hosted by CERN as a part of its Open Data initiative.
The PHENIX Collaboration is using this platform as one of the principal components of its
Data and Analysis Preservation (DAP) effort
and manages a growing collection of HEPData entries.
By the policy established by the PHENIX IB, every paper containing tables and/or plots must be
accompanied by a data package containing the tables and/or plots data before it is approved
for publication. Please see the official policy document (sec. IV.iv):
Conform to the specific format required by the HEPData portal
(please see the documentation)
Be certified by the IRC for each publication
Have an Insipre ID associated with it
In order for data to be successfully uploaded to the HEPData portal, it must conform to a specific format
(please check the HEPData site for documentation). Existing text files can be converted to the HEPData
format with some effort. The DAP team is looking at technical solutions to facilitate this process.
For example, if plots are generated using ROOT macros the code can be instrumeted to output same
data in a format compatible with HEPData. There is a helpful write-up about preparing data for upload:
How to make HEPData input (C. Nattrass)
It is important to provide the Inspire ID for the submission as this is required for the upload to work.
Including the PHENIX-internal PPG identifier is highly recommended as it reduces the chances of human
error and facilitates communication. Both Inspire ID and the PPG identifier can be easily incorporated
in the comment field of the submission.yml file which is an integral part of the submission package.
Each collaboration using the HEPData portal has a
registered on that Web resource. At the time of writing, PHENIX has delegated this
responsibility to M.Potekhin (potekhin_at_bnl_dot_gov).
The IRC is responsible for QA of the data. The IRC selects one of its members as the official
reviewer of the data uploaded to the HEPData portal who gives the final approval before the data goes live.
The submission package for a given publication is prepared in the form of properly formatted YAML (and optional PNG) files.
The HEPData portal provides adequate documentation on this and other subjects.
There is a mandatory submission.yml file describing the contents of the package, which
allows optional comments. It is strongly recommended that the comments include
The Inspire ID of the publication
The internal PPG ID
The name and e-mail address of the designated IRC member for final approval
There is a sandbox feature on HEPData which allows to validate the submission package
and in particular whether the LaTeX-formatted abstract is rendered correctly.
Please use it. It requires an account on HEPData which is trivial to obtain.
Once you log into the portal, the “sandbox” option will be clearly marked in the menu in the upper right corner of the Web page.
The sandbox won’t be visible to anyone without the link generated by the system, so the data are protected in this manner.
Create a fork of the “documentation” repository on GitHub (easy to do in the Web UI) and clone the resulting repository
Check if the correct “ppgXXX” folder exists, if not create it and add it to your repository. ‘XXX’ stands for the PPG serial number; populate the folder with your HEPData submission files
Do “git commit .” and “git push” to place the material on GitHub (do not push to “master”), and create a pull request on the GitHub website in order to merge your addition into the repository
The DAP team then uploads the package to HEPData and notifies the designated IRC member
that they need to review the data already uploaded and issue their final approval. This is done via a Web link.
Each table or plot is approved separately by the designated IRC member.
Only after each item in the submission is approved the PHENIX HEPData coordinator
can finalize the submission. At this point, it becomes globally visible on the HEPData portal and the process is complete.