Content Inventory

Overview

Generate reports about the metadata and amount of content in your site’s publications using the Content Inventory.

The Digital Commons Dashboard, also available to administrators, makes it easy to share with others popular information about a repository: the number of works, number of downloads, and information about your readership. The Content Inventory is more granular, ideal for those who crunch metadata, seek to unify descriptions, and identify structures that might be stagnant or empty.

The Content Inventory offers administrators two options for reports at the site level.

  • The Structure Report helps to compare publications by amount of published content.
  • The Metadata Report provides an export of your repository’s metadata or any collection within it.

Only the Metadata Report is provided at the publication level.

Access the Content Inventory

Click the Content Inventory tab at the top of the Configuration screen to access the tool.

Site level: Access to the site-level Content Inventory, which includes both the Structure Report and the Metadata Report, requires the View Digital Commons Dashboard permission.

Publication level: Access to a publication-level Content Inventory, which provides a Metadata Report for just the specified publication, requires both the View Digital Commons Dashboard permission and the Can See All Submissions permission.

More on enabling permissions here.

Run a Structure Report

Available at the site level only.

To view the amount of content per publication:

  1. Visit the site-level Configuration page at http://DOMAINHERE/cgi/user_config.cgi, then click the Content Inventory tab.
  2. If you want to include collected content (such as from other repositories or SelectedWorks profiles), select Include collected content in your report(s) and click Save Changes.
  3. Click Generate Structure Report.
  4. You’ll receive an email with a link to download an Excel report. It will display a row for each publication with a column showing the number of published items. The link will expire in two weeks.

The report will be emailed to your login address. The report will be in Excel with one column for the URLs of your publications, and one column with the number of works per publication.

Run a Metadata Report

Available at the site and publication levels.

At the site level, administrators may export all of the metadata in the repository:

  1. Visit the site-level Configuration page at http://DOMAINHERE/cgi/user_config.cgi, then click the Content Inventory tab.
  2. If you want to include collected content (such as from other repositories or SelectedWorks profiles), select Include collected content in your report(s).
  3. If you want to include not yet posted submissions in the report, be sure to select Include unpublished (withdrawn, rejected, etc) content in the metadata report.
  4. On the View/Edit Metadata tab, select the metadata fields that you would like included in the report. Note that you’ll see both the back-end Field Name and the front-end Display Name. Some of our system’s special and miscellaneous fields appear as well, such as abstract_format, but you may skip selecting them if desired.
  5. On the View/Edit Publications tab, select the publications you want to include in the report. Note that you can expand a community to see and select the publications grouped within it.
  6. Click Save Changes to save your selections for the present report and also future reports until you save changes again.
  7. Click Generate Metadata Report.
  8. You’ll receive an email with a link to download an Excel report. It will display a column for each field you selected (using the back-end Field Name), as well as some additional columns, such as the URLs where the submissions are posted. The link will expire in two weeks.

At the publication level, you can export just the metadata of that publication:

  1. Click the Content Inventory tab for your publication.
  2. If you want to include collected content (such as from other publications or from SelectedWorks profiles), select Include collected content in your report(s).
  3. If you want to include not yet posted submissions in the report, be sure to select Include unpublished (withdrawn, rejected, etc) content in the metadata report.
  4. On the View/Edit Metadata tab, select the metadata fields that you would like included in the report. Note that you’ll see both the back-end Field Name and the front-end Display Name. Some of our system’s special and miscellaneous fields appear as well, such as abstract_format, but you may skip selecting them if desired.
  5. Click Save Changes to save your selections for the present report and also future reports until you save changes again.
  6. Press Generate Metadata Report.
  7. You’ll receive an email with a link to download the report in Excel format. The link will expire in two weeks.

At the publication level, the interface will show the metadata fields of the context you selected, plus fields used by content collected from other publications.

Types of Fields in the Metadata Report

The list of available metadata in the Content Inventory includes four types of fields:

  • Default metadata fields
  • Custom fields enabled by request
  • Special fields the system uses to store basic information about submissions
  • Miscellaneous fields that may appear depending on repository content

Each type of field is described below.

Default metadata fields

Series and other publication structures in Digital Commons feature a default set of metadata fields. These are included in your repository or publication Content Inventory.

Common metadata fields like title and publication_date appear in multiple publication types, while some fields only appear in certain structures–such as the publisher field in book galleries. To see a list of default fields for each type of publication structure, please refer to Metadata Options in Digital Commons and the downloads available on that page.

Custom metadata fields

When administrators or editors request additional fields beyond a publication’s default fields, those fields are available in the Content Inventory. Custom fields may include completely unique fields as well as optional fields that are already configured in Digital Commons, but only enabled by request. Such optional fields include DOI, Creative Commons License, and Embargo Date.

See Metadata Options for Digital Commons for more information about custom fields.

Special fields

Below is a list of special fields that appear in Content Inventory emailed reports, but which don’t display in the Content Inventory tool. Reports generated with no metadata fields checked in the tool will result in a spreadsheet limited to the following fields.

  • native_filename: The name of the non-PDF file uploaded for the submission, if applicable.
  • native_filesize: The size of the non-PDF file uploaded for the submission, if applicable.
  • pdf_filename: The name of the PDF file uploaded (or created by our automatic PDF conversion), if applicable.
  • pdf_filesize: The size of the PDF file uploaded (or created by our automatic PDF conversion), if applicable.
  • supplemental_filenames: The filename(s) of any supplemental files uploaded for the submission. If there are multiple files, they will all be listed here, separated by commas.
  • supplemental_filesizes: The file size(s) of any supplemental files uploaded for the submission. If there are multiple files, their sizes will all be listed here, separated by commas.
  • state: The current state of the submission (e.g. published, queued, etc.).
  • front_end_url: The URL of the article page. (This is duplicated by the calc_url field.)
  • download_url: The URL of the PDF or native download or, if the article has a link to full-text, the external URL the metadata page is linking to.
  • uploader_userid: Account number of the user who uploaded the submission. Generally for bepress use.
  • uploader_email: The email address of the user who uploaded the submission.
  • context_key: The internal system reference number used for the submission. Generally for bepress use.
  • issue: The system shortname of the series or other publication where the article has been published.
Miscellaneous fields

Other system and internal formatting fields may appear in the Content Inventory, depending on the types of content in your repository. Below are some common examples.

  • abstract_format: Formatting field that controls display of the abstract or description field. The default value of “html” appears in most instances.
  • calc_readers_response: The URL of the published response to an article, when the response feature is enabled for a publication.
  • publication_date_date_format: The selected date format (e.g., MM-DD-YYYY) for the publication a submission is published in.
  • response_to_url: The URL of the original article that a published response refers to, when the response feature is enabled for a publication.
  • source_fulltext_url: Shows the URL if there is a link out to full text. This value will also appear in the “download_url” field.
  • Certain fields appear empty in metadata reports, including file_list, preview_image, and calc_thumbnail_image_url. These are used internally by the system to format elements in some publication types.