Batch Upload, Export, and Revise

Category : Content Management

Digital Commons allows administrators to upload and revise multiple records at a time. Using the batch feature, administrators may upload, export, or revise an Excel spreadsheet containing metadata for submissions within a publication. As metadata requirements differ from publication to publication, the batch upload feature is tied to individual publications to ensure metadata integrity. For all Digital Commons publication types, administrators have the ability to upload both metadata and full texts.

New records can also be uploaded to the repository using XML batch import.  If you are interested in this method, please contact dc-support@bepress.com for more information.

Common Use Cases

Some common use cases for batch features:

  • Administrators may use batch features to migrate records from other repositories, databases, websites, or related services.
  • Department administrators can use batch features to import content provided via CV or other faculty-based lists.
  • Journal editors can use batch features to make their back content available.
  • Institutions can use batch features for ETD migrations.
  • Metadata librarians may use batch revision features to add a new metadata field to an existing collection.

Prerequisites Before Getting Started

Permissions:  If you are unable to access the batch tools, check with Consulting Services to ensure you have the appropriate administrative permissions.

Excel Spreadsheets: For batch upload, export, and revise, you’ll need to be able to open and work with Excel spreadsheets (specifically, .xls files). The following instructions will explain how to prepare and upload Excel spreadsheets to add or alter records on your Digital Commons repository.

Metadata: Most metadata can be entered freely into the spreadsheet. Some fields, like “Document Type,” rely on controlled lists. For these fields, specific values are required.  If any fields on the upload form use controlled lists, please check with Consulting Services to ensure you have the correct values.

Full-Text Files: To import full-text files to the repository, the files must be located on publicly accessible servers.

Batch Upload

Each publication (e.g., series, journal, image gallery) in Digital Commons has a unique Excel spreadsheet. Start the batch upload process by navigating to the publication where you wish to perform the import.

Steps to batch upload:

  1. From the publication’s Manage Submissions tab, click on the Batch Upload Excel sidebar link.
  2. Click Download to save the batch import spreadsheet to your computer.
  3. Complete the spreadsheet, one record per row, with the metadata you wish to include. The columns highlighted in red are required for each record. Hover your cursor over the column headers for information on each field.
  4. When you have completed the spreadsheet, return to the Batch Upload Excel screen in Step 1, and upload the Excel file using the Choose File and Upload buttons.
    • If the upload is successful: you will receive a confirmation email.
    • If there is a problem validating your spreadsheet: the system will return an error message either on-screen or in an email asking you to correct the error before the spreadsheet is accepted.
    • If the spreadsheet is accepted but there are problems with specific articles: The system will import successful records and create placeholders for the remaining records until you can correct them. The email confirmation will include a link for revising those submissions and a list of the errors. Use the spreadsheet provided via the link and follow the steps below for revising placeholder submissions (note: fixing and resubmitting a partially imported spreadsheet instead will result in duplicates).
  5. After you receive the confirmation email, you may preview the records:  Go to the Manage Submissions screen, change the Status to “Queued for update,” and then click a title to preview the record.
  6. If you are satisfied with the preview, press the Update link for your context (e.g., Update ir_series) to make all currently queued records publicly available.

Revising Placeholder Submissions for Unsuccessful Items

An import spreadsheet can be accepted even though individual submissions have errors and are not imported. If this happens, placeholder submissions will be created for the unsuccessful items and you’ll have the opportunity to batch revise the placeholders to fix the errors.

  1. You’ll receive a partial success email if only some of the records import. The email will contain a link to a batch revise spreadsheet with the placeholder submissions, along with a list of the errors that require correction. Click the link to download the spreadsheet, which will also include any other placeholders that aren’t yet revised from recent imports.
  2. Review the errors listed in the email and correct the corresponding metadata in the spreadsheet. The “original_upload_rownum” column at the right of the spreadsheet lists the original row number of failed articles in the batch import spreadsheet to help locate values that needs correcting. Leave the right-hand, gray columns intact for system use.
  3. Save the spreadsheet and return to the Manage Submissions screen in the browser. Click the Batch revise Excel link in the left sidebar.
  4. Select your saved file using the Choose File button and click Upload to submit your corrected batch revise spreadsheet.
  5. You’ll receive an email prompting you to accept the revisions. Once the placeholder submissions are revised, they will be queued to go live at the next update. For detailed batch revision instructions, see “To batch revise” below.

Batch Export and Revise

Administrators may use the batch revision feature to export or revise multiple records at once. Revising is a two-step process where the metadata is first exported, then modified via an Excel spreadsheet. New submissions can also be added to a batch revision spreadsheet for import.

To batch export:

  1. Click the Batch Revise Excel sidebar link located on the Manage Submissions screen for the publication (e.g., series, image gallery, event community, journal).
  2. Click Generate.  A spreadsheet containing the current metadata will then appear at the top of the “Spreadsheet History” list. In publications containing 10,000 submissions or more, exports are broken up into multiple spreadsheets of 5,000 items each. If previous spreadsheets are also present, you may use the date/time stamp to confirm the correct spreadsheet to export.
  3. Click the Download link to complete the export and save the file to your computer.

To batch revise:

You will first generate and download a batch export spreadsheet following the above steps.

  1. Using the spreadsheet from your batch export, enter changes and any new submissions you’d like to add. The columns highlighted in red are required for each record, and the right-hand, gray columns should be left intact for system use. When you have finished, return to the Batch Revise Excel screen in Step 2 of your batch export, and upload the revised file.

    Tip: When submissions are highlighted in yellow, this means author information has been truncated on the spreadsheet. Author information that is not displayed for these submissions must be edited through the Manage Submissions screen.
  2. You will receive an email with a summary of the changes and additions. Use the links in the email to accept or cancel. If there are formatting errors in the metadata, you will receive an email notification indicating the nature of the error. Once you have corrected the error, upload the spreadsheet and await a summary email.
  3. After you click the summary email’s Accept Changes link, the system will process the spreadsheet and you’ll receive a confirmation email with links to preview or update the publication.
  4. If you are satisfied with the preview, press the Update link for your context (e.g., Update ir_series). Revisions to live submissions, and new submissions entered via the spreadsheet, will become publicly available. Submissions that were unpublished as of Step 2 will be revised but will remain unpublished after you update the site. You may post them using your usual workflow or leave them unpublished.

Checking the Status of a Batch Import or Revision

Click the Batch Status link in the left sidebar within Manage Submissions to check on a batch import or revision. You may use this tool to check on batch revisions for placeholder submissions created in a previous import. The batch status tool stores information about your batch imports and revisions for 30 days.

If you have questions or encounter any difficulty, please contact us at dc-support@bepress.com or weekdays at 510-665-1200, option 2, 8:30 a.m.−5:30 p.m. Pacific time.

Common Fields for Batch Upload and Revise

If you would like to add a field to your spreadsheet or need assistance with custom fields, please contact bepress Consulting Services.

abstract – The abstract/description for the article.

author1_fname – First author/creator’s first name, or given name. See also author1_is_corporate to use this field for corporate authors.

author1_mname – First author/creator’s middle name.

author1_lname – First author/creator’s last name, or family name.

author1_email – First author/creator’s email address.

author1_institution – First author/creator’s institution. Additional author/creators will be entered as author2_fname, author2_mname, etc. Four authors are currently included, but more authors may be added by inserting more author columns.

author1_is_corporate – Indicates whether a given author is a corporate or institutional entity. The default is FALSE. To indicate a corporate author, enter TRUE in this cell, and enter the name of the institution in the author1_fname cell.

acknowledgments – The cover page footnote/acknowledgments.

comments – Additional information, notes, or acknowledgments.

create_openurl – The default is 0, which does not create an OpenURL for the article. To create an OpenURL link for an article, place a 1 in this cell.

custom_citation – Used when the series needs a specific, custom citation style rather than the default.

degree_name – Name of the degree associated with the work. (example: Masters in Operations Research).

degree_type – The type of degree. For ETDs, this will generally be entered as: thesis or dissertation.

department – The department associated with the ETD or other work.

disciplines – Please separate disciplines with a semicolon (e.g. Arts and Humanities; American Film Studies). See the master list of disciplines which also includes best practices for choosing disciplines.

distribution_license – Creative Commons license for the work. Available values appear in bold below. For CC 3.0 values, replace “4.0” with “3.0” in the following:

  • http://creativecommons.org/licenses/by/4.0/ – Attribution 4.0
  • http://creativecommons.org/licenses/by-sa/4.0/ – Attribution-Share Alike 4.0
  • http://creativecommons.org/licenses/by-nd/4.0/ – Attribution-No Derivative Works 4.0
  • http://creativecommons.org/licenses/by-nc/4.0/ – Attribution-Noncommercial 4.0
  • http://creativecommons.org/licenses/by-nc-sa/4.0/ – Attribution-Noncommercial-Share Alike 4.0
  • http://creativecommons.org/licenses/by-nc-nd/4.0/ – Attribution-Noncommercial-No Derivative Works 4.0

document_type – The document type for each record. This is specific to each publication. Please contact your consultant for a list of available values.

embargo_date – The date the record will be publicly available. For batch uploads to publications using the default list asset, enter 0, 365, 540, 730, or 1095 to indicate the number of days until an embargo expires. For batch revisions, enter the date of expiration in YYYY-MM-DD format; if embargo date is a required field but the record is not under embargo, enter today’s date to proceed.

fulltext_url – The URL of the main document (usually ending in a file extension like .pdf, .doc, .docx, .jpg, .gif, .png, .bmp). If the file is on a publicly accessible server, the system will copy the file at the URL provided and store it in this record. A full-text file is required for batch imports to image galleries.

identifier – Common for images. A unique ID value for the resource.

keywords – Please separate keywords/keyword phrases with commas, unless a different delimiter has been set up for this field in the target publication.

latitude – Latitude for the geolocation feature (needs to be valid and verified before the batch upload form is completed).

longitude – Longitude for the geolocation feature (needs to be valid and verified before the batch upload form is completed).

multimedia_url – The URL of the streaming media file.

multimedia_format – When using streaming media, this is the format of the media file. Please use the values below. The format options are:

  • embedly – Other rich media
  • flash_audio – Flash Audio (m4a, mp3)
  • flash – Flash Video (flv, mp4, RTMP)
  • qt_audio – QuickTime Audio (aac, aif, mid, midi, mov, wav)
  • quicktime – QuickTime Video (3g2, 3gp, mov, mpg, mpeg)
  • real_audio – RealAudio (ra, ram, rm)
  • real_player – RealVideo (ram, rm, smi, smil)
  • swf_object – SWF format
  • vimeo – Vimeo
  • windows_audio – Windows Media Audio (wma)
  • windows_media – Windows Media Video (avi, wmv)
  • youtube – YouTube

publication_date – The publication date of the record. Please use YYYY-MM-DD format to ensure Excel recognizes and preserves the date correctly for import.

rights – Copyright and/or usage rights information, often included for images.

season – The season corresponding to the publication date. Values are: Winter, Spring, Summer, and Fall.

subject_area – The subject area for each record. This will be specific to the publication.

title – The title of the record

The following fields will appear at the right end of the Batch Revise spreadsheet in gray and are for system use only:

calc_url  Unique URL generated by the system for each record.

context_key  Unique identifier generated by the system for each record.

issue – Publication (e.g., series, image gallery, journal, event community) identifier. If including new submissions with a batch revision, copy the value that appears for existing records into the issue column for the new records.

ctmtime  Unique time stamp for each record.

original_upload_rownum  The original row number of a batch import spreadsheet where an unsuccessful submission was located. Used as a reference for revising placeholder submissions in a batch revise spreadsheet.

Related Resources

Print this resource.