Topic: How to prepare data from Excel

Topic type:

Part of the Import Guide. Preparing the data and how to format it in an Excel spreadsheet

Part of the Import Guide. Based on a guide originally written for Aotearoa People’s Network Kaharoa.

 

Here's what you'll need to do so that your import runs smoothly into Kete:

Use Excel 2003 or later

Excel 2003 and later have the option to export the data in XML. The XML format solves a number of problems that earlier formats, such as CSV have. We'll come back to exporting to XML in the next section.

Create one spreadsheet per document

Kete only handles Excel documents that contain only one spreadsheet! If you have multiple sheets in your Excel documents, split them into separate Excel documents and treat them individually. 

Describe one type of item per speadsheet

Kete only supports adding one type of item in a given import! Your rows should describe only topics or only images or only audio recordings, etc. If you have items of many different types, you will need to split them into separate Excel documents.

If you are importing what will become a topic in Kete, it has a topic type such as Person or Event. Each import of topic items can correspond to only one topic type (i.e. your import may be about events, not events and people).

Create a row of column names

Start your spreadsheet with a row that has the name for each column. If the column name has more than one word, separate them with the _ character rather than a space. Also, avoid punctuation or other special characters in the column names, e.g. use "reference_number" not "Reference #".

A "File" column if necessary

If you are importing a set of image, audio recording, video, or document files into your site, you'll need to have a column which tells Kete where to find the file for the row in your import folder. Kete will assume the file is found under the import folder's files subfolder and will simply look for whatever filename you specify in the File column there. The filename should include the extension.

In other words, if you specify "image1.jpg" in the File column for a row, Kete will look for it at import_folder/files/image1.jpg. 

Kete can handle filenames that include a folder location, or path, in the File column as long they fit the following criteria:

  • the folder location is notated using forward slashes ala the conventions of HTML and URLs, e.g. parent_folder/child_folder/
  • the folder location is relative to the files subfolder within the import folder, e.g. 001/image1.jpg in the File column for a row means that the file is at import_folder/files/001/image1.jpg

In Kete 1.3 and above - Create a "Folder" column that will enable the generation of a PDF document if necessary

As of version 1.3, Kete has the ability to create a PDF document from a folder of JPEG or PNG files. This is especially good for creating one document from a set of images that are individual scans of pages from a manuscript.

To use this feature do the following:

  1. group your images in a folder, named for what you want the PDF document's filename to be (without the .pdf which will be added during the process), making sure that filenames will list alphanumerically in the sequence you want your PDF's pages to be ordered, e.g. import_folder/files/manuscript_1/page1.jpg, import_folder/files/manuscript_1/page2.jpg, import_folder/files /manuscript_1/page3.jpg, etc.
  2. in your spreadsheet, create a Folder column and put the folder name the contains the images files for the row in it

A Note on Cell value formats:

In many cases you simply want to import text from Excel to Kete and cell values will transition smoothly between the two. However, sometimes you may need to keep in mind what format Kete expects your data to be in.

For data that is meant to end up in Kete's standard fields (title, short summary, and description) Kete will expect text. For the standard tags field it expects the tag values to be separated by commas if there is more than one tag.

However, the field's constraints on what is a valid cell value are determined by whatever settings are configured for it. If your data uses extended fields then you'll need to make sure it will work with the particular extended field's configuration. For example, if a cell's value is meant to end up in an extended field that is set to only accept dates, Kete will expect cell values in the format YYYY-MM-DD, e.g. "2009-10-27".

You'll want to keep the relationship between your Excel cells' value formats and your target Kete fields in mind.


Exporting the Excel spreadsheet's data to an XML file that Kete can understand

Once you have gotten your spreadsheet formatted for Kete, you are ready to export it to XML.

  1. Choose Save as from under the Office button in Excel 2007 or under the File menu in Excel 2003
  2. For Save in, choose your import folder
  3. Choose the XML Spreadsheet 2003 (*.xml) from the Save as type field.
  4. Make the File name field's value be "records.xl.xml"
Important Note: Do NOT choose the 2007 XML format as it won't work with Kete.

Discuss This Topic

There are 0 comments in this discussion.

join this discussion

Creative Commons Attribution-Share Alike 3.0 New Zealand License
How to prepare data from Excel by Walter McGinnis is licensed under a Creative Commons Attribution-Share Alike 3.0 New Zealand License