Help

Table of Contents

Searching

You can perform a simple search by typing keywords in the search box and clicking "Search". The search engine will return results that include all of your search terms both in the OCR text of newspaper articles and the metadata of photo archives. English keywords will return English articles and metadata only.

You can search using both traditional and simplified kanji keywords interchangeably. For example “大學” and “大学” yield the same results.

Boolean operators AND, OR and NOT can be used to refine your search results. AND (include all of the words) and NOT (without the words) narrow your search; OR (with at least one of the words) broadens your search. For example, plymouth NOT new will retrieve articles about Plymouth but not New Plymouth. You can group clauses using parentheses, for example (hamilton OR waikato) AND river.

Advanced search

The newspaper advanced search allows you to limit your search results by:

  • One or more publications
  • A date range

It also allows you to search within full text/comments, choose the number of search results you want displayed on each page, and choose whether you would like text or image previews displayed with your search results.

The Nippu Jiji photo archives advanced search allows you to limit your search results by:

  • Location
  • A year range of the photograph taken or published
  • Object genre
  • Subject identified on images

The combination of a simple keyword search and advanced search allows you to further narrow your search. For example, if you type “Scenery-Hawaii” in the keyword search box and choose a year range of 1906-1929 and “photograph” in the object genre, the search results generate photographs from 1906-1929, belonging to the “Scenery-Hawaii” collection.

Advanced query syntax

Query terms can be boosted to increase their importance in the search, changing the order of the search results. This is done by adding "^" and a boost factor at the end of the term, e.g. hamilton river^2 will treat "river" as more important than "hamilton" when ranking the search results returned.

Wildcard searches can be performed by including "?" (single character wildcard) or "*" (multiple character wildcard) in the query term. For example, hamilt* will match all words starting with "hamilt".

Fuzzy searching can be done by adding "~1" at the end of individual terms, e.g. roam~1 will find terms like "foam" and "roams" as well as "roam". This can help to compensate for errors in the text due to the Optical Character Recognition process.

Proximity searching allows you to search for words that appear close together in the text. For example, "John Smith"~3 will find results containing both the words "John" and "Smith" where they are no more than 3 words apart. So as well as finding "John Smith" it will also find "John J. Smith", "John Frederick Smith", "John Fullerton-Smith", and even "Smith, John".

Optical Character Recognition

Optical Character Recognition, or OCR, is a process by which software reads a page image and translates it into a text file by recognizing the shapes of the letters (The NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials).

OCR enables searching of large quantities of full-text data, but it is never 100% accurate. The level of accuracy depends on the print quality of the original issue, its condition at the time of microfilming, the level of detail captured by the microfilm scanner, and the quality of the OCR software. Issues with poor quality paper, small print, mixed fonts, multiple column layouts, or damaged pages may have poor OCR accuracy.

The searchable text and titles in this collection have been automatically generated using OCR software. They may not have been manually reviewed or corrected.

To look at the OCR text, click on the "Show computer-generated text for this article" link on the article page.

How to correct text

The text correction interface is accessed by clicking the "Correct this text" link when viewing section text. "Transcribe now" appears in handwritten articles. This interface is split into two parts: the right side shows the page images that make up the document, and the left side is used for editing the lines of text.

When you move your mouse over the page images in the right pane, the blocks making up the pages will highlight. You can scroll this view by dragging with the mouse or zoom in/out using the buttons above the viewer. Clicking a highlighted block will select it and load a form for editing that block into the left pane.

Correct the text line by line. A red box is displayed in the right pane to help you determine what text should be included in the line. Once you have finished correcting text, click "Save". The changes you make will take effect immediately. Alternatively, clicking the "Cancel" button will discard any unsaved changes you have made.

You can then make further corrections to the same block, move onto the next block by clicking the "Next" button, select another block in the right pane, or exit the text correction view by clicking the "Return to viewing mode" link. Clicking "Save & exit" instead of "Save" will save the changes and then return you to the normal viewing mode automatically.

Hint: Many web browsers include spell-checking functionality and this can assist with your text correction by identifying misspelt words. If your web browser does not have this functionality, it's likely there is a spell-checking add-on available (see your web browser's help for information on how to install add-ons).

Users registered at the Hoji Shinbun Digital Collection can contribute to the metadata of the Nippu Jiji Photo Archives, by identifying locations, dates, and subjects that appear on each photograph.

English and Japanese Descriptive Metadata

Nippu Jiji organized photographs and related newspaper articles and documents into titled envelopes. We created English and Japanese descriptive metadata to increase the searchability of the photographs and related information, using the original Nippu Jiji or Hawaii Times descriptions recorded on the envelopes and the backs of the photographs. In addition to the original images from the Nippu Jiji Photo Archives, we included images from the referred publication pages when available. We explain more details of each metadata category below.

Collection:

The majority of the collection names were taken from the original Nippu Jiji categories although some were absorbed into larger collections after the rescue of the photo archives. We also created two new categories, “Animals” and “Plants” out of the envelopes originally labeled as “Miscellaneous” to increase the discoverability. Other than these new collections, we minimized further reorganization of the digital images to be consistent with the organization of physical photos. Some images, therefore, may appear out of place with the collection names.

Title:

We used the original Nippu Jiji or Hawaii Times titles and added additional basic information recorded on the envelopes. We described locations and dates in separate metadata categories. We wrote names in the alphabet in the order of first and family names, and those in Japanese in the order of family and first names. When American names were included, we kept the order of first and family names even in Japanese. We did not use macrons to differentiate long from short vowels to ensure better searchability. We used historical transliteration where available, and most commonly used transliteration when unavailable.

Date:

The date written on each envelope or photograph was entered into “Date” in a YYYYMMDD format while any estimated dates were assigned to “Approx. Date”. We used multiple clues to derive the approximate date. If the dates are completely unknown, known date ranges, such as the founding of Nippu Jiji (November 3, 1906), the Japanese attack on Pearl Harbor (December 7, 1941), and the bankruptcy of Hawaii Times (May 8, 1985), were applied.

Locations:

The locations written on each envelope or photograph was entered into “Country,” “State,” “City,” and “Other Geographical Location”. For example, if “Honolulu” is written on an envelope, “United States,” “Hawaii,” “Honolulu,” and “Oahu” were entered in the location fields. If a location is not written on the photograph, a written location from the envelope was entered into the approximate location fields. For locations in Hawaiʻi, we did not use okina for better searchability. Countries were entered based on the modern geography.

Printing

Articles can be printed directly from your web browser.

If available, PDF versions of issues and pages can be downloaded for printing.

Identifying Locations, Dates, and Subjects of Photographs

Users registered at the Hoji Shinbun Digital Collection can contribute to the metadata of the Nippu Jiji Photo Archives, by identifying locations, dates, and subjects that appear on each photograph.

IIIF

The Hoji Shinbun Digital Collection is International Image Interoperability Framework (IIIF) compatible, providing interoperability of images between image repositories and making it easier for users to view, compare, manipulate, and annotate web-based digital images. Users can perform these functions by using a viewer. Mirador is an IIIF viewer that allows users to open and compare multiple images and annotate them.

Please drag and drop the IIIF icon of the images you would like to view and or annotate onto Mirador.

Technical requirements

In general, you only need a common web browser like Chrome, Firefox, Internet Explorer, Safari, Opera or Microsoft Edge to search and browse this collection. To view or print PDFs, you will also need a PDF viewer like Adobe Reader.

Contact information

For web-related inquiries, please contact hojishinbun_support@stanford.edu.

Website copyright

Powered by the Veridian platform. © 2008-2020 DL Consulting. All rights reserved.