Patent Help

Patent Lens full-text worldwide patent search engine

Contents

A. Search Overview
B. Patent Collections
C. Boolean Syntax
D. Document Sections
E. Patent Number Search
F. Example Searches
G. Advanced Search Features
H. INPADOC Features
I. Search updates by RSS Feed
J. Limitations and Caveats

A. Search Overview

  1. Choose the "interface" that best suits your searching needs (by selecting from "tabs" at the top of the gray search area):

    • Quick search page. Allows key word searches to performed on the "full text" or one of the front page sections (title, abstract, inventor, applicant) using the appropriate syntax. In addition, particular patents may be found by their publication number (more information).
    • Structured search page. Provides a menu-driven approach to formulating more complex Boolean search expressions. Optional "restrictions" allow searches to be restricted by publication date or application date (more information).
    • Expert search page is designed for users who prefer to manually enter the full Boolean search query. Although it is more difficult to learn, it is more powerful for complex searches and is better suited to experienced searchers (more information).
  2. Select the patent collection(s) (US-B, EP-B, AU-B, WO-A, US-A, AU-A) you wish to search against.

    • US-B contains all US published granted patents (1976 - present)
    • EP-B contains all European published granted patents (1980 - present)
    • AU-B contains all Australian published granted patents (1998 - present)
    • WO-A contains all World (PCT) patent applications (1978 - present)
    • US-A contains all US published patent applications (2001 - present)
    • AU-A contains all Australian published patent applications (1998 - present)
  3. Enter a search word or a combination of words (the fulltext field is suggested as a starting point). Note, stemming is used by the search engine (more details below). The wildcard character is *, and may only be used at the end of a word (and must be preceded by at least 2 characters). Words may be combined using the Boolean operators:

    • AND
    • OR
    • AND NOT (or just "NOT" in the Structured search interface)
    • NEAR/# where the number (#) is the maximum number of word gaps between the search words, e.g. "NEAR/5" (Note: can only be used within the same "document section")
  4. [Structured & Expert pages] Select the document section you wish to search against . The default field is the "fulltext" of the entire patent. Sections include:

    • in fulltext, searches the entire text of the patent. Note that for WO-A (1990 onwards), EP-B (up to 2001) and AU-B collections, except for the front page, data are derived from the OCR text of the original document "image".
    • a particular (front page) field, e.g. in abstract, in title, in inventor, etc.
  5. Press the Search button to start the search.

    • the Reset button clears all entered values.
  6. There are a number of options available to modify the display of the Search results:

    • each page of results can be viewed in either groups of 10 (default), 20, 50 or 100 results at a time, by selecting the Items per page" popup menu.
    • The further most right field may be toggled to display either the publication date, or application date , by selecting the popup menu.
    • The results list may be sorted according to relevance ranking score (default), patent number, application date or publication date, by selecting the drop-down list labelled "Sort by".
      [Note, to turn relevance ranking OFF, select one of other three sort options: patent number, application date or publication date.]
      To reverse the sort order, click on the triangle icon at the top of the table column (clicking again toggles the order, as indicated by the direction of the triangle icon). (Not available for relevance score sorting).
    • Previous searches (last 20 searches only) are recorded under the Search History option. To re-run a previous search, select the search terms from the "Search history" pull-down menu and that search will be re-loaded.
    • Stemming (searching with word variations) can be turned on or off.
  7. Clicking on a patent number takes you to the detailed view (front page) for that patent. Where available, the "PDF" image PDF Icon and "full text" text_doc_iconversions of the patent may be viewed by clicking on their respective icons.

back to top

B. Patent Collections

The Patent Database currently contains data from:

back to top

C. Boolean Search

The BiOS patent search database uses Boolean operators. We recommend this tutorial on Boolean searching (from the Syracuse University Center for Science & Technology). Multiple search terms are required to be combined with one or more of the appropriate Boolean operators:

Because of the precedence of operators (AND takes precedence over OR), it is a good idea to use parentheses (round brackets) to group operators and search terms or you may not get your desired results. Thus: (A OR B) AND C means the same as A AND C or B AND C, in other words the document has to contain at least two of the search terms, one of them C; whereas A OR (B AND C) means the document has either A or both B and C or all three terms. Writing the term as A OR B AND C gives the same result as the first example.

The NEAR/# operator provides a constraint on finding two terms, so that a "match" only occurs if those terms are located within a given number of words (or less) from each other. The "#" number defines the maximum number of word "gaps" that can occur between the two search terms. The NEAR operator can only be used within the context of the same document section (if specified). Thus, the search "cat NEAR/4 dog" would result in a "match" for the text "the cat is chasing the dog", as "cat" and "dog" are located within 4 word gaps of each other. However it would NOT match "the cat is a smaller animal than a dog", as "cat" and "dog" are separated by 7 word gaps in this case. More examples can be seen here.

The use of parentheses (round brackets) aids readability and can be used to over-ride the precedence of operators (see examples).

back to top

D. Document Sections and Fields

Here you can select a "Document Section" or "Field" present on the front page to insert into your boolean search. This will restrict the search to that particular section or field.

Note: If the field is omitted, the search is performed against the "fulltext" of the document (i.e. the entire patent). In other words, the default behavior is to search against the entire patent document.

Syntax: "word in field", e.g. "plant in title" or "word in section", e.g., "plant in frontpage".

The searchable fields are:

The searchable sections are:

See the example searches for real-world searches that illustrate these features.

Other sections to be implemented in future versions include: summary of the invention & examples.

The front page of a patent document (either application or granted patent) contains some basic information about the document - the title, the date of issue, etc. Only some of these items are currently searchable, and not all items are common to every patent document.

In particular, the abstract is a very brief summary of the invention, which is printed on the front page. The abstract does not have any legal meaning and may or may not be an accurate summary of the patent document. In contrast to the other documents in our collections, the cover page of a European granted patent (the EP-B collection) is published by the European Patent Office (EPO) without an abstract. Similarly, the AU-A and AU-B data we receive from IP Australia, does not include abstract text (however, an abstract is usually present in "image" files and thus requires OCR). The CAMBIA patent database provides abstract text for both EP-B and AU-B patents by extracting the text from an "equivalent" patent document, where available. For EP-B patents this is either the corresponding EP-A application or the PCT international application and for AU-B, it is the equivalent PCT international application or the OCR text extracted from the patent image. In both cases, the "source" of the abstract is indicated in parentheses after the abstract text itself.

back to top

E. Publication Number

For each collection in the BiOS Patent Lens databases there is a standard format used to specify a Publication Number. Formats for each patent office are described in The WIPO standard (PDF). There is also a very comprehensive "US Patent Number Guide" by Michael White of Queen's University in Canada in pdf format.

The relevant formats for the BiOS Patent Lens database are:

Document type Format Examples

PCT applications (WO-A)

WO yyyy/nnnnn [An]
WO yy/nnnnn

WO 2001/21785
WO 01/21785

European granted patents (EP-B)

EP nnnnnnn [Bn]

EP 383808 or EP 383808 B1

US applications (US-A)

US nnnnnnnnnnn
US yyyy/nnnnnnn

US 20030121074
US 2003/0121074

US granted patents (US-B)

US nnnnnnn

US 5916570

AU granted patents (AU-B)

AU yyyy/nnnnn[n] [Bn|C]

AU 2005/200191 B2

AU applications (AU-A)

AU yyyy/nnnnn[n] [An]

AU 2002/367775 A1

Note:

back to top

F. Example Searches

The following search examples will help illustrate how to use the "Boolean operators" and "document section" syntax required to successfully search the patent database. These searches were performed against the US-B collection (as of Oct 2002).

Search Query Hits Comment

rice

14,386

If no document section is specified, the default is "in fulltext".

rice in frontpage

1,657

Finds documents that contain "rice", anywhere within the "front-page" section of the document

agri* in title

784

Finds documents that contain "agri" followed by any other characters (* = wildcard), anywhere within the "title" field of the document (will match agriculture, agricultural, agribusiness, etc.)

rice or maize

17,598

Finds documents that contain either "rice" or "maize", anywhere in the document

rice and maize

4,467

Finds documents that contain both "rice" and "maize", anywhere in the document

rice and not maize

9,919

Finds documents that contain "rice" but must not contain "maize", anywhere in the document

maize and not rice

3,212

Same as "not rice and maize", finds all documents that contain "maize" but do not contain "rice"

rice near/5 maize

2,503

A subset of the matches found for "rice and maize", where "rice" and "maize" occur with a maximum of 4 intervening words

rice near/100 maize

3,520

As above but with a maximum separation of 99 intervening words

rice and maize in title

208

A subset of "rice and maize", where "maize" appears in the title; could also have been written "rice and (maize in title)"

(rice and maize) in abstract

48

A subset of "rice and maize", where both "rice" and "maize" appears in the " abstract" field.

(rice near/2 maize) in abstract

33

A subset of "rice and maize", where "rice" appears within 2 words of "maize" , in the "abstract" field

back to top

G. Advanced Search Features

Quick Search Page

Patent Number search

Structured Search Page

Expert Search Page

back to top

H. INPADOC

The INPADOC database from the European Patent Office is a collection of bibliographic data about patent documents (both applications and granted patents), and information about the legal status of those documents. The BIOS PatentLens extracts raw INPADOC data through the Open Patent Services (OPS) web services interface provided by the EPO. The raw data are interpreted by the PatentLens and presented inside a simple user interface showing a patent document's family relationships and legal events.

The bibliographic information is contributed to INPADOC from over 70 national patent offices and regional patent organizations (eg WIPO and the EPO itself). The legal status data is more limited - in February 2005 there were 23 organizations contributing data to this part of the INPADOC database.

The BIOS PatentLens allows you to inspect the family and legal information for any document that has been retrieved by a patent search. Simply click on the INPADOC icon INPADOC icon  to take you to the INPADOC Family & Legal Status section of the search result details view. This section shows the family documents that are related to the current document (the so-called patent family). For each family member, bibliographic information and the legal events associated with that member (if this is available) are shown inside a tabbed interface.

The second table ("Priority applications") shows the priority applications referred to by any member of the family, sorted by application date. Each priority is assigned a priority index in the first column headed "#".  This priority number is referred to in the columns labelled "Priority claims" in the table of family members.  A cross ("X") in priority claim column N for family member M indicates that member M claims priority N. 

Here's a typical example:

INPADOC

Detailed information for each family member is presented inside a tabbed interface. Simply click on each tab to view its contents. Some members will have more tabs because more information has been supplied to INPADOC. Publication numbers shown in blue are hyperlinked to the BIOS PatentLens detailed view for that publication.

INPADOC tabs

To view only the contents of a particular tab (e.g. only "Legal events") for each member of the family, click one of the buttons labelled "View options". You can show the contents of all tabs for all members by clicking "Show all". Printing the page will print the contents of the selected tabs only.

INPADOC view options

back to top

I. Search updates by RSS Feed

Once a search has been performed, an RSS feed for that search is available via the RSS icon rss_combo. The link address (URL) associated with the RSS icon must be imported into a RSS news reader (sometimes called aggregator) by copying and pasting the link URL (right-click the link and select copy link or click the link and then copy the URL from the address bar). Once saved in the RSS news reader, that search will be periodically checked for new patents and will alert you of new "hits".

If you develop a particularly useful patent search for your area of expertise, you may like to share this search by emailing the RSS URL to your colleagues or publishing it on a blog. A future enhancement to the RSS feed functionality will be to allow users to share searches via the Patent Lens site and conversely to allow users to browse lists of saved searches by other users, categorised by areas of technology, etc.

back to top

J. Limitations and Caveats to the Patent Search Database

While we have strived to produce the highest quality patent database, the user should be aware that there are limitations that may affect the outcome of any search. Some of these limitations are inherent in the data provided by the Patent Offices, while others result from the processing of these data. In the interest of full disclosure, below is a list of known issues with the data and their causes.

In addition, the search engine and web interface have their own set of limitations - see a list of these limitations below (jump to Search engine issues).

Data issues

  1. Mis-spellings (typos)
    • can be inherent in the original data, in which case they will appear in the PDF document (where available);
    • can arise from OCR (optical character recognition) processing (which puts images into a full-text searchable format) in two ways, in which case the correct spelling will appear in the PDF document:

                (i) because the OCR process  is generally only 99% accurate

                (ii) can result when words are split over two lines by hyphenation in the original patent document. Currently, such words are indexed as the two separate parts by the OCR process.  For example, if the word "magnetism" is split over two lines as "magnet-ism" then the OCR process indexes it as two separate words "magnet" and "ism".  Where the error is noted, the affected documents will be re-processed to correct this problem.

    NOTE: OCR-derived mis-spellings apply only to full-text of WO-A, AU-B and EP-B patent documents (prior to 2000), which are generated by OCR of the original facsimile images (presented here as the PDF images).

  2. Alternate spellings
    • many words in English can be spelled differently, depending on the preference of the writer (e.g., harbor/harbour; center/centre; labeled/labelled);
    • spelling is usually, but not always, consistent within a document;
    • in US patent documents, mostly the spelling is American even if the writer is not from the U.S. while in EP patent documents, the spelling is mostly British;
    • in WO documents, the spelling preference may depend upon the country of origin or the receiving office.
  3. Names (inventors, assignees, etc)
    • names in the inventor, applicant/assignee or agent fields are indexed just like any other word. The various collections format names in different ways, e.g. "John Smith" may appear in any of the following forms: "J. Smith"; "John Smith"; "Smith, John"; "Smith, J.", etc. ;
    • the best approach to searching for a particular person's name is to use just the last name, surname or family name (e.g. "Smith"), and if too many documents are returned from the search, then refine the search with one or more additional criteria, such as an organisation name (e.g. "university AND Cornell)".
  4. Using the "near" operator may return erroneous results
    • words near each other in documents that were originally formatted in columns may be far apart in full-text and vice versa;
    • some EP-B (publication date pre-2000) and PCT documents were printed in two-column format but when they were OCR processed, the scan went across the entire page regardless of whether the text was in columns or not (we are currently re-processing those documents to alleviate this problem).
  5. Inconsistency of presentation among data sets (e.g. Greek letters, layouts, fields present, order of fields)
    • these inconsistencies will affect your search strategy and search results;
    • Greek letters: It is now possible to search such characters by entering the Unicode character, e.g. beta = β. Please refer to the manual for your computer's operating system for instructions on entering non-roman characters.
    • layouts: unlike the other data sets, the U.S. patents generally have a fixed set of headings (e.g. Field of the Invention; Summary of the Invention);
    • fields: not all information on the front page is common among the datasets. For example, U.S. documents may contain fields (e.g. U.S. classification codes) not present in EP and WO documents;
    • see the collections section for more detail.
  6. Erroneous application date for WO-A
    • WO-A applications are first published either with the search report (A1 designation) or without the search report (A2 designation);
    • the search report, if published on its own, has an A3 designation and is published later in time than the A1 document;
    • in the case where the A3 search report has been published, our database incorrectly gives the "publication date" as being the date of publication of the A3 document, instead of the true date of publication for that patent application (A1 or A2). This problem arises as a result of the order of the data we receive from the PCT, and we are in the process of correcting for it.
  7. Full-text available for selected publications
    • Full text documents include:
      • 1976 onwards - All US granted patents
      • mid-1998 onwards - All Australian granted patents
      • 1980 onwards - EP-B granted patents
      • 1978 onwards - WO-A/PCT patents
  8. PDF images not available for all publications
    • the full patent text as published is available as a PDF download for most of the collections, but occasionally an image is missing in the data;
    • AU-B patents give PDF images only from 1998 onwards (temporarily not available due to data upgrade)
  9. Published in language other than English
    • European patents may be published in French, German or English (the claims are published in all three languages);
    • WO patent applications can be in any language accepted by the office that receives the applications (e.g. Japanese, Russian);

back to top

Search engine mechanisms

  1. Search term requirements
    • query can consist of any combination of words (e.g. "corn"), words and numbers, and punctuation (e.g. "S35", "beta-carotene", "P-32");
    • query cannot consist only of numbers or of numbers and punctuation, i.e. it must contain at least one letter character.
  2. Search query history and "Saved Patents"
    • the patent search engine remembers the last 20 searches you have performed;
    • individual patents may be book-marked in the “Saved Patents” for viewing later;
    • for these functions, Cookies and JavaScript must be activated in your web browser;
    • up to 100 patents can stored in the basket;
    • search histories and basket listings are stored for a maximum of two months (unless there is activity) before they are automatically deleted.
  3. Wildcards
    • the wildcard character * may be used at the end of a word to search for a partial matching word, e.g. “agri*” will match "agriculture", "agricultural", "agribiotech", "agribusiness", etc.;
    • a search word cannot start with * (must be preceded by at least 2 characters).
  4. Stemming
    • Stemming, when turned on, will return documents which contain not only the word but also any of the stem variants: e.g. "separate" returns "separation", "separates", "separating".   It works differently than wildcards, applying rules of English spelling, e.g. "fly" finding "flies"
    • Stemming can be turned on or off via the drop-down in the "Preferences" section.
  5. Relevance ranking
    • sorting by relevance score uses a combination of two relevance ranking algorithms, TFIDF (term frequency inverse document frequency) term weighting and proximity matrix weighting. To turn relevance ranking off, select the "sort by: patent number" drop-down option. The relevance scores which are displayed in the last column, have no implicit meaning but simply differentiate the order of results for a given search.
  6. The NEAR operator
    • can only be used in the context of the same field.  NEAR between different fields is not allowed:
      e.g. ”((crop NEAR/5 soil) in title)” is OK, but “((crop in title) NEAR/5 (soil in abstract)” is incorrect (and will produce an error).
  7. Searchable fields on front page
    • not all front-page fields are searchable;
    • the searchable fields have been restricted to certain fields;
    • some of these fields can now be searched separately, and a future version will include the ability to search all front-page fields.
  8. Date and classification searches
    • it is not possible to search solely by dates (need at least one other search term) using the "optional search features" on the "Structured" search page (but a search query can be restricted by application date, publication date).

back to top

Comments (0)