Introduction

Vector search is a new way to interact with your data. Where previous systems used API commands and query languages to retrieve data, Pulse uses natural language. This system is robust enough that we're on track to use it as the core of our user interface, which is why we refer to it as a "unified interface."

How to Use Vector Search

Basic Settings

  • Threshold – The minimum "relevance level" for search results (0 to 1, where 0.5 = 50%). Lower values return broader matches; higher values return only highly relevant matches.
  • Limit – The maximum number of search results you want retrieved (e.g., "10 funny cat pictures" or "500 FDA lawsuit filings").
  • Date Range – Filters results based on the created date of the digest.
    • Note: This filters based on the digest's ingestion date, not metadata fields. To filter by metadata date fields, use a metadata search.
    • This is also not applied to web searches, for web searches include the date in the web search so the metasearch engine can see it
    • It is generally a good idea to remind workers in acl rules to put the date in the query when using web searches as they'll sometimes try to enter it through the date fields by mistake
  • Capture Results: By Default web searches only keep the top 3 results, swarms and summaries only return the final top-level answer, check this to get:
    • All web search results
    • All intervening swarm steps
    • All results used for a summary
      as part of the final response from the system.

Natural Language Queries

  • Pulse's search model responds to natural language queries.
  • Example 1: "I'm looking for posts that read like they were written by a grumpy old man."
  • Example 2: "I'm trying to find parts for a 2001 Chevy Suburban."

Natural Language Operators

  • Use WEB at the start of a query to trigger a web search
  • Use CONTEXT at the end of a query followed by whatever information about your search you want to pass to the summary or swarm workers
  • EX: WEB canary in the coal mine CONTEXT what does this idiom mean?

Metadata Search

  • Any sequence formatted as fieldname: fieldcontent (newline) is extracted as metadata.
  • Metadata operators can be combined with natural language to refine searches beyond traditional systems.

Special Metadata Fields

  • "ObjectType" – Represents the type of object the digest contains (e.g., connection, search result, or standard digest).
  • "Content" - Represents the entire text of an entry, can be used for generic queries like "content contains dog"

Example Metadata Queries

  • Example 1: "I'm looking for posts that read like they were written by a grumpy old man content contains dentures" – Limits results to posts containing the word "dentures."
  • Example 2: "I'm trying to find parts for a 2001 Chevy Suburban price < 200" – Returns digests with a price metadata field of less than $200.
  • Example 3: price<200 – A metadata-only query.

Metadata Operators

Metadata operators allow you to filter structured fields directly, giving you precise control over results. Unlike natural language search, which infers meaning, metadata operators enforce strict filtering rules.

  • Equals (=) – Find posts with a specific value (e.g., source=olympics for gym and athletic competition fails).
  • Greater than (>) – Find entries above a certain value (e.g., probability > 80 for leads with a high chance of success).
  • Less than (<) – Find entries below a certain value (e.g., cost < 70000 for parts under $70,000).
  • Contains – Filter results based on a keyword in a field (e.g., description contains mantis for insect-related posts).
  • Tags – Specify a tag by using "tags contains [tag]" to refine results by tag

Operators AND by Default

  • Adding multiple operators combines them as required conditions.
  • Example: tags contains salesopportunity probability > 40 description contains texas-based – Filters results to sales opportunities with a probability over 40% that mention "Texas-based."