Lots of improvements to the ingest tool were launched this week in anticipation of the new Explore tool. I shared some early screenshots on X if you want to see them - the tool is still pretty alpha-feeling and needs more polish before I'm willing to share it in a blog post. Here are the updates:

  • URL Ingestion
    • The ingest tool now automatically browses and summarizes web pages, as with all ingest tool results they are tag-able and saved to your node for future reading/can be easily added via the analyze tool:
New tab in the ingest area in the analyze tool
    • Several backend/prompt improvements were made over the course of the week after this feature was launched on 5/15, web page summaries (and as a result of what we tested now PDF summaries) are more detailed and capture key information from charts/data tables:
Example data table captured from a PDF via ingest
  • Improved feedback on URL ingestion from Qwen
    • Qwen will now tell you if there's a problem with a web page you try to ingest at the end of the summary
  • URL ingestion is now generally much faster than it was when we launched. My estimate of how many simultaneous portions of a web page Qwen could handle was too conservative.
    • Qwen now handles large sections of the web page at once which requires fewer API calls and keeps responses somewhere between 10 and 40 seconds on average - up to 60 or 80 for a long web page.

Backend Improvements:

Testing the explore tool also yielded some improved data-preprocessing, truncation, and memory management practices for the embedding backend used for vector search. Rather than waiting to push these with the explore tool or probes update we just released them quietly and let everyone benefit.

Wrapping things up

It all comes down to strong fundamentals.

Explore uses a combination of the ingest and vector search APIs to make any model useful for deep, effective, fast research on web results or local data. All the results form a "knowledge graph" you can explore while the research is happening. All of the results, thinking/reasoning steps, and final outputs are saved as tagged digests that you can easily pull in to the analyze tool/share with quantum search rules.

In a sentence:

If you have Pulse's explore tool you are your own personal intelligence agency.

I think the explore tool has a chance at being what makes Pulse a less obscure project. I've mostly been focusing my effort on making "strings", a "body", and "frets" for an "instrument." Those fundamentals take a long time to get right, but now we're at a stage where we can finally 'pluck' all of the strings.