I have experimented with both pgvector (through supabase) and elasticsearch. Bot...

osigurdson · on April 13, 2023

If already using both, what would you recommend?

omneity · on April 13, 2023

It really boils down to which stack you are familiar with / which stack integrates better with the rest of your infrastructure / use case.

If you're already ingesting a lot of data to ES and you want to vectorize it, it has a good support for cosine similarity in its indexes.

If you store data in a PG database and you want to make it searchable by similarity, then pgvector is a good choice. It's especially powerful coupled with the ease of use of the supabase platform. You can make a document-based chatbot very very quickly.

In both cases it is more of a datatype and a lot of your logic will still reside in your application layer.

In my case I was already ingesting data into elastic, so I just added a dense_vector property to my index, and a vectorization step in my external code by calling the openAI api and saving the result into dense_vector.

In the future, I'm planning to build an AI powered webapp and my stack of choice will be supabase + pgvector because it's a better option as a public app backend.

itsuka · on April 14, 2023

> It's especially powerful coupled with the ease of use of the supabase platform

Just as you said, I followed their Clippy tutorial (which utilizes pgvector and OpenAI embedding) and was able to spin up a document-based chatbot quickly. In my specific use case, I stored portions of my knowledge base as embeddings in a normal Postgres table. The next step would be to implement a Row-Level Security for extra security (in case I screwed up somewhere). Thankfully, all my data and auth info are integrated with Supabase, so it's straightforward to do.