Finding similar images with Rust and a Vector DB

October 02, 2025
laurens@ente.io

At Ente Photos, everything is end-to-end encrypted. This means a lot of the smart computations that would typically happen on servers instead run on your device. With limited compute available, we're constantly finding ways to do things more efficiently.

Recently, we've decided to bring Rust into our mobile apps to unlock performance gains, paving the way for exciting new features. Our first step? Integrating a vector database in Rust that enabled us to add a "similar images" detection feature, making it easier for users to clean up their photo libraries.

We already offered deduplication of same files using file hashes, but now we can easily find visually close images too.

Example of same and closely similar images

Why Rust?

This marks the first time we've added Rust code to our mobile photos app. It's part of a larger push to (re)write performance-critical code in Rust. We write Rust code and automatically generate bindings that our Dart code can use through Flutter Rust Bridge.

The benefits are twofold:

  1. Rust gives us fine-grained control over memory and performance, and it compiles to native code that runs blazingly fast on mobile devices.
  2. The Rust code becomes the single source of truth for all platforms, as we can use it for the web and desktop apps too.

Building on existing ML

As explained in our ML whitepaper, we already generate image embeddings locally for each photo. These embeddings are essentially numerical representations of the visual content in your photos - think of them as a photo's "visual fingerprint."

In the app, we compare these image embeddings with a text embedding to power our semantic search. But you can actually compare image embeddings with each other too.

Computing the cosine similarity between two image embeddings tells us how visually similar the two images are. The closer to 1, the more similar they are.

The problem is that comparing every image with every other image is computationally expensive. For a library of 50,000 photos, that means 2.5 billion comparisons, where each comparison involves calculating a dot product of two embeddings with 512 floating points each.

Enter the Vector DB

This is where a vector database comes in. With our new vector database, we can finally compare all these embeddings efficiently.

The core of our implementation involves using the Rust bindings of USearch, an open-source, high-performance vector database. USearch uses the HNSW (Hierarchical Navigable Small World) index, which makes similarity searches incredibly fast even with large datasets.

Instead of comparing every image with every other image (which would be O(n²) complexity), HNSW creates a graph structure that allows us to find similar images in logarithmic time. It turns an impossibly slow operation into something that runs smoothly on your phone.

What would normally take around 30 minutes for a 50,000 photo library, now takes less than 30 seconds. And the bigger your library, the more significant the performance improvement becomes.

Showing similar images in app

When surfacing the similar images in the app, we categorize them into three groups:

  • Same: Nearly identical photos (think burst shots or accidental duplicates)
  • Close: Very similar photos with minor differences (different angles of the same scene)
  • Related: Photos that share visual elements but are distinctly different

Each category uses different cosine distance thresholds to determine how similar photos need to be to qualify.

To make the feature even more useful, we:

  • Integrate with our existing face recognition system to ensure similar-looking images actually contain the same people
  • Prioritize keeping higher resolution versions when pre-selecting images for deletion
  • Provide both bulk deletion options for confident users and individual group deletes for those who prefer manual control
  • Add symlinks to the kept photos in albums, so you don't lose track of them

If you're tired of scrolling through dozens of nearly identical photos from that perfect sunset you tried to capture, this feature is for you. The similar images view lets you quickly identify and remove duplicates, keeping your library clean and your storage optimized.

The interface is simple: go to the similar images view, choose your category, select what to delete, and you're done. For power users, the bulk delete option can clean up thousands of duplicates in seconds.

At the same time, we're excited to introduce more library cleanup functionality (library culling) in the future. Stay tuned!

Looking forward

And this is also just the beginning of our Rust journey. The performance gains we've seen with the vector database have opened doors to features previously unavailable on mobile devices. Stay tuned as we continue to push the boundaries of what's possible with local, private computation.

Want to try it out? The similar images feature is available in the Ente Photos mobile app. And yes, like everything else we build, the code is open source.


At Ente, we believe that privacy and convenience aren't mutually exclusive. By bringing high-performance computing to your device, we're giving you both.