Big Datasets Are a Mess
We believe we can help you achieve better results. It only takes three lines of code to get started.
Read our blog post, where we expose quality issues present in many widely used academic datasets.
As a preview, the left clip showcases our LAION 400M image dataset analysis. This was done using a single CPU instance in just a few hours.
Introducing VL Profiler - A faster and easier way to diagnose and visualize dataset issues
The team behind fastdup also recently launched VL Profiler, a no-code cloud-based platform that lets you leverage fastdup in the browser.
VL Profiler lets you find duplicates/near-duplicates, outliers, mislabels and non-useful images.
Use VL Profiler for free to analyze issues on your dataset with up to 1,000,000 images.
Interactive Exploration
Not convinced yet? Interact with a collection of dataset like ImageNet-21K, COCO, and DeepFashion here. No sign-ups needed.
[New] Introducing fastdup V1.0 🎉
- Clean & simple API: The new API is simpler to use, and fully backward compatible with older API
- Native Windows support: Windows now has first-class, full feature support in fastdup
- Amazing documentation: New and imporved fasdtdup documentation
- Sleek galleries: New and improved galleries to get a better view of your data
- Extensive labels support: Improved support for handling image and bounding box labels
- Support for additional image formats: Apple’s HEIC+HEIF, 16 bit grayscale TIFF
- Support for Python3.10
- Fully backward compatible to previous API
Register now to gain early access to fastdup enterprise:
Gain access to our hosted cloud-based visual data store, which offers advanced visualization and quality metrics for labels and metadata, and enables you to explore, slice, share, and export your data effortlessly.
Find insights quickly, send for annotations and asses the quality of results. Export reports to PDFs and HTML to share on slack or with stakeholders.