For visualizing the results or individual parts of the analysis, fastdup generates galleries in the form of
HTML files that are saved to the
galleries sub-dir of the work directory and presented interactively when using Jupyter notebooks.
Starting from V1.0 galleries have a new layer of abstraction that automatically adds bounding boxes and labels to images where available.
For more detail, see Galleries API Reference
Galleries share a few methods and arguments used for visualizing labels and bounding boxes, and for setting general attributes:
slice: Visualize a subset of the data with the given label, e.g.,
sort_by: Sort images by a property, supported are:
comp_size- Number of images in the component
distance- The average distance between cluster members. Clusters where the images are most similar will be presented first
area- From the largest to the smallest image or bounding box average size
label_col: Column to use as labels, common options are
num_images: (default=20) The number of images to visualize.
max_width: (default=None) Pixel width of displayed gallery. Useful values are often in the 800-1200 range.
lazy_load: (default=False) When
False, images are embedded into the gallery
HTMLfiles. Otherwise images are loaded by the browser using their relative paths. Using
lazy_loadmakes galleries lighter and faster to generate, but less portable and shareable. On the other hand, Without lazy loading galleries become very large files.
For most cases, visualization is as simple as
fd.vis.component_gallery(). The rest of the parameters are optional, and could be selected in hindsight.
label_col argument controls the labels appended to each image visualized. By default it fetches labels from the label column in the annotations dataframes provided during the
fastdup.run() call. When labels are not provided, or if the use of another column is desired, the
label_col argument could be set for using the required column.
Updated about 1 month ago