Interactive Table

This notebook demonstrate how to use the interactive tables for debugging.

In [1]:

Copied!

from active_vision import ActiveLearner

al = ActiveLearner(name="cycle-1")
al.load_model(model="resnet18", pretrained=True)
from active_vision import ActiveLearner

al = ActiveLearner(name="cycle-1")
al.load_model(model="resnet18", pretrained=True)

2025-02-07 00:10:01.724 | INFO     | active_vision.core:_detect_optimal_device:87 - CUDA GPU detected - will load model on GPU
2025-02-07 00:10:01.724 | INFO     | active_vision.core:load_model:73 - Loading a pretrained timm model `resnet18` on `cuda`

In [2]:

Copied!

import pandas as pd

initial_samples = pd.read_parquet("initial_samples.parquet")

al.load_dataset(initial_samples, filepath_col="filepath", label_col="label")
import pandas as pd

initial_samples = pd.read_parquet("initial_samples.parquet")

al.load_dataset(initial_samples, filepath_col="filepath", label_col="label")

2025-02-07 00:10:01.745 | INFO     | active_vision.core:load_dataset:125 - Loading dataset from `filepath` and `label` columns
2025-02-07 00:10:01.926 | INFO     | active_vision.core:load_dataset:159 - Creating new learner
2025-02-07 00:10:02.776 | INFO     | active_vision.core:_optimize_learner:100 - Enabled mixed precision training
2025-02-07 00:10:02.777 | INFO     | active_vision.core:_finalize_setup:109 - Training set size: 80
2025-02-07 00:10:02.777 | INFO     | active_vision.core:_finalize_setup:110 - Validation set size: 20
2025-02-07 00:10:02.777 | INFO     | active_vision.core:_finalize_setup:111 - Done. Ready to train.

In [3]:

Copied!

al.show_batch()
al.show_batch()

No description has been provided for this image

In [4]:

Copied!

al.train(epochs=5, lr=5e-3)
al.train(epochs=5, lr=5e-3)

2025-02-07 00:10:03.324 | INFO     | active_vision.core:train:213 - Training head for 1 epochs
2025-02-07 00:10:03.326 | INFO     | active_vision.core:train:214 - Training model end-to-end for 5 epochs
2025-02-07 00:10:03.326 | INFO     | active_vision.core:train:215 - Learning rate: 0.005 with one-cycle learning rate scheduler

epoch	train_loss	valid_loss	accuracy	time
0	3.224167	0.949572	0.650000	00:01

epoch	train_loss	valid_loss	accuracy	time
0	0.772945	0.642391	0.800000	00:01
1	0.567139	0.425224	0.900000	00:01
2	0.422784	0.411804	0.900000	00:01
3	0.342287	0.462382	0.900000	00:01
4	0.282525	0.486187	0.900000	00:01

In [5]:

Copied!

evaluation_df = pd.read_parquet("evaluation_samples.parquet")
evaluation_df
evaluation_df = pd.read_parquet("evaluation_samples.parquet")
evaluation_df

Out[5]:

	filepath	label
0	data/imagenette/2/00000.jpg	cassette player
1	data/imagenette/2/00001.jpg	cassette player
2	data/imagenette/2/00002.jpg	cassette player
3	data/imagenette/2/00003.jpg	cassette player
4	data/imagenette/2/00004.jpg	cassette player
...	...	...
3920	data/imagenette/5/03920.jpg	French horn
3921	data/imagenette/5/03921.jpg	French horn
3922	data/imagenette/5/03922.jpg	French horn
3923	data/imagenette/5/03923.jpg	French horn
3924	data/imagenette/5/03924.jpg	French horn

3925 rows × 2 columns

In the following cell, we evaluate the model on the first 100 samples in the evaluation set.

To view the results in an interactive table, we set the interactive parameter to True. One benefit of this is that we can see the images in the table. Also you can sort the table by clicking on the column headers.

In [6]:

Copied!





eval_df = al.evaluate(
    evaluation_df.head(100),
    filepath_col="filepath",
    label_col="label",
    interactive=True,
)
eval_df = al.evaluate(
    evaluation_df.head(100),
    filepath_col="filepath",
    label_col="label",
    interactive=True,
)

2025-02-07 00:10:12.672 | INFO     | active_vision.core:evaluate:317 - Accuracy: 95.00%
2025-02-07 00:10:12.673 | INFO     | active_vision.core:evaluate:320 - Rendering interactive table

image	filepath	label	pred_label	pred_conf	loss
Loading ITables v2.2.4 from the internet... (need help?)

In [7]:

Copied!

df = pd.read_parquet("unlabeled_samples.parquet")
filepaths = df["filepath"].tolist()
len(filepaths)
df = pd.read_parquet("unlabeled_samples.parquet")
filepaths = df["filepath"].tolist()
len(filepaths)

Out[7]:

The same can be done for the predict method.

In [8]:

Copied!

pred_df = al.predict(filepaths[:1000], batch_size=128, interactive=True)
pred_df = al.predict(filepaths[:1000], batch_size=128, interactive=True)

2025-02-07 00:10:12.991 | INFO     | active_vision.core:predict:224 - Running inference on 1000 samples

2025-02-07 00:10:14.598 | INFO     | active_vision.core:predict:269 - Rendering interactive table

image	filepath	pred_label	pred_conf	logits	embeddings
Loading ITables v2.2.4 from the internet... (need help?)

And also the sampling strategies.

In [9]:

Copied!





samples = al.sample_combination(
    pred_df,
    num_samples=50,
    combination={
        "least-confidence": 0.4,
        "ratio-of-confidence": 0.2,
        "entropy": 0.2,
        "model-based-outlier": 0.1,
        "random": 0.1,
    },
    interactive=True,
)
samples = al.sample_combination(
    pred_df,
    num_samples=50,
    combination={
        "least-confidence": 0.4,
        "ratio-of-confidence": 0.2,
        "entropy": 0.2,
        "model-based-outlier": 0.1,
        "random": 0.1,
    },
    interactive=True,
)

2025-02-07 00:10:17.374 | INFO     | active_vision.core:sample_combination:613 - Using combination sampling to get 50 samples
2025-02-07 00:10:17.376 | INFO     | active_vision.core:sample_uncertain:361 - Using least confidence strategy to get top 20 samples
2025-02-07 00:10:17.382 | INFO     | active_vision.core:sample_uncertain:428 - Rendering interactive table

	image	filepath	strategy	score	pred_label	pred_conf
Loading ITables v2.2.4 from the internet... (need help?)

2025-02-07 00:10:17.466 | INFO     | active_vision.core:sample_uncertain:384 - Using ratio of confidence strategy to get top 10 samples
2025-02-07 00:10:17.471 | INFO     | active_vision.core:sample_uncertain:428 - Rendering interactive table

	image	filepath	strategy	score	pred_label	pred_conf
Loading ITables v2.2.4 from the internet... (need help?)

2025-02-07 00:10:17.505 | INFO     | active_vision.core:sample_uncertain:398 - Using entropy strategy to get top 10 samples
/home/dnth/Desktop/active-vision/src/active_vision/core.py:401: RuntimeWarning: divide by zero encountered in log2
  df.loc[:, "score"] = df["probs"].apply(lambda x: -np.sum(x * np.log2(x)))
/home/dnth/Desktop/active-vision/src/active_vision/core.py:401: RuntimeWarning: invalid value encountered in multiply
  df.loc[:, "score"] = df["probs"].apply(lambda x: -np.sum(x * np.log2(x)))
2025-02-07 00:10:17.511 | INFO     | active_vision.core:sample_uncertain:428 - Rendering interactive table

	image	filepath	strategy	score	pred_label	pred_conf
Loading ITables v2.2.4 from the internet... (need help?)

2025-02-07 00:10:17.566 | INFO     | active_vision.core:sample_diverse:465 - Using model-based outlier strategy to get top 5 samples
2025-02-07 00:10:17.567 | INFO     | active_vision.core:predict:224 - Running inference on 20 samples

2025-02-07 00:10:18.303 | INFO     | active_vision.core:sample_diverse:527 - Rendering interactive table

	image	filepath	strategy	score	pred_label	pred_conf
Loading ITables v2.2.4 from the internet... (need help?)

2025-02-07 00:10:18.322 | INFO     | active_vision.core:sample_random:557 - Sampling 5 random samples
2025-02-07 00:10:18.323 | INFO     | active_vision.core:sample_random:567 - Rendering interactive table

	image	filepath	strategy	score	pred_label	pred_conf
Loading ITables v2.2.4 from the internet... (need help?)

2025-02-07 00:10:18.342 | INFO     | active_vision.core:sample_combination:662 - Rendering interactive table

image	filepath	strategy	score	pred_label	pred_conf
Loading ITables v2.2.4 from the internet... (need help?)

In [12]:

Copied!

sample_uncertainty = al.sample_uncertain(
    pred_df, num_samples=50, strategy="entropy", interactive=True
)
sample_uncertainty = al.sample_uncertain(
    pred_df, num_samples=50, strategy="entropy", interactive=True
)

2025-02-07 00:10:58.931 | INFO     | active_vision.core:sample_uncertain:398 - Using entropy strategy to get top 50 samples
/home/dnth/Desktop/active-vision/src/active_vision/core.py:401: RuntimeWarning: divide by zero encountered in log2
  df.loc[:, "score"] = df["probs"].apply(lambda x: -np.sum(x * np.log2(x)))
/home/dnth/Desktop/active-vision/src/active_vision/core.py:401: RuntimeWarning: invalid value encountered in multiply
  df.loc[:, "score"] = df["probs"].apply(lambda x: -np.sum(x * np.log2(x)))
2025-02-07 00:10:58.938 | INFO     | active_vision.core:sample_uncertain:428 - Rendering interactive table

	image	filepath	strategy	score	pred_label	pred_conf
Loading ITables v2.2.4 from the internet... (need help?)

In [13]:

Copied!

sample_diverse = al.sample_diverse(pred_df, num_samples=50, interactive=True)
sample_diverse = al.sample_diverse(pred_df, num_samples=50, interactive=True)

2025-02-07 00:11:49.041 | INFO     | active_vision.core:sample_diverse:465 - Using model-based outlier strategy to get top 50 samples
2025-02-07 00:11:49.042 | INFO     | active_vision.core:predict:224 - Running inference on 20 samples

2025-02-07 00:11:49.758 | INFO     | active_vision.core:sample_diverse:527 - Rendering interactive table

	image	filepath	strategy	score	pred_label	pred_conf
Loading ITables v2.2.4 from the internet... (need help?)

In [14]:

Copied!

sample_random = al.sample_random(pred_df, num_samples=50, interactive=True)
sample_random = al.sample_random(pred_df, num_samples=50, interactive=True)

2025-02-07 00:12:08.513 | INFO     | active_vision.core:sample_random:557 - Sampling 50 random samples
2025-02-07 00:12:08.515 | INFO     | active_vision.core:sample_random:567 - Rendering interactive table

	image	filepath	strategy	score	pred_label	pred_conf
Loading ITables v2.2.4 from the internet... (need help?)

In [ ]: