The .pave accessor
One import, then .pave is available on every
DataFrame and Series — no need to call pavement.summary by name.
pandas Import pavement.pandas once — that registers the .pave accessor on every pandas DataFrame and Series.
df.pave() — whole-frame summary Show code import pavement.pandas # registers .pave
df.pave()⧉ 4 by 300
column tally: 300 distinct, 0 duplicate, 0 missing of 300 rows distinct
100% (300 of 300 rows)
(300 appearing once) column tally: 300 distinct, 0 duplicate, 0 missing of 300 entries distinct
100% (300 of 300 entries)
(300 appearing once) 100,000
pavement sparkline of 300 values 1e+05 to 1e+05
p0 to p6
6% (17 of 300 values) 1e+05 to 1e+05
p6 to p12
6% (18 of 300 values) 1e+05 to 1e+05
p12 to p19
6% (18 of 300 values) 1e+05 to 1e+05
p19 to p25
6% (18 of 300 values) 1e+05 to 1e+05
p25 to p31
6% (18 of 300 values) 1e+05 to 1e+05
p31 to p38
6% (18 of 300 values) 1e+05 to 1e+05
p38 to p44
6% (18 of 300 values) 1e+05 to 1e+05
p44 to p50
6% (18 of 300 values) 1e+05 to 1e+05
p50 to p56
6% (18 of 300 values) 1e+05 to 1e+05
p56 to p62
6% (18 of 300 values) 1e+05 to 1e+05
p62 to p69
6% (18 of 300 values) 1e+05 to 1e+05
p69 to p75
6% (18 of 300 values) 1e+05 to 1e+05
p75 to p81
6% (18 of 300 values) 1e+05 to 1e+05
p81 to p88
6% (18 of 300 values) 1e+05 to 1e+05
p88 to p94
6% (18 of 300 values) 1e+05 to 1e+05
p94 to p100
6% (17 of 300 values) 1e+05
p0
<1% (1 of 300 values) 1e+05
p6
<1% (1 of 300 values) 1e+05
p12
<1% (1 of 300 values) 1e+05
p19
<1% (1 of 300 values) 1e+05
p25
0% (0 of 300 values) 1e+05
p31
<1% (1 of 300 values) 1e+05
p38
<1% (1 of 300 values) 1e+05
p44
<1% (1 of 300 values) 1e+05
p50
0% (0 of 300 values) 1e+05
p56
<1% (1 of 300 values) 1e+05
p62
<1% (1 of 300 values) 1e+05
p69
<1% (1 of 300 values) 1e+05
p75
0% (0 of 300 values) 1e+05
p81
<1% (1 of 300 values) 1e+05
p88
<1% (1 of 300 values) 1e+05
p94
<1% (1 of 300 values) 1e+05
p100
<1% (1 of 300 values) 100,299
column tally: 3 distinct, 266 duplicate, 31 missing of 300 entries distinct
1% (3 of 300 entries)
(0 appearing once) duplicate
89% (266 of 300 entries) missing
10% (31 of 300 entries) free
value proportions of 269 values across 3 distinct values free
50% (135 of 269 values) pro
36% (98 of 269 values) team
13% (36 of 269 values) team
column tally: 51 distinct, 230 duplicate, 19 missing of 300 entries distinct
17% (51 of 300 entries)
(9 appearing once) duplicate
77% (230 of 300 entries) missing
6% (19 of 300 entries) 8
pavement sparkline of 281 values 8 to 21
p0 to p6
4% (12 of 281 values) 21 to 25
p6 to p12
5% (15 of 281 values) 25 to 29
p12 to p19
5% (15 of 281 values) 29 to 31
p19 to p25
3% (9 of 281 values) 31 to 34
p25 to p31
3% (9 of 281 values) 34 to 36
p31 to p38
2% (6 of 281 values) 36 to 38
p38 to p44
4% (11 of 281 values) 38 to 38
0% (0 of 281 values) 38 to 41
p50 to p56
6% (17 of 281 values) 41 to 42
0% (0 of 281 values) 42 to 44
p62 to p69
3% (9 of 281 values) 44 to 45
0% (0 of 281 values) 45 to 49
p75 to p81
5% (15 of 281 values) 49 to 52
p81 to p88
4% (11 of 281 values) 52 to 55
p88 to p94
4% (12 of 281 values) 55 to 68
p94 to p100
5% (14 of 281 values) 8
p0
<1% (1 of 281 values) 21
p6
2% (6 of 281 values) 25
p12
1% (3 of 281 values) 29
p19
1% (3 of 281 values) 31
p25
4% (10 of 281 values) 34
p31
3% (9 of 281 values) 36
p38
5% (13 of 281 values) 38
p44 to p50
7% (19 of 281 values) 41
p56
5% (13 of 281 values) 42
p62
5% (13 of 281 values) 44
p69
4% (12 of 281 values) 45
p75
2% (6 of 281 values) 49
p81
2% (7 of 281 values) 52
p88
2% (6 of 281 values) 55
p94
1% (4 of 281 values) 68
p100
<1% (1 of 281 values) 68
column tally: 240 distinct, 60 duplicate, 0 missing of 300 entries distinct
80% (240 of 300 entries)
(191 appearing once) duplicate
20% (60 of 300 entries) 0.1
pavement sparkline of 300 values 0.1 to 1.9
p0 to p6
6% (17 of 300 values) 1.9 to 3.8
p6 to p12
6% (17 of 300 values) 3.8 to 6
p12 to p19
6% (18 of 300 values) 6 to 7.45
p19 to p25
6% (18 of 300 values) 7.45 to 9.7
p25 to p31
6% (18 of 300 values) 9.7 to 13.3
p31 to p38
6% (18 of 300 values) 13.3 to 15.6
p38 to p44
6% (18 of 300 values) 15.6 to 19.2
p44 to p50
6% (17 of 300 values) 19.2 to 22.7
p50 to p56
6% (17 of 300 values) 22.7 to 28.9
p56 to p62
6% (18 of 300 values) 28.9 to 35.7
p62 to p69
6% (18 of 300 values) 35.7 to 45.1
p69 to p75
6% (17 of 300 values) 45.1 to 51.9
p75 to p81
6% (17 of 300 values) 51.9 to 68.3
p81 to p88
6% (18 of 300 values) 68.3 to 87.4
p88 to p94
6% (18 of 300 values) 87.4 to 171
p94 to p100
6% (17 of 300 values) 0.1
p0
<1% (1 of 300 values) 1.9
p6
<1% (1 of 300 values) 3.8
p12
1% (2 of 300 values) 6
p19
<1% (1 of 300 values) 7.45
p25
0% (0 of 300 values) 9.7
p31
<1% (1 of 300 values) 13.3
p38
<1% (1 of 300 values) 15.6
p44
<1% (1 of 300 values) 19.2
p50
1% (2 of 300 values) 22.7
p56
<1% (1 of 300 values) 28.9
p62
<1% (1 of 300 values) 35.7
p69
<1% (1 of 300 values) 45.1
p75
1% (2 of 300 values) 51.9
p81
<1% (1 of 300 values) 68.3
p88
<1% (1 of 300 values) 87.4
p94
<1% (1 of 300 values) 171
p100
<1% (1 of 300 values) 170.7
df.pave.spark(column) — one column's sparkline Show code df.pave.spark("age")pavement sparkline of 281 values 8 to 31
p0 to p25
22% (63 of 281 values) 31 to 38
p25 to p50
17% (48 of 281 values) 38 to 45
p50 to p75
23% (64 of 281 values) 45 to 68
p75 to p100
25% (69 of 281 values) 8
p0
<1% (1 of 281 values) 31
p25
4% (10 of 281 values) 38
p50
7% (19 of 281 values) 45
p75
2% (6 of 281 values) 68
p100
<1% (1 of 281 values)
df.pave.tally(column) — distinct / duplicate / missing Show code df.pave.tally("plan")column tally: 3 distinct, 266 duplicate, 31 missing of 300 entries distinct
1% (3 of 300 entries)
(0 appearing once) duplicate
89% (266 of 300 entries) missing
10% (31 of 300 entries)
df.pave.proportion(column) — value-counts strip Show code df.pave.proportion("plan")value proportions of 269 values across 3 distinct values free
50% (135 of 269 values) pro
36% (98 of 269 values) team
13% (36 of 269 values)
Series.pave() — a single-row summary Show code df['age'].pave()300 entries
column tally: 51 distinct, 230 duplicate, 19 missing of 300 entries distinct
17% (51 of 300 entries)
(9 appearing once) duplicate
77% (230 of 300 entries) missing
6% (19 of 300 entries) 8
pavement sparkline of 281 values 8 to 21
p0 to p6
4% (12 of 281 values) 21 to 25
p6 to p12
5% (15 of 281 values) 25 to 29
p12 to p19
5% (15 of 281 values) 29 to 31
p19 to p25
3% (9 of 281 values) 31 to 34
p25 to p31
3% (9 of 281 values) 34 to 36
p31 to p38
2% (6 of 281 values) 36 to 38
p38 to p44
4% (11 of 281 values) 38 to 38
0% (0 of 281 values) 38 to 41
p50 to p56
6% (17 of 281 values) 41 to 42
0% (0 of 281 values) 42 to 44
p62 to p69
3% (9 of 281 values) 44 to 45
0% (0 of 281 values) 45 to 49
p75 to p81
5% (15 of 281 values) 49 to 52
p81 to p88
4% (11 of 281 values) 52 to 55
p88 to p94
4% (12 of 281 values) 55 to 68
p94 to p100
5% (14 of 281 values) 8
p0
<1% (1 of 281 values) 21
p6
2% (6 of 281 values) 25
p12
1% (3 of 281 values) 29
p19
1% (3 of 281 values) 31
p25
4% (10 of 281 values) 34
p31
3% (9 of 281 values) 36
p38
5% (13 of 281 values) 38
p44 to p50
7% (19 of 281 values) 41
p56
5% (13 of 281 values) 42
p62
5% (13 of 281 values) 44
p69
4% (12 of 281 values) 45
p75
2% (6 of 281 values) 49
p81
2% (7 of 281 values) 52
p88
2% (6 of 281 values) 55
p94
1% (4 of 281 values) 68
p100
<1% (1 of 281 values) 68
enable_repr() — make summary the default displayShow code import pavement.pandas
pavement.pandas.enable_repr()
# From this point on, displaying any DataFrame or Series in a
# notebook cell shows pavement.summary instead of the default repr.
pavement.pandas.disable_repr() # restore the original displayCall once at the top of a notebook to make pavement.summary every DataFrame's default cell preview. Strictly opt-in — a plain import pavement.pandas only adds the .pave accessor, nothing else.
polars The API is identical — import pavement.polars instead. Polars uses a plugin namespace rather than a pandas accessor, but the call shapes are the same.
df.pave() — whole-frame summary Show code import pavement.polars # registers .pave namespace
df.pave()⧉ 4 by 300
column tally: 300 distinct, 0 duplicate, 0 missing of 300 rows distinct
100% (300 of 300 rows)
(300 appearing once) column tally: 300 distinct, 0 duplicate, 0 missing of 300 entries distinct
100% (300 of 300 entries)
(300 appearing once) 100,000
pavement sparkline of 300 values 1e+05 to 1e+05
p0 to p6
6% (17 of 300 values) 1e+05 to 1e+05
p6 to p12
6% (18 of 300 values) 1e+05 to 1e+05
p12 to p19
6% (18 of 300 values) 1e+05 to 1e+05
p19 to p25
6% (18 of 300 values) 1e+05 to 1e+05
p25 to p31
6% (18 of 300 values) 1e+05 to 1e+05
p31 to p38
6% (18 of 300 values) 1e+05 to 1e+05
p38 to p44
6% (18 of 300 values) 1e+05 to 1e+05
p44 to p50
6% (18 of 300 values) 1e+05 to 1e+05
p50 to p56
6% (18 of 300 values) 1e+05 to 1e+05
p56 to p62
6% (18 of 300 values) 1e+05 to 1e+05
p62 to p69
6% (18 of 300 values) 1e+05 to 1e+05
p69 to p75
6% (18 of 300 values) 1e+05 to 1e+05
p75 to p81
6% (18 of 300 values) 1e+05 to 1e+05
p81 to p88
6% (18 of 300 values) 1e+05 to 1e+05
p88 to p94
6% (18 of 300 values) 1e+05 to 1e+05
p94 to p100
6% (17 of 300 values) 1e+05
p0
<1% (1 of 300 values) 1e+05
p6
<1% (1 of 300 values) 1e+05
p12
<1% (1 of 300 values) 1e+05
p19
<1% (1 of 300 values) 1e+05
p25
0% (0 of 300 values) 1e+05
p31
<1% (1 of 300 values) 1e+05
p38
<1% (1 of 300 values) 1e+05
p44
<1% (1 of 300 values) 1e+05
p50
0% (0 of 300 values) 1e+05
p56
<1% (1 of 300 values) 1e+05
p62
<1% (1 of 300 values) 1e+05
p69
<1% (1 of 300 values) 1e+05
p75
0% (0 of 300 values) 1e+05
p81
<1% (1 of 300 values) 1e+05
p88
<1% (1 of 300 values) 1e+05
p94
<1% (1 of 300 values) 1e+05
p100
<1% (1 of 300 values) 100,299
column tally: 3 distinct, 262 duplicate, 35 missing of 300 entries distinct
1% (3 of 300 entries)
(0 appearing once) duplicate
87% (262 of 300 entries) missing
12% (35 of 300 entries) free
value proportions of 265 values across 3 distinct values free
55% (145 of 265 values) pro
26% (69 of 265 values) team
19% (51 of 265 values) team
column tally: 52 distinct, 233 duplicate, 15 missing of 300 entries distinct
17% (52 of 300 entries)
(8 appearing once) duplicate
78% (233 of 300 entries) missing
5% (15 of 300 entries) -6
pavement sparkline of 285 values -6 to 22
p0 to p6
5% (15 of 285 values) 22 to 25
p6 to p12
3% (9 of 285 values) 25 to 28
p12 to p19
5% (15 of 285 values) 28 to 30
p19 to p25
1% (4 of 285 values) 30 to 33
p25 to p31
5% (14 of 285 values) 33 to 34
0% (0 of 285 values) 34 to 37
p38 to p44
5% (15 of 285 values) 37 to 39
p44 to p50
4% (10 of 285 values) 39 to 40
0% (0 of 285 values) 40 to 43
p56 to p62
5% (13 of 285 values) 43 to 45
p62 to p69
4% (10 of 285 values) 45 to 47
p69 to p75
3% (8 of 285 values) 47 to 49
p75 to p81
2% (5 of 285 values) 49 to 52
p81 to p88
4% (10 of 285 values) 52 to 56
p88 to p94
4% (12 of 285 values) 56 to 72
p94 to p100
5% (15 of 285 values) -6
p0
<1% (1 of 285 values) 22
p6
2% (6 of 285 values) 25
p12
2% (7 of 285 values) 28
p19
3% (9 of 285 values) 30
p25
3% (8 of 285 values) 33
p31
4% (10 of 285 values) 34
p38
4% (10 of 285 values) 37
p44
3% (9 of 285 values) 39
p50
3% (9 of 285 values) 40
p56
4% (11 of 285 values) 43
p62
2% (7 of 285 values) 45
p69
5% (13 of 285 values) 47
p75
1% (4 of 285 values) 49
p81
4% (11 of 285 values) 52
p88
4% (10 of 285 values) 56
p94
1% (4 of 285 values) 72
p100
<1% (1 of 285 values) 72
column tally: 239 distinct, 61 duplicate, 0 missing of 300 entries distinct
80% (239 of 300 entries)
(189 appearing once) duplicate
20% (61 of 300 entries) 0.2
pavement sparkline of 300 values 0.2 to 1.6
p0 to p6
6% (17 of 300 values) 1.6 to 3.7
p6 to p12
6% (17 of 300 values) 3.7 to 5.9
p12 to p19
6% (18 of 300 values) 5.9 to 8.3
p19 to p25
6% (18 of 300 values) 8.3 to 10.3
p25 to p31
6% (18 of 300 values) 10.3 to 13.1
p31 to p38
6% (17 of 300 values) 13.1 to 15.8
p38 to p44
6% (17 of 300 values) 15.8 to 18.5
p44 to p50
6% (17 of 300 values) 18.5 to 24
p50 to p56
6% (18 of 300 values) 24 to 28.9
p56 to p62
6% (17 of 300 values) 28.9 to 33.1
p62 to p69
6% (17 of 300 values) 33.1 to 41.3
p69 to p75
6% (18 of 300 values) 41.3 to 54.1
p75 to p81
6% (18 of 300 values) 54.1 to 66.7
p81 to p88
6% (18 of 300 values) 66.7 to 91.1
p88 to p94
6% (18 of 300 values) 91.1 to 180
p94 to p100
6% (17 of 300 values) 0.2
p0
<1% (1 of 300 values) 1.6
p6
<1% (1 of 300 values) 3.7
p12
1% (2 of 300 values) 5.9
p19
<1% (1 of 300 values) 8.3
p25
0% (0 of 300 values) 10.3
p31
<1% (1 of 300 values) 13.1
p38
1% (3 of 300 values) 15.8
p44
1% (2 of 300 values) 18.5
p50
0% (0 of 300 values) 24
p56
1% (2 of 300 values) 28.9
p62
<1% (1 of 300 values) 33.1
p69
1% (2 of 300 values) 41.3
p75
0% (0 of 300 values) 54.1
p81
<1% (1 of 300 values) 66.7
p88
<1% (1 of 300 values) 91.1
p94
<1% (1 of 300 values) 180
p100
<1% (1 of 300 values) 179.8
df.pave.spark(column) — one column's sparkline Show code df.pave.spark("age")pavement sparkline of 285 values -6 to 30
p0 to p25
23% (65 of 285 values) 30 to 39
p25 to p50
24% (68 of 285 values) 39 to 47
p50 to p75
22% (62 of 285 values) 47 to 72
p75 to p100
24% (67 of 285 values) -6
p0
<1% (1 of 285 values) 30
p25
3% (8 of 285 values) 39
p50
3% (9 of 285 values) 47
p75
1% (4 of 285 values) 72
p100
<1% (1 of 285 values)
df.pave.tally(column) — distinct / duplicate / missing Show code df.pave.tally("plan")column tally: 3 distinct, 262 duplicate, 35 missing of 300 entries distinct
1% (3 of 300 entries)
(0 appearing once) duplicate
87% (262 of 300 entries) missing
12% (35 of 300 entries)
df.pave.proportion(column) — value-counts strip Show code df.pave.proportion("plan")value proportions of 265 values across 3 distinct values free
55% (145 of 265 values) pro
26% (69 of 265 values) team
19% (51 of 265 values)
Series.pave() — a single-row summary Show code df['age'].pave()300 entries
column tally: 52 distinct, 233 duplicate, 15 missing of 300 entries distinct
17% (52 of 300 entries)
(8 appearing once) duplicate
78% (233 of 300 entries) missing
5% (15 of 300 entries) -6
pavement sparkline of 285 values -6 to 22
p0 to p6
5% (15 of 285 values) 22 to 25
p6 to p12
3% (9 of 285 values) 25 to 28
p12 to p19
5% (15 of 285 values) 28 to 30
p19 to p25
1% (4 of 285 values) 30 to 33
p25 to p31
5% (14 of 285 values) 33 to 34
0% (0 of 285 values) 34 to 37
p38 to p44
5% (15 of 285 values) 37 to 39
p44 to p50
4% (10 of 285 values) 39 to 40
0% (0 of 285 values) 40 to 43
p56 to p62
5% (13 of 285 values) 43 to 45
p62 to p69
4% (10 of 285 values) 45 to 47
p69 to p75
3% (8 of 285 values) 47 to 49
p75 to p81
2% (5 of 285 values) 49 to 52
p81 to p88
4% (10 of 285 values) 52 to 56
p88 to p94
4% (12 of 285 values) 56 to 72
p94 to p100
5% (15 of 285 values) -6
p0
<1% (1 of 285 values) 22
p6
2% (6 of 285 values) 25
p12
2% (7 of 285 values) 28
p19
3% (9 of 285 values) 30
p25
3% (8 of 285 values) 33
p31
4% (10 of 285 values) 34
p38
4% (10 of 285 values) 37
p44
3% (9 of 285 values) 39
p50
3% (9 of 285 values) 40
p56
4% (11 of 285 values) 43
p62
2% (7 of 285 values) 45
p69
5% (13 of 285 values) 47
p75
1% (4 of 285 values) 49
p81
4% (11 of 285 values) 52
p88
4% (10 of 285 values) 56
p94
1% (4 of 285 values) 72
p100
<1% (1 of 285 values) 72
In a notebook
In Jupyter, df.pave() at the end of a cell renders the summary
table inline automatically — the result has a _repr_html_. And
enable_repr() goes one step further: after calling it once,
any DataFrame displayed in a cell shows the pavement summary instead
of the default pandas/polars table.