Overview

Dataset statistics

Number of variables11
Number of observations3802
Missing cells227
Missing cells (%)0.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory326.9 KiB
Average record size in memory88.0 B

Variable types

Categorical9
Numeric2

Warnings

contracts/0/implementation/transactions/0/value/currency has constant value "CRC" Constant
contracts/0/implementation/transactions/0/payer/id has constant value "2-300-042155" Constant
contracts/0/implementation/transactions/0/payer/name has constant value "Corte Suprema de Justicia Poder Judicial" Constant
ocid has a high cardinality: 169 distinct values High cardinality
id has a high cardinality: 169 distinct values High cardinality
contracts/0/id has a high cardinality: 288 distinct values High cardinality
contracts/0/implementation/transactions/0/id has a high cardinality: 3761 distinct values High cardinality
contracts/0/implementation/transactions/0/date has a high cardinality: 347 distinct values High cardinality
contracts/0/implementation/transactions/0/payee/name has a high cardinality: 112 distinct values High cardinality
contracts/0/implementation/transactions/0/payer/name is highly correlated with contracts/0/implementation/transactions/0/payer/id and 1 other fieldsHigh correlation
contracts/0/implementation/transactions/0/payer/id is highly correlated with contracts/0/implementation/transactions/0/payer/name and 1 other fieldsHigh correlation
contracts/0/implementation/transactions/0/value/currency is highly correlated with contracts/0/implementation/transactions/0/payer/name and 1 other fieldsHigh correlation
contracts/0/implementation/transactions/0/payee/name has 227 (6.0%) missing values Missing
contracts/0/implementation/transactions/0/id is uniformly distributed Uniform

Reproduction

Analysis started2021-03-27 16:53:53.489820
Analysis finished2021-03-27 16:53:55.659654
Duration2.17 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

ocid
Categorical

HIGH CARDINALITY

Distinct169
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size29.8 KiB
ocds-fnha3a-002238-2017
1662 
ocds-fnha3a-002235-2017
550 
ocds-fnha3a-000467-2017
289 
ocds-fnha3a-002962-2017
 
84
ocds-fnha3a-003075-2017
 
64
Other values (164)
1153 

Length

Max length23
Median length23
Mean length23
Min length23

Characters and Unicode

Total characters87446
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)1.1%

Sample

1st rowocds-fnha3a-000004-2017
2nd rowocds-fnha3a-000004-2017
3rd rowocds-fnha3a-000004-2017
4th rowocds-fnha3a-000005-2017
5th rowocds-fnha3a-000005-2017
ValueCountFrequency (%)
ocds-fnha3a-002238-20171662
43.7%
ocds-fnha3a-002235-2017550
 
14.5%
ocds-fnha3a-000467-2017289
 
7.6%
ocds-fnha3a-002962-201784
 
2.2%
ocds-fnha3a-003075-201764
 
1.7%
ocds-fnha3a-000067-201757
 
1.5%
ocds-fnha3a-001082-201756
 
1.5%
ocds-fnha3a-000881-201739
 
1.0%
ocds-fnha3a-000617-201738
 
1.0%
ocds-fnha3a-002460-201735
 
0.9%
Other values (159)928
24.4%
2021-03-27T10:53:55.953844image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ocds-fnha3a-002238-20171662
43.7%
ocds-fnha3a-002235-2017550
 
14.5%
ocds-fnha3a-000467-2017289
 
7.6%
ocds-fnha3a-002962-201784
 
2.2%
ocds-fnha3a-003075-201764
 
1.7%
ocds-fnha3a-000067-201757
 
1.5%
ocds-fnha3a-001082-201756
 
1.5%
ocds-fnha3a-000881-201739
 
1.0%
ocds-fnha3a-000617-201738
 
1.0%
ocds-fnha3a-002460-201735
 
0.9%
Other values (159)928
24.4%

Most occurring characters

ValueCountFrequency (%)
012695
14.5%
-11406
13.0%
29019
10.3%
a7604
 
8.7%
36384
 
7.3%
14494
 
5.1%
74467
 
5.1%
o3802
 
4.3%
c3802
 
4.3%
d3802
 
4.3%
Other values (9)19971
22.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number41822
47.8%
Lowercase Letter34218
39.1%
Dash Punctuation11406
 
13.0%

Most frequent character per category

ValueCountFrequency (%)
012695
30.4%
29019
21.6%
36384
15.3%
14494
 
10.7%
74467
 
10.7%
82096
 
5.0%
5886
 
2.1%
6752
 
1.8%
4733
 
1.8%
9296
 
0.7%
ValueCountFrequency (%)
a7604
22.2%
o3802
11.1%
c3802
11.1%
d3802
11.1%
s3802
11.1%
f3802
11.1%
n3802
11.1%
h3802
11.1%
ValueCountFrequency (%)
-11406
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common53228
60.9%
Latin34218
39.1%

Most frequent character per script

ValueCountFrequency (%)
012695
23.9%
-11406
21.4%
29019
16.9%
36384
12.0%
14494
 
8.4%
74467
 
8.4%
82096
 
3.9%
5886
 
1.7%
6752
 
1.4%
4733
 
1.4%
ValueCountFrequency (%)
a7604
22.2%
o3802
11.1%
c3802
11.1%
d3802
11.1%
s3802
11.1%
f3802
11.1%
n3802
11.1%
h3802
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII87446
100.0%

Most frequent character per block

ValueCountFrequency (%)
012695
14.5%
-11406
13.0%
29019
10.3%
a7604
 
8.7%
36384
 
7.3%
14494
 
5.1%
74467
 
5.1%
o3802
 
4.3%
c3802
 
4.3%
d3802
 
4.3%
Other values (9)19971
22.8%

id
Categorical

HIGH CARDINALITY

Distinct169
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size29.8 KiB
2017LN-000013-PROV
1662 
2017CD-000014-PROVEX
550 
2017LA-000025-PROV
289 
2017CD-000678-PROVCD
 
84
2017LN-000019-PROV
 
64
Other values (164)
1153 

Length

Max length20
Median length18
Mean length18.77117307
Min length18

Characters and Unicode

Total characters71368
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)1.1%

Sample

1st row2017LA-000003-PROV
2nd row2017LA-000003-PROV
3rd row2017LA-000003-PROV
4th row2017LA-000004-PROV
5th row2017LA-000004-PROV
ValueCountFrequency (%)
2017LN-000013-PROV1662
43.7%
2017CD-000014-PROVEX550
 
14.5%
2017LA-000025-PROV289
 
7.6%
2017CD-000678-PROVCD84
 
2.2%
2017LN-000019-PROV64
 
1.7%
2017CD-000025-PROVCD57
 
1.5%
2017CD-000008-PROVEX56
 
1.5%
2017CD-000207-PROVCD39
 
1.0%
2017CD-000005-PROVEX38
 
1.0%
2017LN-000014-PROV35
 
0.9%
Other values (159)928
24.4%
2021-03-27T10:53:56.282965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2017ln-000013-prov1662
43.7%
2017cd-000014-provex550
 
14.5%
2017la-000025-prov289
 
7.6%
2017cd-000678-provcd84
 
2.2%
2017ln-000019-prov64
 
1.7%
2017cd-000025-provcd57
 
1.5%
2017cd-000008-provex56
 
1.5%
2017cd-000207-provcd39
 
1.0%
2017cd-000005-provex38
 
1.0%
2017ln-000014-prov35
 
0.9%
Other values (159)928
24.4%

Most occurring characters

ValueCountFrequency (%)
018730
26.2%
-7604
10.7%
16586
 
9.2%
24553
 
6.4%
74088
 
5.7%
P3802
 
5.3%
R3802
 
5.3%
O3802
 
5.3%
V3802
 
5.3%
L2336
 
3.3%
Other values (13)12263
17.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number38020
53.3%
Uppercase Letter25744
36.1%
Dash Punctuation7604
 
10.7%

Most frequent character per category

ValueCountFrequency (%)
P3802
14.8%
R3802
14.8%
O3802
14.8%
V3802
14.8%
L2336
9.1%
C2214
8.6%
D2086
8.1%
N1778
6.9%
E718
 
2.8%
X718
 
2.8%
Other values (2)686
 
2.7%
ValueCountFrequency (%)
018730
49.3%
16586
 
17.3%
24553
 
12.0%
74088
 
10.8%
31880
 
4.9%
4735
 
1.9%
5569
 
1.5%
6400
 
1.1%
8280
 
0.7%
9199
 
0.5%
ValueCountFrequency (%)
-7604
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common45624
63.9%
Latin25744
36.1%

Most frequent character per script

ValueCountFrequency (%)
P3802
14.8%
R3802
14.8%
O3802
14.8%
V3802
14.8%
L2336
9.1%
C2214
8.6%
D2086
8.1%
N1778
6.9%
E718
 
2.8%
X718
 
2.8%
Other values (2)686
 
2.7%
ValueCountFrequency (%)
018730
41.1%
-7604
16.7%
16586
 
14.4%
24553
 
10.0%
74088
 
9.0%
31880
 
4.1%
4735
 
1.6%
5569
 
1.2%
6400
 
0.9%
8280
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII71368
100.0%

Most frequent character per block

ValueCountFrequency (%)
018730
26.2%
-7604
10.7%
16586
 
9.2%
24553
 
6.4%
74088
 
5.7%
P3802
 
5.3%
R3802
 
5.3%
O3802
 
5.3%
V3802
 
5.3%
L2336
 
3.3%
Other values (13)12263
17.2%

contracts/0/id
Categorical

HIGH CARDINALITY

Distinct288
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Memory size29.8 KiB
2018-038118
671 
2017-071117
550 
2018-040118
535 
2018-039118
456 
2017-088117
289 
Other values (283)
1301 

Length

Max length12
Median length12
Mean length11.93161494
Min length11

Characters and Unicode

Total characters45364
Distinct characters16
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique209 ?
Unique (%)5.5%

Sample

1st row2017-073349
2nd row2017-073349
3rd row2017-073350
4th row2017-074015
5th row2017-074016
ValueCountFrequency (%)
2018-038118 671
17.6%
2017-071117 550
14.5%
2018-040118 535
14.1%
2018-039118 456
12.0%
2017-088117 289
 
7.6%
2018-Res. 2284
 
2.2%
2018-032118 64
 
1.7%
2017-Res. 2157
 
1.5%
2017-018217 56
 
1.5%
2017-075117 39
 
1.0%
Other values (278)1001
26.3%
2021-03-27T10:53:56.614108image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2018-038118671
17.0%
2017-071117550
13.9%
2018-040118535
13.6%
2018-039118456
11.6%
2017-088117289
 
7.3%
2284
 
2.1%
2018-res84
 
2.1%
2018-03211864
 
1.6%
2157
 
1.4%
2017-res57
 
1.4%
Other values (280)1096
27.8%

Most occurring characters

ValueCountFrequency (%)
111300
24.9%
08310
18.3%
85521
12.2%
24635
10.2%
74127
 
9.1%
-3802
 
8.4%
3542
 
7.8%
31471
 
3.2%
4888
 
2.0%
9660
 
1.5%
Other values (6)1108
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number37456
82.6%
Dash Punctuation3802
 
8.4%
Space Separator3542
 
7.8%
Lowercase Letter282
 
0.6%
Uppercase Letter141
 
0.3%
Other Punctuation141
 
0.3%

Most frequent character per category

ValueCountFrequency (%)
111300
30.2%
08310
22.2%
85521
14.7%
24635
12.4%
74127
 
11.0%
31471
 
3.9%
4888
 
2.4%
9660
 
1.8%
5323
 
0.9%
6221
 
0.6%
ValueCountFrequency (%)
e141
50.0%
s141
50.0%
ValueCountFrequency (%)
-3802
100.0%
ValueCountFrequency (%)
3542
100.0%
ValueCountFrequency (%)
R141
100.0%
ValueCountFrequency (%)
.141
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common44941
99.1%
Latin423
 
0.9%

Most frequent character per script

ValueCountFrequency (%)
111300
25.1%
08310
18.5%
85521
12.3%
24635
10.3%
74127
 
9.2%
-3802
 
8.5%
3542
 
7.9%
31471
 
3.3%
4888
 
2.0%
9660
 
1.5%
Other values (3)685
 
1.5%
ValueCountFrequency (%)
R141
33.3%
e141
33.3%
s141
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII45364
100.0%

Most frequent character per block

ValueCountFrequency (%)
111300
24.9%
08310
18.3%
85521
12.2%
24635
10.2%
74127
 
9.1%
-3802
 
8.4%
3542
 
7.8%
31471
 
3.2%
4888
 
2.0%
9660
 
1.5%
Other values (6)1108
 
2.4%

contracts/0/implementation/transactions/0/id
Categorical

HIGH CARDINALITY
UNIFORM

Distinct3761
Distinct (%)98.9%
Missing0
Missing (%)0.0%
Memory size29.8 KiB
301-019468-2017-AP1893-GOB-17
 
3
301-017829-2017-AP1690-GOB-17
 
3
301-003651-2020-AP00669-GOB20
 
2
301-020372-2017-AP1968-GOB-17
 
2
301-016652-2017-AP1593-GOB-17
 
2
Other values (3756)
3790 

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters110258
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3722 ?
Unique (%)97.9%

Sample

1st row301-029261-2017-AP2887-GOB-17
2nd row301-031263-2017-AP3050-GOB-17
3rd row301-029260-2017-AP2887-GOB-17
4th row301-034651-2017-AP3489-GOB-17
5th row301-034652-2017-AP3489-GOB-17
ValueCountFrequency (%)
301-019468-2017-AP1893-GOB-173
 
0.1%
301-017829-2017-AP1690-GOB-173
 
0.1%
301-003651-2020-AP00669-GOB202
 
0.1%
301-020372-2017-AP1968-GOB-172
 
0.1%
301-016652-2017-AP1593-GOB-172
 
0.1%
301-010108-2020-AP02165-GOB202
 
0.1%
301-027730-2017-AP2721-GOB-172
 
0.1%
301-045169-2019-AP8905-GOB-192
 
0.1%
301-026887-2017-AP2650-GOB-172
 
0.1%
301-010850-2020-AP02292-GOB202
 
0.1%
Other values (3751)3780
99.4%
2021-03-27T10:53:56.958160image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
301-017829-2017-ap1690-gob-173
 
0.1%
301-019468-2017-ap1893-gob-173
 
0.1%
301-031525-2017-ap3076-gob-172
 
0.1%
301-010107-2020-ap02165-gob202
 
0.1%
301-017561-2017-ap1694-gob-172
 
0.1%
301-020447-2020-ap04880-gob202
 
0.1%
301-016652-2017-ap1593-gob-172
 
0.1%
301-032563-2017-ap3230-gob-172
 
0.1%
301-010104-2020-ap02165-gob202
 
0.1%
301-033520-2017-ap3342-gob-172
 
0.1%
Other values (3751)3780
99.4%

Most occurring characters

ValueCountFrequency (%)
-18138
16.5%
017592
16.0%
114096
12.8%
29230
8.4%
37845
 
7.1%
95822
 
5.3%
84849
 
4.4%
44011
 
3.6%
A3802
 
3.4%
P3802
 
3.4%
Other values (6)21071
19.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number73110
66.3%
Uppercase Letter19010
 
17.2%
Dash Punctuation18138
 
16.5%

Most frequent character per category

ValueCountFrequency (%)
017592
24.1%
114096
19.3%
29230
12.6%
37845
10.7%
95822
 
8.0%
84849
 
6.6%
44011
 
5.5%
73542
 
4.8%
63083
 
4.2%
53040
 
4.2%
ValueCountFrequency (%)
A3802
20.0%
P3802
20.0%
G3802
20.0%
O3802
20.0%
B3802
20.0%
ValueCountFrequency (%)
-18138
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common91248
82.8%
Latin19010
 
17.2%

Most frequent character per script

ValueCountFrequency (%)
-18138
19.9%
017592
19.3%
114096
15.4%
29230
10.1%
37845
8.6%
95822
 
6.4%
84849
 
5.3%
44011
 
4.4%
73542
 
3.9%
63083
 
3.4%
ValueCountFrequency (%)
A3802
20.0%
P3802
20.0%
G3802
20.0%
O3802
20.0%
B3802
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII110258
100.0%

Most frequent character per block

ValueCountFrequency (%)
-18138
16.5%
017592
16.0%
114096
12.8%
29230
8.4%
37845
 
7.1%
95822
 
5.3%
84849
 
4.4%
44011
 
3.6%
A3802
 
3.4%
P3802
 
3.4%
Other values (6)21071
19.1%
Distinct347
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size29.8 KiB
2019-01-22T00:00:00-06:00
 
88
2018-09-27T00:00:00-06:00
 
72
2018-12-28T00:00:00-06:00
 
70
2019-01-25T00:00:00-06:00
 
53
2019-12-18T00:00:00-06:00
 
46
Other values (342)
3473 

Length

Max length25
Median length25
Mean length25
Min length25

Characters and Unicode

Total characters95050
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)1.1%

Sample

1st row2017-10-13T14:53:33-06:00
2nd row2017-10-31T16:56:35-06:00
3rd row2017-10-13T14:53:33-06:00
4th row2017-11-22T11:01:29-06:00
5th row2017-11-22T11:01:29-06:00
ValueCountFrequency (%)
2019-01-22T00:00:00-06:0088
 
2.3%
2018-09-27T00:00:00-06:0072
 
1.9%
2018-12-28T00:00:00-06:0070
 
1.8%
2019-01-25T00:00:00-06:0053
 
1.4%
2019-12-18T00:00:00-06:0046
 
1.2%
2019-05-07T00:00:00-06:0044
 
1.2%
2020-01-06T00:00:00-06:0040
 
1.1%
2018-12-13T00:00:00-06:0040
 
1.1%
2019-10-29T00:00:00-06:0036
 
0.9%
2020-04-17T00:00:00-06:0035
 
0.9%
Other values (337)3278
86.2%
2021-03-27T10:53:57.253371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2019-01-22t00:00:00-06:0088
 
2.3%
2018-09-27t00:00:00-06:0072
 
1.9%
2018-12-28t00:00:00-06:0070
 
1.8%
2019-01-25t00:00:00-06:0053
 
1.4%
2019-12-18t00:00:00-06:0046
 
1.2%
2019-05-07t00:00:00-06:0044
 
1.2%
2020-01-06t00:00:00-06:0040
 
1.1%
2018-12-13t00:00:00-06:0040
 
1.1%
2019-10-29t00:00:00-06:0036
 
0.9%
2020-04-17t00:00:00-06:0035
 
0.9%
Other values (337)3278
86.2%

Most occurring characters

ValueCountFrequency (%)
040443
42.5%
-11406
 
12.0%
:11406
 
12.0%
27407
 
7.8%
16571
 
6.9%
64868
 
5.1%
T3802
 
4.0%
92591
 
2.7%
81748
 
1.8%
31366
 
1.4%
Other values (3)3442
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number68436
72.0%
Dash Punctuation11406
 
12.0%
Other Punctuation11406
 
12.0%
Uppercase Letter3802
 
4.0%

Most frequent character per category

ValueCountFrequency (%)
040443
59.1%
27407
 
10.8%
16571
 
9.6%
64868
 
7.1%
92591
 
3.8%
81748
 
2.6%
31366
 
2.0%
71304
 
1.9%
51189
 
1.7%
4949
 
1.4%
ValueCountFrequency (%)
-11406
100.0%
ValueCountFrequency (%)
T3802
100.0%
ValueCountFrequency (%)
:11406
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common91248
96.0%
Latin3802
 
4.0%

Most frequent character per script

ValueCountFrequency (%)
040443
44.3%
-11406
 
12.5%
:11406
 
12.5%
27407
 
8.1%
16571
 
7.2%
64868
 
5.3%
92591
 
2.8%
81748
 
1.9%
31366
 
1.5%
71304
 
1.4%
Other values (2)2138
 
2.3%
ValueCountFrequency (%)
T3802
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII95050
100.0%

Most frequent character per block

ValueCountFrequency (%)
040443
42.5%
-11406
 
12.0%
:11406
 
12.0%
27407
 
7.8%
16571
 
6.9%
64868
 
5.1%
T3802
 
4.0%
92591
 
2.7%
81748
 
1.8%
31366
 
1.4%
Other values (3)3442
 
3.6%
Distinct1585
Distinct (%)41.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6897188.212
Minimum0
Maximum438393687.4
Zeros5
Zeros (%)0.1%
Memory size29.8 KiB
2021-03-27T10:53:57.393009image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile109287.5
Q1584699.78
median2971282.99
Q37033367.31
95-th percentile19426583.67
Maximum438393687.4
Range438393687.4
Interquartile range (IQR)6448667.53

Descriptive statistics

Standard deviation20138083.77
Coefficient of variation (CV)2.91975268
Kurtosis137.9526967
Mean6897188.212
Median Absolute Deviation (MAD)2697161.75
Skewness10.00618223
Sum2.622310958 × 1010
Variance4.055424179 × 1014
MonotocityNot monotonic
2021-03-27T10:53:57.538617image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19600035
 
0.9%
9193506.8725
 
0.7%
141666.6622
 
0.6%
47040021
 
0.6%
14000021
 
0.6%
28545020
 
0.5%
1021597.0818
 
0.5%
16998118
 
0.5%
34300017
 
0.4%
7350508.2917
 
0.4%
Other values (1575)3588
94.4%
ValueCountFrequency (%)
05
0.1%
569.51
 
< 0.1%
29401
 
< 0.1%
30001
 
< 0.1%
3016.781
 
< 0.1%
ValueCountFrequency (%)
438393687.41
 
< 0.1%
349790016.82
0.1%
255057336.22
0.1%
229928793.63
0.1%
193914378.71
 
< 0.1%

contracts/0/implementation/transactions/0/value/currency
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.8 KiB
CRC
3802 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters11406
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCRC
2nd rowCRC
3rd rowCRC
4th rowCRC
5th rowCRC
ValueCountFrequency (%)
CRC3802
100.0%
2021-03-27T10:53:57.814870image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-27T10:53:57.997382image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
crc3802
100.0%

Most occurring characters

ValueCountFrequency (%)
C7604
66.7%
R3802
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter11406
100.0%

Most frequent character per category

ValueCountFrequency (%)
C7604
66.7%
R3802
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin11406
100.0%

Most frequent character per script

ValueCountFrequency (%)
C7604
66.7%
R3802
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII11406
100.0%

Most frequent character per block

ValueCountFrequency (%)
C7604
66.7%
R3802
33.3%

contracts/0/implementation/transactions/0/payer/id
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.8 KiB
2-300-042155
3802 

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters45624
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2-300-042155
2nd row2-300-042155
3rd row2-300-042155
4th row2-300-042155
5th row2-300-042155
ValueCountFrequency (%)
2-300-0421553802
100.0%
2021-03-27T10:53:58.198877image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-27T10:53:58.282619image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
2-300-0421553802
100.0%

Most occurring characters

ValueCountFrequency (%)
011406
25.0%
27604
16.7%
-7604
16.7%
57604
16.7%
33802
 
8.3%
43802
 
8.3%
13802
 
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number38020
83.3%
Dash Punctuation7604
 
16.7%

Most frequent character per category

ValueCountFrequency (%)
011406
30.0%
27604
20.0%
57604
20.0%
33802
 
10.0%
43802
 
10.0%
13802
 
10.0%
ValueCountFrequency (%)
-7604
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common45624
100.0%

Most frequent character per script

ValueCountFrequency (%)
011406
25.0%
27604
16.7%
-7604
16.7%
57604
16.7%
33802
 
8.3%
43802
 
8.3%
13802
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII45624
100.0%

Most frequent character per block

ValueCountFrequency (%)
011406
25.0%
27604
16.7%
-7604
16.7%
57604
16.7%
33802
 
8.3%
43802
 
8.3%
13802
 
8.3%

contracts/0/implementation/transactions/0/payer/name
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.8 KiB
Corte Suprema de Justicia Poder Judicial
3802 

Length

Max length40
Median length40
Mean length40
Min length40

Characters and Unicode

Total characters152080
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCorte Suprema de Justicia Poder Judicial
2nd rowCorte Suprema de Justicia Poder Judicial
3rd rowCorte Suprema de Justicia Poder Judicial
4th rowCorte Suprema de Justicia Poder Judicial
5th rowCorte Suprema de Justicia Poder Judicial
ValueCountFrequency (%)
Corte Suprema de Justicia Poder Judicial3802
100.0%
2021-03-27T10:53:58.488092image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-03-27T10:53:58.599771image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
de3802
16.7%
poder3802
16.7%
judicial3802
16.7%
corte3802
16.7%
justicia3802
16.7%
suprema3802
16.7%

Most occurring characters

ValueCountFrequency (%)
19010
12.5%
e15208
10.0%
i15208
10.0%
r11406
 
7.5%
u11406
 
7.5%
a11406
 
7.5%
d11406
 
7.5%
o7604
 
5.0%
t7604
 
5.0%
J7604
 
5.0%
Other values (8)34218
22.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter114060
75.0%
Uppercase Letter19010
 
12.5%
Space Separator19010
 
12.5%

Most frequent character per category

ValueCountFrequency (%)
e15208
13.3%
i15208
13.3%
r11406
10.0%
u11406
10.0%
a11406
10.0%
d11406
10.0%
o7604
6.7%
t7604
6.7%
c7604
6.7%
p3802
 
3.3%
Other values (3)11406
10.0%
ValueCountFrequency (%)
J7604
40.0%
C3802
20.0%
S3802
20.0%
P3802
20.0%
ValueCountFrequency (%)
19010
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin133070
87.5%
Common19010
 
12.5%

Most frequent character per script

ValueCountFrequency (%)
e15208
11.4%
i15208
11.4%
r11406
8.6%
u11406
8.6%
a11406
8.6%
d11406
8.6%
o7604
 
5.7%
t7604
 
5.7%
J7604
 
5.7%
c7604
 
5.7%
Other values (7)26614
20.0%
ValueCountFrequency (%)
19010
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII152080
100.0%

Most frequent character per block

ValueCountFrequency (%)
19010
12.5%
e15208
10.0%
i15208
10.0%
r11406
 
7.5%
u11406
 
7.5%
a11406
 
7.5%
d11406
 
7.5%
o7604
 
5.0%
t7604
 
5.0%
J7604
 
5.0%
Other values (8)34218
22.5%
Distinct124
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3125976570
Minimum104060231
Maximum1.170011981 × 1011
Zeros0
Zeros (%)0.0%
Memory size29.8 KiB
2021-03-27T10:53:58.699505image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum104060231
5-th percentile109070854
Q13101059070
median3101229409
Q33101229409
95-th percentile3101531254
Maximum1.170011981 × 1011
Range1.168971379 × 1011
Interquartile range (IQR)170339

Descriptive statistics

Standard deviation6993155182
Coefficient of variation (CV)2.237110556
Kurtosis256.3836316
Mean3125976570
Median Absolute Deviation (MAD)130547
Skewness15.90259659
Sum1.188496292 × 1013
Variance4.89042194 × 1019
MonotocityNot monotonic
2021-03-27T10:53:58.872042image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31012294091221
32.1%
3101059070420
 
11.0%
3101098862402
 
10.6%
3101310098289
 
7.6%
10552020684
 
2.2%
310153125473
 
1.9%
310162303971
 
1.9%
10423057664
 
1.7%
310148918756
 
1.5%
10907085449
 
1.3%
Other values (114)1073
28.2%
ValueCountFrequency (%)
1040602313
 
0.1%
10423057664
1.7%
10519041820
 
0.5%
1052000523
 
0.1%
10552020684
2.2%
ValueCountFrequency (%)
1.170011981 × 101114
0.4%
400004214718
0.5%
31025177801
 
< 0.1%
31020269722
 
0.1%
31017205202
 
0.1%
Distinct112
Distinct (%)3.1%
Missing227
Missing (%)6.0%
Memory size29.8 KiB
Eulen De Costa Rica S.A.
1221 
Distribuidora Y Envasadora De Químicos S.A
420 
Servicios Rápidos de Costa Rica S.A.
402 
Manejo Profesional de Desechos S.A.
289 
Marta Eugenia Chacón González
 
84
Other values (107)
1159 

Length

Max length68
Median length26
Mean length29.72027972
Min length10

Characters and Unicode

Total characters106250
Distinct characters61
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)0.5%

Sample

1st rowSwat Consulting Services Latin America S. A.
2nd rowSwat Consulting Services Latin America S. A.
3rd rowSwat Consulting Services Latin America S. A.
4th rowCentral de Servicios P C, S. A.
5th rowCentral de Servicios P C, S. A.
ValueCountFrequency (%)
Eulen De Costa Rica S.A.1221
32.1%
Distribuidora Y Envasadora De Químicos S.A420
 
11.0%
Servicios Rápidos de Costa Rica S.A.402
 
10.6%
Manejo Profesional de Desechos S.A.289
 
7.6%
Marta Eugenia Chacón González84
 
2.2%
Carlos Manuel Abarca López64
 
1.7%
Pharos Street S. A.56
 
1.5%
Weimng Chin Ng49
 
1.3%
Corporación de Profesionales Médicos COPROMED S. A.48
 
1.3%
Sthena Mobiliaria S.A.38
 
1.0%
Other values (102)904
23.8%
(Missing)227
 
6.0%
2021-03-27T10:53:59.248037image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
s.a2656
15.0%
de2568
14.5%
costa1685
 
9.5%
rica1685
 
9.5%
eulen1221
 
6.9%
y434
 
2.4%
distribuidora422
 
2.4%
químicos420
 
2.4%
envasadora420
 
2.4%
servicios414
 
2.3%
Other values (276)5815
32.8%

Most occurring characters

ValueCountFrequency (%)
14323
 
13.5%
a8231
 
7.7%
e7674
 
7.2%
o6699
 
6.3%
i6625
 
6.2%
s5890
 
5.5%
.5651
 
5.3%
S3885
 
3.7%
n3859
 
3.6%
c3746
 
3.5%
Other values (51)39667
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter64565
60.8%
Uppercase Letter21691
 
20.4%
Space Separator14323
 
13.5%
Other Punctuation5671
 
5.3%

Most frequent character per category

ValueCountFrequency (%)
a8231
12.7%
e7674
11.9%
o6699
10.4%
i6625
10.3%
s5890
9.1%
n3859
 
6.0%
c3746
 
5.8%
r3711
 
5.7%
t2913
 
4.5%
l2760
 
4.3%
Other values (21)12457
19.3%
ValueCountFrequency (%)
S3885
17.9%
A3563
16.4%
D2549
11.8%
R2503
11.5%
C2494
11.5%
E2126
9.8%
M886
 
4.1%
P508
 
2.3%
Y471
 
2.2%
Q451
 
2.1%
Other values (16)2255
10.4%
ValueCountFrequency (%)
.5651
99.6%
,13
 
0.2%
&7
 
0.1%
ValueCountFrequency (%)
14323
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin86256
81.2%
Common19994
 
18.8%

Most frequent character per script

ValueCountFrequency (%)
a8231
 
9.5%
e7674
 
8.9%
o6699
 
7.8%
i6625
 
7.7%
s5890
 
6.8%
S3885
 
4.5%
n3859
 
4.5%
c3746
 
4.3%
r3711
 
4.3%
A3563
 
4.1%
Other values (47)32373
37.5%
ValueCountFrequency (%)
14323
71.6%
.5651
 
28.3%
,13
 
0.1%
&7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII104716
98.6%
None1534
 
1.4%

Most frequent character per block

ValueCountFrequency (%)
14323
 
13.7%
a8231
 
7.9%
e7674
 
7.3%
o6699
 
6.4%
i6625
 
6.3%
s5890
 
5.6%
.5651
 
5.4%
S3885
 
3.7%
n3859
 
3.7%
c3746
 
3.6%
Other values (44)38133
36.4%
ValueCountFrequency (%)
á629
41.0%
í480
31.3%
ó303
19.8%
é102
 
6.6%
Á10
 
0.7%
ñ7
 
0.5%
ú3
 
0.2%

Interactions

2021-03-27T10:53:54.188563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-03-27T10:53:54.344148image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-03-27T10:53:59.369715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-03-27T10:53:59.525297image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-03-27T10:53:59.680882image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-03-27T10:53:59.842448image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-03-27T10:53:59.984069image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-03-27T10:53:54.805913image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-03-27T10:53:55.261695image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-03-27T10:53:55.452186image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

ocididcontracts/0/idcontracts/0/implementation/transactions/0/idcontracts/0/implementation/transactions/0/datecontracts/0/implementation/transactions/0/value/amountcontracts/0/implementation/transactions/0/value/currencycontracts/0/implementation/transactions/0/payer/idcontracts/0/implementation/transactions/0/payer/namecontracts/0/implementation/transactions/0/payee/idcontracts/0/implementation/transactions/0/payee/name
0ocds-fnha3a-000004-20172017LA-000003-PROV2017-073349301-029261-2017-AP2887-GOB-172017-10-13T14:53:33-06:005.610245e+07CRC2-300-042155Corte Suprema de Justicia Poder Judicial3101415074Swat Consulting Services Latin America S. A.
1ocds-fnha3a-000004-20172017LA-000003-PROV2017-073349301-031263-2017-AP3050-GOB-172017-10-31T16:56:35-06:003.677869e+06CRC2-300-042155Corte Suprema de Justicia Poder Judicial3101415074Swat Consulting Services Latin America S. A.
2ocds-fnha3a-000004-20172017LA-000003-PROV2017-073350301-029260-2017-AP2887-GOB-172017-10-13T14:53:33-06:005.610245e+07CRC2-300-042155Corte Suprema de Justicia Poder Judicial3101415074Swat Consulting Services Latin America S. A.
3ocds-fnha3a-000005-20172017LA-000004-PROV2017-074015301-034651-2017-AP3489-GOB-172017-11-22T11:01:29-06:002.550573e+08CRC2-300-042155Corte Suprema de Justicia Poder Judicial3101096527Central de Servicios P C, S. A.
4ocds-fnha3a-000005-20172017LA-000004-PROV2017-074016301-034652-2017-AP3489-GOB-172017-11-22T11:01:29-06:002.550573e+08CRC2-300-042155Corte Suprema de Justicia Poder Judicial3101096527Central de Servicios P C, S. A.
5ocds-fnha3a-000005-20172017LA-000004-PROV2017-074016301-039645-2017-AP4149-GOB-172017-12-22T08:58:33-06:009.139343e+06CRC2-300-042155Corte Suprema de Justicia Poder Judicial3101096527Central de Servicios P C, S. A.
6ocds-fnha3a-000005-20172017LA-000004-PROV2017-075744301-041443-2017-AP4374-GOB-172017-12-27T09:49:10-06:001.267950e+08CRC2-300-042155Corte Suprema de Justicia Poder Judicial3101096527Central de Servicios P C, S. A.
7ocds-fnha3a-000005-20172017LA-000004-PROV2017-075745301-041444-2017-AP4374-GOB-172017-12-27T09:49:10-06:001.267950e+08CRC2-300-042155Corte Suprema de Justicia Poder Judicial3101096527Central de Servicios P C, S. A.
8ocds-fnha3a-000006-20172017LA-000005-PROV2017-072799301-030022-2017-AP3021-GOB-172017-10-13T14:58:08-06:001.470508e+08CRC2-300-042155Corte Suprema de Justicia Poder Judicial3101187066Asesorías Asepro de Centroamérica S. A.
9ocds-fnha3a-000006-20172017LA-000005-PROV2017-072800301-030022-2017-AP3021-GOB-172017-10-13T14:58:08-06:001.470508e+08CRC2-300-042155Corte Suprema de Justicia Poder Judicial3101187066Asesorías Asepro de Centroamérica S. A.

Last rows

ocididcontracts/0/idcontracts/0/implementation/transactions/0/idcontracts/0/implementation/transactions/0/datecontracts/0/implementation/transactions/0/value/amountcontracts/0/implementation/transactions/0/value/currencycontracts/0/implementation/transactions/0/payer/idcontracts/0/implementation/transactions/0/payer/namecontracts/0/implementation/transactions/0/payee/idcontracts/0/implementation/transactions/0/payee/name
3792ocds-fnha3a-003187-20172017LA-000074-PROV2018-027118301-028296-2018-AP3775-GOB-182018-09-05T00:00:00-06:001910388.96CRC2-300-042155Corte Suprema de Justicia Poder Judicial800520213Edi Darío Velázquez Chávez
3793ocds-fnha3a-003187-20172017LA-000074-PROV2018-027118301-030351-2019-AP5518-GOB-192019-09-12T00:00:00-06:00953262.46CRC2-300-042155Corte Suprema de Justicia Poder Judicial800520213Edi Darío Velázquez Chávez
3794ocds-fnha3a-003187-20172017LA-000074-PROV2018-027118301-031803-2019-AP5876-GOB-192019-09-27T00:00:00-06:00965581.16CRC2-300-042155Corte Suprema de Justicia Poder Judicial800520213Edi Darío Velázquez Chávez
3795ocds-fnha3a-003187-20172017LA-000074-PROV2018-027118301-036132-2018-AP5264-GOB-182018-11-07T00:00:00-06:002032254.91CRC2-300-042155Corte Suprema de Justicia Poder Judicial800520213Edi Darío Velázquez Chávez
3796ocds-fnha3a-003187-20172017LA-000074-PROV2018-027118301-036133-2018-AP5264-GOB-182018-11-07T00:00:00-06:002032254.91CRC2-300-042155Corte Suprema de Justicia Poder Judicial800520213Edi Darío Velázquez Chávez
3797ocds-fnha3a-003187-20172017LA-000074-PROV2018-027118301-038286-2019-AP7392-GOB-192019-11-01T00:00:00-06:00937582.40CRC2-300-042155Corte Suprema de Justicia Poder Judicial800520213Edi Darío Velázquez Chávez
3798ocds-fnha3a-003187-20172017LA-000074-PROV2018-027118301-044028-2018-AP7065-GOB-182019-01-22T00:00:00-06:00993521.16CRC2-300-042155Corte Suprema de Justicia Poder Judicial800520213Edi Darío Velázquez Chávez
3799ocds-fnha3a-003187-20172017LA-000074-PROV2018-027118301-045258-2019-AP9149-GOB-192019-12-06T00:00:00-06:00946508.65CRC2-300-042155Corte Suprema de Justicia Poder Judicial800520213Edi Darío Velázquez Chávez
3800ocds-fnha3a-003187-20172017LA-000074-PROV2018-027118301-047743-2019-AP10122-GOB192019-12-17T00:00:00-06:00941323.56CRC2-300-042155Corte Suprema de Justicia Poder Judicial800520213Edi Darío Velázquez Chávez
3801ocds-fnha3a-003187-20172017LA-000074-PROV2018-027118301-050026-2019-AP10416-GOB192020-01-23T00:00:00-06:00940068.58CRC2-300-042155Corte Suprema de Justicia Poder Judicial800520213Edi Darío Velázquez Chávez