Overview

Dataset statistics

Number of variables15
Number of observations45726
Missing cells29682
Missing cells (%)4.3%
Duplicate rows10
Duplicate rows (%)< 0.1%
Total size in memory24.7 MiB
Average record size in memory566.8 B

Variable types

Text3
Numeric5
Categorical4
DateTime1
Boolean1
Unsupported1

Alerts

source has constant value ""Constant
Dataset has 10 (< 0.1%) duplicate rowsDuplicates
nametype is highly imbalanced (98.2%)Imbalance
fall is highly imbalanced (83.4%)Imbalance
reclat has 7315 (16.0%) missing valuesMissing
reclong has 7315 (16.0%) missing valuesMissing
GeoLocation has 7315 (16.0%) missing valuesMissing
reclat_city has 7315 (16.0%) missing valuesMissing
mass (g) is highly skewed (γ1 = 76.91847245)Skewed
unhashable is an unsupported type, check if it needs cleaning or further analysisUnsupported
reclat has 6438 (14.1%) zerosZeros
reclong has 6214 (13.6%) zerosZeros

Reproduction

Analysis started2024-03-18 18:23:16.492791
Analysis finished2024-03-18 18:23:22.133442
Duration5.64 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

name
Text

Distinct45716
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size3.3 MiB
2024-03-18T18:23:22.491772image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length28
Median length25
Mean length17.782487
Min length2

Characters and Unicode

Total characters813122
Distinct characters96
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45706 ?
Unique (%)> 99.9%

Sample

1st rowAachen
2nd rowAarhus
3rd rowAbee
4th rowAcapulco
5th rowAchiras
ValueCountFrequency (%)
yamato 7269
 
5.7%
range 6575
 
5.2%
africa 4502
 
3.6%
northwest 4499
 
3.5%
hills 3995
 
3.2%
queen 3445
 
2.7%
alexandra 3444
 
2.7%
mountains 3004
 
2.4%
al 2663
 
2.1%
grove 2496
 
2.0%
Other values (37726) 84860
66.9%
2024-03-18T18:23:23.044121image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
81032
 
10.0%
a 72715
 
8.9%
e 48167
 
5.9%
n 38392
 
4.7%
0 34943
 
4.3%
r 33097
 
4.1%
i 32658
 
4.0%
l 31873
 
3.9%
t 30898
 
3.8%
o 30428
 
3.7%
Other values (86) 378919
46.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 813122
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
81032
 
10.0%
a 72715
 
8.9%
e 48167
 
5.9%
n 38392
 
4.7%
0 34943
 
4.3%
r 33097
 
4.1%
i 32658
 
4.0%
l 31873
 
3.9%
t 30898
 
3.8%
o 30428
 
3.7%
Other values (86) 378919
46.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 813122
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
81032
 
10.0%
a 72715
 
8.9%
e 48167
 
5.9%
n 38392
 
4.7%
0 34943
 
4.3%
r 33097
 
4.1%
i 32658
 
4.0%
l 31873
 
3.9%
t 30898
 
3.8%
o 30428
 
3.7%
Other values (86) 378919
46.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 813122
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
81032
 
10.0%
a 72715
 
8.9%
e 48167
 
5.9%
n 38392
 
4.7%
0 34943
 
4.3%
r 33097
 
4.1%
i 32658
 
4.0%
l 31873
 
3.9%
t 30898
 
3.8%
o 30428
 
3.7%
Other values (86) 378919
46.6%

id
Real number (ℝ)

Distinct45716
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26883.906
Minimum1
Maximum57458
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size357.4 KiB
2024-03-18T18:23:23.257362image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2388.75
Q112681.25
median24256.5
Q340653.5
95-th percentile54890.75
Maximum57458
Range57457
Interquartile range (IQR)27972.25

Descriptive statistics

Standard deviation16863.446
Coefficient of variation (CV)0.62726917
Kurtosis-1.1601308
Mean26883.906
Median Absolute Deviation (MAD)13264
Skewness0.26653007
Sum1.2292935 × 109
Variance2.843758 × 108
MonotonicityNot monotonic
2024-03-18T18:23:23.468181image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 2
 
< 0.1%
6 2
 
< 0.1%
10 2
 
< 0.1%
370 2
 
< 0.1%
379 2
 
< 0.1%
390 2
 
< 0.1%
392 2
 
< 0.1%
398 2
 
< 0.1%
417 2
 
< 0.1%
2 2
 
< 0.1%
Other values (45706) 45706
> 99.9%
ValueCountFrequency (%)
1 2
< 0.1%
2 2
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 2
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 2
< 0.1%
11 1
< 0.1%
ValueCountFrequency (%)
57458 1
< 0.1%
57457 1
< 0.1%
57456 1
< 0.1%
57455 1
< 0.1%
57454 1
< 0.1%
57453 1
< 0.1%
57436 1
< 0.1%
57435 1
< 0.1%
57434 1
< 0.1%
57433 1
< 0.1%

nametype
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
Valid
45651 
Relict
 
75

Length

Max length6
Median length5
Mean length5.0016402
Min length5

Characters and Unicode

Total characters228705
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowValid
2nd rowValid
3rd rowValid
4th rowValid
5th rowValid

Common Values

ValueCountFrequency (%)
Valid 45651
99.8%
Relict 75
 
0.2%

Length

2024-03-18T18:23:23.662778image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-18T18:23:23.796271image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
valid 45651
99.8%
relict 75
 
0.2%

Most occurring characters

ValueCountFrequency (%)
l 45726
20.0%
i 45726
20.0%
V 45651
20.0%
a 45651
20.0%
d 45651
20.0%
R 75
 
< 0.1%
e 75
 
< 0.1%
c 75
 
< 0.1%
t 75
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 228705
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
l 45726
20.0%
i 45726
20.0%
V 45651
20.0%
a 45651
20.0%
d 45651
20.0%
R 75
 
< 0.1%
e 75
 
< 0.1%
c 75
 
< 0.1%
t 75
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 228705
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
l 45726
20.0%
i 45726
20.0%
V 45651
20.0%
a 45651
20.0%
d 45651
20.0%
R 75
 
< 0.1%
e 75
 
< 0.1%
c 75
 
< 0.1%
t 75
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 228705
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
l 45726
20.0%
i 45726
20.0%
V 45651
20.0%
a 45651
20.0%
d 45651
20.0%
R 75
 
< 0.1%
e 75
 
< 0.1%
c 75
 
< 0.1%
t 75
 
< 0.1%
Distinct466
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
2024-03-18T18:23:24.034000image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length26
Median length2
Mean length3.0525303
Min length1

Characters and Unicode

Total characters139580
Distinct characters62
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique145 ?
Unique (%)0.3%

Sample

1st rowL5
2nd rowH6
3rd rowEH4
4th rowAcapulcoite
5th rowL6
ValueCountFrequency (%)
l6 8341
17.6%
h5 7165
15.1%
l5 4818
10.2%
h6 4530
9.6%
h4 4223
 
8.9%
ll5 2766
 
5.8%
ll6 2046
 
4.3%
l4 1256
 
2.7%
iron 1070
 
2.3%
h4/5 428
 
0.9%
Other values (434) 10712
22.6%
2024-03-18T18:23:24.515461image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
L 28467
20.4%
H 18396
13.2%
5 16419
11.8%
6 16132
11.6%
4 6930
 
5.0%
e 3972
 
2.8%
i 3834
 
2.7%
r 3648
 
2.6%
t 3327
 
2.4%
3 3278
 
2.3%
Other values (52) 35177
25.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 139580
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
L 28467
20.4%
H 18396
13.2%
5 16419
11.8%
6 16132
11.6%
4 6930
 
5.0%
e 3972
 
2.8%
i 3834
 
2.7%
r 3648
 
2.6%
t 3327
 
2.4%
3 3278
 
2.3%
Other values (52) 35177
25.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 139580
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
L 28467
20.4%
H 18396
13.2%
5 16419
11.8%
6 16132
11.6%
4 6930
 
5.0%
e 3972
 
2.8%
i 3834
 
2.7%
r 3648
 
2.6%
t 3327
 
2.4%
3 3278
 
2.3%
Other values (52) 35177
25.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 139580
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
L 28467
20.4%
H 18396
13.2%
5 16419
11.8%
6 16132
11.6%
4 6930
 
5.0%
e 3972
 
2.8%
i 3834
 
2.7%
r 3648
 
2.6%
t 3327
 
2.4%
3 3278
 
2.3%
Other values (52) 35177
25.2%

mass (g)
Real number (ℝ)

SKEWED 

Distinct12576
Distinct (%)27.6%
Missing131
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean13278.426
Minimum0
Maximum60000000
Zeros19
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size357.4 KiB
2024-03-18T18:23:24.727430image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.1
Q17.2
median32.61
Q3202.9
95-th percentile4000
Maximum60000000
Range60000000
Interquartile range (IQR)195.7

Descriptive statistics

Standard deviation574926.01
Coefficient of variation (CV)43.297752
Kurtosis6798.3984
Mean13278.426
Median Absolute Deviation (MAD)30.51
Skewness76.918472
Sum6.0542985 × 108
Variance3.3053992 × 1011
MonotonicityNot monotonic
2024-03-18T18:23:24.934634image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.3 171
 
0.4%
1.2 140
 
0.3%
1.4 138
 
0.3%
2.1 130
 
0.3%
2.4 126
 
0.3%
1.6 120
 
0.3%
0.5 119
 
0.3%
1.1 116
 
0.3%
3.8 114
 
0.2%
1.5 111
 
0.2%
Other values (12566) 44310
96.9%
(Missing) 131
 
0.3%
ValueCountFrequency (%)
0 19
< 0.1%
0.01 2
 
< 0.1%
0.013 1
 
< 0.1%
0.02 1
 
< 0.1%
0.03 1
 
< 0.1%
0.04 1
 
< 0.1%
0.05 1
 
< 0.1%
0.06 1
 
< 0.1%
0.07 3
 
< 0.1%
0.08 2
 
< 0.1%
ValueCountFrequency (%)
60000000 1
< 0.1%
58200000 1
< 0.1%
50000000 1
< 0.1%
30000000 1
< 0.1%
28000000 1
< 0.1%
26000000 1
< 0.1%
24300000 1
< 0.1%
24000000 1
< 0.1%
23000000 1
< 0.1%
22000000 1
< 0.1%

fall
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
Found
44609 
Fell
 
1117

Length

Max length5
Median length5
Mean length4.9755719
Min length4

Characters and Unicode

Total characters227513
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFell
2nd rowFell
3rd rowFell
4th rowFell
5th rowFell

Common Values

ValueCountFrequency (%)
Found 44609
97.6%
Fell 1117
 
2.4%

Length

2024-03-18T18:23:25.124544image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-18T18:23:25.254716image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
found 44609
97.6%
fell 1117
 
2.4%

Most occurring characters

ValueCountFrequency (%)
F 45726
20.1%
o 44609
19.6%
u 44609
19.6%
n 44609
19.6%
d 44609
19.6%
l 2234
 
1.0%
e 1117
 
0.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 227513
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
F 45726
20.1%
o 44609
19.6%
u 44609
19.6%
n 44609
19.6%
d 44609
19.6%
l 2234
 
1.0%
e 1117
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 227513
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
F 45726
20.1%
o 44609
19.6%
u 44609
19.6%
n 44609
19.6%
d 44609
19.6%
l 2234
 
1.0%
e 1117
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 227513
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
F 45726
20.1%
o 44609
19.6%
u 44609
19.6%
n 44609
19.6%
d 44609
19.6%
l 2234
 
1.0%
e 1117
 
0.5%

year
Date

Distinct265
Distinct (%)0.6%
Missing291
Missing (%)0.6%
Memory size357.4 KiB
Minimum1970-01-01 00:00:00
Maximum1970-01-01 00:00:00.000002
2024-03-18T18:23:25.415451image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:25.613678image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

reclat
Real number (ℝ)

MISSING  ZEROS 

Distinct12738
Distinct (%)33.2%
Missing7315
Missing (%)16.0%
Infinite0
Infinite (%)0.0%
Mean-39.107095
Minimum-87.36667
Maximum81.16667
Zeros6438
Zeros (%)14.1%
Negative23416
Negative (%)51.2%
Memory size357.4 KiB
2024-03-18T18:23:25.799184image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-87.36667
5-th percentile-84.35476
Q1-76.71377
median-71.5
Q30
95-th percentile34.494325
Maximum81.16667
Range168.53334
Interquartile range (IQR)76.71377

Descriptive statistics

Standard deviation46.386011
Coefficient of variation (CV)-1.1861278
Kurtosis-1.4768651
Mean-39.107095
Median Absolute Deviation (MAD)12.76459
Skewness0.49131573
Sum-1502142.6
Variance2151.662
MonotonicityNot monotonic
2024-03-18T18:23:26.001938image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6438
 
14.1%
-71.5 4761
 
10.4%
-84 3040
 
6.6%
-72 1506
 
3.3%
-79.68333 1130
 
2.5%
-76.71667 680
 
1.5%
-76.18333 539
 
1.2%
-84.21667 263
 
0.6%
-86.36667 226
 
0.5%
-86.71667 217
 
0.5%
Other values (12728) 19611
42.9%
(Missing) 7315
 
16.0%
ValueCountFrequency (%)
-87.36667 4
 
< 0.1%
-87.03333 3
 
< 0.1%
-86.93333 3
 
< 0.1%
-86.71667 217
0.5%
-86.56667 17
 
< 0.1%
-86.54488 1
 
< 0.1%
-86.5379 1
 
< 0.1%
-86.53734 1
 
< 0.1%
-86.53725 1
 
< 0.1%
-86.53035 1
 
< 0.1%
ValueCountFrequency (%)
81.16667 1
< 0.1%
76.53333 1
< 0.1%
76.13333 1
< 0.1%
72.88333 1
< 0.1%
72.68333 1
< 0.1%
70.73333 1
< 0.1%
70 1
< 0.1%
69.1 1
< 0.1%
68 1
< 0.1%
67.88333 1
< 0.1%

reclong
Real number (ℝ)

MISSING  ZEROS 

Distinct14640
Distinct (%)38.1%
Missing7315
Missing (%)16.0%
Infinite0
Infinite (%)0.0%
Mean61.052594
Minimum-165.43333
Maximum354.47333
Zeros6214
Zeros (%)13.6%
Negative4057
Negative (%)8.9%
Memory size357.4 KiB
2024-03-18T18:23:26.337409image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-165.43333
5-th percentile-90.427
Q10
median35.66667
Q3157.16667
95-th percentile168
Maximum354.47333
Range519.90666
Interquartile range (IQR)157.16667

Descriptive statistics

Standard deviation80.655258
Coefficient of variation (CV)1.3210783
Kurtosis-0.73139356
Mean61.052594
Median Absolute Deviation (MAD)39.53972
Skewness-0.17438133
Sum2345091.2
Variance6505.2706
MonotonicityNot monotonic
2024-03-18T18:23:26.545411image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6214
 
13.6%
35.66667 4985
 
10.9%
168 3040
 
6.6%
26 1506
 
3.3%
159.75 657
 
1.4%
159.66667 637
 
1.4%
157.16667 542
 
1.2%
155.75 473
 
1.0%
160.5 263
 
0.6%
-70 228
 
0.5%
Other values (14630) 19866
43.4%
(Missing) 7315
 
16.0%
ValueCountFrequency (%)
-165.43333 9
< 0.1%
-165.11667 17
< 0.1%
-163.16667 1
 
< 0.1%
-162.55 1
 
< 0.1%
-157.86667 1
 
< 0.1%
-157.78333 1
 
< 0.1%
-149.5 4
 
< 0.1%
-148.55 2
 
< 0.1%
-148 3
 
< 0.1%
-146.26667 1
 
< 0.1%
ValueCountFrequency (%)
354.47333 1
 
< 0.1%
178.2 1
 
< 0.1%
178.08333 1
 
< 0.1%
175.73028 1
 
< 0.1%
175.13333 1
 
< 0.1%
175 185
0.4%
174.50043 1
 
< 0.1%
174.4 1
 
< 0.1%
172.7 1
 
< 0.1%
172.6 1
 
< 0.1%

GeoLocation
Text

MISSING 

Distinct17100
Distinct (%)44.5%
Missing7315
Missing (%)16.0%
Memory size2.9 MiB
2024-03-18T18:23:26.875399image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length24
Median length22
Mean length17.304809
Min length10

Characters and Unicode

Total characters664695
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16363 ?
Unique (%)42.6%

Sample

1st row(50.775, 6.08333)
2nd row(56.18333, 10.23333)
3rd row(54.21667, -113.0)
4th row(16.88333, -99.9)
5th row(-33.16667, -64.95)
ValueCountFrequency (%)
0.0 12652
 
16.5%
35.66667 4991
 
6.5%
71.5 4761
 
6.2%
84.0 3041
 
4.0%
168.0 3040
 
4.0%
26.0 1512
 
2.0%
72.0 1506
 
2.0%
79.68333 1130
 
1.5%
76.71667 680
 
0.9%
159.75 657
 
0.9%
Other values (26608) 42852
55.8%
2024-03-18T18:23:27.429090image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 76822
11.6%
6 67560
 
10.2%
7 52499
 
7.9%
0 49033
 
7.4%
3 44771
 
6.7%
1 44476
 
6.7%
5 42757
 
6.4%
( 38411
 
5.8%
, 38411
 
5.8%
38411
 
5.8%
Other values (6) 171544
25.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 664695
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
. 76822
11.6%
6 67560
 
10.2%
7 52499
 
7.9%
0 49033
 
7.4%
3 44771
 
6.7%
1 44476
 
6.7%
5 42757
 
6.4%
( 38411
 
5.8%
, 38411
 
5.8%
38411
 
5.8%
Other values (6) 171544
25.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 664695
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
. 76822
11.6%
6 67560
 
10.2%
7 52499
 
7.9%
0 49033
 
7.4%
3 44771
 
6.7%
1 44476
 
6.7%
5 42757
 
6.4%
( 38411
 
5.8%
, 38411
 
5.8%
38411
 
5.8%
Other values (6) 171544
25.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 664695
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
. 76822
11.6%
6 67560
 
10.2%
7 52499
 
7.9%
0 49033
 
7.4%
3 44771
 
6.7%
1 44476
 
6.7%
5 42757
 
6.4%
( 38411
 
5.8%
, 38411
 
5.8%
38411
 
5.8%
Other values (6) 171544
25.8%

source
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
NASA
45726 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters182904
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNASA
2nd rowNASA
3rd rowNASA
4th rowNASA
5th rowNASA

Common Values

ValueCountFrequency (%)
NASA 45726
100.0%

Length

2024-03-18T18:23:27.628737image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-18T18:23:27.750697image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
nasa 45726
100.0%

Most occurring characters

ValueCountFrequency (%)
A 91452
50.0%
N 45726
25.0%
S 45726
25.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 182904
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 91452
50.0%
N 45726
25.0%
S 45726
25.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 182904
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 91452
50.0%
N 45726
25.0%
S 45726
25.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 182904
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 91452
50.0%
N 45726
25.0%
S 45726
25.0%

boolean
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size44.8 KiB
True
22934 
False
22792 
ValueCountFrequency (%)
True 22934
50.2%
False 22792
49.8%
2024-03-18T18:23:27.860598image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

mixed
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
A
22889 
1
22837 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45726
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd rowA
3rd row1
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A 22889
50.1%
1 22837
49.9%

Length

2024-03-18T18:23:28.001916image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-18T18:23:28.130291image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
a 22889
50.1%
1 22837
49.9%

Most occurring characters

ValueCountFrequency (%)
A 22889
50.1%
1 22837
49.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 45726
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 22889
50.1%
1 22837
49.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 45726
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 22889
50.1%
1 22837
49.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 45726
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 22889
50.1%
1 22837
49.9%

unhashable
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size3.1 MiB

reclat_city
Real number (ℝ)

MISSING 

Distinct38401
Distinct (%)> 99.9%
Missing7315
Missing (%)16.0%
Infinite0
Infinite (%)0.0%
Mean-39.153542
Minimum-104.31717
Maximum77.749011
Zeros0
Zeros (%)0.0%
Negative26603
Negative (%)58.2%
Memory size357.4 KiB
2024-03-18T18:23:28.291974image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-104.31717
5-th percentile-87.871058
Q1-78.407752
median-68.975293
Q34.7886449
95-th percentile35.42981
Maximum77.749011
Range182.06618
Interquartile range (IQR)83.196397

Descriptive statistics

Standard deviation46.685687
Coefficient of variation (CV)-1.1923745
Kurtosis-1.446385
Mean-39.153542
Median Absolute Deviation (MAD)17.255843
Skewness0.48160358
Sum-1503926.7
Variance2179.5534
MonotonicityNot monotonic
2024-03-18T18:23:28.503494image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50.51806008 2
 
< 0.1%
43.27957156 2
 
< 0.1%
52.01104434 2
 
< 0.1%
-32.5810219 2
 
< 0.1%
49.60726921 2
 
< 0.1%
-29.65152821 2
 
< 0.1%
36.5165896 2
 
< 0.1%
-23.28864666 2
 
< 0.1%
23.16596589 2
 
< 0.1%
52.70663547 2
 
< 0.1%
Other values (38391) 38391
84.0%
(Missing) 7315
 
16.0%
ValueCountFrequency (%)
-104.3171665 1
< 0.1%
-102.4312375 1
< 0.1%
-102.0868253 1
< 0.1%
-101.5556373 1
< 0.1%
-101.3269284 1
< 0.1%
-101.2084341 1
< 0.1%
-101.0146935 1
< 0.1%
-100.9191264 1
< 0.1%
-100.7856947 1
< 0.1%
-100.5751117 1
< 0.1%
ValueCountFrequency (%)
77.74901083 1
< 0.1%
72.80622023 1
< 0.1%
72.75730423 1
< 0.1%
72.42607973 1
< 0.1%
72.25809595 1
< 0.1%
71.78938297 1
< 0.1%
71.42543169 1
< 0.1%
70.89755212 1
< 0.1%
70.53373183 1
< 0.1%
70.48523932 1
< 0.1%

Interactions

2024-03-18T18:23:20.617111image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:17.760709image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:18.484739image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:19.211972image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:19.928083image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:20.764775image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:17.908795image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:18.631142image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:19.356537image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:20.067493image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:20.917992image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:18.053135image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:18.773553image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:19.502888image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:20.213636image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:21.063171image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:18.198220image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:18.918076image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:19.645448image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:20.350274image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:21.197171image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:18.333026image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:19.061620image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:19.778389image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:23:20.474880image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-03-18T18:23:21.430302image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-18T18:23:21.790795image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

nameidnametyperecclassmass (g)fallyearreclatreclongGeoLocationsourcebooleanmixedunhashablereclat_city
0Aachen1ValidL521.0Fell1970-01-01 00:00:00.00000188050.775006.08333(50.775, 6.08333)NASATrue1[1]50.518060
1Aarhus2ValidH6720.0Fell1970-01-01 00:00:00.00000195156.1833310.23333(56.18333, 10.23333)NASAFalseA[1]52.011044
2Abee6ValidEH4107000.0Fell1970-01-01 00:00:00.00000195254.21667-113.00000(54.21667, -113.0)NASAFalse1[1]52.706635
3Acapulco10ValidAcapulcoite1914.0Fell1970-01-01 00:00:00.00000197616.88333-99.90000(16.88333, -99.9)NASAFalseA[1]23.165966
4Achiras370ValidL6780.0Fell1970-01-01 00:00:00.000001902-33.16667-64.95000(-33.16667, -64.95)NASAFalseA[1]-23.288647
5Adhi Kot379ValidEH44239.0Fell1970-01-01 00:00:00.00000191932.1000071.80000(32.1, 71.8)NASATrue1[1]36.516590
6Adzhi-Bogdo (stone)390ValidLL3-6910.0Fell1970-01-01 00:00:00.00000194944.8333395.16667(44.83333, 95.16667)NASATrue1[1]43.279572
7Agen392ValidH530000.0Fell1970-01-01 00:00:00.00000181444.216670.61667(44.21667, 0.61667)NASAFalseA[1]49.607269
8Aguada398ValidL61620.0Fell1970-01-01 00:00:00.000001930-31.60000-65.23333(-31.6, -65.23333)NASAFalse1[1]-32.581022
9Aguila Blanca417ValidL1440.0Fell1970-01-01 00:00:00.000001920-30.86667-64.55000(-30.86667, -64.55)NASAFalseA[1]-29.651528
nameidnametyperecclassmass (g)fallyearreclatreclongGeoLocationsourcebooleanmixedunhashablereclat_city
45716Aachen1ValidL521.0Fell1970-01-01 00:00:00.00000188050.775006.08333(50.775, 6.08333)NASATrue1[1]50.518060
45717Aarhus2ValidH6720.0Fell1970-01-01 00:00:00.00000195156.1833310.23333(56.18333, 10.23333)NASAFalseA[1]52.011044
45718Abee6ValidEH4107000.0Fell1970-01-01 00:00:00.00000195254.21667-113.00000(54.21667, -113.0)NASAFalse1[1]52.706635
45719Acapulco10ValidAcapulcoite1914.0Fell1970-01-01 00:00:00.00000197616.88333-99.90000(16.88333, -99.9)NASAFalseA[1]23.165966
45720Achiras370ValidL6780.0Fell1970-01-01 00:00:00.000001902-33.16667-64.95000(-33.16667, -64.95)NASAFalseA[1]-23.288647
45721Adhi Kot379ValidEH44239.0Fell1970-01-01 00:00:00.00000191932.1000071.80000(32.1, 71.8)NASATrue1[1]36.516590
45722Adzhi-Bogdo (stone)390ValidLL3-6910.0Fell1970-01-01 00:00:00.00000194944.8333395.16667(44.83333, 95.16667)NASATrue1[1]43.279572
45723Agen392ValidH530000.0Fell1970-01-01 00:00:00.00000181444.216670.61667(44.21667, 0.61667)NASAFalseA[1]49.607269
45724Aguada398ValidL61620.0Fell1970-01-01 00:00:00.000001930-31.60000-65.23333(-31.6, -65.23333)NASAFalse1[1]-32.581022
45725Aguila Blanca417ValidL1440.0Fell1970-01-01 00:00:00.000001920-30.86667-64.55000(-30.86667, -64.55)NASAFalseA[1]-29.651528

Duplicate rows

Most frequently occurring

nameidnametyperecclassmass (g)fallyearreclatreclongGeoLocationsourcebooleanmixedreclat_city# duplicates
0Aachen1ValidL521.0Fell1970-01-01 00:00:00.00000188050.775006.08333(50.775, 6.08333)NASATrue150.5180602
1Aarhus2ValidH6720.0Fell1970-01-01 00:00:00.00000195156.1833310.23333(56.18333, 10.23333)NASAFalseA52.0110442
2Abee6ValidEH4107000.0Fell1970-01-01 00:00:00.00000195254.21667-113.00000(54.21667, -113.0)NASAFalse152.7066352
3Acapulco10ValidAcapulcoite1914.0Fell1970-01-01 00:00:00.00000197616.88333-99.90000(16.88333, -99.9)NASAFalseA23.1659662
4Achiras370ValidL6780.0Fell1970-01-01 00:00:00.000001902-33.16667-64.95000(-33.16667, -64.95)NASAFalseA-23.2886472
5Adhi Kot379ValidEH44239.0Fell1970-01-01 00:00:00.00000191932.1000071.80000(32.1, 71.8)NASATrue136.5165902
6Adzhi-Bogdo (stone)390ValidLL3-6910.0Fell1970-01-01 00:00:00.00000194944.8333395.16667(44.83333, 95.16667)NASATrue143.2795722
7Agen392ValidH530000.0Fell1970-01-01 00:00:00.00000181444.216670.61667(44.21667, 0.61667)NASAFalseA49.6072692
8Aguada398ValidL61620.0Fell1970-01-01 00:00:00.000001930-31.60000-65.23333(-31.6, -65.23333)NASAFalse1-32.5810222
9Aguila Blanca417ValidL1440.0Fell1970-01-01 00:00:00.000001920-30.86667-64.55000(-30.86667, -64.55)NASAFalseA-29.6515282