Dataset statistics
Number of variables | 15 |
---|---|
Number of observations | 45726 |
Missing cells | 29682 |
Missing cells (%) | 4.3% |
Duplicate rows | 10 |
Duplicate rows (%) | < 0.1% |
Total size in memory | 24.0 MiB |
Average record size in memory | 550.8 B |
Variable types
Text | 3 |
---|---|
Numeric | 5 |
Categorical | 4 |
DateTime | 1 |
Boolean | 1 |
Unsupported | 1 |
source has constant value "" | Constant |
Dataset has 10 (< 0.1%) duplicate rows | Duplicates |
reclat is highly overall correlated with reclong and 1 other fields | High correlation |
reclong is highly overall correlated with reclat and 1 other fields | High correlation |
reclat_city is highly overall correlated with reclat and 1 other fields | High correlation |
nametype is highly imbalanced (98.2%) | Imbalance |
fall is highly imbalanced (83.4%) | Imbalance |
reclat has 7315 (16.0%) missing values | Missing |
reclong has 7315 (16.0%) missing values | Missing |
GeoLocation has 7315 (16.0%) missing values | Missing |
reclat_city has 7315 (16.0%) missing values | Missing |
mass (g) is highly skewed (γ1 = 76.91847245) | Skewed |
unhashable is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
reclat has 6438 (14.1%) zeros | Zeros |
reclong has 6214 (13.6%) zeros | Zeros |
Reproduction
Analysis started | 2023-09-12 08:37:54.979373 |
---|---|
Analysis finished | 2023-09-12 08:38:01.952463 |
Duration | 6.97 seconds |
Software version | ydata-profiling v0.0.dev0 |
Download configuration | config.json |
name
Text
Distinct | 45716 |
---|---|
Distinct (%) | > 99.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 3.3 MiB |
Value | Count | Frequency (%) |
yamato | 7269 | 5.7% |
range | 6575 | 5.2% |
africa | 4502 | 3.6% |
northwest | 4499 | 3.5% |
hills | 3995 | 3.2% |
queen | 3445 | 2.7% |
alexandra | 3444 | 2.7% |
mountains | 3004 | 2.4% |
al | 2663 | 2.1% |
grove | 2496 | 2.0% |
Other values (37726) | 84860 |
Most occurring characters
Value | Count | Frequency (%) |
81032 | 10.0% | |
a | 72715 | 8.9% |
e | 48167 | 5.9% |
n | 38392 | 4.7% |
0 | 34943 | 4.3% |
r | 33097 | 4.1% |
i | 32658 | 4.0% |
l | 31873 | 3.9% |
t | 30898 | 3.8% |
o | 30428 | 3.7% |
Other values (86) | 378919 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 440949 | |
Decimal Number | 205415 | |
Uppercase Letter | 84942 | 10.4% |
Space Separator | 81032 | 10.0% |
Close Punctuation | 295 | < 0.1% |
Open Punctuation | 295 | < 0.1% |
Dash Punctuation | 98 | < 0.1% |
Other Punctuation | 96 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
a | 72715 | |
e | 48167 | |
n | 38392 | |
r | 33097 | 7.5% |
i | 32658 | 7.4% |
l | 31873 | 7.2% |
t | 30898 | 7.0% |
o | 30428 | 6.9% |
s | 20972 | 4.8% |
m | 12393 | 2.8% |
Other values (39) | 89356 |
Uppercase Letter
Value | Count | Frequency (%) |
A | 14120 | |
M | 11173 | |
R | 7599 | |
Y | 7327 | |
N | 5796 | 6.8% |
H | 5676 | 6.7% |
G | 4682 | 5.5% |
L | 4630 | 5.5% |
D | 3777 | 4.4% |
Q | 3478 | 4.1% |
Other values (21) | 16684 |
Decimal Number
Value | Count | Frequency (%) |
0 | 34943 | |
9 | 24444 | |
8 | 22179 | |
1 | 21986 | |
2 | 19839 | |
7 | 19347 | |
3 | 17379 | |
4 | 16001 | |
5 | 14812 | |
6 | 14485 |
Other Punctuation
Value | Count | Frequency (%) |
' | 67 | |
. | 29 |
Space Separator
Value | Count | Frequency (%) |
81032 |
Close Punctuation
Value | Count | Frequency (%) |
) | 295 |
Open Punctuation
Value | Count | Frequency (%) |
( | 295 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 98 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 525891 | |
Common | 287231 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
a | 72715 | |
e | 48167 | 9.2% |
n | 38392 | 7.3% |
r | 33097 | 6.3% |
i | 32658 | 6.2% |
l | 31873 | 6.1% |
t | 30898 | 5.9% |
o | 30428 | 5.8% |
s | 20972 | 4.0% |
A | 14120 | 2.7% |
Other values (70) | 172571 |
Common
Value | Count | Frequency (%) |
81032 | ||
0 | 34943 | |
9 | 24444 | 8.5% |
8 | 22179 | 7.7% |
1 | 21986 | 7.7% |
2 | 19839 | 6.9% |
7 | 19347 | 6.7% |
3 | 17379 | 6.1% |
4 | 16001 | 5.6% |
5 | 14812 | 5.2% |
Other values (6) | 15269 | 5.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 812638 | |
None | 484 | 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
81032 | 10.0% | |
a | 72715 | 8.9% |
e | 48167 | 5.9% |
n | 38392 | 4.7% |
0 | 34943 | 4.3% |
r | 33097 | 4.1% |
i | 32658 | 4.0% |
l | 31873 | 3.9% |
t | 30898 | 3.8% |
o | 30428 | 3.7% |
Other values (58) | 378435 |
None
Value | Count | Frequency (%) |
é | 204 | |
ÅŸ | 125 | |
Ö | 63 | 13.0% |
á | 11 | 2.3% |
ö | 11 | 2.3% |
ä | 10 | 2.1% |
ó | 8 | 1.7% |
ü | 8 | 1.7% |
ñ | 8 | 1.7% |
ã | 5 | 1.0% |
Other values (18) | 31 | 6.4% |
id
Real number (ℝ)
Distinct | 45716 |
---|---|
Distinct (%) | > 99.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 26883.906 |
Minimum | 1 |
---|---|
Maximum | 57458 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 357.4 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 2388.75 |
Q1 | 12681.25 |
median | 24256.5 |
Q3 | 40653.5 |
95-th percentile | 54890.75 |
Maximum | 57458 |
Range | 57457 |
Interquartile range (IQR) | 27972.25 |
Descriptive statistics
Standard deviation | 16863.446 |
---|---|
Coefficient of variation (CV) | 0.62726917 |
Kurtosis | -1.1601308 |
Mean | 26883.906 |
Median Absolute Deviation (MAD) | 13264 |
Skewness | 0.26653007 |
Sum | 1.2292935 × 109 |
Variance | 2.843758 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 2 | < 0.1% |
6 | 2 | < 0.1% |
10 | 2 | < 0.1% |
370 | 2 | < 0.1% |
379 | 2 | < 0.1% |
390 | 2 | < 0.1% |
392 | 2 | < 0.1% |
398 | 2 | < 0.1% |
417 | 2 | < 0.1% |
2 | 2 | < 0.1% |
Other values (45706) | 45706 |
Value | Count | Frequency (%) |
1 | 2 | |
2 | 2 | |
4 | 1 | |
5 | 1 | |
6 | 2 | |
7 | 1 | |
8 | 1 | |
9 | 1 | |
10 | 2 | |
11 | 1 |
Value | Count | Frequency (%) |
57458 | 1 | |
57457 | 1 | |
57456 | 1 | |
57455 | 1 | |
57454 | 1 | |
57453 | 1 | |
57436 | 1 | |
57435 | 1 | |
57434 | 1 | |
57433 | 1 |
nametype
Categorical
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.7 MiB |
Valid | |
---|---|
Relict | 75 |
Common Values
Value | Count | Frequency (%) |
Valid | 45651 | |
Relict | 75 | 0.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
valid | 45651 | |
relict | 75 | 0.2% |
Most occurring characters
Value | Count | Frequency (%) |
l | 45726 | |
i | 45726 | |
V | 45651 | |
a | 45651 | |
d | 45651 | |
R | 75 | < 0.1% |
e | 75 | < 0.1% |
c | 75 | < 0.1% |
t | 75 | < 0.1% |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 182979 | |
Uppercase Letter | 45726 | 20.0% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
l | 45726 | |
i | 45726 | |
a | 45651 | |
d | 45651 | |
e | 75 | < 0.1% |
c | 75 | < 0.1% |
t | 75 | < 0.1% |
Uppercase Letter
Value | Count | Frequency (%) |
V | 45651 | |
R | 75 | 0.2% |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 228705 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
l | 45726 | |
i | 45726 | |
V | 45651 | |
a | 45651 | |
d | 45651 | |
R | 75 | < 0.1% |
e | 75 | < 0.1% |
c | 75 | < 0.1% |
t | 75 | < 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 228705 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
l | 45726 | |
i | 45726 | |
V | 45651 | |
a | 45651 | |
d | 45651 | |
R | 75 | < 0.1% |
e | 75 | < 0.1% |
c | 75 | < 0.1% |
t | 75 | < 0.1% |
recclass
Text
Distinct | 466 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.6 MiB |
Value | Count | Frequency (%) |
l6 | 8341 | |
h5 | 7165 | |
l5 | 4818 | |
h6 | 4530 | |
h4 | 4223 | 8.9% |
ll5 | 2766 | 5.8% |
ll6 | 2046 | 4.3% |
l4 | 1256 | 2.7% |
iron | 1070 | 2.3% |
h4/5 | 428 | 0.9% |
Other values (434) | 10712 |
Most occurring characters
Value | Count | Frequency (%) |
L | 28467 | |
H | 18396 | |
5 | 16419 | |
6 | 16132 | |
4 | 6930 | 5.0% |
e | 3972 | 2.8% |
i | 3834 | 2.7% |
r | 3648 | 2.6% |
t | 3327 | 2.4% |
3 | 3278 | 2.3% |
Other values (52) | 35177 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 57793 | |
Decimal Number | 44118 | |
Lowercase Letter | 29926 | |
Other Punctuation | 3293 | 2.4% |
Dash Punctuation | 1835 | 1.3% |
Space Separator | 1747 | 1.3% |
Math Symbol | 320 | 0.2% |
Open Punctuation | 274 | 0.2% |
Close Punctuation | 274 | 0.2% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 3972 | |
i | 3834 | |
r | 3648 | |
t | 3327 | |
n | 2520 | |
o | 2458 | |
c | 1767 | 5.9% |
u | 1469 | 4.9% |
a | 1409 | 4.7% |
l | 1016 | 3.4% |
Other values (12) | 4506 |
Uppercase Letter
Value | Count | Frequency (%) |
L | 28467 | |
H | 18396 | |
I | 2753 | 4.8% |
C | 1785 | 3.1% |
E | 1261 | 2.2% |
A | 985 | 1.7% |
M | 913 | 1.6% |
B | 754 | 1.3% |
O | 542 | 0.9% |
V | 350 | 0.6% |
Other values (10) | 1587 | 2.7% |
Decimal Number
Value | Count | Frequency (%) |
5 | 16419 | |
6 | 16132 | |
4 | 6930 | |
3 | 3278 | 7.4% |
2 | 646 | 1.5% |
7 | 251 | 0.6% |
8 | 216 | 0.5% |
9 | 111 | 0.3% |
1 | 100 | 0.2% |
0 | 35 | 0.1% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 1174 | |
. | 1064 | |
, | 1031 | |
? | 24 | 0.7% |
Math Symbol
Value | Count | Frequency (%) |
~ | 319 | |
< | 1 | 0.3% |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1835 |
Space Separator
Value | Count | Frequency (%) |
1747 |
Open Punctuation
Value | Count | Frequency (%) |
( | 274 |
Close Punctuation
Value | Count | Frequency (%) |
) | 274 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 87719 | |
Common | 51861 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
L | 28467 | |
H | 18396 | |
e | 3972 | 4.5% |
i | 3834 | 4.4% |
r | 3648 | 4.2% |
t | 3327 | 3.8% |
I | 2753 | 3.1% |
n | 2520 | 2.9% |
o | 2458 | 2.8% |
C | 1785 | 2.0% |
Other values (32) | 16559 |
Common
Value | Count | Frequency (%) |
5 | 16419 | |
6 | 16132 | |
4 | 6930 | |
3 | 3278 | 6.3% |
- | 1835 | 3.5% |
1747 | 3.4% | |
/ | 1174 | 2.3% |
. | 1064 | 2.1% |
, | 1031 | 2.0% |
2 | 646 | 1.2% |
Other values (10) | 1605 | 3.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 139580 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
L | 28467 | |
H | 18396 | |
5 | 16419 | |
6 | 16132 | |
4 | 6930 | 5.0% |
e | 3972 | 2.8% |
i | 3834 | 2.7% |
r | 3648 | 2.6% |
t | 3327 | 2.4% |
3 | 3278 | 2.3% |
Other values (52) | 35177 |
mass (g)
Real number (ℝ)
SKEWED
 
Distinct | 12576 |
---|---|
Distinct (%) | 27.6% |
Missing | 131 |
Missing (%) | 0.3% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 13278.426 |
Minimum | 0 |
---|---|
Maximum | 60000000 |
Zeros | 19 |
Zeros (%) | < 0.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 357.4 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 1.1 |
Q1 | 7.2 |
median | 32.61 |
Q3 | 202.9 |
95-th percentile | 4000 |
Maximum | 60000000 |
Range | 60000000 |
Interquartile range (IQR) | 195.7 |
Descriptive statistics
Standard deviation | 574926.01 |
---|---|
Coefficient of variation (CV) | 43.297752 |
Kurtosis | 6798.3984 |
Mean | 13278.426 |
Median Absolute Deviation (MAD) | 30.51 |
Skewness | 76.918472 |
Sum | 6.0542985 × 108 |
Variance | 3.3053992 × 1011 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1.3 | 171 | 0.4% |
1.2 | 140 | 0.3% |
1.4 | 138 | 0.3% |
2.1 | 130 | 0.3% |
2.4 | 126 | 0.3% |
1.6 | 120 | 0.3% |
0.5 | 119 | 0.3% |
1.1 | 116 | 0.3% |
3.8 | 114 | 0.2% |
1.5 | 111 | 0.2% |
Other values (12566) | 44310 | |
(Missing) | 131 | 0.3% |
Value | Count | Frequency (%) |
0 | 19 | |
0.01 | 2 | < 0.1% |
0.013 | 1 | < 0.1% |
0.02 | 1 | < 0.1% |
0.03 | 1 | < 0.1% |
0.04 | 1 | < 0.1% |
0.05 | 1 | < 0.1% |
0.06 | 1 | < 0.1% |
0.07 | 3 | < 0.1% |
0.08 | 2 | < 0.1% |
Value | Count | Frequency (%) |
60000000 | 1 | |
58200000 | 1 | |
50000000 | 1 | |
30000000 | 1 | |
28000000 | 1 | |
26000000 | 1 | |
24300000 | 1 | |
24000000 | 1 | |
23000000 | 1 | |
22000000 | 1 |
fall
Categorical
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.7 MiB |
Found | |
---|---|
Fell | 1117 |
Common Values
Value | Count | Frequency (%) |
Found | 44609 | |
Fell | 1117 | 2.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
found | 44609 | |
fell | 1117 | 2.4% |
Most occurring characters
Value | Count | Frequency (%) |
F | 45726 | |
o | 44609 | |
u | 44609 | |
n | 44609 | |
d | 44609 | |
l | 2234 | 1.0% |
e | 1117 | 0.5% |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 181787 | |
Uppercase Letter | 45726 | 20.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
o | 44609 | |
u | 44609 | |
n | 44609 | |
d | 44609 | |
l | 2234 | 1.2% |
e | 1117 | 0.6% |
Uppercase Letter
Value | Count | Frequency (%) |
F | 45726 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 227513 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
F | 45726 | |
o | 44609 | |
u | 44609 | |
n | 44609 | |
d | 44609 | |
l | 2234 | 1.0% |
e | 1117 | 0.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 227513 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
F | 45726 | |
o | 44609 | |
u | 44609 | |
n | 44609 | |
d | 44609 | |
l | 2234 | 1.0% |
e | 1117 | 0.5% |
year
Date
Distinct | 265 |
---|---|
Distinct (%) | 0.6% |
Missing | 291 |
Missing (%) | 0.6% |
Memory size | 357.4 KiB |
Minimum | 1970-01-01 00:00:00 |
---|---|
Maximum | 1970-01-01 00:00:00.000002 |
reclat
Real number (ℝ)
HIGH CORRELATION
  MISSING
  ZEROS
 
Distinct | 12738 |
---|---|
Distinct (%) | 33.2% |
Missing | 7315 |
Missing (%) | 16.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | -39.107095 |
Minimum | -87.36667 |
---|---|
Maximum | 81.16667 |
Zeros | 6438 |
Zeros (%) | 14.1% |
Negative | 23416 |
Negative (%) | 51.2% |
Memory size | 357.4 KiB |
Quantile statistics
Minimum | -87.36667 |
---|---|
5-th percentile | -84.35476 |
Q1 | -76.71377 |
median | -71.5 |
Q3 | 0 |
95-th percentile | 34.494325 |
Maximum | 81.16667 |
Range | 168.53334 |
Interquartile range (IQR) | 76.71377 |
Descriptive statistics
Standard deviation | 46.386011 |
---|---|
Coefficient of variation (CV) | -1.1861278 |
Kurtosis | -1.4768651 |
Mean | -39.107095 |
Median Absolute Deviation (MAD) | 12.76459 |
Skewness | 0.49131573 |
Sum | -1502142.6 |
Variance | 2151.662 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 6438 | 14.1% |
-71.5 | 4761 | 10.4% |
-84 | 3040 | 6.6% |
-72 | 1506 | 3.3% |
-79.68333 | 1130 | 2.5% |
-76.71667 | 680 | 1.5% |
-76.18333 | 539 | 1.2% |
-84.21667 | 263 | 0.6% |
-86.36667 | 226 | 0.5% |
-86.71667 | 217 | 0.5% |
Other values (12728) | 19611 | |
(Missing) | 7315 | 16.0% |
Value | Count | Frequency (%) |
-87.36667 | 4 | < 0.1% |
-87.03333 | 3 | < 0.1% |
-86.93333 | 3 | < 0.1% |
-86.71667 | 217 | |
-86.56667 | 17 | < 0.1% |
-86.54488 | 1 | < 0.1% |
-86.5379 | 1 | < 0.1% |
-86.53734 | 1 | < 0.1% |
-86.53725 | 1 | < 0.1% |
-86.53035 | 1 | < 0.1% |
Value | Count | Frequency (%) |
81.16667 | 1 | |
76.53333 | 1 | |
76.13333 | 1 | |
72.88333 | 1 | |
72.68333 | 1 | |
70.73333 | 1 | |
70 | 1 | |
69.1 | 1 | |
68 | 1 | |
67.88333 | 1 |
reclong
Real number (ℝ)
HIGH CORRELATION
  MISSING
  ZEROS
 
Distinct | 14640 |
---|---|
Distinct (%) | 38.1% |
Missing | 7315 |
Missing (%) | 16.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 61.052594 |
Minimum | -165.43333 |
---|---|
Maximum | 354.47333 |
Zeros | 6214 |
Zeros (%) | 13.6% |
Negative | 4057 |
Negative (%) | 8.9% |
Memory size | 357.4 KiB |
Quantile statistics
Minimum | -165.43333 |
---|---|
5-th percentile | -90.427 |
Q1 | 0 |
median | 35.66667 |
Q3 | 157.16667 |
95-th percentile | 168 |
Maximum | 354.47333 |
Range | 519.90666 |
Interquartile range (IQR) | 157.16667 |
Descriptive statistics
Standard deviation | 80.655258 |
---|---|
Coefficient of variation (CV) | 1.3210783 |
Kurtosis | -0.73139356 |
Mean | 61.052594 |
Median Absolute Deviation (MAD) | 39.53972 |
Skewness | -0.17438133 |
Sum | 2345091.2 |
Variance | 6505.2706 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 6214 | 13.6% |
35.66667 | 4985 | 10.9% |
168 | 3040 | 6.6% |
26 | 1506 | 3.3% |
159.75 | 657 | 1.4% |
159.66667 | 637 | 1.4% |
157.16667 | 542 | 1.2% |
155.75 | 473 | 1.0% |
160.5 | 263 | 0.6% |
-70 | 228 | 0.5% |
Other values (14630) | 19866 | |
(Missing) | 7315 | 16.0% |
Value | Count | Frequency (%) |
-165.43333 | 9 | |
-165.11667 | 17 | |
-163.16667 | 1 | < 0.1% |
-162.55 | 1 | < 0.1% |
-157.86667 | 1 | < 0.1% |
-157.78333 | 1 | < 0.1% |
-149.5 | 4 | < 0.1% |
-148.55 | 2 | < 0.1% |
-148 | 3 | < 0.1% |
-146.26667 | 1 | < 0.1% |
Value | Count | Frequency (%) |
354.47333 | 1 | < 0.1% |
178.2 | 1 | < 0.1% |
178.08333 | 1 | < 0.1% |
175.73028 | 1 | < 0.1% |
175.13333 | 1 | < 0.1% |
175 | 185 | |
174.50043 | 1 | < 0.1% |
174.4 | 1 | < 0.1% |
172.7 | 1 | < 0.1% |
172.6 | 1 | < 0.1% |
GeoLocation
Text
MISSING
 
Distinct | 17100 |
---|---|
Distinct (%) | 44.5% |
Missing | 7315 |
Missing (%) | 16.0% |
Memory size | 2.9 MiB |
Length
Max length | 24 |
---|---|
Median length | 22 |
Mean length | 17.304809 |
Min length | 10 |
Characters and Unicode
Total characters | 664695 |
---|---|
Distinct characters | 16 |
Distinct categories | 6 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 16363 ? |
---|---|
Unique (%) | 42.6% |
Sample
1st row | (50.775, 6.08333) |
---|---|
2nd row | (56.18333, 10.23333) |
3rd row | (54.21667, -113.0) |
4th row | (16.88333, -99.9) |
5th row | (-33.16667, -64.95) |
Value | Count | Frequency (%) |
0.0 | 12652 | 16.5% |
35.66667 | 4991 | 6.5% |
71.5 | 4761 | 6.2% |
84.0 | 3041 | 4.0% |
168.0 | 3040 | 4.0% |
26.0 | 1512 | 2.0% |
72.0 | 1506 | 2.0% |
79.68333 | 1130 | 1.5% |
76.71667 | 680 | 0.9% |
159.75 | 657 | 0.9% |
Other values (26608) | 42852 |
Most occurring characters
Value | Count | Frequency (%) |
. | 76822 | |
6 | 67560 | 10.2% |
7 | 52499 | 7.9% |
0 | 49033 | 7.4% |
3 | 44771 | 6.7% |
1 | 44476 | 6.7% |
5 | 42757 | 6.4% |
( | 38411 | 5.8% |
, | 38411 | 5.8% |
38411 | 5.8% | |
Other values (6) | 171544 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 406756 | |
Other Punctuation | 115233 | 17.3% |
Open Punctuation | 38411 | 5.8% |
Space Separator | 38411 | 5.8% |
Close Punctuation | 38411 | 5.8% |
Dash Punctuation | 27473 | 4.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
6 | 67560 | |
7 | 52499 | |
0 | 49033 | |
3 | 44771 | |
1 | 44476 | |
5 | 42757 | |
8 | 32680 | |
2 | 29923 | |
4 | 23646 | 5.8% |
9 | 19411 | 4.8% |
Other Punctuation
Value | Count | Frequency (%) |
. | 76822 | |
, | 38411 |
Open Punctuation
Value | Count | Frequency (%) |
( | 38411 |
Space Separator
Value | Count | Frequency (%) |
38411 |
Close Punctuation
Value | Count | Frequency (%) |
) | 38411 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 27473 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 664695 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
. | 76822 | |
6 | 67560 | 10.2% |
7 | 52499 | 7.9% |
0 | 49033 | 7.4% |
3 | 44771 | 6.7% |
1 | 44476 | 6.7% |
5 | 42757 | 6.4% |
( | 38411 | 5.8% |
, | 38411 | 5.8% |
38411 | 5.8% | |
Other values (6) | 171544 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 664695 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
. | 76822 | |
6 | 67560 | 10.2% |
7 | 52499 | 7.9% |
0 | 49033 | 7.4% |
3 | 44771 | 6.7% |
1 | 44476 | 6.7% |
5 | 42757 | 6.4% |
( | 38411 | 5.8% |
, | 38411 | 5.8% |
38411 | 5.8% | |
Other values (6) | 171544 |
source
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.7 MiB |
NASA |
---|
Common Values
Value | Count | Frequency (%) |
NASA | 45726 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
nasa | 45726 |
Most occurring characters
Value | Count | Frequency (%) |
A | 91452 | |
N | 45726 | |
S | 45726 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 182904 |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
A | 91452 | |
N | 45726 | |
S | 45726 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 182904 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
A | 91452 | |
N | 45726 | |
S | 45726 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 182904 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
A | 91452 | |
N | 45726 | |
S | 45726 |
boolean
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 44.8 KiB |
True | |
---|---|
False |
Value | Count | Frequency (%) |
True | 22934 | |
False | 22792 |
mixed
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 2.5 MiB |
A | |
---|---|
1 |
Common Values
Value | Count | Frequency (%) |
A | 22889 | |
1 | 22837 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
a | 22889 | |
1 | 22837 |
Most occurring characters
Value | Count | Frequency (%) |
A | 22889 | |
1 | 22837 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 22889 | |
Decimal Number | 22837 |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
A | 22889 |
Decimal Number
Value | Count | Frequency (%) |
1 | 22837 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 22889 | |
Common | 22837 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
A | 22889 |
Common
Value | Count | Frequency (%) |
1 | 22837 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 45726 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
A | 22889 | |
1 | 22837 |
unhashable
Unsupported
REJECTED
  UNSUPPORTED
 
Missing | 0 |
---|---|
Missing (%) | 0.0% |
Memory size | 2.4 MiB |
reclat_city
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 38401 |
---|---|
Distinct (%) | > 99.9% |
Missing | 7315 |
Missing (%) | 16.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | -39.153542 |
Minimum | -104.31717 |
---|---|
Maximum | 77.749011 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 26603 |
Negative (%) | 58.2% |
Memory size | 357.4 KiB |
Quantile statistics
Minimum | -104.31717 |
---|---|
5-th percentile | -87.871058 |
Q1 | -78.407752 |
median | -68.975293 |
Q3 | 4.7886449 |
95-th percentile | 35.42981 |
Maximum | 77.749011 |
Range | 182.06618 |
Interquartile range (IQR) | 83.196397 |
Descriptive statistics
Standard deviation | 46.685687 |
---|---|
Coefficient of variation (CV) | -1.1923745 |
Kurtosis | -1.446385 |
Mean | -39.153542 |
Median Absolute Deviation (MAD) | 17.255843 |
Skewness | 0.48160358 |
Sum | -1503926.7 |
Variance | 2179.5534 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
50.51806008 | 2 | < 0.1% |
43.27957156 | 2 | < 0.1% |
52.01104434 | 2 | < 0.1% |
-32.5810219 | 2 | < 0.1% |
49.60726921 | 2 | < 0.1% |
-29.65152821 | 2 | < 0.1% |
36.5165896 | 2 | < 0.1% |
-23.28864666 | 2 | < 0.1% |
23.16596589 | 2 | < 0.1% |
52.70663547 | 2 | < 0.1% |
Other values (38391) | 38391 | |
(Missing) | 7315 | 16.0% |
Value | Count | Frequency (%) |
-104.3171665 | 1 | |
-102.4312375 | 1 | |
-102.0868253 | 1 | |
-101.5556373 | 1 | |
-101.3269284 | 1 | |
-101.2084341 | 1 | |
-101.0146935 | 1 | |
-100.9191264 | 1 | |
-100.7856947 | 1 | |
-100.5751117 | 1 |
Value | Count | Frequency (%) |
77.74901083 | 1 | |
72.80622023 | 1 | |
72.75730423 | 1 | |
72.42607973 | 1 | |
72.25809595 | 1 | |
71.78938297 | 1 | |
71.42543169 | 1 | |
70.89755212 | 1 | |
70.53373183 | 1 | |
70.48523932 | 1 |
id | mass (g) | reclat | reclong | reclat_city | nametype | fall | boolean | mixed | |
---|---|---|---|---|---|---|---|---|---|
id | 1.000 | -0.142 | 0.261 | -0.316 | 0.219 | 0.130 | 0.126 | 0.000 | 0.009 |
mass (g) | -0.142 | 1.000 | 0.409 | -0.281 | 0.424 | 0.000 | 0.012 | 0.000 | 0.003 |
reclat | 0.261 | 0.409 | 1.000 | -0.650 | 0.943 | 0.349 | 0.450 | 0.000 | 0.013 |
reclong | -0.316 | -0.281 | -0.650 | 1.000 | -0.618 | 0.044 | 0.195 | 0.007 | 0.000 |
reclat_city | 0.219 | 0.424 | 0.943 | -0.618 | 1.000 | 0.379 | 0.424 | 0.015 | 0.000 |
nametype | 0.130 | 0.000 | 0.349 | 0.044 | 0.379 | 1.000 | 0.000 | 0.000 | 0.000 |
fall | 0.126 | 0.012 | 0.450 | 0.195 | 0.424 | 0.000 | 1.000 | 0.000 | 0.000 |
boolean | 0.000 | 0.000 | 0.000 | 0.007 | 0.015 | 0.000 | 0.000 | 1.000 | 0.000 |
mixed | 0.009 | 0.003 | 0.013 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 |
name | id | nametype | recclass | mass (g) | fall | year | reclat | reclong | GeoLocation | source | boolean | mixed | unhashable | reclat_city | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Aachen | 1 | Valid | L5 | 21.0 | Fell | 1970-01-01 00:00:00.000001880 | 50.77500 | 6.08333 | (50.775, 6.08333) | NASA | True | 1 | [1] | 50.518060 |
1 | Aarhus | 2 | Valid | H6 | 720.0 | Fell | 1970-01-01 00:00:00.000001951 | 56.18333 | 10.23333 | (56.18333, 10.23333) | NASA | False | A | [1] | 52.011044 |
2 | Abee | 6 | Valid | EH4 | 107000.0 | Fell | 1970-01-01 00:00:00.000001952 | 54.21667 | -113.00000 | (54.21667, -113.0) | NASA | False | 1 | [1] | 52.706635 |
3 | Acapulco | 10 | Valid | Acapulcoite | 1914.0 | Fell | 1970-01-01 00:00:00.000001976 | 16.88333 | -99.90000 | (16.88333, -99.9) | NASA | False | A | [1] | 23.165966 |
4 | Achiras | 370 | Valid | L6 | 780.0 | Fell | 1970-01-01 00:00:00.000001902 | -33.16667 | -64.95000 | (-33.16667, -64.95) | NASA | False | A | [1] | -23.288647 |
5 | Adhi Kot | 379 | Valid | EH4 | 4239.0 | Fell | 1970-01-01 00:00:00.000001919 | 32.10000 | 71.80000 | (32.1, 71.8) | NASA | True | 1 | [1] | 36.516590 |
6 | Adzhi-Bogdo (stone) | 390 | Valid | LL3-6 | 910.0 | Fell | 1970-01-01 00:00:00.000001949 | 44.83333 | 95.16667 | (44.83333, 95.16667) | NASA | True | 1 | [1] | 43.279572 |
7 | Agen | 392 | Valid | H5 | 30000.0 | Fell | 1970-01-01 00:00:00.000001814 | 44.21667 | 0.61667 | (44.21667, 0.61667) | NASA | False | A | [1] | 49.607269 |
8 | Aguada | 398 | Valid | L6 | 1620.0 | Fell | 1970-01-01 00:00:00.000001930 | -31.60000 | -65.23333 | (-31.6, -65.23333) | NASA | False | 1 | [1] | -32.581022 |
9 | Aguila Blanca | 417 | Valid | L | 1440.0 | Fell | 1970-01-01 00:00:00.000001920 | -30.86667 | -64.55000 | (-30.86667, -64.55) | NASA | False | A | [1] | -29.651528 |
name | id | nametype | recclass | mass (g) | fall | year | reclat | reclong | GeoLocation | source | boolean | mixed | unhashable | reclat_city | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
45716 | Aachen | 1 | Valid | L5 | 21.0 | Fell | 1970-01-01 00:00:00.000001880 | 50.77500 | 6.08333 | (50.775, 6.08333) | NASA | True | 1 | [1] | 50.518060 |
45717 | Aarhus | 2 | Valid | H6 | 720.0 | Fell | 1970-01-01 00:00:00.000001951 | 56.18333 | 10.23333 | (56.18333, 10.23333) | NASA | False | A | [1] | 52.011044 |
45718 | Abee | 6 | Valid | EH4 | 107000.0 | Fell | 1970-01-01 00:00:00.000001952 | 54.21667 | -113.00000 | (54.21667, -113.0) | NASA | False | 1 | [1] | 52.706635 |
45719 | Acapulco | 10 | Valid | Acapulcoite | 1914.0 | Fell | 1970-01-01 00:00:00.000001976 | 16.88333 | -99.90000 | (16.88333, -99.9) | NASA | False | A | [1] | 23.165966 |
45720 | Achiras | 370 | Valid | L6 | 780.0 | Fell | 1970-01-01 00:00:00.000001902 | -33.16667 | -64.95000 | (-33.16667, -64.95) | NASA | False | A | [1] | -23.288647 |
45721 | Adhi Kot | 379 | Valid | EH4 | 4239.0 | Fell | 1970-01-01 00:00:00.000001919 | 32.10000 | 71.80000 | (32.1, 71.8) | NASA | True | 1 | [1] | 36.516590 |
45722 | Adzhi-Bogdo (stone) | 390 | Valid | LL3-6 | 910.0 | Fell | 1970-01-01 00:00:00.000001949 | 44.83333 | 95.16667 | (44.83333, 95.16667) | NASA | True | 1 | [1] | 43.279572 |
45723 | Agen | 392 | Valid | H5 | 30000.0 | Fell | 1970-01-01 00:00:00.000001814 | 44.21667 | 0.61667 | (44.21667, 0.61667) | NASA | False | A | [1] | 49.607269 |
45724 | Aguada | 398 | Valid | L6 | 1620.0 | Fell | 1970-01-01 00:00:00.000001930 | -31.60000 | -65.23333 | (-31.6, -65.23333) | NASA | False | 1 | [1] | -32.581022 |
45725 | Aguila Blanca | 417 | Valid | L | 1440.0 | Fell | 1970-01-01 00:00:00.000001920 | -30.86667 | -64.55000 | (-30.86667, -64.55) | NASA | False | A | [1] | -29.651528 |
Most frequently occurring
name | id | nametype | recclass | mass (g) | fall | year | reclat | reclong | GeoLocation | source | boolean | mixed | reclat_city | # duplicates | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Aachen | 1 | Valid | L5 | 21.0 | Fell | 1970-01-01 00:00:00.000001880 | 50.77500 | 6.08333 | (50.775, 6.08333) | NASA | True | 1 | 50.518060 | 2 |
1 | Aarhus | 2 | Valid | H6 | 720.0 | Fell | 1970-01-01 00:00:00.000001951 | 56.18333 | 10.23333 | (56.18333, 10.23333) | NASA | False | A | 52.011044 | 2 |
2 | Abee | 6 | Valid | EH4 | 107000.0 | Fell | 1970-01-01 00:00:00.000001952 | 54.21667 | -113.00000 | (54.21667, -113.0) | NASA | False | 1 | 52.706635 | 2 |
3 | Acapulco | 10 | Valid | Acapulcoite | 1914.0 | Fell | 1970-01-01 00:00:00.000001976 | 16.88333 | -99.90000 | (16.88333, -99.9) | NASA | False | A | 23.165966 | 2 |
4 | Achiras | 370 | Valid | L6 | 780.0 | Fell | 1970-01-01 00:00:00.000001902 | -33.16667 | -64.95000 | (-33.16667, -64.95) | NASA | False | A | -23.288647 | 2 |
5 | Adhi Kot | 379 | Valid | EH4 | 4239.0 | Fell | 1970-01-01 00:00:00.000001919 | 32.10000 | 71.80000 | (32.1, 71.8) | NASA | True | 1 | 36.516590 | 2 |
6 | Adzhi-Bogdo (stone) | 390 | Valid | LL3-6 | 910.0 | Fell | 1970-01-01 00:00:00.000001949 | 44.83333 | 95.16667 | (44.83333, 95.16667) | NASA | True | 1 | 43.279572 | 2 |
7 | Agen | 392 | Valid | H5 | 30000.0 | Fell | 1970-01-01 00:00:00.000001814 | 44.21667 | 0.61667 | (44.21667, 0.61667) | NASA | False | A | 49.607269 | 2 |
8 | Aguada | 398 | Valid | L6 | 1620.0 | Fell | 1970-01-01 00:00:00.000001930 | -31.60000 | -65.23333 | (-31.6, -65.23333) | NASA | False | 1 | -32.581022 | 2 |
9 | Aguila Blanca | 417 | Valid | L | 1440.0 | Fell | 1970-01-01 00:00:00.000001920 | -30.86667 | -64.55000 | (-30.86667, -64.55) | NASA | False | A | -29.651528 | 2 |