Overview

Dataset statistics

Number of variables5
Number of observations189
Missing cells188
Missing cells (%)19.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.5 KiB
Average record size in memory40.7 B

Variable types

URL1
Categorical2
DateTime1
Text1

Alerts

notes has constant value ""Constant
source is highly imbalanced (81.6%)Imbalance
notes has 188 (99.5%) missing valuesMissing
url has unique valuesUnique

Reproduction

Analysis started2024-03-18 18:33:24.197943
Analysis finished2024-03-18 18:33:24.542135
Duration0.34 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

url
URL

UNIQUE 

Distinct189
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
http://abrahadesta.wordpress.com/
 
1
http://www.ocha-eth.org/
 
1
http://www.medhin.org/
 
1
http://www.mediaethiopia.com/
 
1
http://www.mediaethiopia.com/blog/
 
1
Other values (184)
184 
ValueCountFrequency (%)
http://abrahadesta.wordpress.com/ 1
 
0.5%
http://www.ocha-eth.org/ 1
 
0.5%
http://www.medhin.org/ 1
 
0.5%
http://www.mediaethiopia.com/ 1
 
0.5%
http://www.mediaethiopia.com/blog/ 1
 
0.5%
http://www.mereja.com/ 1
 
0.5%
http://www.mesfinwoldemariam.org/ 1
 
0.5%
http://www.meskelsquare.com/ 1
 
0.5%
http://www.nazret.com/ 1
 
0.5%
http://www.nazret.com/news/view_amharic.php?feed=5&how=paged&what=all 1
 
0.5%
Other values (179) 179
94.7%
ValueCountFrequency (%)
http 173
91.5%
https 16
 
8.5%
ValueCountFrequency (%)
nazret.com 8
 
4.2%
www.cafpde.org 3
 
1.6%
www.hrw.org 3
 
1.6%
www.ethpress.gov.et 2
 
1.1%
web.worldbank.org 2
 
1.1%
www.tzta.ca 2
 
1.1%
www.twitter.com 2
 
1.1%
www.aeup.org 2
 
1.1%
www.aigaforum.com 2
 
1.1%
www.torproject.org 2
 
1.1%
Other values (134) 161
85.2%
ValueCountFrequency (%)
/ 127
67.2%
/blog/index.php 7
 
3.7%
/index.html 2
 
1.1%
/index.htm 2
 
1.1%
/tzta/english.htm 1
 
0.5%
/doc 1
 
0.5%
/ethiopia/ 1
 
0.5%
/public/english/region/afpro/addisababa/ethiopia.htm 1
 
0.5%
/external/country/ETH/index.htm 1
 
0.5%
/research-publications/speaksafe-media-workers-toolkit-safer-online-and-mobile-practices 1
 
0.5%
Other values (45) 45
 
23.8%
ValueCountFrequency (%)
174
92.1%
blog=12 1
 
0.5%
blog=13 1
 
0.5%
blog=14 1
 
0.5%
blog=15 1
 
0.5%
blog=16 1
 
0.5%
blog=7 1
 
0.5%
blog=9 1
 
0.5%
c=ethiop&t=africa 1
 
0.5%
feed=5&how=paged&what=all 1
 
0.5%
Other values (6) 6
 
3.2%
ValueCountFrequency (%)
188
99.5%
ethiopia 1
 
0.5%

category_code
Categorical

Distinct15
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
NEWS
65 
HUMR
45 
POLR
32 
ECON
13 
ANON
Other values (10)
26 

Length

Max length5
Median length4
Mean length4
Min length3

Characters and Unicode

Total characters756
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)2.1%

Sample

1st rowCULTR
2nd rowNEWS
3rd rowMISC
4th rowMISC
5th rowNEWS

Common Values

ValueCountFrequency (%)
NEWS 65
34.4%
HUMR 45
23.8%
POLR 32
16.9%
ECON 13
 
6.9%
ANON 8
 
4.2%
CULTR 7
 
3.7%
XED 5
 
2.6%
MISC 3
 
1.6%
HOST 3
 
1.6%
MILX 2
 
1.1%
Other values (5) 6
 
3.2%

Length

2024-03-18T18:33:24.664797image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
news 65
34.4%
humr 45
23.8%
polr 32
16.9%
econ 13
 
6.9%
anon 8
 
4.2%
cultr 7
 
3.7%
xed 5
 
2.6%
misc 3
 
1.6%
host 3
 
1.6%
milx 2
 
1.1%
Other values (5) 6
 
3.2%

Most occurring characters

ValueCountFrequency (%)
N 95
12.6%
R 86
11.4%
E 85
11.2%
S 72
9.5%
W 65
8.6%
O 56
7.4%
U 54
7.1%
H 51
6.7%
M 50
6.6%
L 42
5.6%
Other values (11) 100
13.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 756
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 95
12.6%
R 86
11.4%
E 85
11.2%
S 72
9.5%
W 65
8.6%
O 56
7.4%
U 54
7.1%
H 51
6.7%
M 50
6.6%
L 42
5.6%
Other values (11) 100
13.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 756
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 95
12.6%
R 86
11.4%
E 85
11.2%
S 72
9.5%
W 65
8.6%
O 56
7.4%
U 54
7.1%
H 51
6.7%
M 50
6.6%
L 42
5.6%
Other values (11) 100
13.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 756
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 95
12.6%
R 86
11.4%
E 85
11.2%
S 72
9.5%
W 65
8.6%
O 56
7.4%
U 54
7.1%
H 51
6.7%
M 50
6.6%
L 42
5.6%
Other values (11) 100
13.2%
Distinct6
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
Minimum2014-04-15 00:00:00
Maximum2018-04-10 00:00:00
2024-03-18T18:33:24.825748image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-03-18T18:33:24.989204image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)

source
Categorical

IMBALANCE 

Distinct5
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
citizenlab
178 
CIPIT
 
4
OONI
 
4
BBC
 
2
defenddefenders
 
1

Length

Max length15
Median length10
Mean length9.7195767
Min length3

Characters and Unicode

Total characters1837
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st rowcitizenlab
2nd rowcitizenlab
3rd rowcitizenlab
4th rowcitizenlab
5th rowcitizenlab

Common Values

ValueCountFrequency (%)
citizenlab 178
94.2%
CIPIT 4
 
2.1%
OONI 4
 
2.1%
BBC 2
 
1.1%
defenddefenders 1
 
0.5%

Length

2024-03-18T18:33:25.165307image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-18T18:33:25.326905image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
citizenlab 178
94.2%
cipit 4
 
2.1%
ooni 4
 
2.1%
bbc 2
 
1.1%
defenddefenders 1
 
0.5%

Most occurring characters

ValueCountFrequency (%)
i 356
19.4%
e 183
10.0%
n 180
9.8%
c 178
9.7%
t 178
9.7%
z 178
9.7%
l 178
9.7%
a 178
9.7%
b 178
9.7%
I 12
 
0.7%
Other values (10) 38
 
2.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1837
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 356
19.4%
e 183
10.0%
n 180
9.8%
c 178
9.7%
t 178
9.7%
z 178
9.7%
l 178
9.7%
a 178
9.7%
b 178
9.7%
I 12
 
0.7%
Other values (10) 38
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1837
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 356
19.4%
e 183
10.0%
n 180
9.8%
c 178
9.7%
t 178
9.7%
z 178
9.7%
l 178
9.7%
a 178
9.7%
b 178
9.7%
I 12
 
0.7%
Other values (10) 38
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1837
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 356
19.4%
e 183
10.0%
n 180
9.8%
c 178
9.7%
t 178
9.7%
z 178
9.7%
l 178
9.7%
a 178
9.7%
b 178
9.7%
I 12
 
0.7%
Other values (10) 38
 
2.1%

notes
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing188
Missing (%)99.5%
Memory size1.6 KiB
2024-03-18T18:33:25.481878image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters18
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowReportedly blocked
ValueCountFrequency (%)
reportedly 1
50.0%
blocked 1
50.0%
2024-03-18T18:33:25.801840image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3
16.7%
o 2
11.1%
d 2
11.1%
l 2
11.1%
R 1
 
5.6%
p 1
 
5.6%
r 1
 
5.6%
t 1
 
5.6%
y 1
 
5.6%
1
 
5.6%
Other values (3) 3
16.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 18
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 3
16.7%
o 2
11.1%
d 2
11.1%
l 2
11.1%
R 1
 
5.6%
p 1
 
5.6%
r 1
 
5.6%
t 1
 
5.6%
y 1
 
5.6%
1
 
5.6%
Other values (3) 3
16.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 18
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 3
16.7%
o 2
11.1%
d 2
11.1%
l 2
11.1%
R 1
 
5.6%
p 1
 
5.6%
r 1
 
5.6%
t 1
 
5.6%
y 1
 
5.6%
1
 
5.6%
Other values (3) 3
16.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 18
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 3
16.7%
o 2
11.1%
d 2
11.1%
l 2
11.1%
R 1
 
5.6%
p 1
 
5.6%
r 1
 
5.6%
t 1
 
5.6%
y 1
 
5.6%
1
 
5.6%
Other values (3) 3
16.7%

Correlations

2024-03-18T18:33:25.924319image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
category_codesource
category_code1.0000.100
source0.1001.000

Missing values

2024-03-18T18:33:24.326589image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-18T18:33:24.477583image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

urlcategory_codedate_addedsourcenotes
0http://abrahadesta.wordpress.com/CULTR2014-04-15citizenlabNaN
1http://aljazeera.net/NEWS2014-04-15citizenlabNaN
2http://am.wikipedia.org/MISC2014-04-15citizenlabNaN
3http://am.wikipedia.org/wiki/%E1%8B%8B%E1%8A%93%E1%8B%8D_%E1%8C%88%E1%8C%BDMISC2014-04-15citizenlabNaN
4http://amharic.voanews.com/NEWS2014-04-15citizenlabNaN
5http://ancientgebts.org/HUMR2014-04-15citizenlabNaN
6http://carpediemethiopia.blogspot.com/POLR2014-04-15citizenlabNaN
7http://citizenlab.org/NEWS2014-04-15citizenlabNaN
8http://cpj.org/NEWS2014-04-15citizenlabNaN
9http://egoportal.blogspot.com/POLR2014-04-15citizenlabNaN
urlcategory_codedate_addedsourcenotes
179https://www.citizenlab.org/NEWS2014-04-15citizenlabNaN
180https://www.dropbox.com/s/n65b3d67f82asn2/Leaked%20National%20Entrance%20Exam_English.pdf?dl=0FILE2016-05-30OONINaN
181https://www.facebook.com/JawarmdNEWS2016-05-30OONINaN
182https://www.facebook.com/pages/Addis-Neger/49967100821NEWS2014-04-15citizenlabNaN
183https://www.hrw.org/HUMR2014-04-15citizenlabNaN
184https://www.mereja.com/NEWS2016-09-09CIPITNaN
185https://www.oromiamedia.org/NEWS2016-05-30OONINaN
186https://www.privacyinternational.org/HUMR2014-04-15citizenlabNaN
187https://www.torproject.org/NEWS2014-04-15citizenlabNaN
188https://www.twitter.com/HOST2014-04-15citizenlabNaN