Dataset statistics
| Number of variables | 5 |
|---|---|
| Number of observations | 189 |
| Missing cells | 188 |
| Missing cells (%) | 19.9% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 7.5 KiB |
| Average record size in memory | 40.7 B |
Variable types
| URL | 1 |
|---|---|
| Categorical | 2 |
| DateTime | 1 |
| Text | 1 |
Reproduction
| Analysis started | 2023-09-12 08:35:44.234938 |
|---|---|
| Analysis finished | 2023-09-12 08:35:45.846111 |
| Duration | 1.61 second |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
url
URL
UNIQUE 
| Distinct | 189 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 KiB |
| http://abrahadesta.wordpress.com/ | 1 |
|---|---|
| http://www.ocha-eth.org/ | 1 |
| http://www.medhin.org/ | 1 |
| http://www.mediaethiopia.com/ | 1 |
| http://www.mediaethiopia.com/blog/ | 1 |
| Other values (184) |
| Value | Count | Frequency (%) |
| http://abrahadesta.wordpress.com/ | 1 | 0.5% |
| http://www.ocha-eth.org/ | 1 | 0.5% |
| http://www.medhin.org/ | 1 | 0.5% |
| http://www.mediaethiopia.com/ | 1 | 0.5% |
| http://www.mediaethiopia.com/blog/ | 1 | 0.5% |
| http://www.mereja.com/ | 1 | 0.5% |
| http://www.mesfinwoldemariam.org/ | 1 | 0.5% |
| http://www.meskelsquare.com/ | 1 | 0.5% |
| http://www.nazret.com/ | 1 | 0.5% |
| http://www.nazret.com/news/view_amharic.php?feed=5&how=paged&what=all | 1 | 0.5% |
| Other values (179) | 179 |
| Value | Count | Frequency (%) |
| http | 173 | |
| https | 16 | 8.5% |
| Value | Count | Frequency (%) |
| nazret.com | 8 | 4.2% |
| www.cafpde.org | 3 | 1.6% |
| www.hrw.org | 3 | 1.6% |
| www.ethpress.gov.et | 2 | 1.1% |
| web.worldbank.org | 2 | 1.1% |
| www.tzta.ca | 2 | 1.1% |
| www.twitter.com | 2 | 1.1% |
| www.aeup.org | 2 | 1.1% |
| www.aigaforum.com | 2 | 1.1% |
| www.torproject.org | 2 | 1.1% |
| Other values (134) | 161 |
| Value | Count | Frequency (%) |
| / | 127 | |
| /blog/index.php | 7 | 3.7% |
| /index.html | 2 | 1.1% |
| /index.htm | 2 | 1.1% |
| /tzta/english.htm | 1 | 0.5% |
| /doc | 1 | 0.5% |
| /ethiopia/ | 1 | 0.5% |
| /public/english/region/afpro/addisababa/ethiopia.htm | 1 | 0.5% |
| /external/country/ETH/index.htm | 1 | 0.5% |
| /research-publications/speaksafe-media-workers-toolkit-safer-online-and-mobile-practices | 1 | 0.5% |
| Other values (45) | 45 | 23.8% |
| Value | Count | Frequency (%) |
| 174 | ||
| blog=12 | 1 | 0.5% |
| blog=13 | 1 | 0.5% |
| blog=14 | 1 | 0.5% |
| blog=15 | 1 | 0.5% |
| blog=16 | 1 | 0.5% |
| blog=7 | 1 | 0.5% |
| blog=9 | 1 | 0.5% |
| c=ethiop&t=africa | 1 | 0.5% |
| feed=5&how=paged&what=all | 1 | 0.5% |
| Other values (6) | 6 | 3.2% |
| Value | Count | Frequency (%) |
| 188 | ||
| ethiopia | 1 | 0.5% |
category_code
Categorical
| Distinct | 15 |
|---|---|
| Distinct (%) | 7.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 KiB |
| NEWS | |
|---|---|
| HUMR | |
| POLR | |
| ECON | |
| ANON | |
| Other values (10) |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 3 |
Characters and Unicode
| Total characters | 756 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | 2.1% |
Sample
| 1st row | CULTR |
|---|---|
| 2nd row | NEWS |
| 3rd row | MISC |
| 4th row | MISC |
| 5th row | NEWS |
Common Values
| Value | Count | Frequency (%) |
| NEWS | 65 | |
| HUMR | 45 | |
| POLR | 32 | |
| ECON | 13 | 6.9% |
| ANON | 8 | 4.2% |
| CULTR | 7 | 3.7% |
| XED | 5 | 2.6% |
| MISC | 3 | 1.6% |
| HOST | 3 | 1.6% |
| MILX | 2 | 1.1% |
| Other values (5) | 6 | 3.2% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| news | 65 | |
| humr | 45 | |
| polr | 32 | |
| econ | 13 | 6.9% |
| anon | 8 | 4.2% |
| cultr | 7 | 3.7% |
| xed | 5 | 2.6% |
| misc | 3 | 1.6% |
| host | 3 | 1.6% |
| milx | 2 | 1.1% |
| Other values (5) | 6 | 3.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 95 | |
| R | 86 | |
| E | 85 | |
| S | 72 | |
| W | 65 | |
| O | 56 | |
| U | 54 | |
| H | 51 | |
| M | 50 | |
| L | 42 | |
| Other values (11) | 100 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 756 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 95 | |
| R | 86 | |
| E | 85 | |
| S | 72 | |
| W | 65 | |
| O | 56 | |
| U | 54 | |
| H | 51 | |
| M | 50 | |
| L | 42 | |
| Other values (11) | 100 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 756 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| N | 95 | |
| R | 86 | |
| E | 85 | |
| S | 72 | |
| W | 65 | |
| O | 56 | |
| U | 54 | |
| H | 51 | |
| M | 50 | |
| L | 42 | |
| Other values (11) | 100 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 756 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| N | 95 | |
| R | 86 | |
| E | 85 | |
| S | 72 | |
| W | 65 | |
| O | 56 | |
| U | 54 | |
| H | 51 | |
| M | 50 | |
| L | 42 | |
| Other values (11) | 100 |
date_added
Date
| Distinct | 6 |
|---|---|
| Distinct (%) | 3.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 KiB |
| Minimum | 2014-04-15 00:00:00 |
|---|---|
| Maximum | 2018-04-10 00:00:00 |
Histogram with fixed size bins (bins=6)
source
Categorical
IMBALANCE 
| Distinct | 5 |
|---|---|
| Distinct (%) | 2.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 KiB |
| citizenlab | |
|---|---|
| CIPIT | 4 |
| OONI | 4 |
| BBC | 2 |
| defenddefenders | 1 |
Length
| Max length | 15 |
|---|---|
| Median length | 10 |
| Mean length | 9.7195767 |
| Min length | 3 |
Characters and Unicode
| Total characters | 1837 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.5% |
Sample
| 1st row | citizenlab |
|---|---|
| 2nd row | citizenlab |
| 3rd row | citizenlab |
| 4th row | citizenlab |
| 5th row | citizenlab |
Common Values
| Value | Count | Frequency (%) |
| citizenlab | 178 | |
| CIPIT | 4 | 2.1% |
| OONI | 4 | 2.1% |
| BBC | 2 | 1.1% |
| defenddefenders | 1 | 0.5% |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| citizenlab | 178 | |
| cipit | 4 | 2.1% |
| ooni | 4 | 2.1% |
| bbc | 2 | 1.1% |
| defenddefenders | 1 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 356 | |
| e | 183 | |
| n | 180 | |
| c | 178 | |
| t | 178 | |
| z | 178 | |
| l | 178 | |
| a | 178 | |
| b | 178 | |
| I | 12 | 0.7% |
| Other values (10) | 38 | 2.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1795 | |
| Uppercase Letter | 42 | 2.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 356 | |
| e | 183 | |
| n | 180 | |
| c | 178 | |
| t | 178 | |
| z | 178 | |
| l | 178 | |
| a | 178 | |
| b | 178 | |
| d | 4 | 0.2% |
| Other values (3) | 4 | 0.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| I | 12 | |
| O | 8 | |
| C | 6 | |
| P | 4 | 9.5% |
| T | 4 | 9.5% |
| N | 4 | 9.5% |
| B | 4 | 9.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1837 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 356 | |
| e | 183 | |
| n | 180 | |
| c | 178 | |
| t | 178 | |
| z | 178 | |
| l | 178 | |
| a | 178 | |
| b | 178 | |
| I | 12 | 0.7% |
| Other values (10) | 38 | 2.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1837 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 356 | |
| e | 183 | |
| n | 180 | |
| c | 178 | |
| t | 178 | |
| z | 178 | |
| l | 178 | |
| a | 178 | |
| b | 178 | |
| I | 12 | 0.7% |
| Other values (10) | 38 | 2.1% |
notes
Text
CONSTANT  MISSING 
| Distinct | 1 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 188 |
| Missing (%) | 99.5% |
| Memory size | 1.6 KiB |
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 18 |
| Min length | 18 |
Characters and Unicode
| Total characters | 18 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | Reportedly blocked |
|---|
| Value | Count | Frequency (%) |
| reportedly | 1 | |
| blocked | 1 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 3 | |
| o | 2 | |
| d | 2 | |
| l | 2 | |
| R | 1 | 5.6% |
| p | 1 | 5.6% |
| r | 1 | 5.6% |
| t | 1 | 5.6% |
| y | 1 | 5.6% |
| 1 | 5.6% | |
| Other values (3) | 3 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 16 | |
| Uppercase Letter | 1 | 5.6% |
| Space Separator | 1 | 5.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 3 | |
| o | 2 | |
| d | 2 | |
| l | 2 | |
| p | 1 | 6.2% |
| r | 1 | 6.2% |
| t | 1 | 6.2% |
| y | 1 | 6.2% |
| b | 1 | 6.2% |
| c | 1 | 6.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 17 | |
| Common | 1 | 5.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 3 | |
| o | 2 | |
| d | 2 | |
| l | 2 | |
| R | 1 | 5.9% |
| p | 1 | 5.9% |
| r | 1 | 5.9% |
| t | 1 | 5.9% |
| y | 1 | 5.9% |
| b | 1 | 5.9% |
| Other values (2) | 2 |
Common
| Value | Count | Frequency (%) |
| 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 18 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 3 | |
| o | 2 | |
| d | 2 | |
| l | 2 | |
| R | 1 | 5.6% |
| p | 1 | 5.6% |
| r | 1 | 5.6% |
| t | 1 | 5.6% |
| y | 1 | 5.6% |
| 1 | 5.6% | |
| Other values (3) | 3 |
| category_code | source | |
|---|---|---|
| category_code | 1.000 | 0.100 |
| source | 0.100 | 1.000 |
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
| url | category_code | date_added | source | notes | |
|---|---|---|---|---|---|
| 0 | http://abrahadesta.wordpress.com/ | CULTR | 2014-04-15 | citizenlab | NaN |
| 1 | http://aljazeera.net/ | NEWS | 2014-04-15 | citizenlab | NaN |
| 2 | http://am.wikipedia.org/ | MISC | 2014-04-15 | citizenlab | NaN |
| 3 | http://am.wikipedia.org/wiki/%E1%8B%8B%E1%8A%93%E1%8B%8D_%E1%8C%88%E1%8C%BD | MISC | 2014-04-15 | citizenlab | NaN |
| 4 | http://amharic.voanews.com/ | NEWS | 2014-04-15 | citizenlab | NaN |
| 5 | http://ancientgebts.org/ | HUMR | 2014-04-15 | citizenlab | NaN |
| 6 | http://carpediemethiopia.blogspot.com/ | POLR | 2014-04-15 | citizenlab | NaN |
| 7 | http://citizenlab.org/ | NEWS | 2014-04-15 | citizenlab | NaN |
| 8 | http://cpj.org/ | NEWS | 2014-04-15 | citizenlab | NaN |
| 9 | http://egoportal.blogspot.com/ | POLR | 2014-04-15 | citizenlab | NaN |
| url | category_code | date_added | source | notes | |
|---|---|---|---|---|---|
| 179 | https://www.citizenlab.org/ | NEWS | 2014-04-15 | citizenlab | NaN |
| 180 | https://www.dropbox.com/s/n65b3d67f82asn2/Leaked%20National%20Entrance%20Exam_English.pdf?dl=0 | FILE | 2016-05-30 | OONI | NaN |
| 181 | https://www.facebook.com/Jawarmd | NEWS | 2016-05-30 | OONI | NaN |
| 182 | https://www.facebook.com/pages/Addis-Neger/49967100821 | NEWS | 2014-04-15 | citizenlab | NaN |
| 183 | https://www.hrw.org/ | HUMR | 2014-04-15 | citizenlab | NaN |
| 184 | https://www.mereja.com/ | NEWS | 2016-09-09 | CIPIT | NaN |
| 185 | https://www.oromiamedia.org/ | NEWS | 2016-05-30 | OONI | NaN |
| 186 | https://www.privacyinternational.org/ | HUMR | 2014-04-15 | citizenlab | NaN |
| 187 | https://www.torproject.org/ | NEWS | 2014-04-15 | citizenlab | NaN |
| 188 | https://www.twitter.com/ | HOST | 2014-04-15 | citizenlab | NaN |