datasets

You are currently browsing the archive for the datasets category.

I was just awarded a small but not insignificant award as part of the Carleton University COVID-19 Rapid Response Research Grants. Below is a description of what I will be up to, along with some great students and expert advisors.  I will share everyone’s names later.  Results of the work will be published here as it becomes available!  Stay tuned. Also, let me know if you want to contribute in any way! Tracey dot Lauriault at Carleton dot CA

Research Summary

There is much official COVID-19 data reporting by federal, provincial, territorial and Indigenous Communities. As the pandemic evolves, and more information comes to light, there is a call to add data attributes about Indigenous, Black and Racialized groups and of the affected labour force, and to report where cases predominate. The pandemic also revealed that foundational datasets are missing, such as a national list of elder care homes, maps of local health regions and data about the digital divide. This project will embrace technological citizenship, adopt a critical data studies theoretical framework and a data humanitarian approach to rapidly assess data shortfalls, identify standards, and support the building of infrastructure. This involves training students, conducting rapid response research, developing a network of experts, learning by doing and a transdisciplinary team of peer reviewers to assess results. The knowledge will be mobilized in open access blog posts, infographics, policy briefs and scholarly publications.

Research challenge:

Official COVID-19 public heath reports by Federal, Provincial, and Territorial (F/P/T) and First Nation Communities are uneven and there are calls to improve them ( 1 CBC News, Toronto Star). Asymmetries can be attributed to dynamically evolving challenges associated with the pandemic, such as working while practicing social distancing; jurisdictional divisions of power in terms of health delivery; and responding to a humanitarian crisis, where resources are stretched and infrastructures are splintered (i.e. digital divide, nursing home conditions).

The Harvard Humanitarian Initiative (HHI) developed a rights-based approach to the management of data and technologies during crisis situations which includes the right to: be informed, protection, privacy and security, data agency and rectification and redress (2). These apply to contact tracing (3 ITWorld, Scassa) and to equity groups calling for demographic data (1). Other have conducted rapid response data reporting, for example after the Haiti Earthquake volunteers developed real-time crowdsourcing data collection systems to support humanitarian responders (4 Meier) and WeRobotics mobilizes local drone expertise to objectively assess proposed pandemic response technologies (5 WeRobotics).

This research will apply a critical data studies (CDS) theoretical framework (6 Kitchin & Lauriault), the principles of the HHI and, practice technological citizenship (7 Feenbert) to the study of the Canadian COVID-19 data response. Lauriault will leverage her expertise and Canadian and international network of open data, open government, civic technology experts in government, civil society, and Indigenous Communities (see CV) as seen in the policy briefs published on DataLibre.ca (8) to rapidly assess and support COVID-19 data management and reporting.

The objective is to carry out the following activities:

  1. Compare official COVID-19 public health data reports to identify gaps and best practices (9 Lauriault & Shields).
  2. Identify and support the building of framework datasets to standardize reporting (10 Lauriault).
  3. Analyze data standards and protocols to support data management, interoperability and cross-jurisdictional reporting (11 GeoConnections).
  4. Publish case-studies, resources, an archives of official reporting, and a glossary and
  5. Rapidly conduct expert analysis, peer review, knowledge mobilization and provide evidence-based recommendations to improve data reporting.

The rationale for this research is as follows:

  1. Official COVID-19 public health data are inconsistently reported, impeding comparability, and the ability to assess impact and target actions. Also, predictions missed seniors’ homes, precarious labour, and Indigenous communities and social determinants (12 Global News, NCCDH), resulting in an increase in cases and deaths. Currently job classifications and Indigenous, Black, and Racialized people classifications (13 CTV News) remain absent. This research will create a corpus of F/P/T and Indigenous Communities’ official reports, compare results, identify gaps.
  2. Framework data are standard information infrastructures upon which other analysis can consistently be done (14 Toronto Star). When this is lacking analysis is impeded, for example there is no national reporting by health region since no national framework dataset exists (15 Lauriault), and mitigating the digital divide is thwarted with a lack of broadband maps (16 Potter & Lauriault et al.). Other missing national datasets include senior care facilities, homeless shelters, precarious labour, and Indigenous Communities (17 Gaetz et al.). Needed framework datasets will be identified and if necessary coordinate their building (18 SPCOStatCan LODE), advocacy for the opening of public datasets such as corporate registries may be carried out (19 Fed. Registry,  Open Corporate, Open Contracting), and experts from public health , social planning, and Indigenous Communities will help identify localized frameworks.
  3. Consistent COVID-19 reporting requires an interoperable infrastructure which builds upon standards developed through consensus processes (20 CIHI, PHAC). Current uneven reporting may be attributed to a lack of standards adoption and formalization in terms of data flows. This research will develop a repository of standards and protocols and share these with decision-makers to improve interoperability (i.e. Data Standards for the Identification and Monitoring of Systemic Racism (21 ON Govt) and FNIGC OCAP Principles (22 FNIGC)).
  4. Rapidly mobilizing knowledge is important to improve reporting and manage data, and to build a crisis data reporting infrastructure for the future. This project will compile, and archive information, rapidly assess and peer review results with experts and report results on DataLibre.ca and other websites, will produce infographics and policy briefs, deliver online webinars, and help administrators and Indigenous Communities improve their data and technology policies.

A CDS framework recognizes that data have social and material shaping qualities and that they are never politically neutral while also being inseparable from the people and institutions who create them including practices, techniques, and infrastructures. This involves a team of data, technology, legal, social and health, and Indigenous experts to rapidly assess official COVID-19 data assemblages and to act as technological citizens by applying knowledge in real time and mobilize results to mitigate the data shortfalls witnessed during this crisis and support decision makers to respond with a data humanitarian and rights-based approach for now and to better respond in the future.

Expected Impact:

The target audience for this rapid response data and technology reporting is F/P/T public officials and Indigenous Community Leaders who manage public health, socio-economic, statistical and official record data flows; and civil society actors and the public involved in open data, open government and open contracting, transparency and accountability. This includes C-class executives, chief technology, information data, and digital officers.

The outcome of this research is to standardize and improve humanitarian crisis data management and data reporting in the short term to ensure consistent reporting, and in the long term establish standardized data workflows and operationalize data infrastructures for this pandemic in preparation for the next.

The timing to compile, inventory and build an open access archives of official data reporting is now as the fractures in the system have become apparent in real-time and have had negative consequences. It is important to monitor the response as it evolves so as to be able to improve it while our collective institutional memory is fresh and to have the evidence available as a reminder for if and when we forget, but also to build more robust systems.

The results of this research will be continuously reported and made openly accessible as it becomes available and will lead to the formation of a new research team.

Tags: , , , , , , ,

TracingCOVIDbanners-08It is very odd that national health organizations are not reporting COVID-19 cases aggregated into health regions even though provinces and territories are mostly reporting them in that way. And where is the national health framework datasets?

Framework data are a “set of continuous and fully integrated geospatial data that provide context and reference information for the country. Framework data are expected to be widely used and generally applicable, either underpinning or enabling geospatial applications” P.7.

Federal Electoral Districts for example, are the official framework data for Elections Canada and these data are updated for each election.  They are used to administer elections, report the results of exit polls during the elections, and show the results after an election.  Framework data are available in multiple formats as well as in cartographic or mapping products for Geographic Information Systems (GIS) such as ESRI, MapInfo or Tableau (Shapefiles), in KML formats for GoogleMaps, and in standardized online mapping GML Formats which also happens to also be a Treasury Board Secretariat of Standard for Geospatial Data. Election result data are aggregated into these framework data along with other socio-economic data, and once these data are mapped we can compare and can tell a more nuanced local, regional and national story, we can see patterns across the country.  The benefit of framework data are many, what is also great is they are created once by an authoritative source, they are updated and reliable, they are used many times, they are open data and everyone knows where to get them.

Considering that health care spending is one of the largest expenditures we have as a nation state, and it would be expected that in an era of accountability and transparency and where outcomes based management is the norm, it is astonishing that health data including its social determinants data are not disseminated in this way.  Yes, there are privacy issues, but we are capable of addressing those with the Census and Elections, which means we can also do so for health. We need to have an evidence based conversation about population health now more than ever, and we will need these data to tell a socio-economic story as well. Could we have done better? Who is doing great and why and who is not doing so great and why, what can we learn and what is the remedy?

Numerous useful and insightful interactive maps were published after the elections (CBC, CTV, Macleans, ESRI and many others), and these generated much discussion, people could see the results, they could situate themselves, they could see what friends and family in other places were experiencing.  Analysts and policy makers also had what they needed to understand and plan a new context. This is what democratic evidence based data journalism and policy making is all aboutt!

Natural Resources Canada is normally the producer of Canada’s framework data but it does not produce a health region framework dataset for Canada.  Arguably, these data would not only be useful during a pandemic, but also for administering and reporting health associated with natural resources such as allergies in the spring and fall, food insecurity, health and farming, or health after a natural disaster such as flooding and fires.  They data would also be useful to see where money is spent providing Canadians with the evidence they require to advocate for change.

So why no national heath reporting by their administrative boundaries and where is the health region framework dataset?

National Health Reporting Canada:

Virihealth.com and ESRI Canada produced the the first National ge0-COVID-19 reporting:

https://virihealth.com/

https://virihealth.com/

https://resources-covid19canada.hub.arcgis.com/app/eb0ec6ffdb654e71ab3c758726c55b68

https://resources-covid19canada.hub.arcgis.com/app/eb0ec6ffdb654e71ab3c758726c55b68

Federal Government:

Canada as a federation has jurisdictional divisions of power, and one of those jurisdictional  divides is health. We have the Canada Health Care Act (CHA) that

“establishes criteria and conditions related to insured health services and extended health care services that the provinces and territories must fulfill to receive the full federal cash contribution under the Canada Health Transfer (CHT)”.

The Canada Health Transfer (CHT) provides long-term predictable funding for health care, on a per capital basis and

“supports the principles of the Canada Health Act which are: universality; comprehensiveness; portability; accessibility; and, public administration”.

The provinces and territories receive cash transfers to deliver health care to Canadians and health care data reporting is done by the each province and territory separately. This alone justifies the creation of a national health region framework dataset. Which organization should be responsible for it?

There are three main organizations which are part of the Canada Health Portfolio  that currently report official COVID-19 cases. At the moment, they do not publish COVID-19 case data by health regions.

Health Canada “is the Federal department responsible for helping Canadians maintain and improve their health, while respecting individual choices and circumstances.” Health Canada is an official and authoritative national source of COVID-19 data and it publishes the Coronavirus disease (COVID-19): Outbreak update. Reporting includes an interactive map and a line graph of data by Province and Territory.

https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection.html

https://www.canada.ca/en/public-health/services/diseases/2019-novel-coronavirus-infection.html

Public Health Agency of Canada (PHAC) promotes and protects the health of Canadians through leadership, partnership, innovation and action in public health and it does so by: Promoting health; Preventing and controlling chronic diseases and injuries; Preventing and controlling infectious diseases; Preparing for and responding to public health emergencies; Serving as a central point for sharing Canada’s expertise with the rest of the world; Applying international research and development to Canada’s public health programs; and Strengthening intergovernmental collaboration on public health and facilitate national approaches to public health policy and planning. PHAC now disseminates an excellent interactive dashboard entitled the National Epidemiological Summary of COVID-19 Cases in Canada. Their data sources are: Public Health Agency of Canada, Surveillance and Risk Assessment, Epidemiology update; Natural Resources Canada – Grey basemap with Credit: COVID-19 Situational Awareness tiger team Powered by ESRI-Canada and COVID-19 Canadian Geostatistical Platform, a collaboration between Public Health Agency of Canada, Statistics Canada and Natural Resources Canada.

https://phac-aspc.maps.arcgis.com/apps/opsdashboard/index.html#/e968bf79f4694b5ab290205e05cfcda6

https://phac-aspc.maps.arcgis.com/apps/opsdashboard/index.html#/e968bf79f4694b5ab290205e05cfcda6

Canadian Institute for Health Research (CIHR) is the Government of Canada’s health research investment agency and its mandate is to “excel, according to internationally accepted standards of scientific excellence, in the creation of new knowledge and its translation into improved health for Canadians, more effective health services and products and a strengthened Canadian health care system.” Although a research funding organization, CIHR could publish a national framework dataset of health units to help researchers in Canada and to also to disseminate the findings of research either about COVID-19 or any other research according to those administrative boundaries. (Update 07/04/2020 CIHR does not have a framework data file)

A national non-governmental organization, the Canadian Institute for Health Information (CIHI) also disseminates national comparative health data, mostly about the administration of health and it would make sense for them to also publish data by health units and to have such a framework dataset. CIHI is an independent, not-for-profit organization that provides essential information on Canada’s health system and the health of Canadians. (Update 07/04/2020 CIHI does not have a framework data file). CIHI’s mandate is

“to deliver comparable and actionable information to accelerate improvements in health care, health system performance and population health across the continuum of care”.

Natural Resources Canada is the producer of most of Canada’s Framework data, and it could with the help of the Canadian Council on Geomatics Provincial and Territorial Accord could create this framework file and this was discussed at the 4th Annual SDI Summit meetings hosted in Quebec City in the Fall of 2019.

Statistics Canada produces Provincial and Territorial Health Geographies and it does seem to have a national GIS Health Regions: Boundaries and Correspondence with Census Geography file for 2018, and if that is the case, why are health geographies not reported by these boundaries? (Update 07/04/2020 StatCan has a 2018 GIS national health geography file).  Here is a PDF version of the 2018 map.

https://www150.statcan.gc.ca/n1/pub/82-402-x/2018001/maps-cartes/rm-cr14-eng.htm

https://www150.statcan.gc.ca/n1/pub/82-402-x/2018001/maps-cartes/rm-cr14-eng.htm

Provincial and Territorial Official COVID-19 Case Reports and health geographies:

Below I have compiled a list of official COVID-19 Case reporting by province and territory, and when I could find them, I included a link to health administration geographies. That does not mean that data are reported in maps, but data are generally tabulated according to health administration geographies.

Alberta

British Columbia

Manitoba (Updated RHA and Map info. 07/04/2020)

Newfoundland and Labrador (Updated RHA and Map info. 07/04/2020)

New Brunswick (Updated RHA and Map info. 07/04/2020)

North West Territories

Nova Scotia

Nunavut

Ontario

Prince Edward Island (Updated Health PEI info. 07/04/2020)

Quebec (Updated Map info 08/04/2020)

Saskatchewan

Yukon (Updated Health Region info. 07/04/2020)

I have emailed each of the Provincial and Territorial governments to confirm that I have the latest heath geography framework data.  I have received updates from Yukon, Quebec,  PEI, New Brunswick, and Manitoba, and have updated map data accordingly. I have also received correspondence from Statistics Canada, and CIHI.

For the moment ESRI Canada and some of the Provinces and Territories are reporting Official COVID-19 Cases by health region geographies.  Why aren’t Health Canada and the Public Health Agency of Canada doing so?  And where is the National Health Region Framework Data file?

Canada has signed on to the G8 Open Data Charter. The official UK G8 Presidency site includes the Charter and its associated technical Annex.  This Charter falls under one of this year’s G8 agenda items, which is to promote greater transparency. The main points of the charter are:

  1. Principle 1: Open Data by Default
  2. Principle 2: Quality and Quantity
  3. Principle 3: Usable by All
  4. Principle 4: Releasing Data for Improved Governance
  5. Principle 5: Releasing Data for Innovation

The G8 countries have committed to the following 3 actions:

  • Action 1: G8 National Action Plans
  • Action 2: Release of high value data (List is pasted below)
  • Action 3: Metadata mapping

These are all good things.  The devil will be in the details and implementation in Canada will depend on collaboration and interoperability between provinces, territories, cities and municipalities and the federal governments.  It will be interesting to see how crown corporations such as Canada Post, Canada Housing and Mortgage and Corporation (CMHC), CBC/Radio Canada and others fare.  At the moment, it there is uncertain if these are to follow the same rules.

The list of high value data that should be released somewhat overlap with information collected by the Open Knowledge Foundation Open Data Census.  Note the postal code data requested under geospatial, this is a big ask for Canada especially in light of the Geocoder copryright lawsuit instigated by Canada Post. Digital Copyright Canada has also done some good work on the postal code file.

It will be very interesting to see if greater access to data will mean an increase in evidenced based policy making and greater participatory democracy. The government will need to be more receptive to citizen input, and so far, if the census and issues around science are any indication, this does not look promising.  Releasing data is one thing, acting on the evidence and having the mechanisms in place and willingness to hear from citizens is another.

Data Category (alphabetical order) Example datasets
Companies Company/business register
Crime and Justice Crime statistics, safety
Earth observation Meteorological/weather, agriculture, forestry, fishing, and hunting
Education List of schools; performance of schools, digital skills
Energy and Environment Pollution levels, energy consumption
Finance and contracts Transaction spend, contracts let, call for tender, future tenders, local budget, national budget (planned and spent)
Geospatial Topography, postcodes, national maps, local maps
Global Development Aid, food security, extractives, land
Government Accountability and Democracy Government contact points, election results, legislation and statutes, salaries (pay scales), hospitality/gifts
Health Prescription data, performance data
Science and Research Genome data, research and educational activity, experiment results
Statistics National Statistics, Census, infrastructure, wealth, skills
Social mobility and welfare Housing, health insurance and unemployment benefits
Transport and Infrastructure Public transport timetables, access points broadband penetration

Watching this is a great New Years morning activity, and for Sep Kamvar I fell that data and statistics are the new black!  This is worth the 1 hour of your time!  dam, most online TV shows are 42 minutes and you learn way less…I should know 🙁

Merci Karl!

Abstract: Canada’s Information Commissioners have adopted a resolution toward Open Government and part of the open government process is open access to public administrative, census, map and research data.  A number of Canadian Cities,  innovative government programs such as GeoConnections, forward thinking research funding such as International Polar Year have become OpenData cities, implemented data sharing infrastructures and fund data sharing science.  Access to data are one part of the open government conversation, and it is argued that opendata bring us closer to more informed democratic deliberations on public policy.

Event: Open Access Week 2010, Carleton University, October 21, Noon to 1PM.

1. Event: Open Access Week 2010, Carleton University, October 21, Noon to 1PM.

2. Event: Open Access Week, Université d’Ottawa, Apps4Ottawa Showcase, October 21, 5-7PM.

  • Title: OpenData & Public Research
  • Abstract: Researchers use OpenData to inform their work, and are also producers of data and software that can be re-shared to the public.  In Canada, much of university research is supported by public funds and an argument can be made that the results of that research should be accessible to the public.  The research at the Geomatics and Cartographic Research Centre will be featured as will community based social policy research in Ottawa.  In Canada some data are accessible, but mostly data are not, and if they are, cost recovery policies and regressive licensing impede their use.  The talk will feature examples where data are open and where opportunities for evidence based decision making are restricted.

3. Event: Statistical Society of Ottawa 8th annual seminar – Our Statistics Community on Monday the 25th of October.

  • Title: The Real Census informs Neighbourhood Research in Canada
  • Abstract: Ms. Tracey P. Lauriault will discuss neighbourhood scale research using Census data.  She will introduce the The Cybercartographic Pilot Atlas of the Risk of Homelessness created at the Geomatics and Cartographic Research and will feature community based research used to inform public policy as part of the Canadian Social Data Strategy (CSDS).  She will feature maps and data about social issues in Canadian cities & metropolitan areas (e.g. Calgary, Toronto, Halton, Sault Ste. Marie, Ottawa, Montreal, & others) and will focus on the importance of local analysis and what the loss of the Long-Form Census could mean to evidence based decision making to communities in Canada’s.

The Canadian Government cuts the Long-Form Census,creates a survey that costs  $ 35 million for less reliable data and then cuts the agency back again by $7 million!

Canadian Press: Troubled StatsCan facing $7M in cuts

Hamilton Spectator:  StatsCan to cut more 5 more surveys

The Article includes the following surveys – I think I have the correct links but I am unsure!:

  1. The Industrial Pollutant Release Survey (I cannot find a link)
  2. The article says The Quarterly Energy Use  (Households and the Environment: Energy Use or Quarterly Industrial Consumption of Energy Survey which one?) and the  Greenhouse Gas Emissions Survey (Greenhouse Gas Emissions from Private Vehicles in Canada, 1990 to 2007 or Greenhouse Gas Emissions Report which one?) both pilot projects;
  3. The National Population Health Survey;
  4. The Survey of the Suppliers of Business Financing; and
  5. The Survey on Financing of Small and Medium-Sized Enterprises.

Information may or may not yearn to be free — but you shouldn’t have to pay to get it from the government.

Since it’s yours to begin with.

That was the unanimous conclusion of a GTEC panel discussion Tuesday on the implications of “open data,” including the potential for government departments to earn revenue from it.

Ottawa Citizen: ‘Open data’ should mean end to fees: GTEC panel

From the Toronto Star

Toronto Sun
by Patrick Corrigan

Theo Moudakis/Toronto Star
By Theo Moudakis


By Michael de Adder

« Older entries