canada

You are currently browsing the archive for the canada category.

The issue of public access to government data has a number of components: availability (is it available?), format (is it in a usable/open format?), cost (is it free?), and copyright (do I need permission to use it, may I do with it what I wish?).
one cent
The City of Toronto has recently launched a campaign to get more money for cities from the Federal government, asking for one cent from the GST. The campaign is called: onecentnow.ca, and uses the Canadian penny in ads and on their web site.

They’ve received a retroactive bill from the Royal Canadian Mint for $47,000+ for use of the image of the Canadian penny, and for use of the words “one cent” (!).

There are political/moral issues here about how government agencies use (or abuse) existing laws. Notably, the Royal Canadian Mint is a crown corporation that answers to the federal government, and the federal government is a target of the onecentnow.ca campaign, so this retroactive charge could be interpreted as politically motivated. Perhaps not.

And of course there are policy issues about how Crown Copyright ought to be used, or whether it should exist at all. In the USA, for instance, federal government documents, designs and publications are de facto in the public domain.

But other than these abstract concerns, there is a more crucial point: the Mint appears to be on the wrong side of the Canadian Copyright Act. As Howard Knopf points out in Excess Copyright, Canadian copyright law provides copyright protection until 50 years after the death of the creator. Crown Copyright extends 50 years after date of publication.

The Canadian penny was designed by G.E. Kruger Gray in 1937. He died in 1943, meaning that the design for the Canadian penny went into the public domain 50 years later, in 1993. Which means that no one, including the Royal Canadian Mint, can claim ownership of the image, much less charge for its use.

He notes further that it seems unlikely that any court would agree with the Mint that they own a copyright or trademark on the words “one cent.”

So it seems possible that the Royal Canadian Mint has developed an Intellectual Property policy that is claiming – and charging for – ownership where none exists.

Spacingmontreal.ca and spacingtoronto.ca are:

your hub for daily dispatches from the streets of Toronto/Montreal to cities around the world, offering both analysis and a forum for discussion. Our contributors examine city hall, architecture, urban planning, public transit, transportation infrastructure and just about anything that involves the public realm of our cities.

Both blogs are published by spacing magazine.

There is a interesting article in the Globe today by Eric Sager a professor of history at the University of Victoria about access to the names of Census respondents of Censuses gone by and those in the future.

I consider the privacy aspects of the Census to be sacred and so does StatCan. I fill it out because I know I am anonymous and that the data will be aggregated therefore not traced back to my personal address. Many people feel the same way, recall the Lockheed Martin online Census debacle. Fortunately for Canadians we do not live in Nazi Germany, Stalin’s Ukraine, or are in Idi Amin’s Uganda where Censuses were explicitly used to target, kill or expulse ‘undersireable’ populations or to mask the death tole of massive mistakes. Censuses can and have been used to trace and target people of ethnic, religious, sexual orientation, or racial backgrounds. This 2006 Census year included a question as to whether or not we would be willing to give consent to sharing our private information 92 years from now. I responded with an educated no.

Historians and genealogists argue that past census respondent’s names should be made available and that we should have future access to current censuses:

The census is the only complete inventory of our population, an indispensable historical record of the Canadian people. It’s critical to genealogy, our most popular form of history. Of all visitors to our national archives today, half are doing genealogical research. If you had ancestors in Canada in 1901 or 1911, you can find them in the censuses of those years, online from Library and Archives Canada. Your children will also be able to find their grandparents and great-grandparents in the censuses of the past century — but only after a legally mandated delay of 92 years.

Seems like our friends in the South are sharing their Census information, as the U. S. Census information is released

through their National Archives after a delay of 72 years. They apply the principle of “implied consent” — a principle well known to privacy experts. When completing their census forms, Americans are consenting to the present-day use of their information by the Census Bureau, and to its use by other researchers in the distant future. Americans do not complain about the future use of their information, and there is no evidence that public release after 72 years has made them reluctant to participate.

Spammers and telemarketers have been using “implied consent” when they send me unsolicited email garbage, drop popups on my computer or call my home to sell me stuff. I have to say there are dubious elements to this concept. I do however like the concept of informed consent and think the Census had it right by leaving it up to census respondents to decide if they wish to share their personal information to future generations of researchers or potentially less progressive political regimes (see the question and your options).  StatCan even provided a very extensive section on historical and genealogical position. See the informed consent Question 8 on the short form and Question 53 on the long form. These are perfectly legitimate questions supported with a ton of explanatory texte and is a perfect compromise to the debate.

Prof. Sager makes a compelling argument for access to this private information, but he believes we should give up our right to informed consent, that we are not smart enough to understand on our own the importance of historical and genealogical research.  I vehemently disagree with these points. He does however correctly point out the importance of the Census for research and decision making.

I would like to have free – as in no cost – access to the non-private Census data and maps in the same way we have free access to the forms and the methodological guides. Now that, along with informed consent, is what a democracy looks like!

Check out the Statistics Canada Canadian Environmental Sustainability Indicators report.

It discusses three main indicators:

Air quality indicator tracks Canadians’ exposure to ground-level ozone—a key component of smog and one of the most common and harmful air pollutants to which people are exposed.

The greenhouse gas emissions indicator tracks the annual releases of the six greenhouse gases that are the major contributors to climate change. The indicator comes directly from the greenhouse gas inventory report prepared by Environment Canada for the United Nations Framework Convention on Climate Change and the Kyoto Protocol.

The freshwater quality indicator reports the status of surface water quality at selected monitoring sites across the country. For this first report, the focus of the indicator is on the protection of aquatic life, such as plants, invertebrates and fish.

The report also has some links in the references to some of the data used to build these indicators. Also check out the methodology section to get the low down on how to use these data. Perhaps some data and ideas to play with!

But now that we know that

the three indicators reported here raise concerns for Canada’s environmental sustainability, the health and well-being of Canadians, and our economic performance. The trends for air quality and greenhouse gas emissions are pointing to greater threats to human health and the planet’s climate. The water quality results show that guidelines are being exceeded, at least occasionally, at most of the selected monitoring sites across the country.

What do we do?

This CBC.ca video gives a brief on how 2d and 3d street view data are collected. In this case it is the city of Toronto and the data collector is Tele Atlas. The things cartographers do to make maps! Tele Atlas seems to be selling georeferenced landmarks, street networks, and a variety of other data it collects simply by driving the streets with cameras and GPS mounted on the roof of cars. At 500 km a day and terrabytes of data, these folks are collecting and selling tons of geo-information that we like to play with on google earth, help find places in mapquest, and allow city planners or police forces to prepare evacuation plans, understand the characteristics of the route planned for a protest or know the point address in a 911 call.

The video also briefly discusses privacy issues, seems like the street is public space and if you happen to be naughty going into some taudry establishment and your act happens to be caught on film, well, so be it, either behave or accept the digital consequences of your private acts in public space, or so the video suggests!

Regarding access to these data, well, my guess is a big price tag. It is a private company after all!

I met with Wendy Watkins at the Carleton University Data Library Carleton University Data Library yesterday. She is one of the founders and current co-chair of DLI and CAPDU (Canadian Association of Public Data Users), a member of the governing council of the International Association of Social Science Information Service and Technology (IASSIST) and a great advocate for data accessibility and whatever else you can think of in relation to data.

Wendy introduced me to a very interesting project that is happening between and among university libraries in Ontario called the Ontario Data Documentation, Extraction Service Infrastructure Initiative (ODESI). ODESI will make discovery, access and integration of social science data from a variety of databases much easier.

Administration of the Project:

Carleton University Data Library in cooperation with the University of Guelph. The portal will be hosted at the Scholar’s Portal at the University of Toronto which makes online journal discovering and access a dream. The project is partially funded by the Ontario Council of University Libraries (OCUL) and OntarioBuys operated out of the Ontario Ministry of Finance. It is a 3 year project with $1 040 000 in funding.

How it works:

ODESI operates on a distributed data access model, where servers that host data from a variety of organizations will be accessed via Scholars’ Portal. The metadata are written in the DDI standard which produces XML. DDI is the

Data Documentation Initiative [which] is an international effort to establish a standard for technical documentation describing social science data. A membership-based Alliance is developing the DDI specification, which is written in XML.

The standard has been adopted by several international organizations such as IASSIST, Interuniversity Consortium for Political and Social Research (ICPSR), Council of European Social Science Data Archives (CESSDA) and several governmental departments including Statistics Canada, Health Canada and HRSDC.

Collaboration:

This project will integrate with and is based on the existing and fully operational Council of European Social Science Data Archives (CESSDA), which is cross boundary data initiative. CESSDA

promotes the acquisition, archiving and distribution of electronic data for social science teaching and research in Europe. It encourages the exchange of data and technology and fosters the development of new organisations in sympathy with its aims. It associates and cooperates with other international organisations sharing similar objectives.

The CESSDA Trans-Border Agreement and Constitution are very interesting models of collaboration. CESSDA is the governing body of a group of national European Social Science Data Archives. The CESSDA data portal is accompanied by a multilingual thesaurus, currently 13 nations and 20 organizations are involved and data from thousands of studies are made available to students, faculty and researchers at participating institutions. The portal search mechanism is quite effective although not pretty!

In addition, CESSDA is associated with a series of National Data Archives, Wow! Canada does not have a data archive!

Users:

Users would come to the portal, search across the various servers on the metadata fields, access the data. Additionally, users will be provided with some tools to integrate myriad data sets and conduct analyses with the use of statistical tools that are part of the service. For some of the data, basic thematic maps can also be made.

Eventually the discovery tools will be integrated with the journal search tools of the Scholar’s Portal. You will be able to search for data, find the journals that have used that data or vice versa, find the journal and then the data. This will hugely simplify the search and integration process of data analysis. At the moment, any data intensive research endeavour or data based project needs to dedicate 80-95% of the job to find the data from a bunch of different databases, navigating the complex licensing and access regimes, maybe pay a large sum of money, organizing the data in such a way that it is statistically accurate then make those comparisons. Eventually one gets to talk about results!

Data Access:

Both the CESSDA data portal project and ODESI are groundbreaking initiatives that are making data accessible to the research community. These data however will only be available to students, faculty and researchers at participating institutions. Citizens who do not fall into those categories can only search the metadata elements, see what is available but will not get access to the data.

Comment:

It is promising that a social and physical infrastructure exists to make data discoverable and accessible between and among national and international institutions. What is needed is a massive cultural shift in our social science data creating and managing institutions that would make them amenable to the creation of policies to unlock these same public data assets, some of the private sector data assets (Polls, etc.) and make them freely (as in no cost) available to all citizens.

What is the cost to taxpayers of public institutions purchasing public data? As citizens we do not like to pay for the same thing many times. So here is a real scenario and an estimated best guess of the #s on the cost to taxpayers for public data which they pay for many times via their public institutions whose job it is to work for the public interest and re-purchase data citizens have already paid for once in taxation:

a) Each Canadian municipality, city or town purchases demographic data from Statistics Canada. Lets suggest there are approximately 2000 of these entities. Lets say they each purchase a subset of the Census at varying scales, with a specialized geography to match their boundaries, so lets say they each spend conservatively $ 10 000 each (factoring that some small towns will buy less and others more).

2000 Towns/municipalities/cities * $ 10 000 = $ 20 000 000

b) Since many cities/towns/municipalities do not have efficient data infrastructures to manage their data assets, sometimes different departments purchase the same data twice or three times. So you may get planning, health and social welfare departments each purchasing the same data and not sharing as they are unaware and there is no central accessible repository they can mutually search. So lets pretend that the top 100 (conservative #) cities in Canada purchase the same/similar data 3 times each. We already included one purchase once above but we will keep to 3 as potentially some have purchased 4 times while the other 2900 units may have done so at least once.

100 Towns/municipalities/cities * 3 (duplicate copies of the same data) * $ 10 000 = $3 000 000

c) The best part, often each of these Towns/municipalities/cities are purchasing data for their entire respective provinces as they wish to do some cross comparisons. This means that each of these entities is paying each for the exact same/similar data set each time! Dam! Talk about a non-rivalrous good and how smart is StatCan? Dam we thought the public service did not have a corporate mindset!

d) The Provinces and Territories also each purchase Census data. They do not necessarily have a centralized data infrastructure either, they have bigger bureacracies, more departments, more specialized needs and bigger data requirements. So lets suggest that each Province and Territory spends $ 15 000 * 5 duplicate/similar sets, and an additional each $ 10 000 on multiple special orders between censuses.

13 Provinces/Territories * $ 15 000 * 5 = $ 975 000

13 Provinces/Territories * $ 10 000 = $ 130 000

d) Again many of the Provinces and Territories will purchase National scale datasets for comparison purposes, which like Towns/municipalities/cities are purchasing the exact same/similar copy of the exact same/similar data sets for the exact same geography numerous time. Recall the great part about information is its non-rivalrousness! We can each consume the same entity many times and none will suffer as a result. Unless of course you are a Canadian Tax Payer.

e) Then we have the Federal Government with approximately 350 departments and agencies and lets say each purchases some city data, some provincial data and a whole bunch of national data for $ 17 000 each. Then many, lets say 175 of these departments and agencies are purchasing special ordered data set to meet their particular needs, each at $ 7 500.

350 Federal Departments and Agencies * $ 17 000 = $ 5 950 000

175 Federal Departments and Agencies * $ 7 500 = $ 1 312 500

TOTAL:

  1. 2000 Towns/municipalities/cities * $ 10 000 = $ 20 000 000
  2. 100 Towns/municipalities/cities * 3 (duplicate copies of the same data) * $ 10 000 = $3 000 000
  3. 13 Provinces/Territories * $ 15 000 * 5 = $ 975 000
  4. 13 Provinces/Territories * $ 10 000 = $ 130 000
  5. 350 Federal Departments and Agencies * $ 17 000 = $ 5 950 000
  6. 175 Federal Departments and Agencies * $ 7 500 = $ 1 312 500

Grand Total of Census Data Expenditures by Taxpayers via Public Institutions in Canada: $ 31 367 500

The above is conservative number as it does not include the human resource expenditures like the following:

  1. Person hours associated for each public servant to negotiate and discuss their data needs
  2. Person hours for the StatCan officials to fill in the orders
  3. Person hours of the public servant lawyers to take care of licensing
  4. Person hours associated with all of the purchasing and accounting work to pay for, acquire and account for this money
  5. Person hours for each official who has to work the data in the same way to meet their needs
  6. Dunno if public agencies pay taxes on these! That would add insult to injury would it not?

It is also important to note, that hospitals, school boards, universities, crown corporations and a host of other quasi public institutions are doing the same thing. And that these numbers are only for census data, these do not include the cost of other datasets like road networks, water quality, maps, environment data and so on.

Would seem to me that we could spend a fraction of that cost to deliver the data online to all of these institutions, private sector, NGOs, and Citizens and we would all be better off financially. We would waive all the administration costs, and the license management costs, and we would all be smarter to! Further, we could reinvest that money into more research, air quality infrastructure, healthcare, waive recreation fees in municipalities etc. We could reinvest wisely in quality of life and know more how to do so at the same time.

PS-If anyone has:

  • come across any type of cost analysis reports etc.
  • has a better way to calculate this
  • knows of some real costs

Please pass them along! The more we have on this the better.

Looks like some of us are using less pesticides, purchasing a few more energy efficient and water conservation devices, composting only very slightly more than before, and it seems we dunno what to do with our toxic waste, we still throw out medicines and electronics in the regular curb pick up and we still commute to work one person per car which is too bad since

Passenger transportation accounts for about 12 per cent of Canada’s greenhouse gas emissions and efforts to improve efficiency are a high-profile part of the global warming debate.

Also, sadly we drink way more bottled water than is necessary in a country with an excellent drinking water infrastructure.

It would be great to get a hold of the raw data and play with it. It could be mapped and studied with other variables like income, city versus rural, ethnicity, mother tongue, population density, etc. This type of analysis could help target campaigns in certain under-performing areas and study why others are doing better.

Sources:

Putting Canadian “Piracy” in Perspective, a video from Geist and Albahary is a great way to present an argument. In Geist’s words

over the past year, Canadians have faced a barrage of claims painting Canada as a “piracy haven.” This video – the second in my collaboration with Daniel Albahary – moves beyond the headlines to demonstrate how the claims do not tell the whole story.

The video also uses quite a bit of public and private sector data to support its argument. This to me is what public data are for and this is what democracy looks like – when civil society has access to the data it requires to keep its government accountable, can keep citizens informed and can temper industry desires with public interest!

One of the cultural issues that has become pervasive as of late is the proliferation of policies and decisions being based on assumptions and not on facts, and in the case of the very powerful lobby against Canada on IP in the cultural sector – really biased reports that are not based on facts but on an industry’s desires and self interests. Look for the sources of the data and the methodology in all reports. Even in this great video! Geist and Albahary do a great job in this to show what is being said and repeated (memes) about the cultural industry in Canada and reality.

It is interesting that the video ends with a slide acknowledging the photos used, the music heard, the creators of the video and the license but not all the data sources in the charts! Some of the data references are in some of the bar charts while most statements are referenced with their source at the bottom of the slide. I always look for data references, else how can I go back and verify what was purported!

The data in the charts were:

  • Hollywood Studio Revenue Growth – Data Source unknown
  • Top Hollywood International Markets – Data Source unknown
  • Canadian Music Releases – Statistics Canada
  • Canadian Artist Share of Sales – Canadian Heritage Music Industry Profile
  • Digital Music Download Sales Growth – Data Source unknown
  • Private Copying Revenues 2000-2005 – Data Source unknown
  • RCMP Crime Data – Data Source unknown but assume the RCMP

*************************************
NOTE: See the comments of this post, the references to the data, quotes and reports that were not listed in the credits or with the information in the film are now fully described on Michael Geist’s Blog here.

Datalibre.ca received and excellent comment on the DLI post about access to some of the Statistics Canada data in schools and public libraries. Today I am looking at E-STAT online and am quite impressed – but alas I have not yet gone to a public library to check out what is actually there and what I can do. Nor do I know the limitations of CANSIM data. I did however speak on the phone with a fine librarian at the Main Ottawa Public Library this morning and look forward to digging for data later on today or tomorrow.

E-STAT is:

Statistics Canada’s interactive learning tool designed with the needs and interests of the education community in mind. E-STAT offers an enormous warehouse of reliable and timely statistics about Canada and its ever-changing people.

Using approximately 2,600 tables from CANSIM*, track trends in virtually every aspect of the lives of Canadians. Updated once a year during the summer, CANSIM contains more than 36 million time series.

Hundreds of schools across the country and Depository Service Program Libraries make these data accessible if you go in person to access them. You can get access to these data online only if you are registered with one of these institutions.

The E-STAT license on the data are quite restrictive.

The Government of Canada (Statistics Canada) is the owner or authorized licensee of all intellectual property rights (including copyright) in the data product referred to as E-STAT. Statistics Canada grants the educational institution a non-exclusive, non-assignable and non-transferable licence to use the data product subject to the terms below.

The data product supplied under this agreement shall at all times remain under the control of the institution. It may not be sold, rented, leased, lent, sub-licensed or transferred to any other institution or organization, and may not be traded or exchanged for any other product or service. The data product may not be used for the personal or commercial gain of any authorized user, nor to develop or derive for sale any other data product that incorporates or uses any part of this data product.

The data that are made available are Yearly updated Canadian Socio-economic Information Management System (CANSIM) data, the daily updates are sold for commercial purposes. I am also not sure how fine the geography is for E-STAT data, for instance if the data are available by Dissemination Blocks, Dissemination Area or, Census Tract, or Urban Areas (Note the cost associated with these and other maps). These make a difference, since DB is the finest granularity, DA is a larger neighbourhood level while CT covers a larger areas, while UAs are larger still. Each scale is for a different level of analysis and the boundaries if you aggregate any of these do not necessarily line up. Additionally, DB and DA are only for the 2006 Census while CT and UA are for others. I am guessing E-STAT is CT Scale data and larger.

E-STAT also has some census data, agricultural data, aboriginal survey data, some environmental data and health behaviour data for school aged children. Clearly not all the data are available and certainly not the specialized surveys such as business, waste management, household spending surveys, health, the survey of particular sectors etc. The data come with explanations, and teachers and users guides.

Lets see what we can get once I make a visit!

« Older entries § Newer entries »