datasets

You are currently browsing the archive for the datasets category.

I was looking for some cross city comparison data yesterday and recalled the Federation of Canadian Municipalities (FCM) Quality of Life Reporting System (QoLRS).

Conçu par la FCM, le Système de rapports sur la qualité de vie mesure, surveille et fait état de la qualité de vie dans les villes canadiennes en utilisant les données provenant de diverses sources nationales et municipales. / Developed by FCM, the Quality of Life Reporting System (QOLRS) measures, monitors and reports on the quality of life in Canadian urban municipalities using data from a variety of national and municipal sources.

Regroupant initialement 16 municipalités à ses débuts en 1999, le SRQDV compte maintenant 22 municipalités, dont certains des plus grands centres urbains du Canada et beaucoup de municipalités de banlieue qui les entourent. / Starting with 16 municipalities in 1999, the QOLRS has grown to include 22 municipalities, comprising some of Canada’s largest urban centres and many of the suburban municipalities surrounding them.

The FCM’s QoLRS site includes all the documentation, data, metadata and methodologies related to the development of their indicators and the system they have developed.

:: Reports
:: Annexes
:: Indicators

Their data are most impressive.  You can download a spreadsheet of the data for each indicator for 1991, 1996, 2001 and I expect 2006 QoLRS will be coming soon.   Each variable was also adjusted to the current geographies of amalgamated cities which makes cross comparison across time and space possible (see the guide to geographies).  This was not easy to do at the time. Each spreadsheet includes the data source, the variable, and a tab that provides the metadata.  Which means that you can verify what was done, reuse those data or if you had some money & loads of time you could purchase & acquire the data pertaining to your city and add to the indicator system.  Unfortunately the FCM had to purchase these datasets and it cost them many many thousands of dollars.

There are 11 themes and 72 indicators over 3 census periods for 20 cities (Sudbury, Regina, Winnipeg, Niagara, CMQ, Saskatoon, Edmonton, Hamilton, Halifax, Windsor, Toronto, Kingston, London, Ottawa, Vancouver, Waterloo, Halton, Calgary, Peel, York).  Datasets come from:

  • Statistics Canada
  • Canada Housing and Mortgage Corporation
  • Environment Canada
  • the 22 cities themselves
  • Elections Canada
  • Audit Bureau of Circulation
  • Tax Filer Data
  • Human Resources and Development Services Canada,
  • FCM Special Surveys
  • Industry Canada
  • Anielsky Management (Ecological Footprint)
  • Canadian Centre for Justice

Putting something like this together is no small feat, so please go check out what is available, play with the data a little, and if you cannot find data for your city, call up your local councilor and ask them to become a member of the QoLRS team!  Also let the FCM know they are doing a good job, as this is one way for us Canadians to see what is going on in our cities overtime.

From gasbuddy.com:

Now you can see what gas prices are around the country at a glance. Areas are color coded according to their price for the average price for regular unleaded gasoline.

Here is the US map.

[via infoesthetics]

Remember hearing about SETI@home? Check out, and download, Gridrepublic.org:

GridRepublic members run a screensaver that allows their computers to work on public-interest research projects when the machines are not otherwise in use. This screensaver does not affect performance of the host computer any more than an ordinary screensaver does.

By aggregating idle resources from users around the world, we create a massive supercomputer.

Gridrepublic is built on the system that started as SETI@home, which was turned into a general distributed computing platform BOINC. Gridrepublic is a central place for all projects using this distributed platform, where you can dowload & install the system and even better, choose which projects your computer’s idle time will be supporting, including:

Einstein@home: you can contribute your computer’s idle time to a search for spinning neutron stars (also called pulsars) using data from the LIGO and GEO gravitational wave detectors.

BBC Climate Change: The same model that the Met Office uses to make daily weather forecasts has been adapted to run on home PCs. The model incorporates many variable parameters, allowing thousands of sets of conditions. Your computer will run one individual set of conditions– in effect your individual version of how the world’s climate works– and then report back to the research team what it calculates. This experiment was described on the BBC television documentary Meltdown (BBC-4, February 20th, 2006). Note: workunits require several months of screensaver time; faster computers recommended.

Rosetta@home: needs your help to determine the 3-dimensional shape of proteins as part of research that may ultimately contribute to cures for major human diseases such as AIDS / HIV, Malaria, Cancer, and Alzheimer’s.
Proteins@Home: investigating the “Inverse Protein Folding Problem”: Whereas “Protein Folding” seeks to determine a protein’s shape from its amino acid sequence, “Inverse Protein Folding” begins with a protein of known shape and seeks to “work backwards” to determine the amino acid sequence from which it is generated.

Quantum Monte Carlo: Reactions between molecules are important for virtually all parts of our lives. The structure and reactivity of molecules can be predicted by Quantum Chemistry, but the solution of the vastly complex equations of Quantum Theory often require huge amounts of computing power. This project seeks to raise the necessary computing time in order to further develop the very promising Quantum Monte Carlo (QMC) method for general use in Quantum Chemistry.

Donate here.

Science 2.0 — Is Open Access Science the Future?
Is posting raw results online, for all to see, a great tool or a great risk?
By M. Mitchell Waldrop, Scientific American

The first generation of World Wide Web capabilities rapidly transformed retailing and information search. More recent attributes such as blogging, tagging and social networking, dubbed Web 2.0, have just as quickly expanded people’s ability not just to consume online information but to publish it, edit it and collaborate about it—forcing such old-line institutions as journalism, marketing and even politicking to adopt whole new ways of thinking and operating.

Science could be next. A small but growing number of researchers (and not just the younger ones) have begun to carry out their work via the wide-open tools of Web 2.0. And although their efforts are still too scattered to be called a movement—yet—their experiences to date suggest that this kind of Web-based “Science 2.0” is not only more collegial than traditional science but considerably more productive…

read the rest of the article…

Via Zzzoot

Coalition Casualty Count is a site managed by independent US citizens who analytically count the coalition casualties

for Operation Iraqi Freedom and Operation Enduring Freedom [Afghanistan]. We attempt to be up to date, precise, accurate and reliable.

There are many other sites on the web that list information of Fatalities from Iraq , but few if any of them do this in an analytical fashion. We endeavor to provide not just a list of names but a resource detailing when, where and how fatalities occurred.

You can read their methodology here.  I am always happy when I get to see the data and read how they were assembled, this provides me with the means to critically assess what is being presented to me.  I love the myriad visualization tools that are emerging on the net however, I wish they were accompanied by metadata which helps me better understand and decide whether or not I trust what is being said to me.

There are alot of data points and even some maps on this site and these folks are commended for doing this work and telling this important story.  There is also a list of the Canadian men and women causalities in Afghanistan.

via: Spatial Sustain

It’s amazing the flowering of data visualization projects – and how well they sometimes bring to life abstract issues.

Here is a beautiful little project, which helps you understand the scale of the financial woes brought on by the subprime mortgage troubles in the US. It’s a complex problem with all sorts of reasons and ramifications, but the simplest explanation is this: in the past decade, banks have been falling over themselves to give out loans to really, really bad credit risks. This means that lots of money that’s gone out in loans isn’t coming back. Which means banks are going to start to fail.

You can see this by asking: how many loan repayments are more than 90 days late? And you could split that out among various banks, and track it over the period from 2002-2007, and see not just how many, but the value of those overdue payments. And if you did that, you’d get this:

bank mortgage

If you made that graph into a little movie over time, you’d be in good shape. Which is what and still i persist has done.

PS time to dump your shares of Wells Fargo, I’d say.

[thanks, as always, to infosthetics]

world clock

the world clock.

Check out the new UNdata – United Nations Data Access System (UNdata)

The new UN data access system (UNdata) will improve the dissemination of statistics by United Nations Statistics Division (UNSD) to the widest possible audience. An easy to use data access system was developed that meets UNSD’s vision of providing an integrated information resource with current, relevant and reliable statistics free of charge to the global community.

Subsequent stages of the development of the UN data access system will extend to UN system data as well as to data of national statistical offices – providing the user with a simple single-entry point to global statistics.

UNdata

UNdata

Imagine if we could do that in Canada!

I have a thing about cars, idling, air quality and really appreciate it when people develop interesting visualizations & sonifications that make car population issues tangible by using metaphors which make those data meaningful. While this is an HR intensive and expensive visualization project, it could not have been done without access to some free data and in this case Madrid Movilidad. I would have liked a bit more metadata and metholodological explanations to accompany the visualizations though! Nonetheless, this project reinforces the argument that experimentation and innovation comes with free data!

Cascade on Wheels is a visualization project that intends to express the quantity of cars we live with in big cities nowadays. The data set we worked on is the daily average of cars passing by streets, over a year. In this case, a section of the Madrid city center, during 2006. The averages are grouped down into four categories of car types. Light vehicles, taxis, trucks, and buses.

We made two different visualizations of the same data set. We intended not just to visualize the data in a readable way, but also to express its meaning, with the use of metaphors. In the Walls Map piece, car counts are represented by 3D vertical columns emerging from the streets map, like walls. The Traffic Mixer piece, where noise is the metaphor, is an hybrid of a visualization and a sound toy. The first piece focuses more on showing the data in a readable and functional way, while the latter focuses more on expressing the meaning of the data and immersing the user into these numbers. Both pieces try to complete each other.

Check out their videos!

Well the folks (Matt Ball and Jeff Thurston) over at Spatial Sustain a Vector 1 Media blog have a great article exactly about that topic here. The article discusses free data as a platform for economic expansion, how free geospatial data weighed against cost represents a return on investment, industry creation based on government free data in the US.

Free federal data spurred free market competition. If the data were locked up to begin with, the market would never have taken off. There wouldn’t be the level of investment in technology, and we’d be much poorer in terms of both economic benefit and our knowledge of our world.

A few years back Gabe Sawhney and I co-prepared and Gabe gave the presentation entitled CivicAccess.ca: Democracy in an information age and the need for free and open civic data at Geotec organized by Matt and it is nice to see Matt doing some new stuff.

« Older entries § Newer entries »