datalibre.ca

cities of knowledge conference – dublin

November 18, 2007 in conference, government, infrastructure, policy, projects by Hugh | Permalink

If you are in Dublin, the Cities of Knowledge conference on November 20 looks interesting:

Cities of Knowledge
An International eGovernment/Public Sector Knowledge Management event, co-organised by Dublin City Council and DIT.

The event is part of ICiNG (Innovative Cities for the Next Generation) which is a project funded through the European 6th Framework Research programme. It aims to develop effective e-communities and e-access to city administration.

The project is based in Dublin, Barcelona, and Helsinki. Each city is providing â€˜City Laboratoryâ€™ test-bed sites in strategic development/city regeneration locations where users will trial and evaluate technologies and services.

Speakers include:
Jon Udell, Technology Evangelist, Microsoft
Graham Colclough, Vice President, Capgemini
Martin Curley, Head of Innovation, Intel
Prof John Ratcliffe, DIT Futures Academy
Mark Wardle, Head of Innovation Programmes, BT

The agenda is here.

DataNet – Data Archive

October 4, 2007 in academia, infrastructure by Tracey | 5 comments

Imagine a cyberinfrastructure that builds a data archive! Well the National Science Foundation (NSF) in the US has a massive call for proposals to build just that a Sustainable Digital Data Preservation and Access Network Partners (DataNet) I am so jealous of those folks!Â Canada has no equivalent to an NSF and does not invest in the future access of data at all!Â The Canadian Digital Information Strategy document will be released for public consultation in October but it is no where near as comprehensive as the Cyberinfrastructure work.Â The Cyberinfrastructure Vision for 21st Century Discovery is well worth reading.

Call for proposals info via – Jon Udell
Musing (about infrastructures) from my reading of Cyberinfrastructure documents sometime ago -Â Infrastructure learning, More, even more, public reason!

spacing

October 2, 2007 in canada, infrastructure, projects, web by Hugh | Permalink

Spacingmontreal.ca and spacingtoronto.ca are:

your hub for daily dispatches from the streets of Toronto/Montreal to cities around the world, offering both analysis and a forum for discussion. Our contributors examine city hall, architecture, urban planning, public transit, transportation infrastructure and just about anything that involves the public realm of our cities.

Both blogs are published by spacing magazine.

Wearable knitted data! I want one!

September 6, 2007 in infrastructure by Tracey | Permalink

Dam! Just-in-time delivery manufacturing meets data!

Newsknitter

is a data visualization project which focuses on knitted garments as an alternative medium to visualize large scale data.

The production of knitted garments is a highly complex process which involves computer support at various steps starting with the designs of both the fabric and the shape of garments until they are ready-to-wear. In recent years, technical innovations in machine knitting have especially focused on the patterning facilities. The patterns are designed by individuals generally depending on the current trends of fashion and the intended target markets and multiplied through mass production. News Knitter translates this individual design process into a world-wide collaboration by utilizing live data streams as a base for pattern generation. Due to the dynamic nature of live data streams, the system generates patterns with unpredictable visuality.

News Knitter is initiated as a quest for an alternative medium to visualize live data streams. The key motive is to translate digital information into the language of the physical world.

News Knitter converts information that is gathered from the daily political news into clothing. Live news feed from the Internet that is broadcasted within 24 hours is analyzed, filtered and converted into a unique visual pattern for a knitted sweater. The system consists of software that receives the content from live feeds, another software that converts it into visual patterns and a fully computerized flat knitting machine that produces the final output. Each product, sweater of News Knitter will be an evidence/result of a specific day.

The last thing i need is more stuff, but really, data and clothing – how seductive is that!

Via: information aesthetics

Tele Atlas is mapping Toronto’s Streets in 3d

August 9, 2007 in canada, datasets, infrastructure, Uncategorized by Tracey | 1 comment

This CBC.ca video gives a brief on how 2d and 3d street view data are collected. In this case it is the city of Toronto and the data collector is Tele Atlas. The things cartographers do to make maps! Tele Atlas seems to be selling georeferenced landmarks, street networks, and a variety of other data it collects simply by driving the streets with cameras and GPS mounted on the roof of cars. At 500 km a day and terrabytes of data, these folks are collecting and selling tons of geo-information that we like to play with on google earth, help find places in mapquest, and allow city planners or police forces to prepare evacuation plans, understand the characteristics of the route planned for a protest or know the point address in a 911 call.

The video also briefly discusses privacy issues, seems like the street is public space and if you happen to be naughty going into some taudry establishment and your act happens to be caught on film, well, so be it, either behave or accept the digital consequences of your private acts in public space, or so the video suggests!

Regarding access to these data, well, my guess is a big price tag. It is a private company after all!

Aging Infrastructure Data

August 3, 2007 in infrastructure by Tracey | 3 comments

I have been paying attention to infrastructures lately. More recently, I seemed to be coming across more stories about infrastructural failures, Submarine Cables in Asia, or in the Ring of Fire. The most recent failure being in Minneapolis. Today’s Globeand Mail online has a story that links to some AP video data and this one in particular – U.S Infrastructure under scrutiny – does a good review on how engineers gather their primary data, the nature of that data, and the making of safety reports. Seems like those reports get shelved allot! William Ibbs from UC Berkely an expert on construction risk said it well with a knowing smirk on his face:

well, ah, we’ve had had ah maybe some other social priorities for the past few years in the nation and public works have taken, ah, a bit of a back seat.

The map below shows the distribution of deficient bridges in the US. I thought I was hearing more stories and this data seems to support that my assumptions were not entirely off base!

Then I wondered about Canada so I did some superficial digging and found the following report – The Age of Public Infrastructure produced by Statistics Canada. The great thing about all of their report is that you can access their methodology documents, data sources and contacts which is great education material for amateur data geeks who wants to collect data themselves and want to find a systematic and statistically sound way to do so. I also found an Infrastructure Canada report that discusses the Government’s Infrastructure Assets and their management. The collapse in Minneapolis created a media context and receptivity on the subject as seen here – Canadaâ€™s infrastructure needs urgent attention, while some specialized think tanks look at particular infrastructures related to investment and stock prices in the energy industry – Aging Energy Infrastructure Could Drive Molybdenum Demand Higher -which is loaded with data particular to engineers in that field.

Why, talk about that here! Well, mostly because infrastructure is a boring thing that we rarely think about yet there is a ton of citizen money locked into these very huge material physical artefacts, also because there is little citizen generated data on the topic and the data available or the decisions that are being made rarely have a price tag or the name of the responsible agent attached to them! Yet without infrastructure we can cannot function! Infrastructure is what distinguishes a good city to live in versus a not so good city to live in, and well infrastructure is an inseparable part of our human habitat.

Imagine a concerted effort by citizens to collect data about satellite dishes, or receiving ground stations, server farms, isp offices, aging bridges, cool sewers, following the complete cycle of ones local water purification plant, or telephone switching station, where one’s poo goes once flushed, where one’s data is stored, and sharing and visualizing all that data on a map. We are starting to see some really interesting adventurer/art urban exploration projects or how some boyz are navigating the 3d elements of a city’s hardware in parkour. I love stuff like this Pothole reporter, could we develop collaborative tools to report missing manhole covers, Ottawa’s thriving road side ragweed cultivations, where the public washrooms are/are not along with public water fountains, Montreal’s missing trees in sidewalk planters (Michael‘s idea on location portal content gathering) and so on.

New Social Science Data Infrastructure for Ontario Libraries

July 25, 2007 in academia, Access, canada, datasets, infrastructure, Uncategorized by Tracey | Permalink

I met with Wendy Watkins at the Carleton University Data Library Carleton University Data Library yesterday. She is one of the founders and current co-chair of DLI and CAPDU (Canadian Association of Public Data Users), a member of the governing council of the International Association of Social Science Information Service and Technology (IASSIST) and a great advocate for data accessibility and whatever else you can think of in relation to data.

Wendy introduced me to a very interesting project that is happening between and among university libraries in Ontario called the Ontario Data Documentation, Extraction Service Infrastructure Initiative (ODESI). ODESI will make discovery, access and integration of social science data from a variety of databases much easier.

Administration of the Project:

Carleton University Data Library in cooperation with the University of Guelph. The portal will be hosted at the Scholar’s Portal at the University of Toronto which makes online journal discovering and access a dream. The project is partially funded by the Ontario Council of University Libraries (OCUL) and OntarioBuys operated out of the Ontario Ministry of Finance. It is a 3 year project with $1 040 000 in funding.

How it works:

ODESI operates on a distributed data access model, where servers that host data from a variety of organizations will be accessed via Scholarsâ€™ Portal. The metadata are written in the DDI standard which produces XML. DDI is the

Data Documentation Initiative [which] is an international effort to establish a standard for technical documentation describing social science data. A membership-based Alliance is developing the DDI specification, which is written in XML.

The standard has been adopted by several international organizations such as IASSIST, Interuniversity Consortium for Political and Social Research (ICPSR), Council of European Social Science Data Archives (CESSDA) and several governmental departments including Statistics Canada, Health Canada and HRSDC.

Collaboration:

This project will integrate with and is based on the existing and fully operational Council of European Social Science Data Archives (CESSDA), which is cross boundary data initiative. CESSDA

promotes the acquisition, archiving and distribution of electronic data for social science teaching and research in Europe. It encourages the exchange of data and technology and fosters the development of new organisations in sympathy with its aims. It associates and cooperates with other international organisations sharing similar objectives.

The CESSDA Trans-Border Agreement and Constitution are very interesting models of collaboration. CESSDA is the governing body of a group of national European Social Science Data Archives. The CESSDA data portal is accompanied by a multilingual thesaurus, currently 13 nations and 20 organizations are involved and data from thousands of studies are made available to students, faculty and researchers at participating institutions. The portal search mechanism is quite effective although not pretty!

In addition, CESSDA is associated with a series of National Data Archives, Wow! Canada does not have a data archive!

Users:

Users would come to the portal, search across the various servers on the metadata fields, access the data. Additionally, users will be provided with some tools to integrate myriad data sets and conduct analyses with the use of statistical tools that are part of the service. For some of the data, basic thematic maps can also be made.

Eventually the discovery tools will be integrated with the journal search tools of the Scholar’s Portal. You will be able to search for data, find the journals that have used that data or vice versa, find the journal and then the data. This will hugely simplify the search and integration process of data analysis. At the moment, any data intensive research endeavour or data based project needs to dedicate 80-95% of the job to find the data from a bunch of different databases, navigating the complex licensing and access regimes, maybe pay a large sum of money, organizing the data in such a way that it is statistically accurate then make those comparisons. Eventually one gets to talk about results!

Data Access:

Both the CESSDA data portal project and ODESI are groundbreaking initiatives that are making data accessible to the research community. These data however will only be available to students, faculty and researchers at participating institutions. Citizens who do not fall into those categories can only search the metadata elements, see what is available but will not get access to the data.

Comment:

It is promising that a social and physical infrastructure exists to make data discoverable and accessible between and among national and international institutions. What is needed is a massive cultural shift in our social science data creating and managing institutions that would make them amenable to the creation of policies to unlock these same public data assets, some of the private sector data assets (Polls, etc.) and make them freely (as in no cost) available to all citizens.

Cost Recovery Policies are NOT Synonymous with Data Quality

July 17, 2007 in academia, Access, datasets, infrastructure, policy, Uncategorized by Tracey | Permalink

One of the great data myths is that cost recovery policies are synonymous with higher data quality. Often the myth making stems from effective communications from nations with heavy cost recovery policies such as the UK who often argue that their data are of better quality than those of the US which have open access policies. Canada, depending on the data and the agencies they come from is at either end of this spectrum and often in between.

I just read an interesting study that examined open access versus cost recovery for two framework datasets. The researchers looked at the technical characteristics and use of datasets from nations of similar socio-economic, jurisdiction size, population density, and government type (Netherlands, Denmark, German State of the North Rhine Westfalia, US State of Massachusetts and the US Metropolitan region of Minneapolis-St. Paul). The study compared parcel and large scale topographic datasets typically found as framework datasets in geospatial data infrastructures (see SDI def. page 8). Some of these datasets were free, some were extremely expensive and all under different licensing regimes that defined use. They looked at both technical (e.g. data quality, metadata, coverage, etc.) and non-technical characteristics (e.g. legal access, financial access, acquisition procedures, etc.).

For Parcel Datasets the study discovered that datasets that were assembled from a centralized authority were judged to be technically more advanced while those that require assembly from multiple jurisdictions with standardized or a central institution integrating them were of higher quality while those of multiple jurisdictions without standards were of poor quality as the sets were not harmonized and/or coverage was inconsistent. Regarding non-technical characteristics many datasets came at a high cost, most were not easy to access from one location and there were a variety of access and use restrictions on the data.

For Topographic Information the technical averages were less than ideal while for non-technical criteria access was impeded in some cases due to involvement of utilities (tendency toward cost recovery) and in other cases multiple jurisdictions – over 50 for some – need to be contacted to acquire a complete coverage and in some cases coverage is just not complete.

The study’s hypothesis was:

that technically excellent datasets have restrictive-access policies and technically poor datasets have open access policies.

General conclusion:

All five jurisdictions had significant levels of primary and secondary uses but few value-adding activities, possibly because of restrictive-access and cost-recovery policies.

Specific Results:

The case studies yielded conflicting findings. We identified several technically advanced datasets with less advanced non-technical characteristics…We also identified technically insufficient datasets with restrictive-access policies…Thus cost recovery does not necessarily signify excellent quality.

Although the links between access policy and use and between quality and use are apparent, we did not find convincing evidence for a direct relation between the access policy and the quality of a dataset.

Conclusion:

The institutional setting of a jurisdiction affects the way data collection is organized (e.g. centralized versus decentralized control), the extent to which data collection and processing are incorporated in legislation, and the extent to which legislation requires use within government.

…We found a direct link between institutional setting and the characteristics of the datasets.

In jurisdictions where information collection was centralized in a single public organization, datasets (and access policies) were more homogenous than datasets that were not controlled centrally (such as those of local governments). Ensuring that data are prepared to a single consistent specification is more easily done by one organization than by many.

…The institutional setting can affect access policy, accessibility, technical quality, and consequently, the type and number of users.

My Observations:
It is really difficult to find solid studies like this one that systematically look at both technical and access issues related to data. It is easy to find off the cuff statements without sufficient backup proof though! While these studies are a bit of a dry read, they demonstrate the complexities of the issues, try to tease out the truth, and reveal that there is no one stop shopping for data at any given scale in any country when it comes to data. In other words, there is merit in pushing for some sort of centralized, standardized and interoperable way – which could also mean distributed – to discover and access public data assets. In addition, there is an argument to be made to make those data freely (no cost) accessible in formats we can readily use and reuse. This of course includes standardizing licensing policies!

Reference Institutions Matter: The Impact of Institutional Choices Relative to Access Policy and Data Quality on the Development of Geographic Information Infrastructures by Van Loenen and De Jong in Research and Theory in Advancing Data Infrastructure Concepts edited by Harlan Onsrud, 2007 published by ESRI Press.

If you have references to more studies send them along!

Statistical Data in Schools and Public Libraries

July 11, 2007 in Access, canada, datasets, infrastructure by Tracey | Permalink

Datalibre.ca received and excellent comment on the DLI post about access to some of the Statistics Canada data in schools and public libraries. Today I am looking at E-STAT online and am quite impressed – but alas I have not yet gone to a public library to check out what is actually there and what I can do. Nor do I know the limitations of CANSIM data. I did however speak on the phone with a fine librarian at the Main Ottawa Public Library this morning and look forward to digging for data later on today or tomorrow.

E-STAT is:

Statistics Canada’s interactive learning tool designed with the needs and interests of the education community in mind. E-STAT offers an enormous warehouse of reliable and timely statistics about Canada and its ever-changing people.

Using approximately 2,600 tables from CANSIM*, track trends in virtually every aspect of the lives of Canadians. Updated once a year during the summer, CANSIM contains more than 36 million time series.

Hundreds of schools across the country and Depository Service Program Libraries make these data accessible if you go in person to access them. You can get access to these data online only if you are registered with one of these institutions.

The E-STAT license on the data are quite restrictive.

The Government of Canada (Statistics Canada) is the owner or authorized licensee of all intellectual property rights (including copyright) in the data product referred to as E-STAT. Statistics Canada grants the educational institution a non-exclusive, non-assignable and non-transferable licence to use the data product subject to the terms below.

…

The data product supplied under this agreement shall at all times remain under the control of the institution. It may not be sold, rented, leased, lent, sub-licensed or transferred to any other institution or organization, and may not be traded or exchanged for any other product or service. The data product may not be used for the personal or commercial gain of any authorized user, nor to develop or derive for sale any other data product that incorporates or uses any part of this data product.

The data that are made available are Yearly updated Canadian Socio-economic Information Management System (CANSIM) data, the daily updates are sold for commercial purposes. I am also not sure how fine the geography is for E-STAT data, for instance if the data are available by Dissemination Blocks, Dissemination Area or, Census Tract, or Urban Areas (Note the cost associated with these and other maps). These make a difference, since DB is the finest granularity, DA is a larger neighbourhood level while CT covers a larger areas, while UAs are larger still. Each scale is for a different level of analysis and the boundaries if you aggregate any of these do not necessarily line up. Additionally, DB and DA are only for the 2006 Census while CT and UA are for others. I am guessing E-STAT is CT Scale data and larger.

E-STAT also has some census data, agricultural data, aboriginal survey data, some environmental data and health behaviour data for school aged children. Clearly not all the data are available and certainly not the specialized surveys such as business, waste management, household spending surveys, health, the survey of particular sectors etc. The data come with explanations, and teachers and users guides.

Lets see what we can get once I make a visit!

Why should government spatial data be free?

July 4, 2007 in Access, infrastructure, openmovement, Uncategorized by Tracey | 2 comments

I tripped over this yesterday while looking for some arguments for and against cost recovery. The arguments are quite good and comprehensive. If any of you can think of more send them to the civicacces.ca list or leave comments here.

This texte I believe was put together by Jo Walsh and colleagues as they were preparing positions for the INSPIRE Directive that became official May 7, 2007. Public Geo Data put together a great campaign, an online petition, a discussion list and superb material to lobby EUROGI for Free and Open Access to Geo Data. At the time the UK was pushing heavily for the Ordnance Survey‘s extreme cost recovery model for the EU while other European nations were working towards more open and free access models. You can read more about it by going through the archive of their mailing list.

Here is the full text for Why Should Government Spatial Data be Free?

infrastructure

cities of knowledge conference – dublin

DataNet – Data Archive

spacing

Wearable knitted data! I want one!

Newsknitter

Tele Atlas is mapping Toronto’s Streets in 3d

Aging Infrastructure Data

New Social Science Data Infrastructure for Ontario Libraries

Cost Recovery Policies are NOT Synonymous with Data Quality

Statistical Data in Schools and Public Libraries

Why should government spatial data be free?

about

Previously, on datalibre

Comments on Posts

Recent Comments

Archives