Notes from the workshop.
Initial Agenda
- Introductions
- Set agenda and outline for the day
People
- Martin – software engineer. Interested in design and how government works.
- Chrastian – ontotext. Interested in open semantic data. http://www.ontotext.com/
- Elena – from Sofia University (teach Sociology). Teach course on content analysis. Excited that there is growing interested in public data. You can process a lot but need a purpose.
- Martin – OpenStreetMap’er. How can we integrate with other data e.g. missing people
- Ivo – works for ontotext
- Galia – just interested in open data
- Bogdаn – software author. Curiosity!
- Peio – legal adviser by day, IT background. Curious.
- Plamen – ex-software engineer. Aggregating data from bulgarian parliament.
- Alex – interested in using new technologies, electronics and music!
- Stoian Mishinev – IT Specialist
- Yana Petrova – journalism student
Agenda
- Data (and problem) mapping
- Problems with getting data
- Tools for working with data and developing a community around it (using it)
Summary
Gov data mapping
- Legislation
- Finances
- Civic info
- Transport
- Geodata
- News / Gazettes
Government structure in Bulgaria
- Central Gov – executive and parliament and courts
- Regions (28)
- Municipalities (cities are sometimes municipalities by themselves)
- Districts (possibly)
- (mayors in smallest villages)
Legal status for gov material (e.g. legilslation) — ЗАПСП http://lex.bg/bg/laws/ldoc/2133094401 Член 4, точка 4 Не са обект на авторско право
Question: how far does this extend to all documents.
The Law
The law: in state gazette: mostly online (html and pdf)? http://dv.parliament.bg
Public procurement
4th tab link on: http://dv.parliament.bg/ (no direct link because no urls!)
Parliamentary
Committee debates: http://www.parliament.bg/bg/parliamentarycommittees/members/226/steno
Plenary sessions debates: http://www.parliament.bg/bg/plenaryst
Legal decisions
Local stuff
Finances
Have CKAN package: http://ckan.net/package/bg-budget
Transport
- Trains: Publicly owned
- Trams in sofia: Publicly owned
- Bus: part private / part public http://www.sofiatraffic.bg/
- Subway: publicly owned
Civic Info (Health, Education etc)
Company Register
The company register was publicly available until 2011; at some point in 2011 it has been closed and access to it is available for a fee.
[ACTION: Peio - get old dump and analysis and add to relevant CKAN dataset]
Geodata and Cadastral
Problems getting data
- Gov objections to giving out data (and what can you do about it).
- Data format
- Data persistence
- Data quality
ACTION [Peio]: clarify scope of public domain provision for gov data (is this just legislation and gov documents or all gov data)
What do we do about PDF?
* Ask – directly or via http://isitopendata.org/
* Find a contact if you can
* Find out what the worries are …
* Transcribe
* Find tools – http://getthedata.org/questions/339/excel-table-from-a-pdf
[ACTION: Rufus Pollock: ask Julian Todd to write up instructions on PDF parsing based on UNDemocracy experience]
Tools and Communities
Basic process:
- Extract
- Transform (clean and integrate)
- Load
Tools:
Proprietary but free (in some form or other):
- Google docs and google fusion tables
- Google refine
- Tableau, Needlebase …
Ideas / Wanted
- croudsourcing the collection of all the bulgarian legislative data
- extract structured info from plenary and committee debates
- list of municipalities
- http://wiki.openspending.org/Countries – find volunteers to populate data for the Bulgarian budget
- on time stats for public transport
- wifi locations
- ‘Tell me about my area’ — On my phone (on facebook even!)