An early mockup for the dashboard
I thought it would be useful to provide some notes about my progress with the OKF dashboard.
What to measure? How to measure it?
At one level, tickets are quite specific about what is to be measured:
- listing, map and by interest
- basic listing done by buddypress
- (Interest groups) (provided by buddypress groups)
- Project listing
- Activity by member and project
- Mailing list activity
- Web analytics
- Repo commits
- Blog posts and comments (wordpress should give us most of this)
- Tickets (closed, openned, currently outstanding …?)
Despite this specificity, there’s still quite a lot of deciding to do. Let’s take the example of changes to code. Perhaps it’s only worth mentioning the relative change over the last few weeks. If this were the case, we would be reporting that we’re producing 5 more commits per week than we did last week. This is fairly different than providing a simple count of commits. Some other questions pop into my head, like whether all project have equal weight. Should CKAN extension projects count as part of a global CKAN statistic?
From a design perspective, I’ve had quite a lot of difficulty reconciling several points. The sheer scope of the foundation makes it very difficult. We have dozens of mailing lists, code repositories, projects and blogs. It’s possible to get everything on a page, but then all we have is a long list that is very difficult to get a single view of what’s happening from. The other angle is that it would be great to include a map. We have some good geospatial infomation, such as the location of the OKF’s members. Plotting things on a map is a simple and attractive way of assembling that information.
The dashboard should be a project that enables people to gather information from it. By including everything. However, aggregation necessarily increases granularity. The problem is, the granules would be too large to be able to meet the needs to of the ticket. The requirements are to be able to quickly look at what a project or member is up to. However, placing that level of detail on the dashboard inhibits the ability to see what the foundation as a whole is up to.
My original concept would be to have a single value that represents “OKF hotness”. It would decrease overtime, would would increase whenever a goal was goals. Goals could include website hits, blog posts, emails to mailing lists and so on. I thought that the community could have quite a lot of fun creating goals. That way, the impact of the foundation would be able to be measured as a single value that would be very easy to compare over time.
Here are a few thoughts about the ways to represent the different fields of information that could appear.
A brainstorm of different images that could be captured and visualised. Circles represent topic areas, triangles represent types of information.
- New members joining
- Membership count
- Member locations
I think it would be interesting to present a delta, in addition to whole counts for membership lists. That way, it’s easier to see how membership changes according to specific events.
Membership locations are probably most natural to represent as a map. However, a frequency distribution might be less computationally intensive. It also allows clusters to be seen much more simply. For example, on the current map, it’s fairly difficult to see that there’s a high concentration of OKF’s membership in Europe. That is, the current visualisation doesn’t reflect the actual membership distribution as well as it could.
Mailing list activity
Some of my thinking relating to how to present mailing lists as a visualisation:
The OKF possess a large quantity of prosaic material. I thought a good way to represent this would be to have something very similar to a feed that looks like how Gmail presents email. That is, provide the author, subject header and whatever else from the first sentence on the remainder of the line in a lighter typeface.
The problem with this approach is that recently, the OKF blog has adopted a tradition of adding a preliminary paragraph about who the author of the content is. This means the start of every blog post is identical. This means that the introductory line would add no information as to whether it was relevant for the reader.
Statistics from websites and other logs are almost exclusively time series data. The problem that I faced when looking through OKF’s statistics are that there are so many dozens of websites. Should they all be given equal weight? Is it important to be able to drill down from an aggregate figure to information about individual lists?
One thing that I tried was a streamgraph of all the mailing list activity since OKF began. Streamgraphs tend to work very well when there are several datapoints over a long period. They were first used to visualise play counts in people’s music collections. However, the mailing list data basically looked ugly. Aspect ratios were mucked up and it was quite difficult to tell what was going on. Also, are messages from 2006 still relevant to a dashboard about today’s status?