Svein Ølnes
User testing, as Nielsen describes it, could have been an alternative method for assuring a real user perspective to the existing evaluations. This had to be rejected due to lack of resources. It would, however, be very interesting to compare the results of user tests to the expert evaluation of the same web sites. Such a comparison was done in connection to the Danish quality evaluations and showed that the expert evaluation was not necessarily in accordance with the user test of the same site1.
3. Method
The main part of the data underpinning this paper has been derived from several years of benchmarking public websites. Datasets from the years 2007 – 2011 have been used and a subset of data from the municipalities. The quality evaluations of public websites in Norway are mainly based on a heuristic evaluation principle. A set of indicators have been formulated by using a combination of different heuristics, technical requirements, and governmental policy guidelines for accessibility, usability, and useful services. Each indicator is given a score in terms of points. A yes / no type of indicator will have to possible values. An indicator with several possible results is given a range of values. The number of points for each indicator is also the weight put on this indicator. There are no other weighting in the indicator set.
The method for assessing the quality of a website is an expert evaluation process involving a group of experts; all together 5 – 10 experts from 2‐3 different organizations. Before the annual evaluations start, the group of evaluators are trained using the set of indicators. They all evaluate the same websites and discuss eventual differences in scoring and try to harmonize the understandings of the indicators and the way they should be tested.
The actual evaluation starts with the expert looking up the municipality’ s home page and goes through the website with the list of indicators in hand( or rather: on his / her pc). The evaluation is a combination of observing and technical validation using different HTML testing tools. The expert is given 75 minutes to evaluate the site according to the 32‐33 indicators, which means there will be a little more than 2 minutes for assessing each indicator.
The survey of web administrators was a quantitative method using a questionnaire of 16 questions mainly about the benchmarking system. The online questionnaire was sent to all the 429 municipalities in Norway and 245 of them responded, giving a response rate of 57 %.
The third data source used in this paper is a survey to around 30 000 citizens above 18 years, drawn from the Norwegian census database. The first part of the survey was done in 2009 and a second part in 2010. The plan is to conduct this survey on a biannual base. 42 % of the population responded. Surprisingly, as much as 83 % of the respondents chose to deliver the answers on paper, the rest delivered the answers online. Part one of the survey asked how satisfied the citizens were with the municipality they lived in, and their satisfaction with various municipality and governmental services. Part two of the survey was undertaken in 2010 and asked more detailed questions about a set of predefined public services. The questionnaire for the second part was sent to those of the respondents of part one that had any experience with the selected services during the last year. That turned out to be 11 135 of the 12 659 citizens. Of the 11 135 a total of 6 646 citizens answered( 60 %). In this paper mostly results from the second part of the citizen survey have been used.
4. Results from Norwegian benchmarks and surveys
Here the results from the three different studies are presented;( a) the results from the expert evaluation of public websites 2007 to 2011,( b) the survey of web masters of public websites, and( c) the user survey targeted at the users of public websites.
4.1 Expert evaluations of websites 2007 – 2011
The set of indicators used for expert evaluation of public websites in Norway has only modest changes from 2007 to 2011. The set of indicators is divided in subsets. The following table lists the subsets together with the number of indicators and the maximum score in points.
1 A user test was conducted of the municipality web site which scored among the highest in the expert evaluation. The user test, however, showed that the site was poor in terms of usability as observed by the users.
393