How we have identified and analysed trends in the Knowledge Base
The trends identification method is based on a mixed-method-design consisting of qualitative and quantitative research approaches. The following paragraph explaines how we applied data mining techniques in the trend identification process. The second paragrah showcasts the way we have descibed the trend tendency in the Knowledge Base, based on specific Web of Science search results.
Trend identification on Twitter and Web of Science through data mining
In order to gain a broader understanding of relevant trends in the public sector we used data mining techniques to identify trend frequencies in the Web of Science data base and on Twitter.
The database Web of Science and the social network Twitter are due to the large amount of data ideal for the conduction of efficient data mining.
With 1.4 billion indexed references and over 20,000 Journals Web of Science is one of the main multidisciplinary academic literature collections. Thus, Web of Science has the big advantage of allowing the exportation of the results for further analysis. Twitter is with 330 million monthly active users one of the most used social network sites in the world. The site is especially important for the present case since amongst the users are many professionals from politics, the public as well as the private sector.
In order to identify relevant trends a keyword research has been performed on both platforms, using the following queries: (“policy OR policies”) AND (“big data” OR “open data” OR “data analytics”) AND (“public”). We obtained over 2700 related tweets on twitter and round about 4600 related records on WOS which has been reduced to their roots in a data mining process using the statistical programming language R. From the data mining process we gained ranked frequencies of two-word-combinations.
Noticeable is the fact that the health domain turns out to be most affected by data analytics innovations since it is the first two-word-combination in the ranking without terms that have been used in the search query.
The Web of Science and the Twitter query revealed that the term health and the term Social Media are very high rated. However, most of the gained terms are multifarious terms with only one or two mentions.
In the Twitter-ranking also the term “change_public” revealed on a high rank. It points on the transformation process in terms of data analytic strategies that is going on in the public sector.
Noteworthy is furthermore, that critical amount of tweets on Twitter is related to the term “dehumanizing”, which indicates a controversial discussion regarding the use of data analytics technologies in the public sector.
Trend Tendencies in the BPC Knowledge Base
We have identified trends also through qualitative expert interviews and desk research activities. To derive trend tendencies we made respective trend queries in the Web of Science database. The search results have been exported to create trend visualisations in the open source data analytics software “R Studio”. The figures in the trend items in the Knowledge Base are presenting relative frequencies in a time span between 2008 and 2017.
We focused on relative frequencies and not on absolute frequencies defining it as a proportion of the trend records quantity in relation to the total quantity of all records.
To provide a specific view on public sector trends, we refined the results in a next step by applying the filter categories “public administration” and “political science”. All refined trend queries are referred as “limited category selection”. As a result, trends are represented in the Knowledge Base with focus on the general tendency (solid line) and on the public sector tendency (dotted line). Of course, the total amount of tagged records in the limited category selection is significantly lower, but shows direct impact of a respective trends on the public sector.